If you are using one of the more popular rules engines, chances are you can blame me. I popularized the technology of forward-chaining production rules based on the Rete Algorithm. Others have certainly contributed; my path is the one that led to open-source implementations and many commercial products, including those of IBM, Oracle, SAP, TIBCO, Red Hat, and too many others to mention (e.g., see this).
Today, I want to make clear that the future prospects for production rule technology are diminishing. My objective here is to explain why most rule-based technologies are no good and why some are much better. Although production rule technology is much better than most rule-based technologies, I hope to also make clear that in the age of IBM’s Watson, Google’s Brain, and the semantic web, production rule technology is inadequate.
They are not created equal.
Rules have become so pervasive in the software business that vendors of all types of software say they have them. Consider, for example, that even Microsoft Outlook has rules!
Unfortunately, people generally don’t appreciate the differences between different types of rules. This makes it an uphill battle for someone delivering more value than another who has the most rudimentary functionality. IBM’s Operational Decision Manager (formerly Ilog JRules) and Fair Isaac’s Blaze Advisor are excellent business rules management systems (BRMS) based on production rule technology. And yet, their sales reps have to deal repeatedly with prospective clients who think products from Microsoft or Progress are effective competitors.
Since the nineties, most of the rule-based capabilities added to software of all types is nothing more than glorified if-then-else statements found in every major programming language. Even some self-described rule-based technology vendors offer little more from a technology perspective. For example, Bosch Innovation’s Visual Modeler looks like a visual programming language in which you draw if-then-else statements. Now visual programming may be a nice and useful capability, but the benefits of artificial intelligence technology hardly accrue from dressing up a procedural language!
Bells and Whistles
There have been some nifty innovations in user interfaces to make coding business logic easier. The most significant has been the development of less technical rule languages that look something like English. My former company led that charge but the products from IBM, Fair Isaac, and Oracle are very nice even if they still require an analyst to translate from English into a structured syntax or controlled natural language.
Another significant metaphor was championed by Corticon, now owned by Progress. Their tabular metaphor cannot handle many of the rules that come up in practice, but for those willing to limit their focus and functionality to what does fit in the metaphor, it is very nice in several regards. So nice, in fact, that rather than sell against it, all the vendors, including my former company, introduced their own tabular metaphors.
The march from technical rule syntax to increasingly natural language business rules increased the market for production rule technology dramatically beginning with Haley Systems’ Authority circa 2000. The tabular metaphor was a flash in the pan by comparison. Neither advanced the underlying technology, however. In fact, as mentioned above, the tabular metaphor reduced its expressiveness and power. Nonetheless, the tabular metaphor allowed some people to begin using business rules technology who might not otherwise have adopted it as quickly.
How many people understand that IBM, Fair Isaac, and Oracle are essentially equivalent in their rule technology? How many people understand that each of these is vastly superior in capability to Microsoft, Progress or Visual Rules?
Even more importantly, how many people understand how weak each of IBM, Fair Isaac, Oracle, SAP, TIBCO, and Red Hat are in their ability to perform logical deduction? And how many understand the ramifications of weak deductive capabilities?
Who needs deduction?
One of my early frustrations with production rule technology was its inability to perform deduction. A simple example may help you see this:
- A person’s parent’s sibling’s child is his or her cousin.
- A person’s parent’s other child is his or her sibling.
In production rule technology you have to write each rule as an if-then. The Rete Algorithm efficiently computes when the then part of each rule is satisfied. This is the algorithm almost every vendor uses (e.g., IBM, Fair Isaac, Oracle, SAP, TIBCO, Red Hat, CLIPS, and JESS engines). In today’s market, avendor who does not use the core concepts of the Rete Algorithm is a weak competitor. (That about to change, though.)
So, in any of these tools, you have to write the rules like this:
- if a person has a parent that has a sibling that has a child then the child is a cousin of the person
- if a person has a parent that has a child that is different than the person then the child is a sibling of the person
The problem with production rule technology is that it will land up materializing every cousin relationship, which is geometric in the number of people! (It will also land up materializing every sibling relationship.) And the problem is actually twice as bad, since cousin and sibling are symmetric relationships!
Any logic programming system handles this problem easily. For example, Prolog will deduce only the cousin facts that are needed! (There is too much more here to cover, but consider whether your system could answer whether the parents of a person with a cousin were only children. Watson might. Our new stuff does.)
I solved this problem a long time ago by implementing opportunistic backward chaining for the Rete Algorithm, but none of IBM, Fair Isaac, TIBCO, etc. have similar functionality!
Looking for a link back into my blog to support these points, I just found this unpublished draft from October, 2011:
Working in the shadow (or on the shoulders) of Allen Newell at CMU, we focused on the knowledge level. Unfortunately, technology got in the way. In those days, we were using OPS5, a data-driven production system, much like the business rules engines of today (most of which are based on derivatives of the Rete Algorithm first introduced in OPS5). Such languages are incapable of expressing knowledge as truth in even some of the most trivial cases. From a logical perspective, a production rule is a fairly pathetic thing. Even the simplest of logic programs can represent truths and perform many deductions that cannot be captured even in many production rules.
For example, the if-then statements above are awkward derivatives of the prior statements of truth, aren’t they? More importantly, the behavior of the algorithms that implement production rule systems don’t treat the first and second pair of sentences as equivalent! It’s worth pondering that.
Run-time behavior in production rule systems is not logical!
So, why don’t BRMS use Prolog?
Now we’re getting somewhere. Well, I can tell you why I eschewed Prolog in the eighties…
- Performance of the Rete Algorithm is asymptotically independent of the number of rules (i.e., it scales well).
- The order of production rules has no effect on the nominal behavior or performance of the Rete Algorithm.
- If rule order matters (as it does in any flow-chart-equivalent metaphor) the cost of adding rules increases as the number of rules increase (i.e., it scales poorly).
- The Rete Algorithm is perfectly comfortable with interpreting unknown as false.
- For example, interpreting the lack of a record in a SQL database as indicating that something is not the case.
- This is called “negation under the closed-world assumption”, which is critical in practice and big trouble for Prolog.
- Prolog can runaway in recursive sub-goaling until it “runs of the end of the stack” or until it exhausts physical memory with sub-goals or irrelevant answers.
- This makes Prolog exceeding difficult to use and probably explains more than any other single reason why it is not more popular.
- Prolog cannot deal with negation under the closed-world assumption logically.
- Using Prolog’s cut operator is hardly a band-aid; it makes program behavior rely on the order or rules (i.e., it scales poorly; see above).
- Where negation matters (e.g., when using an SQL database), Prolog is more of a procedural language rather than a knowledge representation and reasoning system.
So if Prolog doesn’t work, what will?
Things have changed.
- In the eighties, nineties, and until recently, there was no semantic web and there was little interest in question answering outside of Project Halo until Watson.
- More recently, there have been significant advances in logic programming that resolve each of the deficiencies mentioned above.
[The following will be redrafted as part 2 is completed.]
But even though things have changed, people in the know knew that business rule systems are logically weak. For example, OMG’s production rule representation (PRR) has gained no traction because it’s just plain boring. In W3C’s efforts to standardize rules for the semantic web, the business rule vendors carved off a production rule dialect of the rule interchange format (RIF-PRD) that they could handle while everyone working on the actual semantic web and with reasoning technology was consumed with a basic logic dialect (RIF-BLD) that would overcome the weaknesses of the earlier semantic web rule language (SWRL) and provide more logical functionality than can be expressed directly in the web ontology language (OWL).
Bottom line here: those in the know know that production rules are pathetic things, logically speaking.
But how does all the noise from the semantic web relate to business logic in enterprises!?
For now, I’ll give you a symptom that is most troubling for practitioners and vendors in the industry…
Why have standards failed repeatedly in the business industry? Why don’t any business rules vendors support OMG’s SBVR or RIF-BLD?
The answer, quite simply, is that they can’t! And now it matters.
To be continued…
E.g., more on first-order logic (including SBVR and Common Logic) and concerning the well-founded semantics (WFS), tabling Prolog (SLG resolution), defeasiblity, non-monotonicity, and transforming first-order logic into defeasible non-monotonic logic programmings based on the WFS and using SLG (ie., Vulcan’s SILK and Coherent)