Well this is a blast from the past… I was working on “the next post” about how Watson, Google, and the semantic web threaten the major BRMS systems when I found myself rewriting that what matters about Rete Algorithm-based production rule systems is that the order of the rules does not matter. It occurred to me that I must have written that a dozen times or more in my career so I did a Google and found this old paper has been on-line at the Free Library On-Line since 2006! I recall writing this in the early nineties and the owner of an AI magazine telling me he carried it with him for the better part of a year (and printing it in his magazine, I think). Some of you may enjoy reading it again!
Best,
Paul
1. WHAT IS AN EXPERT SYSTEM?
An expert system is a program that includes expertise.
2. WHAT IS EXPERTISE?
Expertise is knowledge that enables an individual, group or program to perform an intellectual task better than that task can be performed without the expertise.
3. HOW IS EXPERTISE ENCODED WITHIN A PROGRAM?
Expertise, being knowledge, falls into a few general categories.
Some knowledge is declarative knowledge which doesn’t do anything but which represents truth or state. Such knowledge can be stored in a standard database.
Some knowledge is algorithmic and can therefore be flow charted. Such knowledge can be expressed as a routine in any procedural programming language. Not everything a person or organization knows can be easily translated into data structures and flow charts, however.
4. SO FLOW-CHARTABLE ALGORITHMS SHOULD BE CODED PROCEDURALLY?
Yes, if the flow chart is actually produced, if it is not extraordinarily complex, and if there is no reason to believe that the flow chart is incorrect or incomplete. After all, procedural languages and flow charts are isomorphic, anything that can be expressed in one can be expressed using the other. However, if it is hard to produce a flow chart, it will be even more difficult to produce a working program using a procedural language.
5. WHAT KIND OF KNOWLEDGE CANNOT BE DIAGRAMMED AS DATA OR FLOW CHARTED?
A decade of experience in the development of knowledge-based or expert systems has demonstrated that the knowledge that is neither strictly declarative nor strictly algorithmic is the crucial ingredient that distinguishes experts from non-experts. Thankfully, this is intuitively satisfying since experience–not just books–builds expertise. Books typically provide strictly declarative or algorithmic knowledge. Experience, on the other hand, teaches the special cases and the exceptions to more general knowledge that cannot be effectively taught in the classroom or enumerated in books.
6. IS ANYTHING BESIDES SPECIAL CASE KNOWLEDGE DIFFICULT TO DIAGRAM?
Special case knowledge as acquired through experience is just one form of the broader class of conditional knowledge which is inherently difficult to organize and, therefore, to flow chart or otherwise diagram. Any program whose requirements include codifying or otherwise considering a significant amount of conditional logic can be impractical to flow chart. Simply taking a large number of flow chart decision nodes, each of which have two links coming out of them, and linking them all up into a reasonable flow chart can be extremely difficult.
7. BUT WHAT IF THE CONDITIONAL KNOWLEDGE NEEDED CAN BE FLOW CHARTED?
Not all conditional logic is difficult to flow chart, but most is. If a flow chart can be produced you should consider how complex it is and how difficult it would be to remove or add a few more conditions. Of course, the complexity of the flow chart will relate directly to the complexity, and therefore to the error proneness, of its implementation in software. If changing the flow chart as inconsistent or incomplete program specifications are clarified, elaborated, or otherwise evolve is difficult, then debugging and maintaining the program will also be difficult. This is particularly tale when developing knowledge-based or expert systems. Experiential knowledge, the heart of expertise, is hard to elicit from experts, even by practiced experts in protocol and systems analysis called “knowledge engineers”. Any knowledge which is elicited must necessarily be considered unreliable and incomplete. Therefore, knowledge engineers particularly concern themselves with how easy it will be to modify or add conditional logic which accommodates the special case knowledge that constitutes expertise.
8. WHAT TOOLS DO KNOWLEDGE ENGINEERS USE TO CODIFY EXPERTISE?
Every expert system that has been developed has been rule-based
9. NOT ALL INTELLIGENT PROGRAMS CAN BE STRICTLY RULE-BASED!
No, of course not. Rule-based programming is not the only technique used within artificially intelligent programs. But role-based programming is the only technique that can be used to directly codify expertise.
Many programs have been developed which do not attempt to capture conditional logic or represent knowledge explicitly. These programs use a variety of algorithmic and analytic techniques which have little or no dependence on the domain to which they are applied. Techniques which use no domain knowledge at all include classification algorithms used in artificial neural networks and inductive learning, such as “back propagation” and “ID3”. Some areas of AI even label themselves as algorithmic rather than knowledge-based, such as “genetic algorithms”. And some AI techniques, such as best-first or alpha-beta search, mix algorithms with fairly naive evaluation functions. All of these algorithmic approaches to AI emphasize search over knowledge. It has long been known within the field of artificial intelligence that intelligence involves a search versus knowledge tradeoff. At one extreme, a program can know relatively little and search a great deal, as with programs that play chess and other games. At the other extreme, a program can know a great deal but involve little or no search, as is the case with expert systems. It is the end of the spectrum that involves more knowledge than search at which rule-based systems are most appropriate and where they in are in fact more commonly and most successfully applied.
10. WHAT DO RULE-BASED SYSTEMS DO THAT OTHER TECHNIQUES DO NOT?
The class of rule-based systems called production systems removes the burden of organizing conditional knowledge into procedural flow charts and directly supports the incremental refinement and evolution of a knowledge base. In his 1976 Phi) thesis at Carnegie-Mellon University, “Production Systems as a Programming Language for AI”, Mike Rychener first described the use of production rules as a repository for conditional knowledge where each rule’s conditions remained independent of one another and required no action on the part of a programmer to organize a body of conditional knowledge into a procedural flow chart. The principal advantage that Dr. Rychener identified is that production systems, a subclass of rule-based systems, allowed conditional knowledge to be acquired and refined incrementally–without reorganizing the entire collection of production rules each tune a new condition or special case is realized through experience.
11. WHAT DISTINGUISHES A PRODUCTION SYSTEM FROM OTHER RULE-BASED SYSTEMS?
The order of rules within a production system is irrelevant. If order were not irrelevant then the programmer would be responsible for organizing rules into flow charts which would eliminate the benefit of rale-based systems in the first place, If rules are ordered then rules can be flow charted and we would find ourselves back trying to implement and maintain incredibly complex flow charts. Second, the programmer must have no responsibility for checking the conditions of one rule versus another. If the programmer controls when to check various rules then, once again, we are left attempting to flow chart what is difficult or impossible to express or maintain as a flow chart.
12. SO THEN, PROLOG IS A RULE-BASED SYSTEM BUT IT IS NOT A PRODUCTION SYSTEM?
Correct. Rule order is extremely significant in Prolog. Any Prolog program can be converted directly into a flow chart or equivalent algorithmic program. Conversely, if it’s difficult to flow chart a program it will be difficult to implement in Prolog. Prolog is principally a logic programming language for performing deduction or proof. Prolog is not generally used for knowledge-based or expert systems. The applications of Prolog documented in the AI journals and conference proceedings bear this out.
13. IF RULES SHOULD NOT BE CHECKED UNDER PROGRAM CONTROL, WHEN SHOULD RULES BE CHECKED?
Whenever the applicability of their conditions might have changed. Production systems monitor the applicability of the conditions of rules against a database. Whenever the database changes, whether by adding, deleting, or modifying a record in the database, all the rules should be checked.
14. THAT SOUNDS PRETTY INEFFICIENT!
Indeed, such a naive approach would be horribly inefficient. Nonetheless, most commercial expert system shells land up doing something equivalent or only slightly better. Fortunately, the Rete Algorithm was developed at Carnegie-Mellon University by Charles Forgy and published in his 1979 thesis, “The Rete Algorithm: A Fast Algorithm for the Many Pattern / Many Object Pattern Matching Problem”. Dr. Forgy’s thesis introduced the notion of matching the conditions of rules in a data-driven manner rather than checking rules in a procedural manner. Using the data-driven techniques of the Rete Algorithm allows the checking of rules to be focused on precisely the data that changes and to only those rules whose applicability might actually be affected by those changes. This and other Rete Algorithm techniques produced an algorithm which dramatically outperformed all other approaches.
15. SO THE RETE ALGORITHM MAKES PRACTICAL APPLICATION OF RULE-BASED SYSTEMS POSSIBLE?
Indeed. It wasn’t until 1981, after Carnegie-Mellon developed its “Official Production Systems” based on the Rete Algorithm that John McDermott published the seminal work in the commercial application of AI, “R1: A rule-based configurer of computer systems”, which describes the first commercial expert system ever developed. R1, subsequently renamed XCON by Digital Equipment Corporation, automated the task of configuring computer components within DEC’s VAX computer family. It is worth noting that several previous attempts to do precisely the same thing using procedural programming languages had already failed before Dr. McDermott succeeded. Today, this system has grown to many thousands of rules and thrives in an environment surrounded by cooperating expert systems which coordinate many aspects of the sales, manufacturing, and design of Digital’s computer systems throughout the world. This could not have happened without the Rete Algorithm.
16. DO ALL MODERN RULE-BASED SYSTEMS USE THE RETE ALGORITHM?
No. Any rule-based system which uses the Rete Algorithm will in fact be a production system. And most modern rule-based systems remain equivalent to procedural if-then-else statements with subroutine invocation. All data-driven production systems are based on the Rete Algorithm. They all outperform non-Rete-based rule-based systems dramatically–and their relative performance advantages increase roughly linearly with respect to the number of rules.
17. HOW CAN I DISTINGUISH BETWEEN A RULE-BASED SYSTEM AND A RETE ALGORITHM-BASED PRODUCTION SYSTEM?
Non-Rete-based systems will be unbearably slow. Just ask the vendor whether their rule -based system is based on the Rete Algorithm. If you cannot evaluate their software without risk, you will need to ask more questions. Of course, you can always ask for benchmarks against Rete-based systems.
18. HOW MANY PIECES OF CONDITIONAL KNOWLEDGE CAN I GET AWAY WITH BEFORE A RETE-BASED SYSTEM BECOMES NECESSARY?
Going back to the basics, as long as you can flow chart it you don’t require anile-based system of any kind, although you might wish you started with a production system if the amount of conditional logic increases over time. If you have any trouble flow charting even just a few dozen pieces of conditional knowledge you should definitely use a data-driven production system. Independent benchmarks by NASA, Ford Aerospace, and Bell Northern Research bear this out. Each of these organizations found that Rete-based systems outperformed non-Rete-based systems by at least ten to one for as few as thirty rules and well over a hundred to one as the number of rales increased.
19. ARE ALL DATA-DRIVEN PRODUCTION SYSTEMS ROUGHLY EQUIVALENT TO OPS5?
Yes and no. Like OPS5, they all provide the benefits of a is many times as fast and much smaller than the Lisp version but is quite expensive and available only for DEC
20. WHAT TOOL SUPERCEDED OPS5?
The Automated Reasoning Tool, ART’, first demonstrated in August of 1984 at the American Association of AI conference in Austin, Texas. Early in 1984, Paul Haley left Carnegie-Mellon University to join in the forming of what became ART’s development team. Paul had been working at Carnegie-Mellon with Drs. McDermott and Forgy where he was primarily responsible for several of the DEC expert systems and contributed to the algorithms, investigations of parallelism and measurement of production systems. It was Paul’s expertise that focused Inference Corporation on data-driven production systems as the architecture of ART. As a result, ART supports a much richer pattern matching syntax than does OPS5 and supports free access to external routines within both the conditions and actions of rules. Also, ART supports automatic generation of goals and an advantageous form of backward chaining described in Haley’s “Backward Chaining” paper2.
21. BUT WASN’T ART IMPLEMENTED IN LISP?
Yes. Many AI practitioners were quite surprised that ART was able to compete with Bliss OPS5 in terms of performance. ART is vastly more efficient than Lisp OPS5. Unfortunately, ART was only available on special purpose Lisp machines which limited its applicability in commercial markets where VAX were more acceptable.
22. ARE THERE ANY OTHER ALTERNATIVES TO OPS5 OR ART?
Yes. After benchmarking many expert system shells and rule-based programming languages, NASA standardized on Rete Algorithm-based production systems and selected ART over OPS5, presumably because ART is more expressive and more functional. NASA was concerned about ART’s portability, however, and also desired personal computer versions of ART for internal use, especially for training. Since Inference was either unwilling or unable to provide NASA with a portable version of ART that would support PCs, NASA simply cloned ART’S syntax and functionality. They implemented a C-Language-Integrated Production System, dubbed CLIPS. CLIPS entered the public domain in 1986 and has since been learned and applied by literally thousands of government, corporate, and academic users.
23. SO WHY WOULD ANYONE USE ART RATHER THAN CLIPS?
From a technical perspective, CLIPS was implemented by non-AI professionals. CLIPS is not nearly as robust in its performance as is ART and ART offers much more functionality than CLIPS. From a business perspective, ARTis supported by Inference Corporation. If you need to ask a question or if you require training, customization, or consulting of any kind a commercial enterprise is there to back up its product. With Inference Corporation and ART, there is expertise and commitment. With CLIPS, no one is
24. BUT ART IS EXPENSIVE AND IT REQUIRES EXPENSIVE HARDWARE TO RUN LISP.
True and not so true. Functionality and service have their price. But ART -IM3 is now available. ART-IM builds on CLIPS and is implemented in C. Like ART, it is more robust and offers more functionality than CLIPS. CLIPS, being public domain is obviously cheaper, but in every sense of the word.
25. IS THERE ANY OBJECTIVE EVIDENCE THAT ART IS A SUPERIOR PRODUCT?
Certainly. Every year the American Association for AI recognizes certain applications as being “Innovative Applications of AI”. Invariably these applications are notable success stories concerning the commercial application of artificial intelligence. Applications developed in ART or similar syntax languages based on the Rete Algorithm have consistently dominated these awards over the years. And this dominance has occurred despite the fact that there are many times more users of alternative tools that are not based on the Rete Algorithm.
26. THEN I SHOULD USE ART-IM.
Perhaps. ART-IM is a good product, but despite being implemented in C for portability, ART-IM is still rather large and slow, and it does not offer all the functionality of Lisp-based ART. Moreover, ART -IM is staggeringly expensive–both for development and delivery of applications. If you can afford the software and there are adequate MIPS and megabytes available for both development and delivery you won’t fail simply because you select ART-IM.
27. WHAT IF I AM CONCERNED ABOUT BUDGET AND HARDWARE REQUIREMENTS?
Then you should consider Eclipse from The Haley Enterprise.
28. WHAT ARE THE RELATIVE ADVANTAGES OF ECLIPSE VERSUS CLIPS OR ART-IM?
Syntactically, Eclipse is very compatible with either but offers more functionality than CLIPS and, in some areas, is more powerful than ART-IM. Computationally, Eclipse is much more embeddable than either and requires roughly one half the memory of CLIPS and one fourth the memory required by ART -IM. In terms of performance, Eclipse varies from at least two, but typically four to well over 10 times faster than Bliss OPS5, CLIPS or ART-IM. These claims are supported by the NASA and Bell Northern Research benchmarks and by the testimony of many who have ported from these other languages to Eclipse. Economically, Eclipse is available for less than one tenth the cost of ART-IM. DOS and Windows versions of Eclipse actually cost less than the amount required to obtain CLIPS from NASA or public domain software houses. And the cost of delivering intelligent applications–especially when hardware requirements are considered–is much lower with Eclipse.
29. IS THE ONLY REASON FOR USING ECLIPSE ITS SUPERIOR PRICE/PERFORMANCE?
No. Like Inference Corporation, Haley Systems is a commercial entity that possesses substantial expertise and stands behind its products. Unlike CLIPS, experts are available to answer questions, to provide training, perform custom extensions and integration, and to perform application and other consulting as required its customers. Also, Eclipse supports more advanced functionality and better integration -especially on personal computers. Eclipse supports more powerful logical reasoning capabilities and provides more expressive pattern matching. Eclipse allows declarative domain and procedural control knowledge to be explicitly separated and codified using rules or procedures–retaining more of the advantages of a pure production system even as the requirement to procedurally control and embed a knowledge-base is implemented within any real application. And there are many other functional considerations that favor Eclipse.
30. WHAT IS HALEY SYSTEMS
Haley Systems, Inc. is the worldwide technology leader in business rules management systems that provide the fastest cycle times associated with automating business rules (policies, procedures, regulations, logic, etc.). Haley’s solutions empower both business and IT users with the ability to automate the process from requirements through deployment Haley’s knowledge management tool provides unique accessibility to business users while its rules engine provides the fastest performance and the smallest footprint. Based in Pittsburgh, PA, the home of Carnegie Mellon University, Haley provides enterprises with expert system software and services to get to market more quickly, at a lower cost, with greater agility. An experienced management team keeps Haley at the front of Business Rules Engine (BRE) and knowledge management technology innovation. They are committed to supporting customers’ requirements to deliver quality products and services today while continuing to develop solutions to meet tomorrow’s IT demands.
I’m amazed you had that lying around and found it. Thanks for sharing.