Natural Language Leadership at the Allen Institute for Artificial Intelligence (AI2)

Orin Etzioni is a marvelous choice to lead the Allen Institute for AI (aka AI2).  The NL/ML path is the right path for scaling up the deep knowledge that Paul Allen’s vision of a Digital Aristotle requires.  You can read more about it below and here’s more background on the change in the direction and on some evidence that the path holds great promise.

Going beyond Siri and Watson: Microsoft co-founder Paul Allen taps Oren Etzioni to lead new Artificial Intelligence Institute

Pedagogical applications of proofs of answers to questions

In Vulcan’s Project Halo, we developed means of extracting the structure of logical proofs that answer advanced placement (AP) questions in biology.  For example, the following shows a proof that separation of chromatids occurs during prophase.

textual explanation of entailment using the Linguist and SILK

This explanation was generated using capabilities of SILK built on those described in A SILK Graphical UI for Defeasible Reasoning, with a Biology Causal Process Example.  That paper gives more details on how the proof structures of questions answered in Project Sherlock are available for enhancing the suggested questions of Inquire (which is described in this post, which includes further references).  SILK justifications are produced using a number of higher-order axioms expressed using Flora‘s higher-order logic syntax, HiLog.  These meta rules determine which logical axioms can or do result in a literal.  (A literal is an positive or negative atomic formula, such as a fact, which can be true, false, or unknown.  Something is unknown if it is not proven as true or false.  For more details, you can read about the well-founded semantics, which is supported by XSB. Flora is implemented in XSB.)

Now how does all this relate to pedagogy in future derivatives of electronic learning software or textbooks, such as Inquire?

Well, here’s a use case: Continue reading “Pedagogical applications of proofs of answers to questions”

Translating English into Logic using the Linguist

Now that the patent filings are done, we can discuss and show more about the Linguist…

The following link is a video that shows a sentence from Project Sherlock being translated from English into first-order logic using the patent-pending  Linguisttm software.

The hydrophobic ends of the lipids of a cell’s plasma membrane are oriented away from the cell’s cytoplasm.

This video was recorded in October, 2012.  More recent versions of the Linguist can render the logic in more ways, such as shown below:

A grammatically disambiguated and logically formalized English sentence using Automata Linguist

Semantic Technology & Business Conference (SemTechBiz)

Benjamin Grosof and I will be presenting the following review of recent work at Vulcan towards Digital Aristotle as part of Project Halo at SemTechBiz in San Francisco the first week of June.

Acquiring deep knowledge from text

We show how users can rapidly specify large bodies of deep logical knowledge starting from practically unconstrained natural language text.

English sentences are semi-automatically interpreted into  predicate calculus formulas, and logic programs in SILK, an expressive knowledge representation (KR) and reasoning system which tolerates practically inevitable logical inconsistencies arising in large knowledge bases acquired from and maintained by distributed users possessing varying linguistic and semantic skill sets who collaboratively disambiguate grammar, logical quantification and scope, co-references, and word senses.

The resulting logic is generated as Rulelog, a draft standard under W3C Rule Interchange Format’s Framework for Logical Dialects, and relies on SILK’s support for FOL-like formulas, polynomial-time inference, and exceptions to answer questions such as those found in advanced placement exams.

We present a case study in understanding cell biology based on a first-year college level textbook.

Deep QA

Our efforts at acquiring deep knowledge from a college biology text have enabled us to answer a number of questions that are beyond what has been previously demonstrated.

For example, we’re answering questions like:

  1. Are the passage ways provided by channel proteins hydrophilic or hydrophobic?
  2. Will a blood cell in a hypertonic environment burst?
  3. If a Paramecium swims from a hypotonic environment to an isotonic environment, will its contractile vacuole become more active?

A couple of these are at higher levels on the Bloom scale of cognitive skills than Watson can reach (which is significantly higher than search engines).

As some other posts have shown in images, we can translate completely natural sentences into formal logic.  We actually do the reasoning using Vulcan’s SILK, which has great capabilities, including defeasibility.  We can also output to RIF or SBVR, but the temporal aspects and various things such as modality and the need for defeasibility favor SILK or Cyc for the best reasoning and QA performance.

One thing in particular is worth noting:  this approach does better with causality and temporal logic than is typically considered by most controlled natural language systems, whether they are translating to a business rules engine or a logic formalism, such as first order or description logic.  The approach promises better application development and knowledge management capabilities for more of the business process management and complex event processing markets.

Project Sherlock

Working as part of Vulcan’s Project Halo[1], Automata is applying a natural language understanding system that translates carefully formulated sentences into formal logic so as to answer questions that typically require deeper knowledge and inference than demonstrated by Watson.

The objective over the next three quarters is to acquire enough knowledge from the 9th edition of Campbell’s Biology textbook to demonstrate three things.

  • First, that the resulting system answers, for example, biology advanced placement (AP) exam questions more competently than existing systems (e.g., Aura[2] or Inquire[3]).
  • Second, that knowledge from certain parts of the textbook is effectively translated from English into formal knowledge with sufficient breadth and depth of coverage and semantics.
  • Third, that the knowledge acquisition process proves efficacious and accessible to less than highly skilled knowledge engineers so as to accelerate knowledge acquisition beyond 2012.

Included in the second of these is a substantial ontology of background knowledge expected of students in order to comprehend the selected parts of the textbook using a combination of OWL, logic, and English sentences from sources other than the textbook.

Automata is hiring logicians, linguists, and biologists to work as consultants, contracts, or employees for:

  • Interactive tree-banking and word-sense disambiguation of several thousand sentences.[4]
  • Extending its lexical ontology and a broad-coverage grammar of English with additional vocabulary and deeper semantics, especially concerning cellular biology and related scientific knowledge including chemistry, physics, and math.
  • Maturing its upper and middle ontology of domain independent knowledge using OWL in combination with various other technologies, including description logic, first-order logic, high-order logic, modal logic, and defeasible logic.[5]
  • Enhancing its platform for text-driven knowledge engineering towards a collaborative wiki-like architecture for self-aware content in scientific education and biomedical applications.

Terms of engagement are flexible; ranging from small units of work to full-time employment.  We are based in Pittsburgh, Pennsylvania and Vulcan is headquartered in Seattle, Washington, but the team is distributed across the country and overseas.

Please contact Paul Haley by e-mail to his first name at this domain.


[1] Vulcan: http://www.vulcan.com/TemplateCompany.aspx?contentId=54; Project Halo: http://www.projecthalo.com/
Video introduction/overview :http://videolectures.net/aaai2011_gunning_halobook/

[2] Aura: http://www.ai.sri.com/project/aura

[3] Inquire: http://www.franz.com/success/customer_apps/artificial_intelligence/aura.lhtml

[4] tree-banking and WSD: http://www.omg.org/spec/SBVR and http://en.wikipedia.org/wiki/Word-sense_disambiguation

[5] e.g., SILK (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.174.1796 ) and SBVR (http://www.omg.org/spec/SBVR)

IBM Ilog JRules for business modeling and rule authoring

If you are considering the use of any of the following business rules management systems (BRMS):

  • IBM Ilog JRules
  • Red Hat JBoss Rules
  • Fair Isaac Blaze Advisor
  • Oracle Policy Automation (i.e., Haley in Siebel, PeopleSoft, etc.)
  • Oracle Business Rules (i.e., a derivative of JESS in Fusion)

you can learn a lot by carefully examining this video on decisions using scoring in Ilog.  (The video is also worth considering with respect to Corticon since it authors and renders conditions, actions, and if-then rules within a table format.)

This article is a detailed walk through that stands completely independently of the video (I recommend skipping the first 50 seconds and watching for 3 minutes or so).  You will find detailed commentary and insights here, sometimes fairly critical but in places complimentary.  JRules is a mature and successful product.  (This is not to say to a CIO that it is an appropriate or low risk alternative, however. I would hold on that assessment pending an understanding of strategy.)

The video starts by creating a decision table using this dialog:

Note that the decision reached by the resulting table is labeled but not defined, nor is the information needed to consult the table specified.  As it turns out, this table will take an action rather than make a decision.  As we will see it will “set the score of result to a number”. As we will also see, it references an application.  Given an application, it follows references to related concepts, such as borrowers (which it errantly considers synonomous with applicants), concerning which it further pursues employment information.

Continue reading “IBM Ilog JRules for business modeling and rule authoring”

Google vs. Facebook and Bing (again)

Almost a year ago, I wrote about semantics and social networking as threats to Google.  In that post, I referenced a prior article on investments in natural language processing, such as Microsoft’s acquisition of Powerset, which is now part of Bing.

Today, there are two articles I recommend.  The first addresses the extent to which Google’s Superbowl ad is a response to the threat from Bing.  The second addresses Facebook overtaking Google.

How is a process an event?

Today, I came upon some commentary by a business rule colleague, Carlos Serranos-Morales, of Fair Isaac concerning a presentation I made at the Business Rules Forum.  During the presentation I showed some sentences that are beyond the current state of the art in the business rules industry.  Generally speaking, these were logical statements that did not use the word “if”.  (Note, however, that many of the them could be expressed in SBVR, OMG’s semantics of business vocabulary and rules standard).  Carlos argued that such statements should be more precisely articulated within the specific context of a business process. 

Here is the slide that triggered the controversy:

AI beyond Fair Isaac

Continue reading “How is a process an event?”

Time for the next generation of knowledge automation

In preparing for my workshop at the Business Rules Forum in Las Vegas on November 5th, I have focused on the following needs in reasoning about processes, about events, and about or over time:

  1. Reasoning at a point within a [business] process
  2. Reasoning about events that occur over time.
  3. Reasoning about a [business] process (as in deciding what comes next)
  4. Reasoning about and across different states (as in planning)

Enterprise decision management (EDM) addresses the first.  Complex event processing (CEP) is concerned with the second.  In theory, EDM could address the third but it does not in practice.  This third item includes  the issue of governing and defining workflow or event-driven business processes rather than point decisions within such business processes. 

Business applications of rules have not advanced to include the fourth item.  That is to say, business has yet to significantly leverage reasoning or problem solving techniques that are common in artificial intelligence.  For example, artificially intelligent question and answer systems, which are being developed for  the semantic web,  can do more than retrieve data – they perform inference.  Commercial database and business intelligence queries are typically much less intelligent, which presents a number of opportunities that I don’t want to go into here but would happy to discuss with interested parties.  The point here is that business does not use reasoning much at all, let alone to search across the potential ramifications of alternative decisions or courses of action before making or taking one.  Think of playing chess or a soccer-playing robot planning how to advance the ball on goal.  Why shouldn’t business strategies or tactical business decisions benefit from a little simulated look-ahead along with a lot of inference and evaluation?

Even though I have recently become more interested in the fourth of these areas, I expect the audience at the business rules forum to be most interested in the first two points above.  There will also be some who have enough experience with complex business processes, which are common in larger enterprises.  These folks will be interested in the third item.  Only the most advanced applications, such as in biochemical process planning, will be interested in the fourth.  I don’t expect many of them to attend!

The notion of enterprise decision management (EDM) is focused on point decision making within a business process.  For enterprises that are concerned with governing business processes, a model of the process itself must be available to the business rules that govern its operation.  I’ve written elsewhere about the need for an ontology of events and processes in order to effectively integrate business process management (BPM) with business rules.  Here, and in the workshop, I intend to get a little more specific about the requirements, what is lacking in current standards and offerings, and what we’re trying to do about it. Continue reading “Time for the next generation of knowledge automation”