Background for our Semantic Technology 2013 presentation

In the spring of 2012, Vulcan engaged Automata for a knowledge acquisition (KA) experiment.  This article provides background on the context of that experiment and what the results portend for artificial intelligence applications, especially in the areas of education.  Vulcan presented some of the award-winning work referenced here at an AI conference, including a demonstration of the electronic textbook discussed below.  There is a video of that presentation here.  The introductory remarks are interesting but not pertinent to this article.

Background on Vulcan’s Project Halo

Background on Vulcan's Project Halo

From 2002 to 2004, Vulcan developed a Halo Pilot that could correctly answer between 30% and 50% of the questions on advanced placement (AP) tests in chemistry.  The approaches relied on sophisticated approaches to formal knowledge representation and expert knowledge engineering.  Of three teams, Cycorp fared the worst and SRI fared the best in this competition.  SRI’s system performed at the level of scoring a 3 on the AP, which corresponds to earning course credit at many universities.  The consensus view at that time was that achieving a score of 4 on the AP was feasible with limited additional effort.  However, the cost per page for this level of performance was roughly $10,000, which needed to be reduced significantly before Vulcan’s objective of a Digital Aristotle could be considered viable.

Continue reading “Background for our Semantic Technology 2013 presentation”

Understanding English promotes better policies and requirements

Capturing some policies from a publication by the Health and Human Services department recently turned up the following….
It’s probably the case that there are more specific lists than just “some list” or “any list”, as suggested below.
This is a good thing about applying deep natural language understanding to policy statements.  It helps you say precisely what you mean, even if you are not using a rule or logic engine, but just trying to articulate your policies or requirements clearly and precisely.

HHS on expedited review

Logic from the English of Science, Government, and Business

Our software is translating even long and complicated sentences from regulations to textbooks into formal logic (i.e,, not necessarily first-order logic, but more general predicate calculus).   As you can see below, we can translate this understanding into various logical formalisms including defeasible first-order logic, which we are applying in Vulcan’s Project Halo.  This includes classical first-order logic and related standards such as RIF or SBVR, as well as building or extending an ontology or description logic (e.g., OWL-DL).

We’re excited about these capabilities in various applications, such as in advancing science and education at Vulcan and formally understanding, analyzing and automating policy and regulations in enterprises.

English translated into predicate calculus

English translated into predicate calculus

a sentence understood by Automata

an unambiguously, formally understood sentence


English translated into SILK and Prolog

English translated into SILK and Prolog

Project Sherlock

Working as part of Vulcan’s Project Halo[1], Automata is applying a natural language understanding system that translates carefully formulated sentences into formal logic so as to answer questions that typically require deeper knowledge and inference than demonstrated by Watson.

The objective over the next three quarters is to acquire enough knowledge from the 9th edition of Campbell’s Biology textbook to demonstrate three things.

  • First, that the resulting system answers, for example, biology advanced placement (AP) exam questions more competently than existing systems (e.g., Aura[2] or Inquire[3]).
  • Second, that knowledge from certain parts of the textbook is effectively translated from English into formal knowledge with sufficient breadth and depth of coverage and semantics.
  • Third, that the knowledge acquisition process proves efficacious and accessible to less than highly skilled knowledge engineers so as to accelerate knowledge acquisition beyond 2012.

Included in the second of these is a substantial ontology of background knowledge expected of students in order to comprehend the selected parts of the textbook using a combination of OWL, logic, and English sentences from sources other than the textbook.

Automata is hiring logicians, linguists, and biologists to work as consultants, contracts, or employees for:

  • Interactive tree-banking and word-sense disambiguation of several thousand sentences.[4]
  • Extending its lexical ontology and a broad-coverage grammar of English with additional vocabulary and deeper semantics, especially concerning cellular biology and related scientific knowledge including chemistry, physics, and math.
  • Maturing its upper and middle ontology of domain independent knowledge using OWL in combination with various other technologies, including description logic, first-order logic, high-order logic, modal logic, and defeasible logic.[5]
  • Enhancing its platform for text-driven knowledge engineering towards a collaborative wiki-like architecture for self-aware content in scientific education and biomedical applications.

Terms of engagement are flexible; ranging from small units of work to full-time employment.  We are based in Pittsburgh, Pennsylvania and Vulcan is headquartered in Seattle, Washington, but the team is distributed across the country and overseas.

Please contact Paul Haley by e-mail to his first name at this domain.


[1] Vulcan: http://www.vulcan.com/TemplateCompany.aspx?contentId=54; Project Halo: http://www.projecthalo.com/
Video introduction/overview :http://videolectures.net/aaai2011_gunning_halobook/

[2] Aura: http://www.ai.sri.com/project/aura

[3] Inquire: http://www.franz.com/success/customer_apps/artificial_intelligence/aura.lhtml

[4] tree-banking and WSD: http://www.omg.org/spec/SBVR and http://en.wikipedia.org/wiki/Word-sense_disambiguation

[5] e.g., SILK (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.174.1796 ) and SBVR (http://www.omg.org/spec/SBVR)

NLP: depictive in an HPSG lexicon?

We’re working with the English Resource Grammar (ERG), OWL, and Vulcan’s SILK to educate the machine by translating textbooks into defeasible logic.  Part of this involves an ontology that models semantics more deeply than the ERG, which is based on head-driven phrase structure grammar (HPSG), which provides deeper parsing and, with the ERG and the DELPH-IN infrastructure, also provides a simple under-specified semantic representation called minimal recursion semantics (MRS).

We’re having a great time using OWL to clarify and enrich the semantics of the rich model underlying the ERG.  Here’s an example, FYI.  If you’d like to know more (or help), please drop us a line!  Overall the project will demonstrate our capabilities for transforming everyday sentences into RIF and business rule languages using SBVR extended with defeasibility and other capabilities, all modeled in the same OWL ontology.

What triggered this blog entry was a bit of a surprise in seeing that whether or not an adjective could be used depictively is sometimes encoded in the lexicon.  This is one of the problems of TDL versus a description-logic based model with more expressiveness.  It results in more lexical entries than necessary, which has been discussed by others when contrasted with the attributed logic engine (ALE), for example.

In trying to model the semantics of words like ‘same’ and ‘different’, we are scratching our heads about these lines from the ERG’s lexicon:

  1. same_a1 := aj_pp_i-cmp-sme_le & [ ORTH < “same” >, SYNSEM [ LKEYS.KEYREL.PRED “_same_a_as_rel”, …
  2. the_same_a1 := aj_-_i-prd-ndpt_le & [ ORTH < “the”, “same” >, SYNSEM [ LKEYS.KEYREL.PRED “_the+same_a_1_rel”, …
  3. the_same_adv1 := av_-_i-vp-po_le & [ ORTH < “the”, “same” >, SYNSEM [ LKEYS.KEYREL.PRED “_the+same_a_1_rel”, …
  4. exact_a2 := aj_pp_i-cmp-sme_le & [ ORTH < “exact” >, SYNSEM [ LKEYS.KEYREL.PRED “_exact_a_same-as_rel”…

One of the interesting things about lexicalized grammars is that lexical entries (i.e., ‘words’) are described with almost arbitrary combinations of their lexical, syntactic, and semantic characteristics.

The preceding code is expressed in a type description language (TDL) used by the Lisp-based LKB (and its C++ counterpart, PET, which are unification-based parsers that produce a chart of plausible parses with some efficiency.  What is given above is already deeper than what you can expect from a statistical parser (but richer descriptions of lexical entries promises to make statistical parsing much better, too).

Unfortunately, there is no available documentation on why the ERG was designed as it is, so the meaning of the above is difficult to interpret.  For example, the types of lexical entries (the symbols ending in ‘_le’) referenced above are defined as follows:

  1. aj_pp_i-cmp-sme_le := basic_adj_comp_lexent & [SYNSEM[LOCAL[CAT[HEAD superl_adj &[PRD -,MOD <[LOCAL.CAT.VAL.SPR <[–MIN def_or_demon_q_rel]>]>],VAL.SPR.FIRST.–MIN much_deg_rel],CONT.RELS <!relation,relation!>],MODIFD.LPERIPH bool,LKEYS[ALTKEYREL.PRED comp_equal_rel,–COMPKEY _as_p_comp_rel]]].
  2. aj_-_i-prd-ndpt_le := nonc-hm-nab & [SYNSEM basic_adj_abstr_lex_synsem & [LOCAL[CAT[HEAD adj & [PRD +,MINORS[MIN norm_adj_rel,NORM norm_rel],TAM #tam,MOD < anti_synsem_min >],VAL[SPR.FIRST anti_synsem_min,COMPS < >],POSTHD +],CONT[HOOK[LTOP #ltop,INDEX #arg0 &[E #tam],XARG #xarg],RELS <! #keyrel & adj_relation !>,HCONS <! !>]],NONLOC non-local_none,MODIFD notmod &[LPERIPH bool],LKEYS.KEYREL #keyrel &[LBL #ltop,ARG0 #arg0,ARG1 #xarg & non_expl-ind]]].

Needless to say, that’s a mouthful!  Chasing this down, the following ‘informs’ us that “the same”, which uses type #2 above, is defined using the following lexical types:

  1. nonc-hm-nab := nonc-h-nab & mcna.
  2. nonc-h-nab := nonconj & hc-to-phr & non_affix_bearing.
  3. mcna := word & [ SYNSEM.LOCAL.CAT.MC na ].

Which is to say that it is non-conjunctive, complements a head to form a phrase, can’t be affixed, cannot constitute a main clause, and is a word.

The fact that the lexical entry for “the same” is adjectival is given the definition of the following type(s) used in the SYNSEM feature:

  1. basic_adj_comp_lexent := compar_superl_adj_word & [SYNSEM adj_unsp_ind_twoarg_synsem & [LOCAL[CAT.VAL[COMPS <canonical_or_unexpressed & [–MIN #cmin,LOCAL [CAT basic_pp_cat,CONJ cnil,CONT.HOOK [LTOP #ltop,INDEX #ind]]]>],CONT.HOOK [ LTOP #ltop, XARG #xarg]],LKEYS [ KEYREL.ARG1 #xarg,ALTKEYREL.ARG2 #ind,–COMPKEY #cmin]]].b
  2. compar_superl_adj_word := nonc-hm-nab & [SYNSEM adj_unsp_ind_synsem & [LOCAL[CAT[HEAD[MOD <[–SIND #ind & non_expl]>,TAM #tam,MINORS.MIN abstr_adj_rel],VAL.SPR.FIRST.LOCAL.CONT.HOOK.XARG #altarg0],CONT[HOOK[XARG #ind,INDEX #arg0 & [E #tam]],RELS.LIST <[LBL #hand,ARG1 #ind],#altkeyrel & [LBL #hand,ARG0 event & #altarg0,ARG1 #arg0],…>]],LKEYS.ALTKEYREL #altkeyrel]].

Which is to say that it is a comparative or superlative adjectival word (even though it consists of two lexemes in its ‘orthography’) that involves two semantic arguments including one complement which may be unexpressed prepositional phrase.  A comparative or superlative adjective, in turn, is non-conjunctive, complements a head to form a phrase, is non-affix bearing (?), and non-clausal, as defined by the type ‘nonc-hm-nab’ above.

The types used in the syntax and semantic (i.e., SYNSEM) feature of the two lexical types are defined as follows (none of which is documented):

  1. adj_unsp_ind_twoarg_synsem := adj_unsp_ind_synsem & two_arg.
  2. adj_unsp_ind_synsem := basic_adj_lex_synsem & lex_synsem & adj_synsem_lex_or_phrase & isect_synsem & [LOCAL.CONT.HOOK.INDEX #ind,LKEYS.KEYREL.ARG0 #ind].

In a moment, we’ll discuss the types used in the second of these, but first, some basics on the semantics that are mixed with the syntax above.

In effect, the above indicates that a new ‘elementary predication’ will be needed in the MRS to represent the adjectival relationship in the logic derived in the course of parsing (i.e., that’s what ‘unsp_ind’ means, although it’s not documented, which I will try not to bemoan much further.)

The following indicates that the newly formed elementary predicate is not (initially) within any scope and that it has two arguments whose semantics (i.e., their RELations) are concatenated for propagation into the list of elementary predications that will constitute the MRS for any parses found.

  1. two_arg := basic_two_arg & [LOCAL.CONT.HCONS <! !>].
  2. basic_two_arg := unspec_two_arg & lex_synsem.
  3. unspec_two_arg := basic_lex_synsem & [LOCAL.ARG-S <[LOCAL.CONT.HOOK.–SLTOP #sltop,NONLOC [SLASH[LIST #smiddle,LAST #slast],REL [LIST #rmiddle,LAST #rlast],QUE[LIST #qmiddle,LAST #qlast]]],[LOCAL.CONT.HOOK.–SLTOP #sltop, NONLOC[SLASH[LIST #sfirst,LAST #smiddle],REL[LIST #rfirst,LAST #rmiddle],QUE[LIST #qfirst,LAST #qmiddle]]]>,LOCAL.CONT.HOOK.–SLTOP #sltop,NONLOC[SLASH[LIST #sfirst,LAST #slast],REL[LIST #rfirst,LAST #rlast],QUE[LIST #qfirst,LAST #qlast]]].
  4. lex_synsem := basic_lex_synsem & [LEX +].

The last of these expresses that the constuction is lexical rather than phrasal (which includes clausal in the ERG).

Continuing with the definition of “the same” as an adjective, the following finally clarifies what it means to be a basic adjective:

  1. basic_adj_lex_synsem := basic_adj_abstr_lex_synsem & [LOCAL[ARG-S <#spr . #comps>,CAT[HEAD adj_or_intadj,VAL[SPR<#spr & synsem_min &[–MIN degree_rel,LOCAL[CAT[VAL[SPR *olist*,SPEC <[LOCAL.CAT.HS-LEX #hslex]>],MC na],CONT.HOOK.LTOP #ltop],NONLOC.SLASH 0-dlist,OPT +],anti_synsem_min &[–MIN degree_rel]>,COMPS #comps],HS-LEX #hslex],CONT.RELS.LIST <#keyrel,…>],LKEYS.KEYREL #keyrel & [LBL #ltop]].

Well, ‘clarifies’ might not have been the right word!  Essentially, it indicates that the adjective may have an optional degree specifier (which semantically modifies the predicate of the adjective) and that the predicate specified in the lexical entry becomes the predicate used in the MRS.  The rest is defined below:

  1. basic_adj_abstr_lex_synsem := basic_adj_synsem_lex_or_phrase & abstr_lex_synsem & [LOCAL.CONT.RELS.LIST.FIRST basic_adj_relation].
  2. basic_adj_synsem_lex_or_phrase := canonical_synsem & [LOCAL[AGR #agr,CAT[HEAD[MINORS.MIN basic_adj_rel],VAL[SUBJ <>,SPCMPS <>]],CONT.HOOK[INDEX non_conj_sement,XARG #agr]]].
  3. canonical_synsem := expressed_synsem & canonical_or_unexpressed.
  4. expressed_synsem := synsem.
  5. canonical_or_unexpressed := synsem_min0.
  6. synsem_min0 := synsem_min & [LOCAL mod_local,NONLOC non-local_min].

Which ends with a bunch of basic setup types except for constraining for relation for an adjective to be ‘basically adjectivally’ on the first two lines.  Also on these first two lines, it specifies that its subject and its specifier, if any, must be completed (i.e., empty) and agree with its non-conjunctive argument (which is not to say that it cannot be conjunctive, but that it modifies the conjunction as a whole, if so.)  Whether or not it is expressed will determine if there are any further predicates about its arguments or if its unexpressed argument is identified by an otherwise unreferenced variable in any resulting MRS.

The lexical grounding of this type specification is given below, indicating that it may (or not) have phonology (e.g., pronunciation, such as whether its onset is voiced) and if and how and with what punctuation it may appear, if any.  In general, a semantic argument may be lexical or phrasal and optional but if it appears it corresponds to some semantic index (think variable) in sort of predicate in any resulting MRS.  (The *_min types do not constrain the values of their features any further).

  1. basic_lex_synsem := abstr_lex_synsem & lex_or_nonlex_synsem.
  2. abstr_lex_synsem := canonical_lex_or_phrase_synsem & [LKEYS lexkeys].
  3. canonical_lex_or_phrase_synsem := canonical_synsem & lex_or_phrase.
  4. lex_or_phrase := synsem_min2.
  5. synsem_min2 := synsem_min1 & [LEX luk,MODIFD xmod_min,PHON phon_min,PUNCT punctuation_min].
  6. synsem_min1 := synsem_min0 & [OPT bool,–MIN predsort,–SIND *top*].
  7. adj_synsem_lex_or_phrase := basic_adj_synsem_lex_or_phrase &[LOCAL[CAT.HEAD.MOD <synsem_min &[LOCAL[CAT[HEAD basic_nom_or_ttl & [POSS -],VAL[SUBJ <>,SPR.FIRST synsem &[–MIN quant_or_deg_rel],COMPS <>],MC na],CONJ cnil],–SIND #ind]>,CONT.HOOK.XARG #ind]].

Note that an adjective is not possess-able and that it modifies something nominal (or a title) and that if it has a specifier that it is a quantifier or degree (e.g., ‘very’).  Again, an adjective cannot function as a main clause or be conjunctive (in and of itself).

Finally, if you look far above you will see that the basic semantics of an adjective with an additional semantic argument is ‘intersective’, as in:

  1. isect_synsem := abstr_lex_synsem & [LOCAL[CAT.HEAD.MOD <[LOCAL intersective_mod,NONLOC.REL 0-dlist]>,CONT.HOOK.LTOP #hand],LKEYS.KEYREL.LBL #hand].

Here, the length 0 difference list and the following definitions indicate that intersective semantics do not accept anything but local modification:

  1. intersective_mod := mod_local.
  2. mod_local := *avm*.

AVM stands for ‘attribute value matrix’, which is the structure by which types and their features are defined (with nesting and unification constraints using # to indicate equality).

By now you’re probably getting the idea that there is fairly significant model of the English language, including its lexical and syntactic aspects, but if you look there is a lot about semantics here, too.

Cyc is more than encyclopedic

I had the pleasure of visiting with some fine folks at Cycorp in Austin, Texas recently.  Cycorp is interesting for many reasons, but chiefly because they have expended more effort developing a deeper model of common world knowledge than any other group on the planet.  They are different from current semantic web startups.  Unlike Metaweb‘s Freebase, for example, Cycorp is defining the common sense logic of the world, not just populating databases (which is an unjust simplification of what Freebase is doing, but is proportionally fair when comparing their ontological schemata to Cyc’s knowledge).  Not only does Cyc have the largest and most practical ontology on earth, they have almost incomprehensible numbers of formulas[1] describing the world.   Continue reading “Cyc is more than encyclopedic”

Understanding events and processes takes time

We have been teaching a computer to answer questions like, “How much did IBM’s earnings change last quarter?”  It takes a fair bit of knowledge, including how to understand English, to answer this question.  But teaching it what a “quarter” is brought back memories of debates with some former CMU colleagues about what units are and how to model time.  Since quite a few people ask me for help with knowledge engineering and ontological matters, I thought some might be interested in parts of those debates.As you will see, a strong upper ontology of common knowledge is required to understand common business knowledge.  Leveraging such an ontology is the only way to deliver business rules for under $50.

Sentences like “do something if more than a number of possibly related things have happened within a timeframe of something else happening” or “do something if nothing happens within a timeframe following something happening” are extremely common in business process management (BPM), complex event processing (CEP), and workflow.  With a sense of time, a business rules management system (BRMS) can support BPM, CEP, and workflow applications almost trivially.  Without a sense of time, most BRMS force users to perform computations.

For example, without a sense of time and an infrastructure that supports it, the sentence “call a customer if no response is received within 30 days of notifying the customer of a delinquency” has to be transformed into something like “if a notice is mailed on a date and the notice is a delinquency and the date of notification has a day number then compute the date for checking by adding 30 to the day number and check for a response to the delinquency notice on the date for checking”.  The checking on a date for a response to a notice must also be implemented as a database (or persistent queue) of events to be polled or triggered by application code.  Then a second rule is required to implement the check, as in “if checking whether a response has been received to a notice and the notice was given on a date of notice and the notice was given to a customer and there exists no record of communication with the customer since the date of notice then call the customer”.  (Note that this is actually how most BRMS products would implement this.  The natural language approach I prefer handles the original sentence.)

The discussion here reflects the general structure and content that a usable ontology for business process management requires.  Most users of business rules management tools will find the need to understand and engineer this discussion in their tool of choice.  As my Haley Systems customers know, much of this is reflected in Authority’s built-in ontology and English vocabulary, but quite a few of the points discussed here reflect improvements, especially concerning the confusion between units and amounts.

As you will see the discussion takes careful thinking.  Some readers may find it onerous.  If at any time you have had enough (or if you simply cannot take anymore!), please skip to the end and decide whether to fill in the conclusions by revisiting the body.

Continue reading “Understanding events and processes takes time”

The $50 Business Rule

Work on acquiring knowledge about science has estimated the cost of encoding knowledge in question answering or problem solving systems at $10,000 per page of relevant textbooks. Regrettably, such estimates are also consistent with the commercial experience of many business rules adopters. The cost of capturing and automating hundreds or thousands of business rules is typically several hundred dollars per rule. The labor costs alone for a implementing several hundred rules too often exceed $100,000.

The fact that most rule adopters face costs exceeding $200 per rule is even more discouraging when this cost does not include the cost of eliciting or harvesting functional requirements or policies but is just the cost of translating such content into the more technical expressions understood by business rules management systems (BRMS) or business rule engines (BRE).

I recommend against adopting any business rule approach that cannot limit the cost of automating elicited or harvested content to less than $100 per rule given a few hundred rules. In fact, Automata provides fixed price services consistent with the following graph using an approach similar to the one I developed at Haley Systems.

Cost per Harvested or Elicited Rule

Continue reading “The $50 Business Rule”

Managing Semantics, Vocabulary and Business Rules as Knowledge

A client recently asked me for guidance in establishing a center of excellence concerning business rules within their organization. Their objectives included:

  1. Accumulate requisite skills for productive success.
  2. Establish methodologies for productive, reliable and repeatable success.
  3. Accumulate and reuse content (e.g., definitions, requirements, regulations, and policies) across implementations, departments or divisions.
  4. Establish multiple tutorial and reusable reference implementations, including application development, tooling, and integration aspects.
  5. Establish centralized or transferable infrastructure, including architectural aspects, tools and repositories that reflect and support established methodologies, reusable content, and reference implementations.
  6. Establish criteria, best practices and rationale for various administrative matters, especially change management concerning the life cycles of content (e.g., regulations or policies) and applications (e.g., releases and patches).

I was quickly surprised to find myself struggling to write down recommendations for the skill set required to seed the core staff.  My recommendations were less technical than the client may have expected.   After further consideration, it became clear than any discrepancy in expectations arose from differences in our unvoiced strategic assumptions.  Objectives, such as those listed above, are no substitute for a clearly articulated mission and strategy.  

Continue reading “Managing Semantics, Vocabulary and Business Rules as Knowledge”