Ontology – Commercial Intelligence

Simply Smarter Intelligent Agents

paul@haleyAI.com — Sat, 07 Apr 2018 12:49:51 +0000

Deep learning can produce some impressive chatbots, but they are hardly intelligent. In fact, they are precisely ignorant in that they do not think or know anything.

More intelligent dialog with an artificially intelligent agent involves both knowledge and thinking. In this article, we educate an intelligent agent that reasons to answer questions.

The agent’s knowledge is expressed in simple, English sentences. The agent reasons using informal logic from those sentences and pragmatic knowledge about how to interpret and answer questions.

The domain for this intelligent agent is decision support for compliance with sales tax laws and regulations.
A pragmatic framework demonstrates simple reasoning beyond chatbots using an open-source reasoning system.

In combination, we demonstrate how the following types of reasoning can be easily incorporated into intelligent agents:

Other posts and pages address these topics and the use of natural language knowledge acquisition, representation, and reasoning more formally. This is a casual, introductory post aimed at making the lower end of the intelligent agent spectrum (i.e., chatbots) smarter.

if you are more interested in policy automation, decision management, or compliance, you might be more comfortable starting with this introduction to logical English or the 2-part confessions of a production rule vendor.

Simple Knowledge Acquisition

For simple sentences, natural language processing (NLP) systems can frequently produce knowledge representation that is sufficient for effective question answering beyond chatbots. For example, the following results immediately from parsing a simple sentence.

This knowledge representation is much simpler but not much less practical than fully-specified predicate calculus. The important thing is that it gives us a structure representation that we can work with using the machinery of logic programming and notions such as textual entailment. If you’re interested in the details and how to go further towards “axiomatic knowledge”, please see the notes.^{^[1]^[2]^[3]} The Linguist can make many reliable deductions not shown above, but let’s stick with this most simple use case for purposes of this post.^[4]

The most basic question is what the sentence means. Then we can consider how its meaning is to be used in reasoning (e.g., to answer a question, support a decision, etc.).

Our common sense tells us that the intended meaning here is that food products and medicines do not intersect (i.e., they are disjoint; having no common members).

So how would we want to reason with this understanding?

We could ask abstract questions about food products or medicines as types of things. Or, we could ask questions about a specific food product or medicine.

For example, we could ask “Is medicine subject to sales tax?” where we know that food products are not subject to sales tax.^[5]

Note that the system understands that the above is an interrogative sentence (i.e., a question).

The following, on the other hand, is a declarative sentence:

But, of course, the answer to the question would be “I know nothing that is subject to sales tax.” At least until we say something like:

Intelligent Agents’ Inference Engine

In order for the logic of the interrogative and declarative sentences above to be used by an intelligent agent, its representation must be made available to some type of inference engine.

Assuming we have loaded results from the Linguist into Flora-2, the following query:

flora2 ?-reading(of(sentence))(?reading,?_)@parse.

Has the following answers (the question is underlined):

?reading = subject(to)(‘⊆'(medicine),’⊆'(sale(tax)))
?reading = ‘¬'(include(‘⊆'(food(product)),’⊆'(medicine)))
?reading = ‘¬'(subject(to)(‘⊆'(food(product)),’⊆'(sale(tax))))
?reading = ‘∀₊'(\##7)(‘→'(sale(of)(\##7,’⊆'(tangible(property))),subject(to)(\##7,’⊆'(sale(tax)))))^[6]

Here, \## introduces what Flora calls a skolem constant. For our purposes here, you can think of these as variables. The fact that they are constants in Flora merely prevents their undergoing unification.^[7]

Identifying Pertinent Knowledge

Now we can look at the atomic formulas^[8] of these readings using the following query:

?- reading(of(sentence))(?_reading),atom(of(reading))(?atom,?_reading).

To which the following answers are received:

?atom = include(‘⊆'(food(product)),’⊆'(medicine))
?atom = sale(of)(\##7,’⊆'(tangible(property)))
?atom = subject(to)(\##7,’⊆'(sale(tax)))
?atom = subject(to)(‘⊆'(medicine),’⊆'(sale(tax)))
?atom = subject(to)(‘⊆'(food(product)),’⊆'(sale(tax)))

These suggest that the readings that contain atoms about \##7 or ‘⊆'(food(product)) being subject to sales tax may be pertinent.

Matching atoms from the question with atoms from the readings suggests 2 further deliberations:

Could ‘⊆'(food(product)) include ‘⊆'(medicine)?
Could ‘⊆'(medicine) and \##7 intersect?

The answer to the first is given by the reading that negates the pertinent atom (i.e., food products do not include medicines).

The answer to the second deliberation involves metonymy.

Note that the sentence about sales of tangible property is about the sale rather than the property but the question seems to be about medicine as the property involved in a sale. This is an example of metonymy. The question is not whether some medicine in a bathroom cabinet is subject to tax but whether a sale of medicine is subject to tax.

Metonymy

Answering questions requires handling metonymy. It’s practically impossible for human beings to avoid using metonymy. It is pervasive.

All we know about sales tax from the above sentences is that sales may be subject to it.
Presumably, we would not say that something that is not a sale is subject to (or not) sales tax.

This suggests that we abductively infer that the question is about a sale of medicine and that the statement excluding food products from sales tax is about sales of food products. Alternatively, if we do not assume (as in abduction), we can ask.

One again, we have narrowed the question answering process to the following point:

subject(to)(\##7,’⊆'(sale(tax)))
→'(sale(of)(\##7,’⊆'(tangible(property))),subject(to)(\##7,’⊆'(sale(tax)))) where
\##7 matches ‘⊆'(medicine) from our question

The implication would hold if \##7 was a sale of medicine.

To resolve the metonymy, we take one step away from medicine.
In this case, we move to something about medicine.

Let’s make this a little easier… the following answers result from simplifying the prior answers:

?simplify = sale(of(‘⊆'(tangible(property))))(?X7)
?simplify = subject(to(‘⊆'(sale(tax))))(?X7)
?simplify = subject(to(‘⊆'(sale(tax))))(‘⊆'(medicine))
?simplify = subject(to(‘⊆'(sale(tax))))(‘⊆'(food(product)))

Note that the antecedent of the implication is italicized above (the question remains underlined).

Abductive Inference

When we consider metonymy, we simply allow for some function of the phrase used, as in:

subject(to(‘⊆'(sale(tax))))(?F(‘⊆'(medicine)))

Unifying this with the 2^nd answer above yields:

?X7=?F(‘⊆'(medicine)))

Given the implication above, the answer would be yes if ?F resolves the metonymy, as in:

holds(sale(of(‘⊆'(tangible(property))))(?F(‘⊆'(medicine)))))

In general, a predicate that holds for something general holds for something more specific, as in:

holds(?predicate(?p(?X))(?predicate(?p(?Y))) :- preposition(?p), subsumes(?x,?Y).

Ontology and Common Sense

Our objective in this article is to simply convey how a little knowledge representation and inference is the difference between a chatbot and an intelligent agent. There is more to cover about how an intelligent agent gets domain knowledge, such an understanding of products and services offered by an enterprise. That commonly involves reading pertinent documentation to extract and align domain-specific vocabulary and ontology. Here, we admit, that we gloss over all that. In a future post we will show how we do that, such as we have done for cooking.

If we have loaded an ontology or written a sentence to the effect that tangible property includes medicine, we will deduce the prior statement.

We can deduce that medicine is tangible by reading using common sense, such as:
- anything that contains^[9] something tangible must be tangible
- knowing (e.g., from WordNet) that medicine is a substance^[10]
This leads us to a better ontology where we educate the machine about transfers of ownership of property (i.e., sales).
- anything owned is property^[11]
- a sale is a transfer of ownership of property^[12]^[13]

Obviously, there is much to go into here about the agent’s knowledge, inference capabilities, and reasoning heuristics. We will make the simple claim that a small body of general pragmatics combined with sufficiently precise knowledge from domain documentation can produce a smarter agent than deep learning can obtain with Big Data. The differences are:

A chatbot may respond as if it understands but cannot answer questions if any reasoning is required.
- The chatbot has no explicit, reliable knowledge and does not reason.
- [Yann] LeCun, head of AI at Facebook: “what’s still missing is reasoning”
An intelligent agent has some knowledge with which it does reason.

Learning or Clarifying by Asking

Unlike a chatbot, an intelligent agent engaged in dialogue may wish to clarify the intent of a question or even ask for help in figuring out the answer to a question.

If we have neither loaded an ontology nor read enough to deduce that medicine is tangible, the agent could ask… depending on the user, their goal, and the context of the interaction.

In XSB and Flora, we have the well-founded semantics^[14], which allows us to express the following:

subsumes(?x,?y) :- \neg subsumes(?x,?y), ask(subsumes(?x,?y)).

Which invokes the action to ask only if it is not already known or provable (positively or negatively) that one subsumes the other. (This is different from Prolog’s strictly procedural behavior).

Putting It Together

In either case, once we have that the metonymy around medicine can be resolved by:

?F=sale(of(?_))

We answer the question by deducing the following:

subject(to(‘⊆'(sale(tax))))(sale(of(‘⊆'(medicine))))

Typically, we resort to metonymy when no answer is found without it. So we ask the question without metonymy and, if no answers are obtained, we pursue the question again, this time allowing metonymy.

Of course, in an actual dialog, we might ask for confirmation of the metonymy and, in either case, render the confirmations and answers above as English sentences.

For example, the agent might say any of the following:

Yes.
Yes, a sale of medicine is subject to sales tax.
Yes, if you were asking about sales of medicine.

As simple as this example may be, it demonstrates general techniques, just a few of which allow an agent to be more intelligent than a chatbot. All it takes is a little thinking (i.e., knowledge and reasoning).

^[1] Bunt, Harry. “Semantic under-specification: Which technique for what purpose?.” Computing meaning. Springer, Dordrecht, 2008. 55-85.

^[2] We ignore the quantifier for “food” here but could establish its variable as equal to the variable for products, as is almost universally the case for compound nouns.

^[3] We ignore the quantifier for “not” here, which refers to the situation in which the negation holds. In effect, we interpret the negation “universally”.

^[4] For example, the Linguist realizes that:

the focal referent here is the noun phrase headed by “products” (i.e., ?x6)
- e.g., it’s scope is outermost and includes the quantification of ?e16
- e.g., that its quantifier is universal given its referent occurs as plural
- the quantification for a subject of an event typically has wider scope
  - i.e., the quantification of ?x6 here has scope over the quantification of ?e16
  - that “food products” is a compound noun having a single referent
    - e.g., the quantification for ?x9 has the same scope as ?x6
    - e.g., the variables ?x9 may co-reference or be unified with ?x9

^[5] We can be as precise as necessary here, but for simplicity we will omit the distinction of various taxes and their individuality per jurisdiction.

^[6] The Linguist also generates the following Flora axioms for the last reading:

subject(to)(?x7, ‘⊆'(sale(tax))) :- sale(of)(\##7,’⊆'(tangible(property))).
\neg sale(of)(\##7,’⊆'(tangible(property))) :- \neg subject(to)(?x7, ‘⊆'(sale(tax))).

The second rule reflects that (by default) something that is not subject to sales tax cannot be a sale of tangible property.

Most business rule systems would not handle this logic.Flora provides defeasible logic to handle the exceptions.

^[7] See this background information or this page at Wikipedia for further information.

^[8] definitions for terms of logic can be found here

^[9] in this WordNet sense

^[10] see this WordNet sense and its definition and hypernyms

^[11] see this WordNet sense

^[12] see this WordNet sense of sale or this in FrameNet

^[13]

^[14] see this background for more information

Affiliate Transactions covered by The Federal Reserve Act (Regulation W)

paul@haleyAI.com — Thu, 29 Aug 2013 19:19:55 +0000

Benjamin Grosof, co-founder of Coherent Knowledge Systems, is also involved with developing a standard ontology for the financial services industry (i.e., FIBO). In the course of working on FIBO, he is developing a demonstration of defeasible logic concerning Regulation W of the The Federal Reserve Act. Regulation W specifies which transactions involving banks and their affiliates are prohibited under Section 23A of the Act. In the course of doing this, there are various documents which are being captured within the Linguist platform. This is a brief note of how those documents can be imported into the platform for curation into formal semantics and logic (as Benjamin and Coherent are doing).

There is a document from the Federal Reserve with an appendix reviewing Regulation W:

http://www.federalreserve.gov/boarddocs/SRLetters/2003/SR0302a1.pdf

PDFs are challenging because they are images more than documents. There are lots of problems in getting text for natural language processing out of PDFs (as discussed in great detail in this article). Microsoft Word does not extract the text well at all. Google does a pretty nice job, but it breaks up the paragraphs into unrelated divisions. Every line on the screen becomes its own division in the resulting HTML. It renders fine on screen, but it’s not good for extracting paragraphs and sentences for knowledge acquisition. You can see for yourself by Googling for the above URL and looking at the cached HTML Google maintains. Here’s how it looks in the Linguist:

The next step is to click the XHTML button. This leads to a question about normalizing the document from the Google PDF structure. Answering yes results in the following:

At this point we’re good to go. You can use the XHTML Explorer to get rid of extraneous things, like removing headers and footers (e.g., Page X of Y). But the objective is to get sentences from the text reading for natural language processing. By clicking in a paragraph of text on the right and using a menu for the highlighted node on the left you can pop-up a dialog like this:

And so on to populate the knowledge base for curation.

Automatic Knowledge Graphs for Assessment Items and Learning Objects

paul@haleyAI.com — Wed, 28 Aug 2013 12:03:52 +0000

As I mentioned in this post, we’re having fun layering questions and answers with explanations on top of electronic textbook content.

The basic idea is to couple a graph structure of questions, answers, and explanations into the text using semantics. The trick is to do that well and automatically enough that we can deliver effective adaptive learning support. This is analogous to the knowledge graph that users of Knewton‘s API create for their content. The difference is that we get the graph from the content, including the “assessment items” (that’s what educators call questions, among other things). Essentially, we parse the content, including the assessment items (i.e., the questions and each of their answers and explanations). The result of this parsing is, as we’ve described elsewhere, precise lexical, syntactic, semantic, and logic understanding of each sentence in the content. But we don’t have to go nearly that far to exceed the state of the art here.

We automatically import the text, break it into sentences organized with respect to the document structure (i.e., in an outline of groups, per chapter, section, …), and parse them. We keep track of the ambiguities in the normal workflow of collaborative curation supported by the Linguist but proceed with the best parse according to statistical NLP. The result of this is a set of logical terms using normalized vocabulary. The automated parsing cannot go as far as to produce a logical formula, but that is more than we need (we’re not actually trying to answer the questions, as we were in Project Sherlock!)!

Across all the sentences, including the questions and explanations of answers, we have a graph of terms and inter-term relationships that supports algorithms such as inter-sentence comparison and contrasting. Think of this graph as a semantic network (e.g., one represented in RDF). This semantic network is further enriched by a semantic web ontology (e.g., expressed in OWL) which is semi-automatically derived from the content. For example, when we see a term like “accounting information”, we know it is information and we automatically acquire that net assets and total assets are both assets. We also automatically acquire that equipment is purchased by corporations, that net assets are calculated, that a reporting process is a process modified by a gerund derived from the verb report, and a lot more.

Well, I’m drifting off the thought that had me start this post, so I’ll just show you a picture for now and move on below:

Personally, I’m interested in the difficulties that collaborating contributors sometimes face in disambiguating language. In looking at the automated parse results for some of these sentences, I saw the following residual:

How would you choose? Is the communication or the information for the purpose of making decisions? Does it matter?

The following shows some of the results for one choice:

Note: there is a bug in the renderings of the existentials in the live code I am working with (i.e., don’t worry about it, but the words make and decision should be there).

This gives you an idea of the semantic information that we can acquire automatically and which can be curated to an arbitrary level of precision where appropriate.

Bottom line: the approach holds great promise for curation of assessment items and learning objects into a collaboratively developed adaptive learning platform.

Higher Education on a Flatter Earth

paul@haleyAI.com — Tue, 20 Aug 2013 15:30:25 +0000

We’re collaborating on some educational work and came across this sentence in a textbook on finance and accounting:

All of these are potentially good economic decisions.

We use statistical NLP but assist with the ambiguities. In doing this, we relate questions and answers and explanations to the text.

We also extract the terminology and produce a rich lexicalized ontology of the subject matter for pedagogical uses, assessment, and adaptive learning.

Here’s one that just struck me as interesting. This is a case where the choice looks like it won’t matter much either way, but …

The logic that results from the different parses in even such a simple case can be significantly different, however.

The difference here is whether the point is that they are potentially decisions versus potentially good!

Even when you do the right thing and clarify for the machine that the semantics of potential is with regard to good, there are different logical implications.

In this case, the right logic comes from another interpretation with the same syntactic structure as parse #1 above:

The outer universal is a little tricky. It has to be interpreted within a scope, as is typical for demonstrative pronouns, such as “these”. In effect, this is provided by a quantifier in the formula for the prior sentence.

Knowledge acquisition using lexical and semantic ontology

paul@haleyAI.com — Tue, 23 Jul 2013 22:37:59 +0000

In developing a compliance application based on the institutional review board policies of John Hopkins’ Dept. of Medicine, we have to clarify the following sentence:

Projects involving drugs or medical devices other than the use of an approved drug or medical device in the course of medical practice and projects whose data will be submitted to or held for inspection by the FDA will not be exempt from JHM IRB review UNLESS that use falls within the Emergency Use provisions of 21 CFR 56.102 (d).

As you can see, there are a number of compound words and acronyms, as well as references to the Code of Federal Regulations that need to be defined or recognized to understand this sentence. These include the following:

In addition, from a semantics standpoint, it is preferable to have ontological concepts that are more specific than the heads of the following compounds:

medical device
medical practice
emergency use

And, perhaps, it is appropriate to have concepts for:

approved drug
JHM IRB

Ideally, perhaps, there would even be a concept for:

the Emergency Use provisions of 21 CFR 56.102 (d)

although we’ll skip some of these details to save our readers the tedium.

In the English Resource Grammar (ERG), which is one of the natural language processing systems that we use, there is already a lexical entry for “FDA”, defined as follows:

fda_n1 := n_-_pn_le & [ ORTH < “FDA” >, SYNSEM [ LKEYS.KEYREL.CARG “FDA”, PHON.ONSET voc ] ].

From our standpoint, this is impoverished in that it will produce logical axioms using a predicate like “FDA” instead of “Food(and(Drug))(Administration)”, which avoids the ambiguity of anything else FDA might stand for. Even if the existing definition of FDA was acceptable, however, the other acronyms and compounds mentioned above do not exist in the ERG. To address this, the Linguist provides for the definition of additional words, as discussed here.

In general, unknown words are common in any new domain. For example, in Project Sherlock, we had to define a few hundred biological terms that occurred in Campbell’s Biology but which were not defined in the ERG, such as:

intramolecular
semipermeable
hydrophilicity
HDL
acidity

and so on. As sentences that contain unknown words are processed by the Linguist, a dialog allows part of speech information to be entered such that the natural language processing produces better (i.e., more focused) results. In the case of this sentence, right-clicking on “FDA” brings up the part of speech dialog, which is filled out below:

For the most part, this suffices in producing logic with terms corresponding to the words (actually, the morphological lemmas or stems of the words). In the context of defining a more precise sense of the word the dialog continues with an additional form to acquire more information about the singular proper noun “FDA”:

Here we’ve clarified that “FDA” is an acronym (not just an abbreviation) and that it is voiced (i.e., that we would us “an” rather than “a” before “FDA spokesperson”). We’ve also entered the unabbreviated form of the acronym and a word sense from WordNet.

On completing this dialog, the system wants to define the term “Food and Drug Administration”, which it knows is nominal, so it presents another form, as follows:

On completing this form, the system adds a lexical entry to the ERG (roughly) as follows:

Food+and+Drug+Administration_NNP_food_and_drug_administration%1:14:00:: := n_-_pn_le & [ ORTH < “Food”,”and”,”Drug”,”Administration” >, SYNSEM [ LKEYS.KEYREL.CARG “Food and Drug Administration”, PHON.ONSET con ] ].

In addition, the software maintains a lexical ontology that represents the more precise word sense and ontological information of this lexical entry and its predicate. After completing this, the prior form is updated to reflect the more precise word sense, too:

Which results in the following lexical entry being added to the ERG’s vocabulary, along with a variety of ontological information.

FDA_NNP_food_and_drug_administration%1:14:00:: := n_-_pn_le & [ ORTH < “FDA” >, SYNSEM [ LKEYS.KEYREL.CARG “Food and Drug Administration”, PHON.ONSET voc ] ].

Some of the other cases are more fun or interesting, such as IRB:

Which we clarify as being an aconym for “institutional review board”. Note that we think an IRB is a common noun.

As was the case for “Food and Drug Administration” above, it is desirable to define the compound noun “institutional review board” more completely, as in:

Which leads to further clarification of the head and non-head of the compound.

Which leads to further dialog regarding “board” and “review”. In the case of “board”, we can select a particular sense of the noun as follows:

This is a good thing to do in the case of defining “review board”, but the following shows other lexical entries could have been selected:

This dialog also gives you some idea of additional ontological information known about various lexical entries.

Continuing with “review board”, the dialog allows the sense of “review” to be specified, as in:

Other senses of review are also available:

Having completed the senses of “review” and “board” within “review board”, we now have:

and an ERG entry with sense and ontological connections which will support logical and semantic interpretation of review boards in general (i.e., whether or not they are institutional).

review+board_NN := n_-_c_le & [ ORTH < “review”,”board” >, SYNSEM [ LKEYS.KEYREL.PRED “_review+board_n_rel”, PHON.ONSET con ] ].

This returns the dialog to “institutional review board” which offers the opportunity to clarify the semantics of “institutional”, as follows:

which is derived from the noun “institution” which can be further clarified, as in:

After saying OK to the dialogs for “institution”, “institutional”, “institutional review board”, and “IRB”, the ontology is updated and the following is added to the ERG:

institutional+review+board_NN := n_-_c_le & [ ORTH < “institutional”,”review”,”board” >, SYNSEM [ LKEYS.KEYREL.PRED “_institutional+review+board_n_rel”, PHON.ONSET voc ] ].
IRB_NN := n_-_c_le & [ ORTH < “IRB” >, SYNSEM [ LKEYS.KEYREL.PRED “_institutional+review+board_n_rel”, PHON.ONSET voc ] ].

Hopefully, this gives you the idea as to how the lexicon can be extended to deal with unrecognized (i.e., new) vocabulary, including acronyms, compounds, and other parts of speech, including precise word sense and ontological information.

There is more depth to this than meets the eye, since there are additional capabilities to define senses as sentences of description logic and the senses of WordNet are organized with “synonym sets” and widely anchored or cross-referenced with widely available ontologies, including Yago and Open Cyc, for example.

SBVR in OWL

paul@haleyAI.com — Wed, 10 Apr 2013 15:48:16 +0000

In preparation for generating RIF and SBVR from the Linguist, we have produced an OWL ontology for the pertinent aspects of the SBVR specification. We hope that this is helpful to others and would sincerely appreciate any corrections or comments on how to improve it.

Paul

NLP: depictive in an HPSG lexicon?

paul@haleyAI.com — Fri, 08 Jun 2012 17:09:12 +0000

We’re working with the English Resource Grammar (ERG), OWL, and Vulcan’s SILK to educate the machine by translating textbooks into defeasible logic. Part of this involves an ontology that models semantics more deeply than the ERG, which is based on head-driven phrase structure grammar (HPSG), which provides deeper parsing and, with the ERG and the DELPH-IN infrastructure, also provides a simple under-specified semantic representation called minimal recursion semantics (MRS).

We’re having a great time using OWL to clarify and enrich the semantics of the rich model underlying the ERG. Here’s an example, FYI. If you’d like to know more (or help), please drop us a line! Overall the project will demonstrate our capabilities for transforming everyday sentences into RIF and business rule languages using SBVR extended with defeasibility and other capabilities, all modeled in the same OWL ontology.

What triggered this blog entry was a bit of a surprise in seeing that whether or not an adjective could be used depictively is sometimes encoded in the lexicon. This is one of the problems of TDL versus a description-logic based model with more expressiveness. It results in more lexical entries than necessary, which has been discussed by others when contrasted with the attributed logic engine (ALE), for example.

In trying to model the semantics of words like ‘same’ and ‘different’, we are scratching our heads about these lines from the ERG’s lexicon:

same_a1 := aj_pp_i-cmp-sme_le & [ ORTH < “same” >, SYNSEM [ LKEYS.KEYREL.PRED “_same_a_as_rel”, …
the_same_a1 := aj_-_i-prd-ndpt_le & [ ORTH < “the”, “same” >, SYNSEM [ LKEYS.KEYREL.PRED “_the+same_a_1_rel”, …
the_same_adv1 := av_-_i-vp-po_le & [ ORTH < “the”, “same” >, SYNSEM [ LKEYS.KEYREL.PRED “_the+same_a_1_rel”, …
exact_a2 := aj_pp_i-cmp-sme_le & [ ORTH < “exact” >, SYNSEM [ LKEYS.KEYREL.PRED “_exact_a_same-as_rel”…

One of the interesting things about lexicalized grammars is that lexical entries (i.e., ‘words’) are described with almost arbitrary combinations of their lexical, syntactic, and semantic characteristics.

The preceding code is expressed in a type description language (TDL) used by the Lisp-based LKB (and its C++ counterpart, PET, which are unification-based parsers that produce a chart of plausible parses with some efficiency. What is given above is already deeper than what you can expect from a statistical parser (but richer descriptions of lexical entries promises to make statistical parsing much better, too).

Unfortunately, there is no available documentation on why the ERG was designed as it is, so the meaning of the above is difficult to interpret. For example, the types of lexical entries (the symbols ending in ‘_le’) referenced above are defined as follows:

aj_pp_i-cmp-sme_le := basic_adj_comp_lexent & [SYNSEM[LOCAL[CAT[HEAD superl_adj &[PRD -,MOD <[LOCAL.CAT.VAL.SPR <[–MIN def_or_demon_q_rel]>]>],VAL.SPR.FIRST.–MIN much_deg_rel],CONT.RELS ],MODIFD.LPERIPH bool,LKEYS[ALTKEYREL.PRED comp_equal_rel,–COMPKEY _as_p_comp_rel]]].
aj_-_i-prd-ndpt_le := nonc-hm-nab & [SYNSEM basic_adj_abstr_lex_synsem & [LOCAL[CAT[HEAD adj & [PRD +,MINORS[MIN norm_adj_rel,NORM norm_rel],TAM #tam,MOD < anti_synsem_min >],VAL[SPR.FIRST anti_synsem_min,COMPS < >],POSTHD +],CONT[HOOK[LTOP #ltop,INDEX #arg0 &[E #tam],XARG #xarg],RELS ,HCONS ]],NONLOC non-local_none,MODIFD notmod &[LPERIPH bool],LKEYS.KEYREL #keyrel &[LBL #ltop,ARG0 #arg0,ARG1 #xarg & non_expl-ind]]].

Needless to say, that’s a mouthful! Chasing this down, the following ‘informs’ us that “the same”, which uses type #2 above, is defined using the following lexical types:

nonc-hm-nab := nonc-h-nab & mcna.
nonc-h-nab := nonconj & hc-to-phr & non_affix_bearing.
mcna := word & [ SYNSEM.LOCAL.CAT.MC na ].

Which is to say that it is non-conjunctive, complements a head to form a phrase, can’t be affixed, cannot constitute a main clause, and is a word.

The fact that the lexical entry for “the same” is adjectival is given the definition of the following type(s) used in the SYNSEM feature:

basic_adj_comp_lexent := compar_superl_adj_word & [SYNSEM adj_unsp_ind_twoarg_synsem & [LOCAL[CAT.VAL[COMPS ],CONT.HOOK [ LTOP #ltop, XARG #xarg]],LKEYS [ KEYREL.ARG1 #xarg,ALTKEYREL.ARG2 #ind,–COMPKEY #cmin]]].b
compar_superl_adj_word := nonc-hm-nab & [SYNSEM adj_unsp_ind_synsem & [LOCAL[CAT[HEAD[MOD <[–SIND #ind & non_expl]>,TAM #tam,MINORS.MIN abstr_adj_rel],VAL.SPR.FIRST.LOCAL.CONT.HOOK.XARG #altarg0],CONT[HOOK[XARG #ind,INDEX #arg0 & [E #tam]],RELS.LIST <[LBL #hand,ARG1 #ind],#altkeyrel & [LBL #hand,ARG0 event & #altarg0,ARG1 #arg0],…>]],LKEYS.ALTKEYREL #altkeyrel]].

Which is to say that it is a comparative or superlative adjectival word (even though it consists of two lexemes in its ‘orthography’) that involves two semantic arguments including one complement which may be unexpressed prepositional phrase. A comparative or superlative adjective, in turn, is non-conjunctive, complements a head to form a phrase, is non-affix bearing (?), and non-clausal, as defined by the type ‘nonc-hm-nab’ above.

The types used in the syntax and semantic (i.e., SYNSEM) feature of the two lexical types are defined as follows (none of which is documented):

adj_unsp_ind_twoarg_synsem := adj_unsp_ind_synsem & two_arg.
adj_unsp_ind_synsem := basic_adj_lex_synsem & lex_synsem & adj_synsem_lex_or_phrase & isect_synsem & [LOCAL.CONT.HOOK.INDEX #ind,LKEYS.KEYREL.ARG0 #ind].

In a moment, we’ll discuss the types used in the second of these, but first, some basics on the semantics that are mixed with the syntax above.

In effect, the above indicates that a new ‘elementary predication’ will be needed in the MRS to represent the adjectival relationship in the logic derived in the course of parsing (i.e., that’s what ‘unsp_ind’ means, although it’s not documented, which I will try not to bemoan much further.)

The following indicates that the newly formed elementary predicate is not (initially) within any scope and that it has two arguments whose semantics (i.e., their RELations) are concatenated for propagation into the list of elementary predications that will constitute the MRS for any parses found.

two_arg := basic_two_arg & [LOCAL.CONT.HCONS ].
basic_two_arg := unspec_two_arg & lex_synsem.
unspec_two_arg := basic_lex_synsem & [LOCAL.ARG-S <[LOCAL.CONT.HOOK.–SLTOP #sltop,NONLOC [SLASH[LIST #smiddle,LAST #slast],REL [LIST #rmiddle,LAST #rlast],QUE[LIST #qmiddle,LAST #qlast]]],[LOCAL.CONT.HOOK.–SLTOP #sltop, NONLOC[SLASH[LIST #sfirst,LAST #smiddle],REL[LIST #rfirst,LAST #rmiddle],QUE[LIST #qfirst,LAST #qmiddle]]]>,LOCAL.CONT.HOOK.–SLTOP #sltop,NONLOC[SLASH[LIST #sfirst,LAST #slast],REL[LIST #rfirst,LAST #rlast],QUE[LIST #qfirst,LAST #qlast]]].
lex_synsem := basic_lex_synsem & [LEX +].

The last of these expresses that the constuction is lexical rather than phrasal (which includes clausal in the ERG).

Continuing with the definition of “the same” as an adjective, the following finally clarifies what it means to be a basic adjective:

basic_adj_lex_synsem := basic_adj_abstr_lex_synsem & [LOCAL[ARG-S <#spr . #comps>,CAT[HEAD adj_or_intadj,VAL[SPR<#spr & synsem_min &[–MIN degree_rel,LOCAL[CAT[VAL[SPR *olist*,SPEC <[LOCAL.CAT.HS-LEX #hslex]>],MC na],CONT.HOOK.LTOP #ltop],NONLOC.SLASH 0-dlist,OPT +],anti_synsem_min &[–MIN degree_rel]>,COMPS #comps],HS-LEX #hslex],CONT.RELS.LIST <#keyrel,…>],LKEYS.KEYREL #keyrel & [LBL #ltop]].

Well, ‘clarifies’ might not have been the right word! Essentially, it indicates that the adjective may have an optional degree specifier (which semantically modifies the predicate of the adjective) and that the predicate specified in the lexical entry becomes the predicate used in the MRS. The rest is defined below:

basic_adj_abstr_lex_synsem := basic_adj_synsem_lex_or_phrase & abstr_lex_synsem & [LOCAL.CONT.RELS.LIST.FIRST basic_adj_relation].
basic_adj_synsem_lex_or_phrase := canonical_synsem & [LOCAL[AGR #agr,CAT[HEAD[MINORS.MIN basic_adj_rel],VAL[SUBJ <>,SPCMPS <>]],CONT.HOOK[INDEX non_conj_sement,XARG #agr]]].
canonical_synsem := expressed_synsem & canonical_or_unexpressed.
expressed_synsem := synsem.
canonical_or_unexpressed := synsem_min0.
synsem_min0 := synsem_min & [LOCAL mod_local,NONLOC non-local_min].

Which ends with a bunch of basic setup types except for constraining for relation for an adjective to be ‘basically adjectivally’ on the first two lines. Also on these first two lines, it specifies that its subject and its specifier, if any, must be completed (i.e., empty) and agree with its non-conjunctive argument (which is not to say that it cannot be conjunctive, but that it modifies the conjunction as a whole, if so.) Whether or not it is expressed will determine if there are any further predicates about its arguments or if its unexpressed argument is identified by an otherwise unreferenced variable in any resulting MRS.

The lexical grounding of this type specification is given below, indicating that it may (or not) have phonology (e.g., pronunciation, such as whether its onset is voiced) and if and how and with what punctuation it may appear, if any. In general, a semantic argument may be lexical or phrasal and optional but if it appears it corresponds to some semantic index (think variable) in sort of predicate in any resulting MRS. (The *_min types do not constrain the values of their features any further).

basic_lex_synsem := abstr_lex_synsem & lex_or_nonlex_synsem.
abstr_lex_synsem := canonical_lex_or_phrase_synsem & [LKEYS lexkeys].
canonical_lex_or_phrase_synsem := canonical_synsem & lex_or_phrase.
lex_or_phrase := synsem_min2.
synsem_min2 := synsem_min1 & [LEX luk,MODIFD xmod_min,PHON phon_min,PUNCT punctuation_min].
synsem_min1 := synsem_min0 & [OPT bool,–MIN predsort,–SIND *top*].
adj_synsem_lex_or_phrase := basic_adj_synsem_lex_or_phrase &[LOCAL[CAT.HEAD.MOD ,SPR.FIRST synsem &[–MIN quant_or_deg_rel],COMPS <>],MC na],CONJ cnil],–SIND #ind]>,CONT.HOOK.XARG #ind]].

Note that an adjective is not possess-able and that it modifies something nominal (or a title) and that if it has a specifier that it is a quantifier or degree (e.g., ‘very’). Again, an adjective cannot function as a main clause or be conjunctive (in and of itself).

Finally, if you look far above you will see that the basic semantics of an adjective with an additional semantic argument is ‘intersective’, as in:

isect_synsem := abstr_lex_synsem & [LOCAL[CAT.HEAD.MOD <[LOCAL intersective_mod,NONLOC.REL 0-dlist]>,CONT.HOOK.LTOP #hand],LKEYS.KEYREL.LBL #hand].

Here, the length 0 difference list and the following definitions indicate that intersective semantics do not accept anything but local modification:

intersective_mod := mod_local.
mod_local := *avm*.

AVM stands for ‘attribute value matrix’, which is the structure by which types and their features are defined (with nesting and unification constraints using # to indicate equality).

By now you’re probably getting the idea that there is fairly significant model of the English language, including its lexical and syntactic aspects, but if you look there is a lot about semantics here, too.

Event-centric BPM and goal-driven processing

paul@haleyAI.com — Mon, 07 Nov 2011 16:14:42 +0000

The slides for my Business Rules Forum presentation on event semantics and focusing on events in order to simplify process definition and to facilitate more robust governance and compliance are at Event-centric BPM.

After the talk I spoke with Jan Verbeek and Gartjan Grijzen of Be Informed and reviewed their software, which is excellent. They have been quite successful with various government agencies in applying the event-centric methodology to produce goal-driven processing. Their approach is elegant and effective. It clearly demonstrates the merits of an event-centric approach and the power that emerges from understanding event-dependencies. Also, it is very semantic, ontological, and logic-programming oriented in its approach (e.g., they use OWL and a backward-chaining inference engine).

They do not have the top-down knowledge management approach that I advocate nor do they provide the logical verification of governing policies and compliance (i.e., using theorem provers) that I mention in the talk (see Guido Governatori‘s 2010 publications and Travis Breaux‘s research at CMU, for example) but theirs is the best commercially deployed work in separating business process description from procedural implementation that comes to mind. (Note that Ed Barkmeyer of NIST reports some use of SBVR descriptions of manufacturing processes with theorem provers. Some in automotive and aerospace industries have been interested in this approach for quality purposes, too.)

BeInformed is now expanding into the United States with the assistance of Mills Davis and others. Their software is definitely worth consideration and, in my opinion, is more elegant and effective than the generic BPMN approach.

Simple problems with the semantic web

paul@haleyAI.com — Mon, 10 Oct 2011 21:30:40 +0000

The standard for defining ontologies these days is OWL and Protege. Unfortunately, OWL lacks any notion of exceptions in inheritance or any other notion of defeasibility.

So, although you may want to say that birds fly, you’re ontology will be broken (or become much more complicated) when you realize there are birds that can’t fly, such as penguins or ostriches, or even sick or injured birds.

Practically speaking, you need something like courteous logic or the defeasibility in SILK to handle this (or any 1980s expert system shell or even earlier frame system). OWL is very hard on mortal man (e.g., mainstream IT) in this regard.

How can I tell OWL that a pronoun is a noun but that pronouns are a closed class of words, unlike nouns, verbs, adjectives, and adverbs (in general). Well, I’ll have to tell it about open-class nouns versus closed class nouns. What a pain!

This is why we use Protege primarily as a drafting tool and, for example, SILK, to do reasoning. Non-defeasible description logic and first-order reasoners are difficult to get along with, in practice (and make sustainable knowledge repositories too difficult – which inhibits adoption, obviously).

Tendencies and purpose matter

paul@haleyAI.com — Mon, 07 Feb 2011 20:16:28 +0000

The basic formal ontology (BFO) offers a simple, elegant process model. It adds alethic and teleological semantics to the more procedural models, among which I would include NIST’s process specification language (PSL) along with BPMN.

Although alethic typically refers to necessary vs. possible, it clearly subsumes the probable or expected (albeit excluding deontics⁰). For example, consider the notion of ‘disposition’ (shown below as rendered in Protege):

For example, cells might be disposed to undergo the cell cycle, which consists of interphase, mitosis, and cytokinesis. Iron is disposed to rust. Certain customers might be disposed to comment, complain, or inquire.

Disposition is nice because it reflects things that have an unexpected high probability of occurring¹ but that may not be a necessary part of a process. It seems, however, that disposition is lacking from most business process models. It is prevalent in the soft and hard sciences, though. And it is important in medicine.

Disposition is distinct from what should occur or be attempted next in a process. Just because something is disposed to happen does not mean that it should or will. Although disposition is clearly related to business events and processes, it seems surprisingly lacking from business models (and CEP/BPM tooling).²

A teleological aspect of BFO is the notion of purpose or intended ‘function’, as shown below:

Function is about what something is expected to do or what it is for. For example, what is the function of an actuary? Representing such functionality of individuals or departments within enterprises may be atypical today, but is clearly relevant to skills-based routing, human resource optimization and business modeling in general.

Understanding disposition and function is clearly relevant to business modeling (including organizational structure), planning and performance optimization. Without an understanding of disposition, anticipation and foresight will be lacking. Without an understanding of function, measurement, reporting, and performance improvement will be lacking.

⁰ SBVR does a nice job with alethic and deontic augmentation of first order logic (i.e., positive and negative necessity, possibility, permission, and preference).

¹ Thanks to BG for “politicians are disposed to corruption” which indicates a population that is more likely than a larger population to be involved in certain situations.

² Cyc’s notion of ‘disposition’ or ‘tendency’ is focused on properties rather than probabilities, as in the following citation from OpenCyc. Such a notion is similarly lacking from most business models, probably because its utility requires more significant reasoning and business intelligence than is common within enterprises.

The collection of all the different quantities of dispositional properties; e.g. a particular degree of thermal conductivity. The various specializations of this collection are the collections of all the degrees of a particular dispositional property. For example, ThermalConductivity is a specialization of this collection and its instances are usually denoted with the generic value functions as in (HighAmountFn ThermalConductivity).