Robust Inference and Slacker Semantics – Commercial Intelligence

In preparing for some natural language generation^[1], I came across some work on natural logic^[2]^[3] and reasoning by textual entailment^[4] (RTE) by Richard Bergmair in his PhD at Cambridge:

Monte Carlo Semantics: Robust Inference and Logical Pattern Processing with Natural Language Text

The work he describes overlaps our approach to robust inference from the deep, variable-precision semantics that result from linguistic analysis and disambiguation using the English Resource Grammar (ERG) and the Linguist™.

Mr. Bergmair’s semantic logic project has 2 components:

A Python platform for experimentation with semantics:
i.e., software for converting the minimal recursion semantics (MRS) produced using the ERG into first-order logic
A textual entailment engine using Monte Carlo techniques on the first-order predicate calculus (FOPS) produced above

The following comments are particularly interesting:

With the scoping machinery and the first-order approximation in place, PyPES™ makes it possible to translate text into formulae of FOPC. This is what Boxer does for CCG and what Glue Semantics does for LFG.
- …the main problem with Boxer and glue semantics is their strong commitment throughout to classical bivalent logic, which is limited in its ability to represent natural language semantics.
- FOPC lacks some kinds of expressive power that are important for natural language, such as quantifiers like most as well as weakening and strengthening modifiers like very.
- The straightforward logical encoding used by Boxer and glue semantics leads to over-commitment in some places, for example forcing strictly recursive quantifier scopings when little or nothing is known about the scopings from the natural language input

Parenthetically, CCG is combinatorial categorical grammar (spelled in various ways) and LFG is lexical function grammar. LFG and glue semantics were developed at Xerox PARC before being commercialized at Powerset and subsequently acquired by Microsoft. The ERG is a head-driven phrase structure grammar (HPSG). All of the parsing systems that produce deeper, albeit under-specified semantic representation^[5], including HPSG, CCG, and LFG, are unification grammars.

I have long followed Johan Bos work on under-specified representation and his use of lambda calculus and Prolog (Boxer) and CCG. For example, visit the Groningen Meaning Bank. Nonetheless, Mr. Bergmair’s criticism is fair and reasonable. In our work at Vulcan^[6], the well-founded semantics proved critical, for example. Generalized quantification was also critical, including the use of a reasoning engine that could support such quantification defeasibly^[7].

Mr.Bergmair continues his comment regarding strict resolution of all scope ambiguities by citing the use of slacker semantics rather than resolving ambiguities of scope. Implicit in this statement is that the first component is not typically used to resolve all scopal ambiguities.

I should note that Mr. Bergmair does not discuss the resolution of grammatical ambiguity. For short sentences, as in most of his examples, grammatical ambiguity is low. But it is combinatoric and becomes significant for sentences over 10 words long. Presumably, Mr. Bergmair works from the best ranking parse. However, the best ranking parse rarely includes the intended semantics for sentences significantly more than 10 words in length.^[8]

I am not surprised that Mr. Bergmair does not address grammatical ambiguity and deemphasizes resolution of scopal ambiguities since to do so can be difficult without a user interface such as in the Linguist. Given an RTE focus this is a reasonable position since there is no human assistance as in a cognitive-computing approach. However, for sentences of moderate length or longer, ambiguity of grammar, let alone of logic will produce combinatorial ambiguities of entailment that even a Monte Carlo approach may not be able to address. If this is the case, then slacker semantics (i.e., heuristic reasoning from under-specified semantics) will have to be reduced significantly, perhaps close to the point of elimination of ambiguity.

This is precisely what we emphasize with the Linguist. Consequently, Mr. Bergmair’s approach and others we are pursuing will result in more robust, deep, and precise reasoning.

^[1]Question Generation with Minimal Recursion Semantics

^[2] http://maartens.home.xs4all.nl/philosophy/Dictionary/N/Natural%20Logic.htm

^[3] http://nlp.stanford.edu/projects/natlog.shtml

^[4] http://www.aclweb.org/aclwiki/index.php?title=Textual_Entailment_References

^[5] Semantic underspecification: Which technique for what purpose?

^[6] Project Sherlock and Acquiring Rich Logical Knowledge From Text – SemTech 2013

^[7] The SILK Project: Semantic Inferencing on Large Knowledge

^[8] It’s hard to reckon nice English