HPSG – Commercial Intelligence

February 16, 2018July 14, 2018

‘believed by many’

A Linguist user recently had a question about part of a sentence that boiled down to something like the following:

It is believed by many.

The question was whether “many” was an adjective, cardinality, or noun in this sentence. It’s a reasonable question!

October 23, 2015August 5, 2018

Robust Inference and Slacker Semantics

In preparing for some natural language generation^[1], I came across some work on natural logic^[2]^[3] and reasoning by textual entailment^[4] (RTE) by Richard Bergmair in his PhD at Cambridge:

Monte Carlo Semantics: Robust Inference and Logical Pattern Processing with Natural Language Text

The work he describes overlaps our approach to robust inference from the deep, variable-precision semantics that result from linguistic analysis and disambiguation using the English Resource Grammar (ERG) and the Linguist™.

Continue reading “Robust Inference and Slacker Semantics”

November 27, 2013July 14, 2018

Deep Parsing vs. Deep Learning

For those of us that enjoy the intersection of machine learning and natural language, including “deep learning”, which is all the rage, here is an interesting paper on generalizing vector space models of words to broader semantics of English by Jayant Krishnamurthy, a PhD student of Tom Mitchell at Carnegie Mellon University:

Krishnamurthy, Jayant, and Tom M. Mitchell. “Vector Space Semantic Parsing: A Framework for Compositional Vector Space Models.” ACL 2013 (2013): 1.

Essentially, the paper demonstrates how the features of high-precision lexicalized grammars allow machines to learn the compositional semantics of English. More specifically, the paper demonstrates learning of compositional semantics beyond the capabilities of recurrent neural networks (RNN). In summary, the paper suggests that deep parsing is better than deep learning for understanding the meaning of natural language.

For more information and a different perspective, I recommend the following paper, too:

Socher, Richard, et al. “Semantic compositionality through recursive matrix-vector spaces.” Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 2012.

Note that the authors use Combinatory Categorial Grammar (CCG) while our work uses head-driven phrase structure grammar (HPSG), but this is a minor distinction. For example, compare the logical forms in the Groningen Meaning Bank with the logic produced by the Linguist. The former uses CCG to produce lambda calculus while the latter uses HPSG to produce predicate calculus (ignoring vagaries of under-specified representation which are useful for hypothetical reasoning and textual entailment).

May 28, 2013December 18, 2018

Background for our Semantic Technology 2013 presentation

In the spring of 2012, Vulcan engaged Automata for a knowledge acquisition (KA) experiment. This article provides background on the context of that experiment and what the results portend for artificial intelligence applications, especially in the areas of education. Vulcan presented some of the award-winning work referenced here at an AI conference, including a demonstration of the electronic textbook discussed below. There is a video of that presentation here. The introductory remarks are interesting but not pertinent to this article.

Background on Vulcan’s Project Halo

From 2002 to 2004, Vulcan developed a Halo Pilot that could correctly answer between 30% and 50% of the questions on advanced placement (AP) tests in chemistry. The approaches relied on sophisticated approaches to formal knowledge representation and expert knowledge engineering. Of three teams, Cycorp fared the worst and SRI fared the best in this competition. SRI’s system performed at the level of scoring a 3 on the AP, which corresponds to earning course credit at many universities. The consensus view at that time was that achieving a score of 4 on the AP was feasible with limited additional effort. However, the cost per page for this level of performance was roughly $10,000, which needed to be reduced significantly before Vulcan’s objective of a Digital Aristotle could be considered viable.

Continue reading “Background for our Semantic Technology 2013 presentation”