disambiguation – Commercial Intelligence

March 16, 2018July 13, 2018

Iterative Disambiguation

In a prior post we showed how extraordinarily ambiguous, long sentences can be precisely interpreted. Here we take a simpler look upon request.

Let’s take a sentence that has more than 10 parses and configure the software to disambiguate among no more than 10.

Once again, this is a trivial sentence to disambiguate in seconds without iterative parsing!

The immediate results might present:

Suppose the intent is not that the telescope is with my friend, so veto “telescope with my friend” with a right-click.

Continue reading “Iterative Disambiguation”

March 4, 2018August 5, 2018

“Only full page color ads can run on the back cover of the New York Times Magazine.”

A decade or so ago, we were debating how to educate Paul Allen’s artificial intelligence in a meeting at Vulcan headquarters in Seattle with researchers from IBM, Cycorp, SRI, and other places.

We were talking about how to “engineer knowledge” from textbooks into formal systems like Cyc or Vulcan’s SILK inference engine (which we were developing at the time). Although some progress had been made in prior years, the onus of acquiring knowledge using SRI’s Aura remained too high and the reasoning capabilities that resulted from Aura, which targeted University of Texas’ Knowledge Machine, were too limited to achieve Paul’s objective of a Digital Aristotle. Unfortunately, this failure ultimately led to the end of Project Halo and the beginning of the Aristo project under Oren Etzioni’s leadership at the Allen Institute for Artificial Intelligence.

At that meeting, I brought up the idea of simply translating English into logic, as my former product called “Authorete” did. (We renamed it before Haley Systems was acquired by Oracle, prior to the meeting.)

Continue reading ““Only full page color ads can run on the back cover of the New York Times Magazine.””

February 17, 2018July 26, 2018

Dictionary Knowledge Acquisition

The following is motivated by Section 6359 of the California Sales and Use Tax. It demonstrates how knowledge can be acquired from dictionary definitions:

Here, we’ve taken a definition from WordNet and prefixed it with the word followed by a colon and parsed it using the Linguist.

Continue reading “Dictionary Knowledge Acquisition”

February 16, 2018July 14, 2018

‘believed by many’

A Linguist user recently had a question about part of a sentence that boiled down to something like the following:

It is believed by many.

The question was whether “many” was an adjective, cardinality, or noun in this sentence. It’s a reasonable question!

Continue reading “‘believed by many’”

February 10, 2018September 9, 2018

Nominal semantics of ‘meaning’

Just a quick note about a natural language interpretation that came up for the following sentence:

Under that test, the rental to an oil well driller of a “rock bit” having an effective life of but one rental is a transaction in lieu of a transfer of title within the meaning of (a) of this section.

The NLP system comes up with many hundreds of plausible parses for this sentence (mostly because it’s considering lexical and syntactic possibilities that are not semantically plausible). Among these is “meaning” as a nominalization.

From Wikipedia:

In linguistics, nominalization is the use of a word which is not a noun (e.g. a verb, an adjective or an adverb) as a noun, or as the head of a noun phrase, with or without morphological transformation.

It’s quite common to use the present participle of a verb as a noun. In this case, Google comes up with this definition for the noun ‘meaning’:

what is meant by a word, text, concept, or action.

The NLP system has a definition of “meaning” as a mass or count noun as well as definitions for several senses of the verb “mean”, such as these:

intend to convey, indicate, or refer to (a particular thing or notion); signify.
intend (something) to occur or be the case.
have as a consequence or result.

Continue reading “Nominal semantics of ‘meaning’”

December 8, 2017September 9, 2018

Combinatorial ambiguity? No problem!

Working on translating some legal documentations (sales and use tax laws and regulations) into compliance logic, we came across the following sentence (and many more that are even worse):

Any transfer of title or possession, exchange, or barter, conditional or otherwise, in any manner or by any means whatsoever, of tangible personal property for a consideration.

Natural language processing systems choke on sentences like this because of such sentences’ combinatorial ambiguity and NLP’s typical lack of knowledge about what can be conjoined or complement or modify what.

This sentences has many thousands of possible parses. They involve what the scopes of each of the the ‘or’s are and what is modified by conditional, otherwise, or whatsoever and what is complemented by in, by, of, and for.

The following shows 2 parses remaining after we veto a number of mistakes and confirm some phrases from the 400 highest ranking parses (a few right or left clicks of the mouse):

Continue reading “Combinatorial ambiguity? No problem!”

August 20, 2013August 29, 2013

Higher Education on a Flatter Earth

We’re collaborating on some educational work and came across this sentence in a textbook on finance and accounting:

All of these are potentially good economic decisions.

We use statistical NLP but assist with the ambiguities. In doing this, we relate questions and answers and explanations to the text.

We also extract the terminology and produce a rich lexicalized ontology of the subject matter for pedagogical uses, assessment, and adaptive learning.

Here’s one that just struck me as interesting. This is a case where the choice looks like it won’t matter much either way, but …

Continue reading “Higher Education on a Flatter Earth”

June 15, 2013July 14, 2018

Deep question answering: Watson vs. Aristotle

At the SemTech conference last week, a few companies asked me how to respond to IBM’s Watson given my involvement with rapid knowledge acquisition for deep question answering at Vulcan. My answer varies with whether there is any subject matter focus, but essentially involves extending their approach with deeper knowledge and more emphasis on logical in additional to textual entailment.

Today, in a discussion on the LinkedIn NLP group, there was some interest in finding more technical details about Watson. A year ago, IBM published the most technical details to date about Watson in the IBM Journal of Research and Development. Most of those journal articles are available for free on the web. For convenience, here are my bookmarks to them.

Question analysis: How Watson reads a clue
Deep parsing in Watson
Good technical details on the two parsing approaches taken. Using deep parsing, such as we have (e.g., using the ERG in Project Sherlock) and disambiguation is a viable approach for the background knowledge that IBM dismisses too quickly, however (see below). Note that in order to train NLP systems in new domains, you have to go through the same process for thousands of sentences, so disambiguation technology as in the Linguist is appropriate even if proofs are based more on textual entailment than logical deduction.
Textual resource acquisition and engineering
Automatic knowledge extraction from documents
Finding needles in the haystack: Search and candidate generation
Typing candidate answers using type coercion
Textual evidence gathering and analysis
Relation extraction and scoring in DeepQA
Structured data and inference in DeepQA
IBM takes a hard line against deep knowledge here. They have the upper hand in the argument due to their impressive results, but more precise knowledge would only improve their performance. You can find more on this debate in the Deep QA FAQ and this presentation which includes the the followings slide:
Continue reading “Deep question answering: Watson vs. Aristotle”