February 2018 – Commercial Intelligence

February 26, 2018July 13, 2018

Are vitamins subject to sales tax in California?

What is the part of speech of “subject” in the sentence:

Are vitamins subject to sales tax in California?

Common sense about deep learning

I regularly build deep learning models for natural language processing and today I gave one a try that has been the leader in the Stanford Question Answering Dataset (SQuAD). This one is a impressive NLP platform built using PyTorch. But it’s still missing the big picture (i.e., it doesn’t “know” much).

Generally, NLP systems that emphasize Big Data (e.g., deep learning approaches) but eschew more explicit knowledge representation and reasoning are interesting but unintelligent. Think Siri and Alexa, for example. They might get a simple factoid question if a Google search can find closely related text, but not much more.

Here is a simple demonstration of problems that the state of the art in deep machine learning is far from solving…

Here is a paragraph from a Wall Street Journal article about the Fed today where the deep learning system has “found” what the pronouns “this” and “they” reference:

The essential point here is that the deep learning system is missing common sense. It is “the need”, not “a raise” that is referenced by “this”. And “they” references “officials”, not “the minutes”.

Bottom line: if you need your natural language understanding system to be smarter than this, you are not going to get there using deep learning alone.

February 17, 2018July 26, 2018

Dictionary Knowledge Acquisition

The following is motivated by Section 6359 of the California Sales and Use Tax. It demonstrates how knowledge can be acquired from dictionary definitions:

Here, we’ve taken a definition from WordNet and prefixed it with the word followed by a colon and parsed it using the Linguist.

Continue reading “Dictionary Knowledge Acquisition”

February 16, 2018July 14, 2018

‘believed by many’

A Linguist user recently had a question about part of a sentence that boiled down to something like the following:

It is believed by many.

The question was whether “many” was an adjective, cardinality, or noun in this sentence. It’s a reasonable question!

Continue reading “‘believed by many’”

February 11, 2018September 9, 2018

Parsing Winograd Challenges

The Winograd Challenge is an alternative to the Turing Test for assessing artificial intelligence. The essence of the test involves resolving pronouns. To date, systems have not fared well on the test for several reasons. There are 3 that come to mind:

The natural language processing involved in the word problems is beyond the state of the art.
Resolving many of the pronouns requires more common sense knowledge than state of the art systems possess.
Resolving many of the problems requires pragmatic reasoning beyond the state of the art.

As an example, one of the simpler exemplary problems is:

There is a pillar between me and the stage, and I can’t see around it.

A heuristic system (or a deep learning one) could infer that “it” does not refer to “me” or “I” and toss a coin between “pillar” and “stage”. A system worthy of the passing the Winograd Challenge should “know” it’s the pillar.

Even this simple sentence presents some NLP challenges that are easy to overlook. For example, does “between” modify the pillar or the verb “is”?

This is not much of a challenge, however, so let’s touch on some deeper issues and a more challenging problem…

Continue reading “Parsing Winograd Challenges”

February 10, 2018September 9, 2018

Nominal semantics of ‘meaning’

Just a quick note about a natural language interpretation that came up for the following sentence:

Under that test, the rental to an oil well driller of a “rock bit” having an effective life of but one rental is a transaction in lieu of a transfer of title within the meaning of (a) of this section.

The NLP system comes up with many hundreds of plausible parses for this sentence (mostly because it’s considering lexical and syntactic possibilities that are not semantically plausible). Among these is “meaning” as a nominalization.

From Wikipedia:

In linguistics, nominalization is the use of a word which is not a noun (e.g. a verb, an adjective or an adverb) as a noun, or as the head of a noun phrase, with or without morphological transformation.

It’s quite common to use the present participle of a verb as a noun. In this case, Google comes up with this definition for the noun ‘meaning’:

what is meant by a word, text, concept, or action.

The NLP system has a definition of “meaning” as a mass or count noun as well as definitions for several senses of the verb “mean”, such as these:

intend to convey, indicate, or refer to (a particular thing or notion); signify.
intend (something) to occur or be the case.
have as a consequence or result.

Continue reading “Nominal semantics of ‘meaning’”