Commercial Intelligence Rotating Header Image

Simon Says

Some folks use the term “automatic speech recognition”, ASR.  I don’t like the separation between recognition and understanding, but that’s where the technology stands.

The term ASR encourages thinking about spoken language at a technical level in which purely inductive techniques are used to generate text from an audio signal (which is hopefully some recorded speech!).

As you may know, I am very interested in what many in ASR consider “downstream” natural language tasks.  Nonetheless, I’ve been involved with speech since Carnegie Mellon in the eighties.  During Haley Systems, I hired one of the Sphinx fellows who integrated Microsoft and IBM speech products with our natural language understanding software.  Now I’m working on spoken-language understanding again…

Most common approaches to ASR these days involve deep learning, such as Baidu’s DeepSpeech.  If your notion of deep learning means lots of matrix algebra more than necessarily neural networks, then KALDI is also in the running, but it dates to 2011.  KALDI is an evolution from the hidden Markov model toolkit, HTK (once owned by Microsoft).  Hidden Markov models (HMM) were the basis of most speech recognition systems dating back to the eighties or so, including Sphinx.  All of these are open source and freely licensed.

As everyone knows, ASR performance has improved dramatically in the last 10 years. The primary metric for ASR performance is “word error rate” (WER).  Most folks think of WER as the percentage of words incorrectly recognized, although it’s not that simple.  WER can be more than 1 (e.g., if you come up with a sentence given only noise!).  Here is a comparison published in 2011.

Today, Google, Amazon, Microsoft and others have WER under 10% in many cases. To get there, it takes some talent and thousands of hours of training data.  Google is best, Alexa is close, and Microsoft lags a bit in 3rd place.  (Click the graphic for the article summarizing Vocalize.io results.)

(more…)

Character by character sentiment

This is a great page on language modeling with an awesome graphic and commentary on its learned “sentiment neuron”.

Importing documents into a knowledge base

Many users land up wanting to import sentences in the Linguist rather than type or paste them in one at a time.  One way to do this is to right click on a group within the Linguist and import a file of sentences, one per line, into that group.  But if you want to import a large document and retain its outline structure, some application-specific use of the Linguist APIs may be the way to go.

We’ve written up how to import a whole web site  on California Sales & Use Tax:

Is business logic too much for classical logic?

Business logic is not limited to mathematical logic, as in first-order predicate calculus.

Business logic commonly requires “aggregation” over sets of things, like summing the value of claims against a property to subtract it from the value of that property in order to determine the equity of the owner of that property.

  • The equity of the owner of a property in the property is the excess of the value of the property over the value of claims against it.

There are various ways of describing such extended forms of classical logic.  The most relevant to most enterprises is the relational algebra perspective, which is the base for relational databases and SQL.  Another is the notion of generalized quantifiers.

In either case, it is a practical matter to be able to capture such logic in a rigorous manner.  The example below shows how that can be accomplished using English, producing the following axiom in extended logic:

  • ∀(?x15)(property(?x15)→∀(?x10)(owner(of)(?x10,?x15)→∃(?x31)(value(of(?x15))(?x31)∧∑(?x44)(∀(?x49)(claim(against(?x15))(?x49)→value(of(?x49))(?x44))→∃(?x26)(excess(of(?x31))(over(?x44))(?x26)∧equity(in(?x15))(of(?x10))(?x26))))))

This logic can be realized in various ways, depending on the deployment platform, such as: (more…)

Simply Smarter Intelligent Agents

Deep learning can produce some impressive chatbots, but they are hardly intelligent.  In fact, they are precisely ignorant in that they do not think or know anything.

More intelligent dialog with an artificially intelligent agent involves both knowledge and thinking.  In this article, we educate an intelligent agent that reasons to answer questions.

(more…)

Problems with Probabilistic Parsing

We are using statistical techniques to increase the automation of logical and semantic disambiguation, but nothing is easy with natural language.

Here is the Stanford Parser (the probabilistic context-free grammar version) applied to a couple of sentences.  There is nothing wrong with the Stanford Parser!  It’s state of the art and worthy of respect for what it does well.

(more…)

Confessions of a production rule vendor (part 2)

Going on 5 years ago, I wrote part 1.  Now, finally, it’s time for the rest of the story.

(more…)

Simply Logical English

This is not all that simple of an article, but it walks you through, from start to finish, how we get from English to logic. In particular, it shows how English sentences can be directly translated into formal logic for use with in automated reasoning with theorem provers, logic programs as simple as Prolog, and even into production rule systems.

There is a section in the middle that is a bit technical about the relationship between full logic and more limited systems (e.g., Prolog or production rule systems). You don’t have to appreciate the details, but we include them to avoid the impression of hand-waving

The examples here are trivial. You can find many and more complex examples throughout Automata’s web site.

Consider the sentence, “A cell has a nucleus.”:

(more…)

Natural Intelligence

Deep natural language understanding (NLU) is different than deep learning, as is deep reasoning.  Deep learning facilities deep NLP and will facilitate deeper reasoning, but it’s deep NLP for knowledge acquisition and question answering that seems most critical for general AI.  If that’s the case, we might call such general AI, “natural intelligence”.

Deep learning on its own delivers only the most shallow reasoning and embarrasses itself due to its lack of “common sense” (or any knowledge at all, for that matter!).  DARPA, the Allen Institute, and deep learning experts have come to their senses about the limits of deep learning with regard to general AI.

General artificial intelligence requires all of it: deep natural language understanding[1], deep learning, and deep reasoning.  The deep aspects are critical but no more so than knowledge (including “common sense”).[2] (more…)

Iterative Disambiguation

In a prior post we showed how extraordinarily ambiguous, long sentences can be precisely interpreted. Here we take a simpler look upon request.

Let’s take a sentence that has more than 10 parses and configure the software to disambiguate among no more than 10.

Once again, this is a trivial sentence to disambiguate in seconds without iterative parsing!

The immediate results might present:

Suppose the intent is not that the telescope is with my friend, so veto “telescope with my friend” with a right-click.

(more…)