A lot of recent work has advanced the learning of increasingly context-sensitive distributed representations (i.e., so-called ’embeddings’). In particular. DeepMind’s paper on “Contrastive Predictive Coding” (CPC) is particularly interesting and advances on a number of fronts. For example, in wav2vec, Facebook AI Research (FAIR) uses CPC to obtain apparently superior acoustic modeling results to DeepSpeech’s connectionist temporal classification (CTC) approach. In the CPC paper, the following image is particularly striking, harkening back to the early notion of a Grandmother Cell.
This is an important paper in the development of neural reasoning capabilities which should reduce the brittleness of purely symbolic approaches: Neural Logic Machine
The potential reasoning capabilities, such as with regard to multi-step inference, as in problem solving and theorem proving, are most interesting, but there are important contemporary applications in machine learning and question answering. I’ll just provide a few hightlights from the paper on the latter and some more points and references on the former below.
When I wrote Are Vitamins Subject to Sales Tax, I was addressing the process of translating knowledge expressed in formal documents, like laws, regulations, and contracts, into logic suitable for inference using the Linguist.
Recently, one of my favorite researchers working in natural language processing and reasoning, Luke Zettlemoyer, is among the authors of Entailment-driven Extracting and Editing for Conversational
Machine Reading. This is a very nice turn towards knowledge extraction and inference that improves on superficial reasoning by textual entailment (RTE).
I recommend this paper, which relates to BERT, which is among my current favorites in deep learning for NL/QA. Here is an image from the paper, FYI:
Some folks use the term “automatic speech recognition”, ASR. I don’t like the separation between recognition and understanding, but that’s where the technology stands.
The term ASR encourages thinking about spoken language at a technical level in which purely inductive techniques are used to generate text from an audio signal (which is hopefully some recorded speech!).
As you may know, I am very interested in what many in ASR consider “downstream” natural language tasks. Nonetheless, I’ve been involved with speech since Carnegie Mellon in the eighties. During Haley Systems, I hired one of the Sphinx fellows who integrated Microsoft and IBM speech products with our natural language understanding software. Now I’m working on spoken-language understanding again…
Most common approaches to ASR these days involve deep learning, such as Baidu’s DeepSpeech. If your notion of deep learning means lots of matrix algebra more than necessarily neural networks, then KALDI is also in the running, but it dates to 2011. KALDI is an evolution from the hidden Markov model toolkit, HTK (once owned by Microsoft). Hidden Markov models (HMM) were the basis of most speech recognition systems dating back to the eighties or so, including Sphinx. All of these are open source and freely licensed.
As everyone knows, ASR performance has improved dramatically in the last 10 years. The primary metric for ASR performance is “word error rate” (WER). Most folks think of WER as the percentage of words incorrectly recognized, although it’s not that simple. WER can be more than 1 (e.g., if you come up with a sentence given only noise!). Here is a comparison published in 2011.
Today, Google, Amazon, Microsoft and others have WER under 10% in many cases. To get there, it takes some talent and thousands of hours of training data. Google is best, Alexa is close, and Microsoft lags a bit in 3rd place. (Click the graphic for the article summarizing Vocalize.io results.)
This is a great page on language modeling with an awesome graphic and commentary on its learned “sentiment neuron”.
Many users land up wanting to import sentences in the Linguist rather than type or paste them in one at a time. One way to do this is to right click on a group within the Linguist and import a file of sentences, one per line, into that group. But if you want to import a large document and retain its outline structure, some application-specific use of the Linguist APIs may be the way to go.
Business logic is not limited to mathematical logic, as in first-order predicate calculus.
Business logic commonly requires “aggregation” over sets of things, like summing the value of claims against a property to subtract it from the value of that property in order to determine the equity of the owner of that property.
- The equity of the owner of a property in the property is the excess of the value of the property over the value of claims against it.
There are various ways of describing such extended forms of classical logic. The most relevant to most enterprises is the relational algebra perspective, which is the base for relational databases and SQL. Another is the notion of generalized quantifiers.
In either case, it is a practical matter to be able to capture such logic in a rigorous manner. The example below shows how that can be accomplished using English, producing the following axiom in extended logic:
This logic can be realized in various ways, depending on the deployment platform, such as: Continue reading “Is business logic too much for classical logic?”
Deep learning can produce some impressive chatbots, but they are hardly intelligent. In fact, they are precisely ignorant in that they do not think or know anything.
More intelligent dialog with an artificially intelligent agent involves both knowledge and thinking. In this article, we educate an intelligent agent that reasons to answer questions.
We are using statistical techniques to increase the automation of logical and semantic disambiguation, but nothing is easy with natural language.
Here is the Stanford Parser (the probabilistic context-free grammar version) applied to a couple of sentences. There is nothing wrong with the Stanford Parser! It’s state of the art and worthy of respect for what it does well.
Going on 5 years ago, I wrote part 1. Now, finally, it’s time for the rest of the story.