Commercial Intelligence Rotating Header Image

TA/NLP: It’s a jungle out there!

Text analytics and natural language processing have made tremendous advances in the last few years.  Unfortunately, there is a lot more to understanding natural language that TA/NLP.

I was reading a paper today about NLP pipelines for question answering that used machine learning to find what tools are good at what tasks and to configure a pipeline by selecting the best tool for a given task from each of the types of components in the pipeline.  The paper has a long list of various components, so I checked a few out.  Most of those of interest were available on the web so that they could be easily composed into pipelines without a lot of software setup.  Looking at these I quickly tired in disappointment.  Here are some of the reasons.

I am not surprised by these results.  NLU is hard.  But they are not particularly strong results either.  I’m surprised that people find such results useful (if they do).

I know that Cal. App. is an organization (i.e., California Court of Appeal), but I’m not surprised this system missed it, but perhaps it should have identified California?

I understand that the People can be confusing, especially at the beginning of a sentence, but a knowledgeable or learned system would have recognized People as referring to persons (the public) or an organization (e.g., a government).

Similarly, a knowledge or learned system would recognize this as a reference to a legal case and perhaps classified it with regard to organization, place, and event.

I guess I can understand why this system doesn’t bother identifying single words as concepts.  But I don’t really understand how “oil well” is a concept but not “oil well driller” or “rock bit” or even “effective life”.  It’s important to understand that “transfer of title” is a concept or event here, too. (Perhaps it should even classify ‘(a) of this section’ as referring to a work.)

It’s pretty clear that a company is an organization, not just a concept, so I’m a little surprised by these errors.  It missed section 6352 as a work (as above), so this system would not be very good for legal or contract purposes.

The mistake on “private” is interesting, especially that it did not identify “private individuals” as either a concept or as persons.

In real life, text is not always pretty.  This is real life. The prior examples are verbatim from the California Sales and Use Tax Law & Regulations.

The following text occurs in the body of a medical test prep exam from one of our clients.  Note that it misses lots of people here.  It’s also almost completely arbitrary about what it classifies as a concept.  It’s particularly odd that it classifies an adjective as a concept when the concept should be “clinical features”.  It misses “Pancreas” (twice) and “Blumgart’s Surgery” as referring to a person or organization and being a concept specializing “surgery”.  It also misses that various parts f this are references to works.  And it misses that 2010 and 2012 are references to years (events).

The following may be fine, but how do you use it? How do those modifiers of 23 and year-old relate to each other? Isn’t she the subject of the telling??  That’s enough for me.

The following is a little difficult to “parse”, especially the first row.  I don’t get the ‘that’ as the subject nor ‘her be’ as the object of several of these. (And I wonder, what is ‘it’ as a predicate?)

I expect activities to be the subject of ‘require’ and the objective of ‘participates’.  I expect her to be the subject of ‘participates’ and (part of) the object of ‘requires’.

In the “you get what you pay for” category, here are some results from Amazon’s Comprehend service.  I like that it got the section as an entity and I don’t fault it for perceiving some negative sentiment here (except as follows).

To the extent such things (like Amazon) do well with the smallest noun phrases, they could also do some prepositional phrases, which would be nice, but they are far from understanding what phrases complement (i.e., other phrases or clauses).

I would be more impressed if it noted the more interesting phrases here, such as ‘collect or pay’ or ‘collect or pay a use tax’ or ‘duty to collect or pay a use tax’.  On the other hand, it takes a knowledge or very learned system to do that.

And even more impressed if it recognized (with useful confidence) that its the company being relieved from a duty that is negated (without negative sentiment).

Ending Back Pain with AI

A nationwide physical therapy business is renowned for eliminating chronic pain.  One unique aspect of this business is that it eliminates pain not by manipulation but by providing clients with expertly selected sequences of exercises that address problems in their functional anatomy.  In effect, the business helps people fix themselves and teaches them to maintain their musculoskeletal function for pain-free life.

The business has dozens of clinics with many more therapists.  Its expertise comes primarily from its founder and a number of long-time employees who have learned through experience.  We have been engaged to assist with several challenges on several occasions.

Continue reading →

Combinatorial ambiguity? No problem!

Working on translating some legal documentations (sales and use tax laws and regulations) into compliance logic, we came across the following sentence (and many more that are even worse):

  • Any transfer of title or possession, exchange, or barter, conditional or otherwise, in any manner or by any means whatsoever, of tangible personal property for a consideration.

Natural language processing systems choke on sentences like this because of such sentences’ combinatorial ambiguity and NLP’s typical lack of knowledge about what can be conjoined or complement or modify what.

This sentences has many thousands of possible parses.  They involve what the scopes of each of the the ‘or’s are and what is modified by conditional, otherwise, or whatsoever and what is complemented by in, by, of, and for.

The following shows 2 parses remaining after we veto a number of mistakes and confirm some phrases from the 400 highest ranking parses (a few right or left clicks of the mouse):

Continue reading →

Simple, Fast, Effective, Active Learning

Recently, we “read” ten thousand recipes or so from a cooking web site.  The purpose of doing so was to produce a formal representation of those recipes for use in temporal reasoning by a robot.

Our task was to produce ontology by reading the recipes subject to conflicting goals.  On the one hand, the ontology was to be accurate so that the robot could reason, plan, and answer questions robustly.  On the other hand, the ontology was to be produced automatically (with minimal human effort).[1]

In order to minimize human effort while still obtaining deep parses from which we produce ontology, we used more techniques from statistical natural language processing than we typically do in knowledge acquisition for deep QA, compliance, or policy automation.  (Consider that NLP typically achieves less than 90% syntactic accuracy while such work demands near 100% semantic accuracy.)[2]

In the effort, we refined some prior work on representing words as vectors and semi-supervised learning.  In particular, we adapted semi-supervised, active learning similar to Stratos & Collins 2015 using enhancements to the canonical correlation analysis (CCA) of Dhillon et al 2015 to obtain accurate part of speech tagging, as conveyed in the following graphic from Stratos & Collins:

Continue reading →

Of Kalman Filters and Hidden Markov Models

This provides some background relating to some work we did on part of speech tagging for a modest, domain-specific corpus.  The path is from Hsu et al 2012, which discusses spectral methods based on singular value decomposition (SVD) as a better method for learning hidden Markov models (HMM) and the use of word vectors instead of clustering to improve aspects of NLP, such as part of speech tagging.

The use of vector representations of words for machine learning in natural language is not all that new.  A quarter century ago, Brown et al 1992 introduced hierarchical agglomerative clustering of words based on mutual information and the use of those clusters within hidden Markov language models.  One notable difference versus today’s word vectors is that paths through the hierarchy of clusters to words at the leaves correspond to vectors of bits (i.e., Boolean features) rather than real-valued features.

Continue reading →

Artificially Intelligent Physical Therapy

Over the last 25 years we have developed two generations of AI systems for physical therapy.  The first was before the emergence of the Internet, when Windows was 16 bits.  There were no digital cameras, either.  So, physical therapists would take Polaroid pictures; anterior and left and right lateral pictures or simply eyeball a patient and enter their posture into the computer using a diagram like the following:

Continue reading →

Impressive result from Google

This is pretty impressive work by Google!

They are seeing the objective behind the query.  It’s pretty simple, in theory, to see the verb “read” operating on the object “string” with source (i.e., “from) being consistent with an input stream (also handling the concatenated compound).

More impressive is that they have learned from such queries and content that people view following such queries, perhaps even more deeply, that character streams, scanners, and stream APIs are relevant.

And they have also narrow my results based on the frequency that I look at Java versus other implementation languages.

Robust Inference and Slacker Semantics

In preparing for some natural language generation[1], I came across some work on natural logic[2][3] and reasoning by textual entailment[4] (RTE) by Richard Bergmair in his PhD at Cambridge:

The work he describes overlaps our approach to robust inference from the deep, variable-precision semantics that result from linguistic analysis and disambiguation using the English Resource Grammar (ERG) and the Linguist™.

Continue reading →

It’s hard to reckon nice English

The title is in tribute to Raj Reddy’s classic talk about how it’s hard to wreck a nice beach.

I came across interesting work on higher order and semantic dependency parsing today:

So I gave the software a try for the sentence I discussed in this post.  The results discussed below were somewhat disappointing but not unexpected.  So I tried a well know parser with similar results (also shown below).

There is no surprise here.  Both parsers are marvels of machine learning and natural language processing technology.  It’s just that understanding is far beyond the ken of even the best NLP.  This may be obvious to some, but many are surprised given all the hype about Google and Watson and artificial intelligence or “deep learning” recently.

Continue reading →

Properly disambiguating a sentence using the Linguist™

Consider the following disambiguation result from a user of Automata’s Linguist™.

Continue reading →