Smart Machines And What They Can Still Learn From People – Gary Marcus

This is a must-watch video from the Allen Institute for AI for anyone seriously interested in artificial intelligence.  It’s 70 minutes long, but worth it.  Some of the highlights from my perspective are:

  • 27:27 where the key reason that deep learning approaches fail at understanding language are discussed
  • 31:30 where the inability of inductive approaches to address logical quantification or variables are discussed
  • 39:30 thru 42+ where the inability of deep learning to perform as well as Watson and the inability of Watson to understand or reason are discussed

The astute viewer and blog reader will recognize this slide as discussed by Oren Etzioni here.

Electronically enhanced learning

We are working on educational technology.  That is, technology to assist in education.  More specifically, we are developing software that helps people learn.  There are many types of such software.  We are most immediately focused on two such types.

  1. adaptive educational technology for personalized learning
  2. cognitive tutors

The term “adaptive” with regard to educational technology has various interpretations.  Educational technology that adapts to individuals in any of various ways is the most common interpretation of adaptive educational technology.  This interpretation is a form of personalized learning.  Personalized learning is often considered a more general term which includes human tutors who adapt how they engage with and educate learners.  In the context of educational technology, these senses of adaptive and personalized learning are synonymous. Continue reading “Electronically enhanced learning”

Deep Parsing vs. Deep Learning

For those of us that enjoy the intersection of machine learning and natural language, including “deep learning”, which is all the rage, here is an interesting paper on generalizing vector space models of words to broader semantics of English by Jayant Krishnamurthy, a PhD student of Tom Mitchell at Carnegie Mellon University:

Essentially, the paper demonstrates how the features of high-precision lexicalized grammars allow machines to learn the compositional semantics of English.  More specifically, the paper demonstrates learning of compositional semantics beyond the capabilities of recurrent neural networks (RNN).  In summary, the paper suggests that deep parsing is better than deep learning for understanding the meaning of natural language.

For more information and a different perspective, I recommend the following paper, too:

Note that the authors use Combinatory Categorial Grammar (CCG) while our work uses head-driven phrase structure grammar (HPSG), but this is a minor distinction.  For example, compare the logical forms in the Groningen Meaning Bank with the logic produced by the Linguist.  The former uses CCG to produce lambda calculus while the latter uses HPSG to produce predicate calculus (ignoring vagaries of under-specified representation which are useful for hypothetical reasoning and textual entailment).

IBM Watson in medical education

IBM recently posted this video which suggests the relevance of Watson’s capabilities to medical education. The demo uses cases such as occur on the USMLE exam and Waton’s ability to perform evidentiary reason given large bodies of text. The “reasoning paths” followed by Watson in presenting explanations or decision support material use a nice, increasingly popular graphical metaphor.

One intriguing statement in the video concerns Watson “asking itself questions” during the reasoning process. It would be nice to know more about where Watson gets its knowledge about the domain, other than from statistics alone. As I’ve written previously, IBM openly admits that it avoided explicit knowledge in its approach to Jeopardy!

The demo does a nice job with questions in which it is given answers (e.g., multiple choice questions), in particular. I am most impressed, however, with its response on the case beginning at 3 minutes into the video.

Natural Language Leadership at the Allen Institute for Artificial Intelligence (AI2)

Orin Etzioni is a marvelous choice to lead the Allen Institute for AI (aka AI2).  The NL/ML path is the right path for scaling up the deep knowledge that Paul Allen’s vision of a Digital Aristotle requires.  You can read more about it below and here’s more background on the change in the direction and on some evidence that the path holds great promise.

Going beyond Siri and Watson: Microsoft co-founder Paul Allen taps Oren Etzioni to lead new Artificial Intelligence Institute

Semantic Technology & Business Conference (SemTechBiz)

Benjamin Grosof and I will be presenting the following review of recent work at Vulcan towards Digital Aristotle as part of Project Halo at SemTechBiz in San Francisco the first week of June.

Acquiring deep knowledge from text

We show how users can rapidly specify large bodies of deep logical knowledge starting from practically unconstrained natural language text.

English sentences are semi-automatically interpreted into  predicate calculus formulas, and logic programs in SILK, an expressive knowledge representation (KR) and reasoning system which tolerates practically inevitable logical inconsistencies arising in large knowledge bases acquired from and maintained by distributed users possessing varying linguistic and semantic skill sets who collaboratively disambiguate grammar, logical quantification and scope, co-references, and word senses.

The resulting logic is generated as Rulelog, a draft standard under W3C Rule Interchange Format’s Framework for Logical Dialects, and relies on SILK’s support for FOL-like formulas, polynomial-time inference, and exceptions to answer questions such as those found in advanced placement exams.

We present a case study in understanding cell biology based on a first-year college level textbook.

Super Crunchers: predictive analytics is not enough

Ian Ayres, the author of Super Crunchers, gave a keynote at Fair Isaac’s Interact conference in San Francisco this morning.   He made a number of interesting points related to his thesis that intuitive decision making is doomed.   I found his points on random trials much more interesting, however.

In one of his examples on “The End of Intuition”, a computer program using six variables did a better job of predicting Supreme Court decisions than a team of experts.  He focused on the fact that the program “discovered” that one justice would most likely vote against an appeal if it was labeled a liberal decision.    By discovered we mean that a decision tree for this justice’s vote had a top level decision as to whether the decision was liberal, in which case the program had no further concern for any other information.  Continue reading “Super Crunchers: predictive analytics is not enough”

Adaptive Decision Management

hr-dashboard.jpg

In this article I hope you learn the future of predictive analytics in decision management and how tighter integration between rules and learning are being developed that will  adaptively improve diagnostic capabilities, especially in maximizing profitability and detecting adversarial conduct, such as fraud, money laundering and terrorism.

Business Intelligence

Visualizing business performance is obviously important, but improving business performance is even more important.  A good view of operations, such as this nice dashboard[1], helps management see the forest (and, with good drill-down, some interesting trees). 

With good visualization, management can gain insights into how to improve business processes, but if the view does include a focus on outcomes, improvement in operational decision making will be relatively slow in coming.

Whether or not you use business intelligence software to produce your reports or present dashboards, however, you can improve your operational decision management by applying statistics and other predictive analytic techniques to discover hidden correlations between what you know before a decision and what you learn afterwards to improve your decision making over time.  Continue reading “Adaptive Decision Management”