“Only full page color ads can run on the back cover of the New York Times Magazine.”

A decade or so ago, we were debating how to educate Paul Allen’s artificial intelligence in a meeting at Vulcan headquarters in Seattle with researchers from IBM, Cycorp, SRI, and other places.

We were talking about how to “engineer knowledge” from textbooks into formal systems like Cyc or Vulcan’s SILK inference engine (which we were developing at the time). Although some progress had been made in prior years, the onus of acquiring knowledge using SRI’s Aura remained too high and the reasoning capabilities that resulted from Aura, which targeted University of Texas’ Knowledge Machine, were too limited to achieve Paul’s objective of a Digital Aristotle. Unfortunately, this failure ultimately led to the end of Project Halo and the beginning of the Aristo project under Oren Etzioni’s leadership at the Allen Institute for Artificial Intelligence.

At that meeting, I brought up the idea of simply translating English into logic, as my former product called “Authorete” did. (We renamed it before Haley Systems was acquired by Oracle, prior to the meeting.)

Continue reading ““Only full page color ads can run on the back cover of the New York Times Magazine.””

July 29, 2013December 18, 2018

US News & World Report: “The Education-Technology Revolution Is Coming”

This US News & World Report opinion is on the right track about the macro trend towards increasingly technology-enabled education:

But it also sounds like what I heard during the dot-com boom of the 1990s when a lot of companies—including Blackboard—began using technology to “disrupt” the education status quo. Since then we’ve made some important progress, but in many ways the classroom still looks the same as it did 100 years ago. So what’s different this time? Is all the talk just hype? Or are we really starting to see the beginnings of major change? I believe we are.

The comments about active learning are particularly on-target. Delivering a textbook electronically or a course on-line is hardly the point. For example, textbooks and courses that understand their subject matter well enough to ask appropriate questions and that can explain the answers, assess the learner’s comprehension, guide them through the subject matter and accommodate their learning style dynamically are where the action will be soon enough. This is not at all far-fetched or years off. Look at Watson and some of these links to see how imminent such educational technology could be!

Award-winning video of Inquire: An Intelligent Textbook
Presentation of Vulcan’s Digital Aristotle (PDF slides, streaming recording)
article on Vulcan’s Digital Aristotle, Aura, Inquire, and Campbell’s Biology (PDF)

We’ve been working for several years on applications of artificial intelligence in education, as in Project Sherlock and this presentation. Please get in touch if you’re interested in advancing education along such lines.

June 27, 2013December 18, 2018

Neat vs. Scruffy and Watson

Recently, John Sowa has commented on LinkedIn or in correspondence with some of us at Coherent Knowledge Systems on the old adage due to Shanks concerning the Neats. vs. the Scruffies. The Neats want nice formal logics as the basis of artificial intelligence. This includes anyone who prefers classical logic (e.g., Common Logic, RIF-BLD, or SBVR) or standard ontologies (e.g., OWL-DL) for representing knowledge and reasoning with it. The Scruffies may use well-defined technology, but are not constrained by it. They’ll do whatever they think works, now, whether or not it is a good long term solution and despite its shortcomings, as long as it can obtain immediate objectives.

Watson is scruffy. It doesn’t try to understand or formally represent knowledge. It combines a lot of effective technologies into an evidentiary framework that allows it to effectively “guess”.

Today, in response to continued discussion in the Natural Language Processing group on LinkedIn under the topic “This is Watson”, I’m posting the following presentation on Project Sherlock and the Linguist vs. Google and IBM.

Essentially, the neat approach is more viable today than ever. So, chalk one up for the neats, including Dr. Sowa and Menno Mofait’s comment in that discussion.

During a presentation at CMU after winning the game show,, IBM admitted that in order to get the last leg of improvement needed to win Jeopardy!, they needed to do some “neat” ontological knowledge acquisition, too!

June 15, 2013July 14, 2018

Deep question answering: Watson vs. Aristotle

At the SemTech conference last week, a few companies asked me how to respond to IBM’s Watson given my involvement with rapid knowledge acquisition for deep question answering at Vulcan. My answer varies with whether there is any subject matter focus, but essentially involves extending their approach with deeper knowledge and more emphasis on logical in additional to textual entailment.

Today, in a discussion on the LinkedIn NLP group, there was some interest in finding more technical details about Watson. A year ago, IBM published the most technical details to date about Watson in the IBM Journal of Research and Development. Most of those journal articles are available for free on the web. For convenience, here are my bookmarks to them.

Question analysis: How Watson reads a clue
Deep parsing in Watson
Good technical details on the two parsing approaches taken. Using deep parsing, such as we have (e.g., using the ERG in Project Sherlock) and disambiguation is a viable approach for the background knowledge that IBM dismisses too quickly, however (see below). Note that in order to train NLP systems in new domains, you have to go through the same process for thousands of sentences, so disambiguation technology as in the Linguist is appropriate even if proofs are based more on textual entailment than logical deduction.
Textual resource acquisition and engineering
Automatic knowledge extraction from documents
Finding needles in the haystack: Search and candidate generation
Typing candidate answers using type coercion
Textual evidence gathering and analysis
Relation extraction and scoring in DeepQA
Structured data and inference in DeepQA
IBM takes a hard line against deep knowledge here. They have the upper hand in the argument due to their impressive results, but more precise knowledge would only improve their performance. You can find more on this debate in the Deep QA FAQ and this presentation which includes the the followings slide:
Continue reading “Deep question answering: Watson vs. Aristotle”

May 28, 2013December 18, 2018

Background for our Semantic Technology 2013 presentation

In the spring of 2012, Vulcan engaged Automata for a knowledge acquisition (KA) experiment. This article provides background on the context of that experiment and what the results portend for artificial intelligence applications, especially in the areas of education. Vulcan presented some of the award-winning work referenced here at an AI conference, including a demonstration of the electronic textbook discussed below. There is a video of that presentation here. The introductory remarks are interesting but not pertinent to this article.

Background on Vulcan’s Project Halo

From 2002 to 2004, Vulcan developed a Halo Pilot that could correctly answer between 30% and 50% of the questions on advanced placement (AP) tests in chemistry. The approaches relied on sophisticated approaches to formal knowledge representation and expert knowledge engineering. Of three teams, Cycorp fared the worst and SRI fared the best in this competition. SRI’s system performed at the level of scoring a 3 on the AP, which corresponds to earning course credit at many universities. The consensus view at that time was that achieving a score of 4 on the AP was feasible with limited additional effort. However, the cost per page for this level of performance was roughly $10,000, which needed to be reduced significantly before Vulcan’s objective of a Digital Aristotle could be considered viable.

Continue reading “Background for our Semantic Technology 2013 presentation”

April 7, 2013September 27, 2018

Semantic Technology & Business Conference (SemTechBiz)

Benjamin Grosof and I will be presenting the following review of recent work at Vulcan towards Digital Aristotle as part of Project Halo at SemTechBiz in San Francisco the first week of June.

Acquiring deep knowledge from text

We show how users can rapidly specify large bodies of deep logical knowledge starting from practically unconstrained natural language text.

English sentences are semi-automatically interpreted into predicate calculus formulas, and logic programs in SILK, an expressive knowledge representation (KR) and reasoning system which tolerates practically inevitable logical inconsistencies arising in large knowledge bases acquired from and maintained by distributed users possessing varying linguistic and semantic skill sets who collaboratively disambiguate grammar, logical quantification and scope, co-references, and word senses.

The resulting logic is generated as Rulelog, a draft standard under W3C Rule Interchange Format’s Framework for Logical Dialects, and relies on SILK’s support for FOL-like formulas, polynomial-time inference, and exceptions to answer questions such as those found in advanced placement exams.

We present a case study in understanding cell biology based on a first-year college level textbook.