Project Sherlock

Working as part of Vulcan’s Project Halo[1], Automata is applying a natural language understanding system that translates carefully formulated sentences into formal logic so as to answer questions that typically require deeper knowledge and inference than demonstrated by Watson.

The objective over the next three quarters is to acquire enough knowledge from the 9th edition of Campbell’s Biology textbook to demonstrate three things.

  • First, that the resulting system answers, for example, biology advanced placement (AP) exam questions more competently than existing systems (e.g., Aura[2] or Inquire[3]).
  • Second, that knowledge from certain parts of the textbook is effectively translated from English into formal knowledge with sufficient breadth and depth of coverage and semantics.
  • Third, that the knowledge acquisition process proves efficacious and accessible to less than highly skilled knowledge engineers so as to accelerate knowledge acquisition beyond 2012.

Included in the second of these is a substantial ontology of background knowledge expected of students in order to comprehend the selected parts of the textbook using a combination of OWL, logic, and English sentences from sources other than the textbook.

Automata is hiring logicians, linguists, and biologists to work as consultants, contracts, or employees for:

  • Interactive tree-banking and word-sense disambiguation of several thousand sentences.[4]
  • Extending its lexical ontology and a broad-coverage grammar of English with additional vocabulary and deeper semantics, especially concerning cellular biology and related scientific knowledge including chemistry, physics, and math.
  • Maturing its upper and middle ontology of domain independent knowledge using OWL in combination with various other technologies, including description logic, first-order logic, high-order logic, modal logic, and defeasible logic.[5]
  • Enhancing its platform for text-driven knowledge engineering towards a collaborative wiki-like architecture for self-aware content in scientific education and biomedical applications.

Terms of engagement are flexible; ranging from small units of work to full-time employment.  We are based in Pittsburgh, Pennsylvania and Vulcan is headquartered in Seattle, Washington, but the team is distributed across the country and overseas.

Please contact Paul Haley by e-mail to his first name at this domain.


[1] Vulcan: http://www.vulcan.com/TemplateCompany.aspx?contentId=54; Project Halo: http://www.projecthalo.com/
Video introduction/overview :http://videolectures.net/aaai2011_gunning_halobook/

[2] Aura: http://www.ai.sri.com/project/aura

[3] Inquire: http://www.franz.com/success/customer_apps/artificial_intelligence/aura.lhtml

[4] tree-banking and WSD: http://www.omg.org/spec/SBVR and http://en.wikipedia.org/wiki/Word-sense_disambiguation

[5] e.g., SILK (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.174.1796 ) and SBVR (http://www.omg.org/spec/SBVR)