Commercial Intelligence Rotating Header Image

logic

Dictionary Knowledge Acquisition

The following is motivated by Section 6359 of the California Sales and Use Tax.  It demonstrates how knowledge can be acquired from dictionary definitions:

Here, we’ve taken a definition from WordNet and prefixed it with the word followed by a colon and parsed it using the Linguist.

(more…)

‘believed by many’

A Linguist user recently had a question about part of a sentence that boiled down to something like the following:

  • It is believed by many.

The question was whether “many” was an adjective, cardinality, or noun in this sentence.  It’s a reasonable question!

(more…)

Higher Education on a Flatter Earth

We’re collaborating on some educational work and came across this sentence in a textbook on finance and accounting:

  • All of these are potentially good economic decisions.

We use statistical NLP but assist with the ambiguities.  In doing this, we relate questions and answers and explanations to the text.

We also extract the terminology and produce a rich lexicalized ontology of the subject matter for pedagogical uses, assessment, and adaptive learning.

Here’s one that just struck me as interesting.  This is a case where the choice looks like it won’t matter much either way, but …

(more…)

Acquring Rich Logical Knowledge from Text (Semantic Technology 2013)

As noted in prior posts about Project Sherlock, we have acquired knowledge from a biology textbook to build the business case for applications like Inquire.  We reported our results at SemTech recently.  The slides are  available here.

Logic acquired from English sentences in Campbell's college biology textbook published by Pearson

Project Sherlock

Working as part of Vulcan’s Project Halo[1], Automata is applying a natural language understanding system that translates carefully formulated sentences into formal logic so as to answer questions that typically require deeper knowledge and inference than demonstrated by Watson.

The objective over the next three quarters is to acquire enough knowledge from the 9th edition of Campbell’s Biology textbook to demonstrate three things.

  • First, that the resulting system answers, for example, biology advanced placement (AP) exam questions more competently than existing systems (e.g., Aura[2] or Inquire[3]).
  • Second, that knowledge from certain parts of the textbook is effectively translated from English into formal knowledge with sufficient breadth and depth of coverage and semantics.
  • Third, that the knowledge acquisition process proves efficacious and accessible to less than highly skilled knowledge engineers so as to accelerate knowledge acquisition beyond 2012.

Included in the second of these is a substantial ontology of background knowledge expected of students in order to comprehend the selected parts of the textbook using a combination of OWL, logic, and English sentences from sources other than the textbook.

Automata is hiring logicians, linguists, and biologists to work as consultants, contracts, or employees for:

  • Interactive tree-banking and word-sense disambiguation of several thousand sentences.[4]
  • Extending its lexical ontology and a broad-coverage grammar of English with additional vocabulary and deeper semantics, especially concerning cellular biology and related scientific knowledge including chemistry, physics, and math.
  • Maturing its upper and middle ontology of domain independent knowledge using OWL in combination with various other technologies, including description logic, first-order logic, high-order logic, modal logic, and defeasible logic.[5]
  • Enhancing its platform for text-driven knowledge engineering towards a collaborative wiki-like architecture for self-aware content in scientific education and biomedical applications.

Terms of engagement are flexible; ranging from small units of work to full-time employment.  We are based in Pittsburgh, Pennsylvania and Vulcan is headquartered in Seattle, Washington, but the team is distributed across the country and overseas.

Please contact Paul Haley by e-mail to his first name at this domain.


[1] Vulcan: http://www.vulcan.com/TemplateCompany.aspx?contentId=54; Project Halo: http://www.projecthalo.com/
Video introduction/overview :http://videolectures.net/aaai2011_gunning_halobook/

[2] Aura: http://www.ai.sri.com/project/aura

[3] Inquire: http://www.franz.com/success/customer_apps/artificial_intelligence/aura.lhtml

[4] tree-banking and WSD: http://www.omg.org/spec/SBVR and http://en.wikipedia.org/wiki/Word-sense_disambiguation

[5] e.g., SILK (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.174.1796 ) and SBVR (http://www.omg.org/spec/SBVR)

Rules are not enough. Knowledge is core to reuse.

James Taylor’s blog today on rules being core to BPM and SOA in which he discussed reuse had a particularly strong impact on me following a trip yesterday.  During a meeting with the insurance and retail banking practice leaders at a large consulting firm, we looked for synnergies between applications related to investment and applications related to risk.  Of course, during that conversation, we discussed whether operational rules could be usefully shared across these currently siloed areas, but we landed up discussing what they had in common in terms of business concepts, definitions, and fundamental truths or enterprise wide governance.  It was clear to us that this was the most fruitful area to develop core, reusable knowledge assets.

In his post, James agrees with the Butler Group’s statement:

Possibly the most important aspect of a rules repository, certainly in respect of the stated promise of BPM, Service Oriented Architecture (SOA), and BRMS, is the ability for the developer to re-use rules within multiple process deployments.

I have several problems with this statement: (more…)

Missing Goals and Requirements in Business Rules

Both of the following statements are true, but the first is more informative:

  1. Business Rules Management Systems (BRMS) typically produce forward chaining production rules that are interpreted by[1] a business rules engine (BRE) based on the Rete Algorithm.
  2. BRMS typically generate rules that are interpreted by a BRE.

First, dropping the word “production” before “rules” loses information. BRMS do not typically generate rules that are not production rules. Consider, for example, the BRMS vendors involved in the OMG effort produced the Production Rule Representation (PRR) standard. The obvious question is:

  • What is different about production rules?

Second, dropping the words “based on the Rete Algorithm” loses information. The dominant rules vendors and open-source engines are all based on the Rete Algorithm.

  • Why does the Rete Algorithm matter?

Third, dropping the word “chaining” before “rules” loses information. Chaining refers to the sequential application of rules, as in a chain where each link is the application of one rule and links are tied together by their interaction. But:

  • Why does chaining matter?

Fourth, dropping the word “forward” before “chaining” loses information. Forward chaining reacts to information without requiring goals. This begs the question:

  • Don’t goals matter?

(more…)

Managing Semantics, Vocabulary and Business Rules as Knowledge

A client recently asked me for guidance in establishing a center of excellence concerning business rules within their organization. Their objectives included:

  1. Accumulate requisite skills for productive success.
  2. Establish methodologies for productive, reliable and repeatable success.
  3. Accumulate and reuse content (e.g., definitions, requirements, regulations, and policies) across implementations, departments or divisions.
  4. Establish multiple tutorial and reusable reference implementations, including application development, tooling, and integration aspects.
  5. Establish centralized or transferable infrastructure, including architectural aspects, tools and repositories that reflect and support established methodologies, reusable content, and reference implementations.
  6. Establish criteria, best practices and rationale for various administrative matters, especially change management concerning the life cycles of content (e.g., regulations or policies) and applications (e.g., releases and patches).

I was quickly surprised to find myself struggling to write down recommendations for the skill set required to seed the core staff.  My recommendations were less technical than the client may have expected.   After further consideration, it became clear than any discrepancy in expectations arose from differences in our unvoiced strategic assumptions.  Objectives, such as those listed above, are no substitute for a clearly articulated mission and strategy.  

(more…)