Sentence Patterns

(This is a copy of my blog post on the wordpress opencog brainwave blog on 8 September 2009 -- Linas Vepstas.)

I’ve recently resumed work on the question-answering chatbot, and am trying to get it to comprehend a broader range of questions and statements. The “big idea” is to create a number of “sentence patterns” that the pattern matcher can recognize and respond to. The reason this is a “big” idea is because I am trying to avoid anything algorothmic or procedural — everything is to be done by specifying OpenCog hypergraphs, and NOT by writing C++ code, or scheme code (or python code…etc). The reason for working entirely with patterns and hypergraphs, rather than with C++ or scheme, is because this puts the “knowledge” of the system into a form that AI routines can manipulate it: learning algos can learn new hypergraphs; statistical algos can gather usage information on which hypergraphs get triggered, and so on. This is all easer said than done: although I’ve eliminated a fair amount of question-answering code previously written in C++, I’ve also had to write some new scheme code. Bummer. :-(

Patten matching is now used through-out all of the OpenCog NLP pipeline, although not in a unified manner. The Link Grammar parser uses patterns (called “disjuncts”) to determine how the words in a sentence can link to one-another, thus “parsing”, or pulling the grammatical structure out of a sentence (this paper provides an excellent overview). The RelEx dependency relation extractor applies patterns on the link-grammar output to extract syntactic relations. For example, the sentence “John threw a rock” becomes

_obj(throw, ball)
_subj(throw, John)

after RelEx gets done with it. And now, there are a dozen patterns inside of OpenCog that can pick out certain kinds of questions and statements from RelEx output, and pattern-match questions to find answers to them.

For example, the new OpenCog patterns convert “The capital of France is Paris” into

capital_of(France, Paris)

and similarly, “What is the capital of France?” into

capital_of(France,what)

Treating “what” as a variable, there is yet another pattern that matches up the form of the question to the form of the answer, thus deducing that “what” must be “Paris”.

Somewhat harder is using patterns to distinguish similar from dis-similar concepts, so that sentences like “John threw a green ball” aren’t used as answers to questions such as “Did John throw a red ball?”: the word “ball” with modifier “green” has to be detected as a different entity than the word “ball” with modifier “red”: these are two different entities (called “semes” in the code). In fact, out of laziness, I’ve punted on this one: the promotion of word-instances to “semes” is done by code, rather than by pattern matching. But soon, I hope, this will change. In the meanwhile, the README file provides a more detailed discussion.

Here are some patterns that work these days:

<me> John threw a green ball.
<me> Fred threw a red ball
<me> Mary threw a blue rock
<me> who threw a ball?
<cogita-bot> Syntax pattern match found: Fred John
<me> who threw a red ball?
<cogita-bot> Syntax pattern match found: Fred

<me> Did Fred throw a ball?
<cogita-bot> Truth query determined “yes”: throw

<me> Did Fred throw a red ball?
<cogita-bot> Truth query determined “yes”: throw

<me> The color of the book is red.
<me> What is the color of the book?
<cogita-bot> Triples abstraction found: red

<me> the cat sat on the mat
<me> what did the cat sit on?
<cogita-bot> Triples abstraction found: mat

And here are some that don’t yet work: “Did Fred throw a green ball?” — gets no reply, because the system can’t find an answer, and doesn’t make the common-sense leap of “can’t find answer-> answer must be no”. Another common-sense problem is illustrated by: “Did Fred throw a round ball?” — the system doesn’t know that balls are round, and simply assumes that a “round ball” is some special kind of “ball”. Oh well. There’s work to be done.

You can try out the chatbot yourself (when its up, and not broken!) on the IRC chat channel #opencog on the freenode.net chat servers.