Friday, May 8, 2009

Similarity, bots, uhm ...varia

I'm writing a paper on modeling various kinds of similarity relations with relational models (these are modified Bugajski models, [JPL 1983 vol. 12] for similarity), so that those of Williamson's constraints on four-place similarity relations [NDJFL 1988, vol . 29] that I find convincing are satisfied.

Also, contra Bugajski, who argued that a set of properties generating a similarity relation has to contain vague properties if the resulting structure is to be non-trivial, I rather argue that even with sharp properties we get fairly intuitive and yet quite non-trivial structures, if we assume that our concepts are more like dynamic frames (a fairly new theory of concepts uses this idea and does seem to have some empirical support, see Barsalou's stuff).

Anyway, I was looking for a good similarity jokes to use as examples, found two I like:

Whats the difference between a fish and a mountain bike?
Both can swim, except for the mountain bike.

How does a shotgun with a broken firing pin resemble a government worker?
It won't work and you can't fire it.
More importantly, I stumbled upon an article about work being done by Julia Taylor and Larry Mazlack to get bots understand jokes based on puns. The task is quite non-trivial, considering the vast computational complexity of bacground knowlege searches etc.

Of course, this doesn't mean a great breakthrough is to be expected anytime soon (or anytime at all), but still, finding similarities between words and employing them efficiently in jokes isn't really that easy. The way the bot works is supposed essentialy to be this:
The program then checks to see if the message is consistent with what would make sense. If it doesn’t, the bot searches to see if the word sounds similar to a word that would fit. If this is the case, the bot flags it as humor.
Three things come to my mind: (1) defining what you mean by consistency with what would make sense and finding a way to check it might be a serious problem. (2) It's not sure whether (a) the bot is supplied with a list of similarities between words, or (b) can search its dictionary and identify words that sound similar (case (b), of course, sounds a bit more interesting). (3) It would be even more interesting if the bot could crack new jokes.

Another interesting thing I found following a link from Wikipedia (I was looking for OpenCyc to play around with, but didn't know what it was called). Wiki, while writing about Cyc mentions a fairly new application of it:
The comprehensive Terrorism Knowledge Base is an application of cyc in development that will try to ultimately contain all relevant knowledge about terrorist groups, their members, leaders, ideology, founders, sponsors, affiliations, facilities, locations, finances, capabilities, intentions, behaviors, tactics, and full descriptions of specific terrorist events. The knowledge is stored as statements in mathematical logic, suitable for computer understanding and reasoning.
So, in fact, there are moments where logic comes handy. Good to know. A slightly more serious report about how this database works is available here.

No comments: