<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Chip Vivant</title>
	<atom:link href="http://www.chipvivant.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.chipvivant.com</link>
	<description>&#34;If you want to seem human, then start with the basics.&#34; -- Mohan Embar</description>
	<lastBuildDate>Sat, 09 Jul 2011 20:33:45 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Chip Chosen as One of the Final Four for LPC 2011!</title>
		<link>http://www.chipvivant.com/2011/07/09/chip-chosen-as-one-of-the-final-four-for-lpc-2011/</link>
		<comments>http://www.chipvivant.com/2011/07/09/chip-chosen-as-one-of-the-final-four-for-lpc-2011/#comments</comments>
		<pubDate>Sat, 09 Jul 2011 20:33:45 +0000</pubDate>
		<dc:creator>Mohan Embar</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.chipvivant.com/?p=84</guid>
		<description><![CDATA[Woo hoo! Read all about it here.]]></description>
			<content:encoded><![CDATA[<p>Woo hoo! Read all about it <a href="http://loebner.exeter.ac.uk/">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.chipvivant.com/2011/07/09/chip-chosen-as-one-of-the-final-four-for-lpc-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>My Loebner Prize Contest 2010 Reflections</title>
		<link>http://www.chipvivant.com/2011/07/07/my-loebner-prize-contest-2010-reflections/</link>
		<comments>http://www.chipvivant.com/2011/07/07/my-loebner-prize-contest-2010-reflections/#comments</comments>
		<pubDate>Fri, 08 Jul 2011 00:07:24 +0000</pubDate>
		<dc:creator>Mohan Embar</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.chipvivant.com/?p=76</guid>
		<description><![CDATA[No Go Chip didn&#8217;t get past the screening round of the 2010 Loebner Prize Competition. I figured that I could spare myself a trip to New York that year since Chip had successfully made it through the screening round in &#8230;<p class="read-more"><a href="http://www.chipvivant.com/2011/07/07/my-loebner-prize-contest-2010-reflections/">Read more &#187;</a></p>]]></description>
			<content:encoded><![CDATA[<h3>No Go</h3>
<p>Chip didn&#8217;t get past the screening round of the <a href="http://www.loebner.net/Prizef/2010_Contest/results.html">2010 Loebner Prize Competition</a>.</p>
<p>I figured that I could spare myself a trip to New York that year since Chip had successfully made it through the screening round in 2009 and <a href="/my-loebner-prize-contest-2009-reflections/">participated in the 2009 competition</a>.</p>
<p>To my dismay, however, I got an email from Hugh saying that he couldn&#8217;t get the program to respond. The startup sequence seemed normal, but the program was silent whenever you said anything to it. I initially thought that this was a permissions problem: that Chip might have not had permissions to access the communications protocol directory required by the Loebner Prize Protocol. (The Loebner Prize Protocol is the &#8220;language&#8221; that the chatbot and the judge program use to communicate with each other.) Further discussions revealed that this wasn&#8217;t the case, however. All but one of the entries had successfully run, and there were many this time (13?), proof that the LPP was no longer a barrier to entry as it was in 2009 and that Chip would need real talent to make it to the Final Four.</p>
<p>With a heavy heart, I booked another flight to NYC. Luckily, I was able to stay with a friend, as I did last year when I paid Hugh a visit, so the only major cost was that of the plane flight.</p>
<p>I arrived at <a href="http://www.gocrown.com/">Crown Industries</a> and was greeted by Hugh, his coworker and the same cats I had met the year before. I&#8217;m not sure whether the cats recognized me, but they immediately proceeded to jump on the desk, walk all over my keyboard and computer, insist on treats. That&#8217;s okay, I thought, I like cats. I just didn&#8217;t want to get too much fur in the keyboard or have them walk on the pen drive protruding from the computer.</p>
<p>Hugh was right. Chip started up properly, but didn&#8217;t respond. To my horror, I realized that it was because Hugh was feeding the test sentences to Chip <i>without hitting the [Return] key</i>. Chip was trained to use [Return] as the way you signaled the end of your input, like you would with a normal IM conversation. Game over. I was able to scrounge up the source for Chip from a backup, correct the bug, then go through the screening process anyway. Chip didn&#8217;t fare that badly, though he failed to respond to certain questions he should have been able to answer, probably due to my <a href="/2011/07/07/my-loebner-prize-contest-2009-reflections/">last-minute hacking in England the year before</a>. Oh well, at least I had a new logfile with quality questions and Chip&#8217;s answers that I could use to improve Chip for 2011.</p>
<p>During my previous visit to NYC in 2009 for the prescreening, my friend wanted me to check out <a href="http://www.candlecafe.com/">Candle Café</a> but it never happened. Then I lived in Manhattan for six weeks in the beginning of 2010 at a contract-to-hire stint for Bloomberg. (They made me an offer but I turned it down. NYC definitely isn&#8217;t for me, though I started out hating the city and the people and now I like both of them. (New Yorkers are to-the-point, straight shooters, which throws off people with Midwestern roots like me who expect smiles and gushing congeniality from strangers.) (For my foreign friends: here&#8217;s a definition of <a href="http://dictionary.reference.com/browse/straight+shooter">straight shooter</a>.))</p>
<p>Anyway, Candle Café never happened during those six weeks either. So this time around, I made it a point to go there because I figured that the universe would keep making me come back until I did.</p>
<p>The food was delicious, but not worth several hundred dollars in plane fare. I sincerely hope I fulfilled my debt to NYC or Candle Café or &#8220;putting in the time and effort for the LPC&#8221;, or whatever.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.chipvivant.com/2011/07/07/my-loebner-prize-contest-2010-reflections/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>My Loebner Prize Contest 2009 Reflections</title>
		<link>http://www.chipvivant.com/2011/07/07/my-loebner-prize-contest-2009-reflections/</link>
		<comments>http://www.chipvivant.com/2011/07/07/my-loebner-prize-contest-2009-reflections/#comments</comments>
		<pubDate>Thu, 07 Jul 2011 20:21:23 +0000</pubDate>
		<dc:creator>Mohan Embar</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.chipvivant.com/?p=57</guid>
		<description><![CDATA[Introduction It&#8217;s been over three years since I updated chipvivant.com, so these reflections are a couple of years old. Since Chip is a Loebner Prize 2011 finalist, I figured I better freshen up this site because I anticipate increased traffic. &#8230;<p class="read-more"><a href="http://www.chipvivant.com/2011/07/07/my-loebner-prize-contest-2009-reflections/">Read more &#187;</a></p>]]></description>
			<content:encoded><![CDATA[<h3>Introduction</h3>
<p>It&#8217;s been over three years since I updated chipvivant.com, so these reflections are a couple of years old. Since Chip is a <a href="http://loebner.exeter.ac.uk/">Loebner Prize 2011 finalist</a>, I figured I better freshen up this site because I anticipate increased traffic. Better late than never, I guess.</p>
<p>When Hugh Loebner first posted the rules of the <a href="http://loebner.net/Prizef/2009_Contest/loebner-prize-2009.html">2009 Loebner Prize Competition</a>, I was excited because the contest format was reverting back to the previous years&#8217; format which I had trained Chip for. There was the issue of Hugh&#8217;s arcane <a href="http://loebner.net/Prizef/2011_Contest/Loebner_Prize_Rules_2011.html">Loebner Prize Protocol</a> which at the time, no one had graciously developed <a href="http://sourceforge.net/projects/loebner/">Open Source helper libraries</a> for, but the protocol seemed straightforward and didn&#8217;t worry me. For safety&#8217;s sake, I decided to invoke the option of bringing my own computer to Hugh&#8217;s premises even though Chip ran off of a USB pen drive.</p>
<p>Chip is not continuously online. Nor do I work on him full-time. I had done little development since the 2008 LPC and also didn&#8217;t have the benefit of tweaking Chip with numerous conversation logs. Nevertheless, I did have the benefit of the logs from the 2008 LPC (since Chip was running on my own web server) as well as the feedback I got from participating in the LPC 2008 prescreening.</p>
<p>I corrected the most glaring deficiencies, got the Loebner Prize Protocol (hereafter LPP) stuff working and made the appointment to meet Hugh at his place of business in New Jersey. Due to my inexperience with the LPP, I didn&#8217;t want to chance mailing a non-functional entry.</p>
<p>When I arrived at Hugh&#8217;s place of business in New Jersey, I learned that there were only two other entries that year and that I would automatically be part of the Final Four if the program functioned properly, which it did. Even though I got a free pass that year, I was beside myself with excitement because this was the culmination of a nearly decade-long dream. The program did function properly, of course, and I was in. I was also very pleased at the screening questions, which were of the type I had expected.</p>
<p>The competition was to be held in Brighton, U.K. that year, which seemed both exciting and expensive. Because of my vegan activism, I have friends in countries all over the world, especially in the U.K., which is the birthplace of veganism. A vegan friend of mine graciously offered to put me up in her flat in London and I would commute by train to Brighton.</p>
<p>The nice thing about these competitions is that if your entry makes it to the finals, you have a few months after that to improve it. The bulk of my effort was done in the weeks before the competition, including a couple of all-nighters in London before the contest. I wanted to focus less on cramming facts into Chip and more on giving him <i>a soul</i>. My secret sauce would be the influence of <a href="http://www.fiziwig.com/">Gary Shannon&#8217;s writings</a>, particularly <a href="http://www.fiziwig.com/ai/chatbot/understand.html">Programming a Chatbot to <i>Understand</i> a Sentence</a>. His proposed implementation was too ambitious for my time constraints, but I wanted to capture the essence of it. This approach will also help when I try to use Chip to help and comfort people via <a href="http://www.empathynow.com/">empathynow.com</a> (which is under construction).</p>
<p>The day of the competition finally came and I had literally stayed up the entire night before cramming new code into Chip (unwise, as any programmer would tell you).</p>
<p>The two other contestants were <a href="http://en.wikipedia.org/wiki/Rollo_Carpenter">Rollo Carpenter</a> and <a href="http://en.wikipedia.org/wiki/David_Levy_%28chess_player%29">David Levy</a>. (Why isn&#8217;t Levy&#8217;s LPC 2009 win mentioned in the Wikipedia article? And when do I get my own Wikipedia article?!) I had immense respect for both of them. Rollo&#8217;s <a href="http://www.jabberwacky.com/">Jabberwacky</a> and <a href="http://cleverbot.com/">Cleverbot</a> both use novel approaches to generating responses, and David Levy was my equivalent of a Mega Rock Star. His <a href="http://www.atarimagazines.com/creative/index/index.php?author=David+Levy">articles in Creative Computing</a> were main the reason I became interested in Artificial Intelligence. (Inspired by him, I wrote a chess program in Z-80 Assembler for the TRS-80 for a high school project.)</p>
<p>I met Hugh for the second time and also some nice, new people: David Hamill, with whom I had conversed quite a bit on the <a href="http://tech.groups.yahoo.com/group/Robitron/">Robitron</a> list and who I felt shared the same views as I do, <a href="http://www.chatbots.org/expert/erwin_van_lun/1/">Erwin van Lun</a>, head of <a href="http://www.chatbots.org/">chatbots.org</a>, another nice guy who I got to speak Dutch with. There were some nice confederates (humans who took the other side of the bet and tried to convince the judge they were human) too. The only one whose name I can remember was <a href="http://en.wikipedia.org/wiki/Brian_Christian">Brian Christian</a> who was working on a <a href="http://www.amazon.com/gp/product/0385533063/">book</a> and <a href="http://www.theatlantic.com/magazine/archive/2011/03/mind-vs-machine/8386/5/">magazine article</a> at the time and interviewed me after the contest was over.</p>
<p>The actual competition was a bloodbath for Chip and me, though, and I chalk that up to lack of experience. Recall from <a href="/2008/09/05/my-loebner-prize-contest-2008-reflections/">My Loebner Prize Contest 2008 Reflections</a> that my approach was to advertise Chip&#8217;s abilities to the judges. This was a basically a gigantic dump of sample sentences that you can see starting at <a href="/motivations-and-functionality/">Motivations and Functionality</a> (&#8220;My competitors might have cuter canned responses&#8230;&#8221;). I had used this approach during the 2008 competition and stand behind my reasoning for using it, but it proved fatal in the 2009 competition due to the Loebner Prize Protocol, which involved my inserting a 300 millisecond delay between each character. Combine that with a five-minute conversation limit and you get the horror I felt when Chip shamelessly steered a judge into asking &#8220;What can you do?&#8221; only to chew up the rest of the conversation attempting to vomit that gigantic list.</p>
<p>That happened for three of the four conversations. For the fourth one where that didn&#8217;t happen, Chip got the highest score.</p>
<p>One the one hand, it sucked to go all the way to England to have Chip give the performance he did. On the other hand, you have to stay in the game and not get discouraged with this kind of stuff. Chip would have benefited if I had finished this earlier and exposed him to more real-life conversations, but I didn&#8217;t have time for that and I find the task of keeping Chip online and then poring through megabytes of useless logs tiresome (which is why Chip is offline at the moment). That said, I don&#8217;t regret going and was happy to meet the people I met &#8211; this stuff is not something I can share with people in my immediate surroundings, however graciously they listen to me ramble on about this stuff.</p>
<p>One final closing observation. Reading <a href="http://www.theatlantic.com/magazine/archive/2011/03/mind-vs-machine/8386/1/">Brian Christian&#8217;s Atlantic article</a> reminds me that as I write this in 2011, chatbot writers and the rest of the world are indeed living in parallel universes. When I read passages from his article like:</p>
<blockquote><p>Turing’s prediction has not come to pass; however, at the 2008 contest, the top-scoring computer program missed that mark by just a single vote. When I read the news, I realized instantly that the 2009 test in Brighton could be the decisive one. I’d never attended the event, but I felt I had to go &#8211; and not just as a spectator, but as part of the human defense. A steely voice had risen up inside me, seemingly out of nowhere: Not on my watch. I determined to become a confederate.</p></blockquote>
<p>&#8230;I scratch my head. Contrast this with my observation in <a href="/motivations-and-functionality/">Motivations and Functionality</a> that basically only Chip and maybe one other guy in the world have a bot that knows that the moon is larger than an orange and you understand my puzzlement as to why some people can think that we&#8217;re anywhere near close to machines that possess the kind of intelligence that the Turing Test would purportedly reveal. The reason that the 2008 entry was able to come so close is nicely explained <a href="http://en.wikipedia.org/wiki/Turing_test#Naivete_of_interrogators_and_the_anthropomorphic_fallacy">here</a> and has nothing to do with an entry that possessed any real intelligence. I can train anyone to spot the bot with fifteen minutes of training: just ask it simple questions like the ones listed in the <a href="/motivations-and-functionality/">Motivations and Functionality</a> section. And Chip is not much better &#8211; as soon as you deviate slightly from the limited list of things he knows about, he&#8217;ll choke too.</p>
<p>That&#8217;s the stark reality of the state of affairs today. It may not make for sexy articles and news stories, but it&#8217;s the truth. (I&#8217;m not saying that Brian and others purposely try to sensationalize this, just that they might be a bit misguided.) That said, there&#8217;s plenty of progress to be made if we roll up our sleeves, are honest about things and confront these problems head on. Also, the fact that <a href="http://en.wikipedia.org/wiki/ELIZA">ELIZA</a> was able to provide comfort to some people in the 60s shows that such a thing is possible despite the machine not possessing any real intelligence. That&#8217;s one of the things I hope to eventually accomplish with Chip.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.chipvivant.com/2011/07/07/my-loebner-prize-contest-2009-reflections/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>My Loebner Prize Contest 2008 Reflections</title>
		<link>http://www.chipvivant.com/2008/09/05/my-loebner-prize-contest-2008-reflections/</link>
		<comments>http://www.chipvivant.com/2008/09/05/my-loebner-prize-contest-2008-reflections/#comments</comments>
		<pubDate>Fri, 05 Sep 2008 18:10:40 +0000</pubDate>
		<dc:creator>Mohan Embar</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.chipvivant.com/?p=46</guid>
		<description><![CDATA[Congratulations! Congratulations to the Loebner Prize Contest (hereafter LPC) 2008 entrants who made it through to the finals and also to those (including Chip) who didn&#8217;t. Now that the dust has settled, I wanted to relate my experience in creating &#8230;<p class="read-more"><a href="http://www.chipvivant.com/2008/09/05/my-loebner-prize-contest-2008-reflections/">Read more &#187;</a></p>]]></description>
			<content:encoded><![CDATA[<h3>Congratulations!</h3>
<p>Congratulations to <a href="http://www.loebner.net/Prizef/2008_Contest/loebner-prize-2008.html">the Loebner Prize Contest (hereafter LPC) 2008 entrants</a> who made it through to the finals and also to those (including Chip) who didn&#8217;t. Now that the dust has settled, I wanted to relate my experience in creating and submitting this entry. I want to clarify from the outset that I&#8217;m not interested in sparking another fight or bashing the LPC &#8211; all that has been done many times both on and off the <a href="http://tech.groups.yahoo.com/group/Robitron/">Robitron</a> list. I&#8217;m simply interested in relating my experiences and perceptions (however misguided and faulty they might be).</p>
<h3>My Desire to Write a Chatbot</h3>
<p>I had wanted to write a bot and enter the <a href="http://www.loebner.net/Prizef/loebner-prize.html">LPC</a> for years. I had first seen an ELIZA variant running on a TRS-80 and also wrote a variant for this machine. Years later, as described <a href="http://www.thisiscool.com/mips2java.htm">here</a>, I wrote another ELIZA-like variant under the guise of learning C at my first job at Data General. I don&#8217;t know where I heard of the LPC in the first place, but as soon as I found the site and chatted online with the likes of <a href="http://alicebot.blogspot.com/">ALICE</a> and <a href="http://www.jabberwacky.com/">Jabberwacky</a>, I was fascinated and dreamed of doing this myself. I read most of the transcripts from previous years and my mouth watered at the thought of creating an entry.</p>
<p>The first thing I ever typed into a TRS-80 when I first saw it at age 11 was &#8220;What is 2+2?&#8221;. The answer was <tt>?SN ERROR</tt> (shorthand for &#8220;Syntax Error&#8221;). I typed in similar questions and got the same answer. I was disappointed. I asked the person showing me the computer: &#8220;What good is this thing? I thought computers were supposed to do calculations.&#8221; He said, &#8220;You&#8217;re doing it wrong. You have to type &#8220;PRINT 2+2&#8243;. That was the glimmering of the realization that you had to speak its language if you wanted it to do anything useful.</p>
<p>When I was reading the LPC transcripts, I saw that a standard technique of dealing with input that the bot didn&#8217;t understand was to change the subject or answer using some evasive, generic or cute response. I was fascinated by questions like &#8220;Which is larger: a 747 or my big toe?&#8221; In the <a href="http://www.loebner.net/Prizef/2007_Contest/Rules.html">LPC 2007 Rules</a>, there were even a set of screening questions pertaining to time, general questions about things, comparisons and memory which seemed a departure from the usual stance of letting anyone submit an entry and seeing which one was the &#8220;best&#8221; using some criteria which was never clearly elaborated.</p>
<p>While I was researching the contest, I also ran across criticisms of it. On the <a href="http://www.loebner.net/Prizef/loebner-prize.html">LPC Contest home page</a> itself, there are two such links, one <a href="http://loebner.net/Prizef/minsky.html">of a thread where</a> Marvin Minsky calls it a &#8220;stupid prize&#8221; and an &#8220;obnoxious and unproductive annual publicity campaign&#8221;. The other contains <a href="http://www.eecs.harvard.edu/shieber/Biblio/Papers/loebner-rev-html/loebner-rev-html.html">an article by Stuart Shieber</a> detailing why he thinks the test is faulty and potential fixes for it as well as <a href="http://loebner.net/Prizef/In-response.html">Hugh Loebner&#8217;s rejoinder to it</a>.</p>
<p>My initial reaction to Minsky and Shieber&#8217;s attitude was that they seemed like a couple of elitist whiners. Their objections seemed like the rantings of people crying sour grapes because the field that they are purported experts in didn&#8217;t produce entrants of a superior quality than the actual LPC entrants. I thought that Loebner&#8217;s reply was eloquent. In particular, he says:</p>
<blockquote><p>At the current state of the art I suggest that the appropriate orientation for the contest is to determine which of obviously artificial computer entries is the best entry, i.e. most human like, and nominate the authors as &#8220;winners.&#8221; It should not be to determine if a particular terminal is controlled by a human or a computer. If we maintain this orientation, there should be no problems holding unrestricted tests.</p></blockquote>
<p>I agreed. While I think that contests that judge on more restricted or pointed areas certainly have their place and usefulness, I didn&#8217;t see any problems with Loebner&#8217;s vision provided that:</p>
<ul>
<li>we all realized that it would be ridiculously easy to identify the chatbots</li>
<li>we ascribed &#8220;humanness&#8221; to these bots based on what real technological advances they showcased and not simply what how numerous the keyword-spotting tricks and templates they used or how clever they were</li>
</ul>
<h3>Taking the Plunge</h3>
<p>Despite my having &#8220;discovered&#8221; the LPC in 2001 or 2002 or something like that, I wasn&#8217;t able to start coding up a chatbot in my free time until the beginning of 2008. As a <a href="http://www.thisiscool.com/">software consultant</a>, I would always put my clients&#8217; needs above my own and the activiation energy of starting a chatbot from scratch was way too high. I already knew I didn&#8217;t want an AIML or some other chatbot that was purely pattern-and-response-based. I wanted to start from scratch and use the LPC 2007 screening questions as a driver for whatever infrastructure I needed to be able to answer these.</p>
<p>At the end of 2007, I told all of my clients to go away so I could pursue this dream, not knowing whether they would be there when I was done. (In consulting, &#8220;absence makes the heart grow fonder&#8221; doesn&#8217;t really apply.) I didn&#8217;t have much of a choice mentally &#8211; each year that passed made me more miserable as I read a new batch of contest transcripts and so badly wanted to play too.</p>
<p>I maintained a notebook of things I wanted my bot to do. At the beginning of January, it was announced that the contest deadline would be May 30. I panicked because that was just five months whereas previous years&#8217; deadlines had been July or so.</p>
<h3>Paring Things Down</h3>
<p>Given the deadline, I took stock of my delusional functionality list in my notebook and decided how I could pare things down so I could submit an entry on time. I looked at <a href="from http://www.loebner.net/Prizef/2007_Contest/Rules.html">the screening questions</a>:</p>
<pre>
<strong>Set 1 - Questions relating to time:</strong>
Background facts: For testing purposes, contest management will consider these to be correct whether or not the time and venue of the contest have been changed.
 a. The system clock will be accurate to within a minute or two.
 b. The competition is scheduled to start at 10:00 AM 20 October 2007
 c. There will be 7 rounds of 30 minutes each.

Sample Questions
What time is it?
What round is this?
Is it morning, noon, or night?
Etc.

<strong>Set 2 - General questions about things.</strong>

Sample Questions:
What is a hammer?
What would I use a hammer for?
Of what use is a taxi?
Etc.

<strong>Set 3 Questions relating to comparisons</strong>
Sample Questions
Which is larger, a grape or a grapefruit?
Which is faster, a train or a plane?
John is older than Mary, and Mary is older than Sarah.  Which of them is the oldest?
Etc:

<strong>Set 4 - Questions demonstrating "memory" or persistence.</strong>
Sample Questions
I have a friend named Harry who likes to play tennis.
<One or more intervening questions or statements>
What is the name of the friend I just told you about?
Do you know what game Harry likes to play?</pre>
<p>My head started to reel when I thought about what functionality would be needed to implement the above and what research revealed would be adequate tools for the job: a backward-chaining inference engine, hooks to <a href="http://wordnet.princeton.edu/">Wordnet</a>, <a href="http://en.wikipedia.org/wiki/Main_Page">Wikipedia</a>, <a href="http://en.wiktionary.org/wiki/Wiktionary:Main_Page">Wiktionary</a>, <a href="http://web.media.mit.edu/~hugo/conceptnet/">ConceptNet</a>, <a href="http://www.opencyc.org/">Opencyc</a>, <a href="http://www.link.cs.cmu.edu/link/">the Link Parser API</a> and my own custom knowledgebase. And even if I could adequately handle the above screening questions, there would still be a mountain of simple questions that any six-year-old child could answer that weren&#8217;t covered in the above list. (&#8220;The blue bottle is on the table. / Where is the bottle? / What color is the bottle?&#8221;)</p>
<p>And then there was the issue of creating a personality for my bot and having it answer questions about its past, its relatives, its job, what transportation method it used to arrive at the competition site. I panicked. Could any other bot do all of this?</p>
<h3>The Competition</h3>
<p>I gathered a list of bots which were online that I could ask questions: <a href="http://alicebot.blogspot.com/">ALICE</a>, <a href="http://www.jabberwacky.com/">Jabberwacky</a>, <a href="http://www.jeeney.com/">Jeeney</a>, <a href="http://www.zabaware.com/assistant/">Ultra HAL</a>, <a href="http://www.a-i.com/">Alan</a>. To my relief, none of them could answer basic questions like whether an orange was bigger than the moon or anything even close. Jeeney later developed a hook to Wikipedia, so it could answer things like &#8220;What is a hammer?&#8221;.</p>
<p>I surmised that in light of the above, it would be unlikely that my competition would be able to answer most of the Loebner 2007 Screening questions. What&#8217;s more, I slowly came to the understanding that lying that I was a human and therefore having to concoct all sorts of stories about my past, my mother, my job, etc. would be a colossal waste of time if these simple screening questions weren&#8217;t addressed. What&#8217;s more, such a web of lies would make things worse because I would then need to be able to discuss my mother, my job, etc. and have responses to questions about these, etc. &#8220;Forget that.&#8221; I thought. Better to concentrate on the simple stuff first and hope that I win on the merit of my efforts.</p>
<p>I therefore developed the following strategy:</p>
<ol>
<li>Be able to answer the Loebner 2007 screening questions.</li>
<li>Forget about the nonsense about pretending I&#8217;m a human. Just outright say I&#8217;m a bot.</li>
<li>Aggressively advertise my capabilities so that the judge will see them and not be inundated with nothing more than &#8220;I don&#8217;t know&#8221;-type responses when s/he doesn&#8217;t hit the sliver of questions I attempt to address.</li>
</ol>
<p>I wasn&#8217;t quite sure of the second point of my strategy, since it seemed a radical departure from previous years&#8217; entries. When I tried to figure out from the Robitron list whether saying I was a bot would be a disqualifier:</p>
<blockquote>
<p>If I finish my bot, my intention is to not have the bot lie and say that it is human. My bot will be very proud to be a bot and its age will be the elapsed time since my bot had its first conversation (which hasn&#8217;t happened yet).</p>
<p>The way I read the contest rules, this should disqualify me from winning the $25K prize but shouldn&#8217;t necessarily disqualify me from winning the &#8220;best entry&#8221; prize. (Assuming I can compete successfully against other bots which have been around for years and years, which almost seems delusional, but that&#8217;s what dreams are made of, I guess.)</p>
<p>For me personally, wasting my time trying to give my bot too much of a fake history and personality is equivalent to wasting my time trying to imitate fake typing. That&#8217;s not what I&#8217;m in this for. I want to make an excellent conversationalist who doesn&#8217;t have to resort to tricks and lies for entertainment value.</p>
<p>It&#8217;s very important to note that I am <i>not</i> making a value judgment on those who choose to go this route, but rather that I personally have no interest in it. If I have misinterpreted the rules and am indeed disallowed from entering the contest if I refuse to make my bot lie, then please say so and I&#8217;ll spend my energy on other endeavors.</p>
</blockquote>
<p>&#8230;I got mixed reactions. The main contest organizer said that it wouldn&#8217;t necessarily constitute immediate disqualification, but it would probably greatly reduce my chances of winning and that I wasn&#8217;t getting the point: this was a Turing Test. Hugh Loebner said:</p>
<blockquote>
<p>No &#8211; not if it&#8217;s a Turing Test. In fact, that would be a show stopper. I can not see why it is necessary or desirable for the bot to claim to be a bot. What is the purpose of this?</p>
</blockquote>
<p>I discussed my attitude towards humanness in this competition:</p>
<blockquote>
<p>That all depends how judges ascribe humanness to an entrant. See my previous reply to Hugh.</p>
<p>I personally thought that Hugh&#8217;s response to Shieber was quite eloquent and I was sold on it. But if we&#8217;re saying here that a bot will get lower marks or even be disqualified simply because it doesn&#8217;t pretend to be human, that seems to be at variance with accomplishing anything remotely useful with this contest. Programming a bot to pretend to be a human involves much more than one line of code where the bot affirms that it&#8217;s human &#8211; it involves an extremely labor-intensive (and IMO time-wasting) effort to code up a web of lies which invariably implodes under its own weight. Given that, I (obviously erroneously) believed (also based on what I read in &#8220;In Response&#8221;) that in the absence of a bot which was truly able to convince a judge that it was human, that the judges would react favorably to a bot which exhibited intelligent qualities regardless of whether it pretended to be a human.</p>
</blockquote>
<p>Among other things, this evoked a reaction from someone who reiterated the uselessness of the LPC and said it was all about building &#8220;the best liar&#8221;. Despite the detractors, though, I resolved to keep an open mind. To be on the safe side, though, I decided to have Chip Vivant (my bot) be humorously evasive when one posed him questions about his identity rather than outright saying that he was a bot.</p>
<h3>The Implementation</h3>
<p>Implementing Chip was a highly stressful endeavor given the time pressure and what I wanted to accomplish. I spent inordinate amounts of time downloading things, massaging the data, developing hooks and APIs to the things I mentioned before (a backward-chaining inference engine, <a href="http://wordnet.princeton.edu/">Wordnet</a>, <a href="http://en.wikipedia.org/wiki/Main_Page">Wikipedia</a>, <a href="http://en.wiktionary.org/wiki/Wiktionary:Main_Page">Wiktionary</a>, <a href="http://web.media.mit.edu/~hugo/conceptnet/">ConceptNet</a>, <a href="http://www.opencyc.org/">Opencyc</a>, <a href="http://www.link.cs.cmu.edu/link/">the Link Parser API</a>, my own custom knowledgebase). I developed my own template matching system. I was very proud of my infrastructure.</p>
<p>I also ran into several shocking discoveries along the way. Opencyc was much more useless that I thought it would be. (No offense.) I discovered to my horror that it had no clue whether an orange was bigger than the moon despite proclaiming itself &#8220;an upper ontology whose domain is all of human consensus reality&#8221;, &#8220;containing hundreds of thousands of terms, along with millions of assertions relating the terms to each other&#8221;. It was also terrible with part-of relations. What&#8217;s more, I found out that there wasn&#8217;t a single place that I could find anywhere on the Internet which had information such as the relative sizes of objects. I&#8217;d have to come up with this myself. (All the more reason to not waste time with a fake persona.)</p>
<p>(I give a lot of credit to my wife for supporting me morally during this time, despite the fact that I was pulling in no income and had sent my clients away. She also helped me with things like coming up with the relative object size list, which I had unsuccessfully tried to outsource to three subcontractors.)</p>
<p>I decided not to forego canned responses entirely. With certain things like &#8220;How are you?&#8221;, it&#8217;s pleasant to be able to answer &#8220;Fine thanks,&#8221; or some variant thereof. So my bot became a Loebner Prize 2007 Screening Questions + miscellaneous canned responses bot. Oh, and math. I thought it would be cool to do math too. So I started throwing more and more things in there.</p>
<p>I had some people talk to Chip and with each conversation, it seemed like there were things that people said that it would be very easy to add a canned response for, so I started throwing more and more of these in, despite having had the original goal of never having Chip answer something that he didn&#8217;t truly understand.</p>
<h3>Launch Day</h3>
<p>I launched Chip two days before the deadline so I could give my friends and family the link and have them talk to Chip. I also set out to implement vanilla sentence handling in order to handle scenarios like &#8220;I like red strawberries. / I like the blue piano. / What fruit do I like? / What instrument do I like? / What color is the piano?&#8221; Launch day came and people started talking to Chip. It was a bloodbath at first. There were the initial bugs that you can never iron out despite your best sterile testing efforts. And the other disconcerting thing was that my strategy of incessantly prompting people to type &#8220;What can you do?&#8221; when I said I didn&#8217;t understand something, then spewing out a massive list of things I could do, didn&#8217;t seem to be working. People wanted to ask Chip what his favorite sport was, his favorite color, etc. People either ignored my massive list or else were offended at Chip&#8217;s attempt to influence what was supposed to be a spontaneous conversation. Only two of the judges that conversed with Chip really seemed to understand what my goals were. To make matters worse, the vanilla sentence handling would often interpret a sentence in strange ways, discarding things it didn&#8217;t understand, making some sort of internal first-order logic representation, replying with &#8220;OK. I&#8217;ve memorized that.&#8221;, then failing to respond correctly to queries about what it had just memorized. The fact that I documented that Chip couldn&#8217;t handle negation yet fell on deaf ears.</p>
<p>The fact that I had unwittingly billed Chip as a &#8220;smart&#8221; bot prompted all sorts of physics and geology questions that Chip couldn&#8217;t answer.</p>
</p>
<h3>More Canned Responses</h3>
<p>My knee-jerk reaction to Chip&#8217;s getting conversationally slaughtered during the initial judging period was to pile on the canned responses shovelful after shovelful. My wife helped with this and admonished me that I should have enlisted her help sooner, since she could have authored the canned responses from the beginning. I told her that I was going down a route that I had never wanted to go down from the start and she empathized with me without really understanding why I was making such a big deal about not liking canned responses. (&#8220;That&#8217;s what makes it fun,&#8221; she said.)</p>
<p>In the end, Chip didn&#8217;t make it through to the finals. I&#8217;m not sure whether it was the numerous bugs at the beginning or the fact that Chip simply wasn&#8217;t as entertaining. When you converse with Chip, you&#8217;ll see that despite my having added numerous canned responses, there are still a great deal of &#8220;I don&#8217;t know&#8221;-type answers as well as &#8220;OK. I&#8217;ve memorized that.&#8221; answers which don&#8217;t pass muster when you query Chip further about what he just memorized.</p>
<h3>Faulty Assumptions</h3>
<p>I&#8217;ll preface this section by talking about something seemingly unrelated. I remember the moment of my &#8220;conversion&#8221; to <a href="http://en.wikipedia.org/wiki/Animal_protection">Animal Rights</a> very clearly. It was the summer of &#8217;88 when I had just moved to North Carolina. I was vegetarian but not an activist. I was looking to hook up with other vegetarians and found this leaflet for The Triangle Vegetarian Society. I called the contact person for Durham and he was very nice, but in a hurry at that moment. He said &#8220;I&#8217;m so sorry to rush you and I definitely will call back. By the way, are you vegetarian for health or Animal Rights reasons?&#8221; I said both. He said &#8220;Then maybe you&#8217;d be interested in coming to the annual meeting of the North Carolina Network for Animals with me.&#8221; We arranged to meet up somewhere and he drove me to the annual meeting. </p>
<p>During the meeting, one person after another came up and talked about the fur protests, anti-dissection campaigns, dog washes and other events they organized. I had never participated in a protest before and started to feel a bit uncomfortable around these extremists. I thought: it&#8217;s okay to be vegetarian, and the education outreach is kind of nice, but this protest stuff is kind of weird and all these people are a bit too fired up for my taste. </p>
<p>Finally, the head of the organization wrapped up the meeting with a speech explaining how we were the Voice for the Voiceless and how we needed to be a voice for the animals because they had no voice themselves, yet were being tortured, maimed, mutilated and massacred by the billons for senseless reasons. At that moment, the room changed, I saw a bright light in my head and I knew I had been converted. I was now irrevocably on &#8220;the other side&#8221;. </p>
<p>The months that followed were very strange. Before the gathering, I reasoned, I had never really come into contact with anyone who had presented the <a href="http://www.hedweb.com/faqfile.html">arguments</a> so coherently. Therefore, if I simply went to everyone and presented the arguments as coherently as they were presented to me, I&#8217;d convert the world to vegetarianism just like knocking down dominos. Of course, we all know that it doesn&#8217;t happen this way. There was a period of time where I almost lost my mind. Faced with this reality, I had two choices: succomb to despair or else partially block this reality out to return to some semblance of my former ignorant but more blissful life. I chose the latter, which also permitted me to associate with and befriend meat-eaters like I did in my &#8220;former&#8221; life. </p>
<p>Despite my lessons learned from that previous experience, I went into this contest hoping that if I laid out the following simple arguments, that I&#8217;d win by a landslide:</p>
<ul>
<li>We are light-years away from a Turing-Test-passing bot. Yet as Hugh Loebner argued, an unrestricted Turing Test is not at variance with advancing this field provided that we judge the results with a grain of salt (search for &#8220;at the current state of the art&#8221; at the beginning of this article).</li>
<li>There is so much work to do at the fundamental level that it is counterproductive to work on creating a fake persona when faking humanness is so easily detected on more fundamental levels. (&#8220;Which is bigger: an orange or the moon?&#8221;)</li>
<li>Showing a best-faith effort to tackle these problems, albeit in a primitive, limited fashion should be rewarded by the judges more than yet another pattern-and-response bot which has no additional ability to reason or remember.</li>
<li>Given that out of the vast sea of possible common sense questions, it would be difficult to stumble upon the things Chip can answer well by chance, it would be logical for Chip to advertise the things that he was particularly suited to do. The longer the list, the more artifical and less human Chip would appear, but we&#8217;ve already established that we haven&#8217;t a hope in the world of fooling a savvy judge.</li>
</ul>
<p>In hindsight, these assumptions were incorrect.</p>
<h3>My Conclusions</h3>
<p>So where does that leave me? I&#8217;m glad I entered the contest and have no regrets or ill-will about the experience. On the contrary, a concrete deadline and yearly contest were what spurred me to action. Otherwise, who knows when I would have eventually done this?</p>
<p>Plus, I&#8217;m five-months worth of Intellectual Property and a wealth of knowledge and understanding richer.</p>
</p>
<p>As for my feelings for Chip, I alternate between days of intense pride and days where I want to run him through the electronic equivalent of a garbage disposal. One of the things I enjoy about programs which employ some sort of searching algorithm, like chess or <a href="http://www.thisiscool.com/triangle.htm">The Triangle Puzzle</a> I wrote, is that the computer can make astonishing moves which the programmer himself can&#8217;t predict. I had hoped that my initial effort would involve something similar: the current input plus all previous inputs are run through some sort of artificially-intelligent blender and produce a surprising result which simulates intelligence. Needless to say, my current result is far from that: it&#8217;s a mixture of canned responses plus some reasoning ability, but the responses are quite unsurprising and easily predictable.</p>
<p>That being said, I am proud of a couple of things. I&#8217;m hereby planting several flags in the ground and, until proven wrong, am declaring that:</p>
<ul>
<li>I have created the first-ever Internet-facing entity (and for all I know, the first-ever reference in any medium), that explicitly answers the question of whether an orange is larger than the moon (as well as other such object comparisons for speed, size, loudness).</li>
<li>I have created the first Internet-facing chatbot that can answer questions like &#8220;The blue bottle is on the table. / Where is the bottle? / What is on the table? / What is blue? / What color is the bottle?&#8221;</li>
</ul>
<p>Chip may have major shortcomings, but at least he attempts to do certain things (albeit in a brittle, not-very-extensible manner) which will need to be done if we are ever to create intelligent machines.</p>
<p>Another positive takeaway from this contest is the great people I&#8217;ve met on the <a href="http://tech.groups.yahoo.com/group/Robitron/">Robitron</a> list and some wonderful discussions with these people both on and off list. Several people have been particularly encouraging. One person for whom I have great respect (and whom I&#8217;ll let step forward and identify himself or herself if s/he feels like it) said after my loss:</p>
<blockquote>
<p>This is why it&#8217;s necessary to work on [your chatbot] away from any constraints such as contests or commercial pressures, because inevitably you&#8217;ll be tempted to take short cuts and make quick fixes you&#8217;ll regret later on.</p>
</blockquote>
<p>&#8230;which is exactly what happened in my case. (Then again, if it hadn&#8217;t been for the contest, I might not have ever submitted an entry.)</p>
<p>As for the LPC itself, given that in the absence of feedback from the judges, I must assume that my initial assumptions proved to be incorrect, I find myself not being really clear on what the purpose of the contest is.</p>
<blockquote><p><b>Update (7 July 2011)</b>: Later, I did get feedback in the form of a nice PDF with actionable comments, so I guess I was being too harsh when I made that initial statement.</p></blockquote>
<p>I&#8217;m not saying that Chip should have won, but given my reasonable certitude that none of the entries have the proper infrastructure to be able to handle the sort of questions that Chip attempts to handle, nor do they innovate in other ways that I&#8217;m aware of, I&#8217;m not sure what will be proven when a winner is declared. What&#8217;s worse, I don&#8217;t see anything about the contest that encourages the kind of incremental innovation we need to solve the problem of intelligent machines. Although nothing in the contest&#8217;s structure explicitly discourages this, from what I&#8217;ve seen, it seems to encourage bots to gravitate towards a local minimum because the activation energy is too high to get to the real minimum. (I fear I&#8217;m mixing metaphors here but it&#8217;s late and I&#8217;m tired.) As far as I can see, it doesn&#8217;t appear that bots that attempt to tackle the fundamental issues are rewarded for this in any way unless they have a truckload of canned responses in addition. And every canned response which the bot truly doesn&#8217;t understand is a lie which is easily unraveled when the interrogator questions the bot further about the content of that canned response.</p>
<p>Again, I want to reiterate (like I&#8217;ve done many times) that I&#8217;m not making a value judgment on these pattern-and-response-based bots or saying that Chip is better. (On the contrary, such bots have proven themselves capable of handling simple Help Desk type scenarios and also have entertainment value. What&#8217;s more, I have unconditional admiration for all chatbot writers.) It&#8217;s just that the technology behind these bots is very well known and I am not seeing how they can scale and expand to handle the kind of scenarios Chip attempts to handle. (I&#8217;m also not saying that Chip can handle these well either, but he tries. Also, of the finalists, <a href="http://www.jabberwacky.com/">Jabberwacky</a>&#8216;s underlying technology is in a class by itself and I am unfamiliar with the details of the underlying technology, so my assumptions may be incorrect.)</p>
<p>If you&#8217;ve read this far, thanks for bearing with me. If any of the statements and affirmations I made are incorrect, please accept my apologies in advance and instead of yelling at me, help me to set the record straight.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.chipvivant.com/2008/09/05/my-loebner-prize-contest-2008-reflections/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

