Kelly Miyahara from Jeopardy Clue Crew at the CES09 set
When it comes tackling a challenge as tough as answering a human question, the best computational approach may be to break the job down into multiple parts and run them all in parallel, IBM is betting.
IBM will be taking this strategy next month when its custom-built computer, nicknamed Watson, will compete in an episode of the Jeopardy game show against two previous champions.
While IBM has been thus far been silent about Watson's exact configuration, Watson lead manager David Ferrucci recently shared a few insights with the IDG News Service about how the system was built to take on this formidable task.
For IBM, the Jeopardy challenge represents the next stage in mimicking human intelligence in computer form. In 1997, IBM's Deep Blue computer won a game of chess against grandmaster Garry Kasparov. Jeopardy will be even a tougher job, Ferrucci said.
"In chess, there is nothing tacit, nothing contextual," Ferrucci said. In contrast, the questions in a Jeopardy match assume an understanding of how people communicate, including the many references and allusions they use. "It's a huge challenge," he said.
"Natural language processing is so difficult because of the many different ways the same information can be expressed," Ferrucci said.
Watson's approach is to divide and conquer. "You have to look at the data from so many different perspectives and combine the [results], because you can never rely on there being only one way to express that content."
In the game show, contestants compete to correctly answer a series of questions. In a grammatical twist, the questions are phrased as answers and the contestants must provide their answers in the form of questions.
To make the contest even more difficult, often the questions are phrased in elliptic ways, forcing the contestants to think about what is really being asked. One typical question: "This measurement of cloth is equal to 40 yards." (Answer: What is a bolt?).
Only when host Alex Trebek finishes asking the question are the contestants allowed to indicate that they know the answer, by hitting a buzzer. The first contestant hitting the buzzer gets the first chance to answer the question.
Typically, it takes about three seconds for the announcer to finish asking the question, Ferrucci said. It is in this compact time frame that Watson must determine a plausible answer.
At first glance, the challenge might seem like an easy one. After all, Internet search engines do these sorts of searches millions of times a day. But it is not so easy, Ferrucci said.
"There is a misconception that [the computer] is just looking the answer up somewhere. I wish it were that easy," he said. Google and other Internet search services return only the documents that may provide the answers, not the answers themselves. And databases hold material that only can be accessed through precisely worded queries.
"The reality is that you have to interpret the question and relate the question to the millions of different ways that the answer might be expressed," Ferrucci said.
The software orchestrating the process of returning an answer is called DeepQA. It combines capabilities in natural language processing, machine learning and information retrieval.
When given a question, the software initially analyzes it, identifying any names, dates, geographic locations or other entities. It also examines the phrase structure and the grammar of the question for hints of what the question is asking.
Sometimes the question is an obvious one, and a query to a specific database will do the trick. Most times, however, the question will kick off five or 10 searches across different data sources, each an interpretation of what the question might be.
For this challenge, IBM has amassed an immense amount of reference material, including multiple encyclopedias, millions of news stories, novels, plays and other digital books. Some of the material is in structured databases; other material resides in unstructured text files.
The process is iterative. A set of results may require a new set of searches to be undertaken. "So, now you might have hundreds of processes, each generating additional candidate answers. Imagine that fan-out," Ferrucci said. An end-result may have 10,000 sets of possible questions and their corresponding answers.
Of course, Jeopardy requires only a single answer, preferably the right one. So once all the possible answers are collected, the system uses about 100 algorithms to rate each one, assessing it from different perspectives: Does the answer match the approximate time frame that the question hints at? Is it in the right geographic region? Does the grammatical form of the answer match what is required by the question? A categorical check is done: If the question is looking for a kind of liquid, is the answer a kind of liquid?
If the question with the highest score meets a preliminary threshold of confidence, that answer will be submitted.
This approach, by itself, would take a single CPU-based machine about two hours to formulate an answer to a single question, Ferrucci said. Here is where the IBM hardware comes in handy. Watson itself is composed of two racks of IBM Power7 System servers, or about 2,500 processor cores, all acting in harmony in a clustered configuration.
Each socket, which can accommodate either six or eight core processors, is able to handle 32 independent threads, said Tom Rosamilia, IBM General Manager of Power and z Systems. Each thread can host a separate search, or some other individual action.
"The great advantage that the hardware provides is the ability to run multi-threaded multicore" processes, Rosamilia said. In other words, running the software across multiple servers dramatically cuts the execution time.
Despite all this hardware muscle and software prowess, Watson's victory on the game floor is anything but assured. Last June, The New York Times reported that the system still had to be improved quite a bit to match fast-thinking Jeopardy aces.
But even as Ferrucci and his team work feverishly to make last-minute adjustments, the lessons they learn will have wider applicability, both for IBM and for the IT industry in general. Ultimately, IBM plans to use this software to build commercial systems that could answer specific questions in selected fields, such as health care, tech support, and the legal field.
"At the end of the day, whether Watson beats Jeopardy champions Ken Jennings and Brad Rutter is relatively unimportant," wrote Charles King, head of Pund-IT, in a weekly newsletter that the IT analysis firm issues. "However, a computing system demonstrating a form of essentially cognitive capabilities represents a huge technological step that will likely foreshadow profound developments in commercial IT systems and solutions."