IBM at 100: A Computer Called Watson
Watson is an efficient analytical engine that pulls many sources of data together in real-time, leverages natural language processing, discovers an insight, and deciphers a degree of confidence.
In my continuing series of IBM at 100 achievements, I saved the Watson achievement posting for today. In an historic event beginning tonight, in February 2011 IBM’s Watson computer will compete on Jeopardy! against the TV quiz show’s two biggest all-time champions. Watson is a supercomputer running software called DeepQA, developed by IBM Research. While the grand challenge driving the project is to win on Jeopardy!, the broader goal of Watson was to create a new generation of technology that can find answers in unstructured data more effectively than standard search technology.
Watson does a remarkable job of understanding a tricky question and finding the best answer. IBM’s scientists have been quick to say that Watson does not actually think. “The goal is not to model the human brain,” said David Ferrucci, who spent 15 years working at IBM Research on natural language problems and finding answers amid unstructured information. “The goal is to build a computer that can be more effective in understanding and interacting in natural language, but not necessarily the same way humans do it.”
Computers have never been good at finding answers. Search engines don’t answer a question–they deliver thousands of search results that match keywords. University researchers and company engineers have long worked on question answering software, but the very best could only comprehend and answer simple, straightforward questions (How many Oscars did Elizabeth Taylor win?) and would typically still get them wrong nearly one third of the time. That wasn’t good enough to be useful, much less beat Jeopardy! champions.
The questions on this show are full of subtlety, puns and wordplay—the sorts of things that delight humans but choke computers. “What is The Black Death of a Salesman?” is the correct response to the Jeopardy! clue, “Colorful fourteenth century plague that became a hit play by Arthur Miller.” The only way to get to that answer is to put together pieces of information from various sources, because the exact answer is not likely to be written anywhere.
Watson leverages IBM Content Analytics for part of the natural language processing. Watson runs on a cluster of PowerPC 750™ computers—ten racks holding 90 servers, for a total of 2880 processor cores. It’s really a room lined with black cabinets stuffed with hundreds of thousands of processors plus storage systems that can hold the equivalent of about one million books worth of information. Over a period of years, Watson was fed mountains of information, including text from commercial sources, such as the World Book Encyclopedia, and sources that allow open copying of their content, such as Wikipedia and books from Project Gutenberg. Learn more about the technology under the covers on my previous posting 10 Things You Need to Know About the Technology Behind Watson.
When a question is put to Watson, more than 100 algorithms analyze the question in different ways, and find many different plausible answers–all at the same time. Yet another set of algorithms ranks the answers and gives them a score. For each possible answer, Watson finds evidence that may support or refute that answer. So for each of hundreds of possible answers it finds hundreds of bits of evidence and then with hundreds of algorithms scores the degree to which the evidence supports the answer. The answer with the best evidence assessment will earn the most confidence. The highest-ranking answer becomes the answer. However, during a Jeopardy! game, if the highest-ranking possible answer isn’t rated high enough to give Watson enough confidence, Watson decides not to buzz in and risk losing money if it’s wrong. The Watson computer does all of this in about three seconds.
By late 2010, in practice games at IBM Research in Yorktown Heights, N.Y., Watson was good enough at finding the correct answers to win about 70 percent of games against former Jeopardy! champions. Then in early 2011, Watson went up against Jeopardy! superstars Ken Jennings and Brad Rutter.
Watson’s question-answering technology is expected to evolve into a commercial product. “I want to create something that I can take into every other retail industry, in the transportation industry, you name it,” John Kelly, who runs IBM Research, told The New York Times. “Any place where time is critical and you need to get advanced state-of-the-art information to the front decision-makers. Computers need to go from just being back-office calculating machines to improving the intelligence of people making decisions.”
When you’re looking for an answer to a question, where do you turn? If you’re like most people these days, you go to a computer, phone or mobile device, and type your question into a search engine. You’re rewarded with a list of links to websites where you might find your answer. If that doesn’t work, you revise your search terms until able to find the answer. We’ve come a long way since the time of phone calls and visits to the library to find answers.
But what if you could just ask your computer the question, and get an actual answer rather than a list of documents or websites? Question answering (QA) computing systems are being developed to understand simple questions posed in natural language, and provide the answers in textual form. You ask “What is the capital of Russia?” The computer answers “Moscow,” based on the information that has been loaded into it.
IBM is taking this one step further, developing the Watson computer to understand the actual meaning behind words, distinguish between relevant and irrelevant content, and ultimately demonstrate confidence to deliver precise final answers. Because of its deeper understanding of language, it can process and answer more complex questions that include puns, irony and riddles common in natural language. On February 14–16, 2010, IBM’s Watson computer will be put to the test, competing in three episodes of Jeopardy! against the two most successful players in the quiz show’s history: Ken Jennings and Brad Rutter.
The full text of this article can be found on IBM at 100: http://www.ibm.com/ibm100/us/en/icons/watson/
As for me … I am anxiously waiting to see what happens starting tonight. See my previous blog postings on Watson at: “What is Content Analytics?, Alex”, 10 Things You Need to Know About the Technology Behind Watson and Goodbye Search … It’s About Finding Answers … Enter Watson vs. Jeopardy!
Good luck tonight to Watson, Ken Jennings and Brad Rutter … may the best man win (so to speak)!