David Ferrucci – Information Science and Technology Colloquium Series

Please Note: The content on this page is not maintained after the colloquium event is completed. As such, some links may no longer be functional.

Download Adobe PDF Reader

David Ferrucci
An Overview of DeepQA for the Jeopardy! Challenge

Wednesday, November 17, 2010
Building 3 Auditorium - 11:00 AM
(Coffee and cookies at 10:30 AM)

Computer systems that can directly and accurately answer peoples' questions over a broad domain of human knowledge have been envisioned by scientists and writers since the advent of computers themselves. Open domain question answering holds tremendous promise for facilitating informed decision making over vast volumes of natural language content. Applications in business intelligence, healthcare, customer support, enterprise knowledge management, social computing, science and government would all benefit from deep language processing. The DeepQA project (www.ibm.com/deepqa) is aimed at exploring how advancing and integrating Natural Language Processing (NLP), Information Retrieval (IR), Machine Learning (ML), and massively parallel computation and Knowledge Representation and Reasoning (KR&R) can greatly advance open-domain automatic Question Answering.

An exciting proof-point in this challenge is to develop a computer system that can successfully compete against top human players at the Jeopardy! quiz show. Attaining champion-level performance Jeopardy! requires a computer system to rapidly and accurately answer rich open-domain questions, and to predict its own performance on any given category/question. The system must deliver high degrees of precision and confidence over a very broad range of knowledge and natural language content with a 3-second response time. To do this DeepQA evidences and evaluates many competing hypotheses. A key to success is automatically learning and combining accurate confidences across an array of complex algorithms and over different dimensions of evidence. Accurate confidences are needed to know when to "buzz in" against your competitors and how much to bet. High precision and accurate confidence computations are just as critical for providing real value in business settings where helping users focus on the right content sooner and with greater confidence can make all the difference. The need for speed and high precision demands a massively parallel computing platform capable of generating, evaluating and combing 1000's of hypotheses and their associated evidence. In this talk I will introduce the audience to the Jeopardy! Challenge. I will describe our technical approach, how it is founded on a massively parallel platform using UIMA-AS and will touch on the promise of DeepQA beyond Jeopardy!.

Dr. David Ferrucci is a Research Staff Member at IBM's T.J. Watson's Research Center and where he leads the Semantic Analysis and Integration department. His research focuses on technologies for discovering knowledge in natural language and for leveraging the results in a variety of intelligent search and knowledge management solutions. He has been the Principal Investigator (PI) on several government-funded research programs on automatic question answering, intelligent systems and saleable text analytics. His team consists of 25 researchers and software engineers specializing in the areas of Natural Language Processing (NLP), Software Architecture, Information Retrieval, Machine Learning and Knowledge Representation and Reasoning (KR&R). Dr. Ferrucci, as chief architect, led the UIMA project at IBM and chaired the UIMA standards committee at OASIS. UIMA is a software framework and open standard used by industry and academia for integrating, deploying and scaling advanced text and multi-modal analytics. The UIMA framework is deployed in IBM products and has been contributed to Apache open-source to facilitate broader adoption and development. UIMA helped lay the foundation for doing large-scale, collaborative unstructured information research.

In 2007, Dr. Ferrucci took on the Jeopardy! Challenge - tasked to create a computer system that can rival human champions at the game of Jeopardy!. As the PI for the exploratory research project dubbed DeepQA, he focused on advancing automatic, open-domain question answering using massively parallel evidence based hypothesis generation and evaluation. He explored the feasibility, won the support for and has set and driven the technical agenda for the Jeopardy! Challenge. He engaged top university researchers in the field to help explore better ways to openly and collaboratively accelerate research at a workshop on the open advancement of Question Answering. By building on UIMA, on key university collaborations and by taking bold research, engineering and management steps, he led his team to integrate and advance many search, NLP and semantic technologies to deliver results that have out-performed all expectations and have demonstrated world-class performance at a task previously thought insurmountable with the current state-of-the-art. Watson, the computer system built by Ferrucci's team is now competing with top Jeopardy! players. Next steps are to demonstrate how DeepQA can help make dramatic advances for intelligent decision support in areas including medicine, government and law. Dr. Ferrucci graduated from Manhattan College with a BS in Biology and from Rensselaer Polytechnic Institute in 1994 with a PhD in Computer Science specializing in knowledge representation and reasoning. He is published in the areas of AI, KR&R, NLP and automatic question answering.

IS&T Colloquium Committee Host: Tony Gualtieri

Sign language interpreter upon request: 301-286-7040
Request future announcements