MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Is question answering an acquired skill?

Soumen Chakrabarti
IIT Bombay
Talk
AG 1, AG 2, AG 3, AG 4, AG 5  
MPI Audience

Date, Time and Location

Monday, 13 September 2004
11:00
-- Not specified --
46.1 - MPII
Rotunde 4th floor
Saarbrücken

Abstract

We present a question answering (QA) system which learns how to detect
and rank answer passages by analyzing questions and their answers (QA
pairs) provided as training data. Our key technique is to recover, from
the question, fragments of what might have been posed as a structured
query, had a suitable schema been available. One fragment comprises
_selectors_: tokens that are likely to appear (almost) unchanged in an
answer passage. The other fragment contains question tokens which give
clues about the _answer_type_, and are expected to be _replaced_ in the
answer passage by tokens which _specialize_ or _instantiate_ the desired
answer type. Selectors are like constants in where-clauses in relational
queries, and answer types are like column names. We propose a simple
conditional exponential model over a pair of feature vectors, one
derived from the question and the other derived from the a candidate
passage.  Features are extracted using a lexical network (WordNet) and
surface context as in named entity extraction, except that there is no
direct supervision available in the form of fixed entity types and their
examples. We do not need any manually-designed question type system,
supervised question classification, or customization of the lexical
network. Using the exponential model, we filter candidate passages and
see substantial improvement in the mean rank at which the first answer
passage is found. With TREC QA data, our system achieves mean reciprocal
rank (MRR) scores that compare favorably with the best scores in recent
years, and generalizes from one corpus to another.

Contact

Gerhard Weikum
--email hidden
passcode not visible
logged in users only

Tags, Category, Keywords and additional notes

Homepage:
http://www.cse.iitb.ac.in/~soumen

Biography:
Soumen Chakrabarti received his B.Tech in Computer Science from the
Indian Institute of Technology, Kharagpur, in 1991 and his M.S. and
Ph.D. in Computer Science from the University of California, Berkeley in
1992 and 1996. At Berkeley he worked on compilers and runtime systems
for running scalable parallel scientific software on message passing
multiprocessors. He was a Research Staff Member at IBM Almaden Research
Center from 1996 to 1999, where he worked on the Clever Web search
project and led the Focused Crawling project. In 1999 he moved as
Assistant Professor to Department of Computer Science and Engineering at
the Indian Institute of Technology, Bombay, where he has been an
Associate professor since 2003. In Spring 2004 he was Visiting Associate
professor at Carnegie-Mellon University. He has published in the WWW,
SIGIR, SIGKDD, SIGMOD, VLDB, ICDE, SODA, STOC, SPAA and other
conferences as well as Scientific American, IEEE Computer, VLDB and
other journals. He holds eight US patents on Web-related inventions. He
has served as vice-chair or program committee member for WWW, SIGIR,
SIGKDD, VLDB, ICDE, SODA and other conferences, and guest editor or
editorial board member for DMKD and TKDE journals. He is also author of
a new book on Web Mining. His current research interests include
question answering, Web analysis, monitoring and search, mining
irregular and relational data, and textual data integration.

Uwe Brahm, 09/12/2004 01:04 -- Created document.