MPI-INF Logo
Campus Event Calendar

Event Entry

New for: D1, D2, D3, D4

What and Who

Question Answering Technology: Getting to Know the New Kid on the Block

Marc Light
The Mitre Corporation Boston
Computerlinguistisches Kolloquium
AG 1, AG 2, AG 3, AG 4  
Expert Audience

Date, Time and Location

Thursday, 14 February 2002
16:15
-- Not specified --
17.3 - Computerlinguistik
Seminar Room
Saarbrücken

Abstract


Question answering (QA) systems aim to allow users to ask questions
such as ``which New England communities have reported outbreaks of
encephalitis this year?'' and to receive succinct answers. Such
systems can be viewed as fine-grained search engines that return short
snippets of text containing the answer to a question as opposed to a
list of relevant documents.

For the past three years, the National Institute of Standards and
Technology has hosted an evaluation of QA systems funded by DARPA and
ARDA. The best system this year was able to provide a correct answer
among its top five responses for 70% of the questions in the test
set. The test questions were taken from search engine logs and the
answers were to be found in a document collection consisting of over a
million newswire-like texts.

In general, the performance of these systems outstripped
expectations. Despite this success, there is little understanding of
why these systems work: what aspects of the system and the evaluation
were crucial for the performance, what would cause a decline in
performance, and what aspects account for the system's errors.

In this talk, we take a detailed look at the performance of components
of an idealized question answering system on two different tasks: the
TREC Question Answering task and a set of reading comprehension
exams. We carry out three types of analysis: inherent properties of
the data, feature analysis, and performance bounds. Based on these
analyses we explain some of the performance results of the current
generation of QA systems and make predictions on future work. In
particular, we present four findings: (1) QA system performance is
correlated with answer redundancy, (2) relative overlap scores are
more effective than absolute overlap scores, (3) equivalence classes
on scoring functions can be used to quantify performance bounds, and
(4) perfect answer typing still leaves a great deal of ambiguity for a
QA system because sentences often contain several items of the same type.

This is joint work with Gideon Mann, Ellen Riloff, and Eric Breck.

If you would like to meet with the speaker, please contact:

Detlef Prescher

This seminar series is jointly organized by the Department of
Computational Linguistics and Phonetics and the European Post-Graduate
College in Language Technology and Cognitive Systems.

A current version of the program for this term can be found at:

http://www.coli.uni-sb.de/colloquium/

Contact

--email hidden
passcode not visible
logged in users only

Uwe Brahm, 04/12/2007 12:12 -- Created document.