Campus Event Calendar: Soumen Chakrabarti (09/13/2004 in 46.1

Campus Event Calendar

Campus Event Calendar:
- All Upcoming:
  - only for D1
  - only for D2
  - only for INET
  - only for D4
  - only for D5
  - only for D6
  - only for RG1
  - Mailing Lists
  - by Speaker
  - by Type
  - by Category
  - by Title
  - Calendar
  - RSS Feed
- History of Events:

Event Entry

What and Who

Is question answering an acquired skill?

Soumen Chakrabarti

IIT Bombay

Talk

AG 1, AG 2, AG 3, AG 4, AG 5

MPI Audience

Note: We use this to send email in the morning.

Date, Time and Location

Monday, 13 September 2004

11:00

-- Not specified --

46.1 - MPII

Rotunde 4th floor

Saarbrücken

Abstract

We present a question answering (QA) system which learns how to detect and rank answer passages by analyzing questions and their answers (QA pairs) provided as training data. Our key technique is to recover, from the question, fragments of what might have been posed as a structured query, had a suitable schema been available. One fragment comprises _selectors_: tokens that are likely to appear (almost) unchanged in an answer passage. The other fragment contains question tokens which give clues about the _answer_type_, and are expected to be _replaced_ in the answer passage by tokens which _specialize_ or _instantiate_ the desired answer type. Selectors are like constants in where-clauses in relational queries, and answer types are like column names. We propose a simple conditional exponential model over a pair of feature vectors, one derived from the question and the other derived from the a candidate passage. Features are extracted using a lexical network (WordNet) and surface context as in named entity extraction, except that there is no direct supervision available in the form of fixed entity types and their examples. We do not need any manually-designed question type system, supervised question classification, or customization of the lexical network. Using the exponential model, we filter candidate passages and see substantial improvement in the mean rank at which the first answer passage is found. With TREC QA data, our system achieves mean reciprocal rank (MRR) scores that compare favorably with the best scores in recent years, and generalizes from one corpus to another.

Contact

Gerhard Weikum

--email hidden

System used:

Meeting URL:

Meeting ID:

Passcode:

passcode not visible

Code Visible for:

logged in users only

Tags, Category, Keywords and additional notes

Note:

Homepage: http://www.cse.iitb.ac.in/~soumen Biography: Soumen Chakrabarti received his B.Tech in Computer Science from the Indian Institute of Technology, Kharagpur, in 1991 and his M.S. and Ph.D. in Computer Science from the University of California, Berkeley in 1992 and 1996. At Berkeley he worked on compilers and runtime systems for running scalable parallel scientific software on message passing multiprocessors. He was a Research Staff Member at IBM Almaden Research Center from 1996 to 1999, where he worked on the Clever Web search project and led the Focused Crawling project. In 1999 he moved as Assistant Professor to Department of Computer Science and Engineering at the Indian Institute of Technology, Bombay, where he has been an Associate professor since 2003. In Spring 2004 he was Visiting Associate professor at Carnegie-Mellon University. He has published in the WWW, SIGIR, SIGKDD, SIGMOD, VLDB, ICDE, SODA, STOC, SPAA and other conferences as well as Scientific American, IEEE Computer, VLDB and other journals. He holds eight US patents on Web-related inventions. He has served as vice-chair or program committee member for WWW, SIGIR, SIGKDD, VLDB, ICDE, SODA and other conferences, and guest editor or editorial board member for DMKD and TKDE journals. He is also author of a new book on Web Mining. His current research interests include question answering, Web analysis, monitoring and search, mining irregular and relational data, and textual data integration.

Uwe Brahm, 09/12/2004 01:04 -- Created document.

Imprint / Impressum | Data Protection / Datenschutzhinweis