MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Question Answering over Knowledge Bases with Continuous Learning

Abdalghani Abujabal
MMCI
Promotionskolloquium
AG 1, AG 2, AG 3, INET, AG 4, AG 5, SWS, RG1, MMCI  
Public Audience
English

Date, Time and Location

Friday, 12 April 2019
11:00
60 Minutes
E1 4
024
Saarbrücken

Abstract

Answering complex natural language questions with crisp answers is crucial towards satisfying the information needs of advanced users. With the rapid growth of knowledge bases (KBs) such as Yago and Freebase, this goal has become attainable by translating questions into formal queries like SPARQL queries. Such queries can then be evaluated over knowledge bases to retrieve crisp answers. To this end, two research issues arise:

(i) how to develop methods that are robust to lexical and syntactic variations in questions and can handle complex questions, and
(ii) how to design and curate datasets to advance research in question answering. In this dissertation, we make the following contributions in the area of question answering (QA).  For issue (i), we present QUINT,  an approach for answering natural language questions over knowledge bases using automatically learned templates. Templates are an important asset for QA over KBs, simplifying the semantic parsing of input questions and generating formal queries for interpretable answers. We also introduce NEQA, a framework for continuous learning for QA over KBs. NEQA starts with a small seed of training examples in the form of question-answer pairs, and improves its performance over time. NEQA combines both syntax, through template-based answering, and semantics, via a semantic similarity function. For issues (i) and (ii), we present TEQUILA, a framework for answering complex questions with explicit and implicit temporal conditions over KBs. TEQUILA is built on a rule-based framework that detects and decomposes temporal questions into simpler sub-questions that can be answered by standard KB-QA systems. TEQUILA reconciles the results of sub-questions into final answers. TEQUILA is accompanied with a dataset called TempQuestions, which consists of 1,271 temporal questions with gold-standard answers over Freebase. This collection is derived by judiciously selecting time-related questions from existing QA datasets. For issue (ii), we publish ComQA, a large-scale manually-curated dataset for QA. ComQA contains questions that represent real information needs and exhibit a wide range of difficulties such as the need for temporal reasoning, comparison, and compositionality. ComQA contains paraphrase clusters of semantically-equivalent questions that can be exploited by QA systems.

Contact

Petra Schaaf
5000
--email hidden
passcode not visible
logged in users only

Petra Schaaf, 04/02/2019 09:24 -- Created document.