Campus Event Calendar

Event Entry

What and Who

Automatic Extraction of Facts, Relations, and Entities for Web-Scale Knowledge Base Population

Ndapandula T. Nakashole, M.Sc.
Max-Planck-Institut für Informatik - D5
AG 1, AG 2, AG 3, AG 4, AG 5, SWS, RG1, MMCI  
Public Audience

Date, Time and Location

Thursday, 20 December 2012
60 Minutes
E1 4


Equipping machines with knowledge, through the construction of machine-readable knowledge bases, presents a key asset for semantic search, machine translation, question answering, and other formidable challenges in artificial intelligence. However, human knowledge predominantly resides in books and other natural language text forms. This means that knowledge bases must be extracted and synthesized from natural language text.

When the source of text is the Web, extraction methods must cope with ambiguity, noise, scale, and updates. The goal of this dissertation is to develop knowledge base population methods that address the afore mentioned characteristics of Web text. The dissertation makes three contributions. The first contribution is a method for mining high-quality facts at scale, through distributed constraint reasoning and a pattern representation model that is robust against noisy patterns. The second contribution is a method for mining a large comprehensive collection of relation types beyond those commonly found in existing knowledge bases. The third contribution is a method for extracting facts from dynamic Web sources such as news articles and social media where one of the key challenges is the constant emergence of new entities. All methods have been evaluated through experiments involving Web-scale text collections.


Petra Schaaf
--email hidden
passcode not visible
logged in users only

Petra Schaaf, 12/13/2012 13:22 -- Created document.