Max-Planck-Institut für Informatik
max planck institut
informatik
mpii logo Minerva of the Max Planck Society
 

MPI-INF or MPI-SWS or Local Campus Event Calendar

<< Previous Entry Next Entry >> New Event Entry Edit this Entry Login to DB (to update, delete)
What and Who
Title:Methods and Tools for Summarization of Entities and Facts in Knowledge Bases
Speaker:Tomasz Tylenda
coming from:Max-Planck-Institut für Informatik - D5
Speakers Bio:
Event Type:Promotionskolloquium
Visibility:D1, D2, D3, D4, D5, SWS, RG1, MMCI
We use this to send out email in the morning.
Level:
Language:English
Date, Time and Location
Date:Monday, 28 September 2015
Time:11:00
Duration:-- Not specified --
Location:Saarbrücken
Building:E1 4
Room:0.24
Abstract
Knowledge bases have become key assets for search and analytics over large document corpora.

They are used in applications ranging from highly specialized tasks in bioinformatics to general
purpose search engines. The large amount of structured knowledge they contain calls for effective
summarization and ranking methods.

The goal of this dissertation is to develop methods for automatic summarization of entities in knowledge
bases, which also involves augmenting them with information about the importance of particular facts on
entities of interest. We make two main contributions.

First, we develop a method to generate a summary of information about an entity using the type
information contained in a knowledge base. We call such a summary a semantic snippet. Our method
relies on having importance information about types, which is external to the knowledge base.
We show that such information can be obtained using human computing methods, such as Amazon
Mechanical Turk, or extracted from the edit history of encyclopedic articles in Wikipedia.

Our second contribution is linking facts to their occurrences in supplementary documents.
Information retrieval on text uses the frequency of terms in a document to judge their importance.
Such an approach, while natural, is difficult for facts extracted from text. This is because information
extraction is only concerned with finding any occurrence of a fact. To overcome this limitation we
propose linking known facts with all their occurrences in a process we call fact spotting. We develop
two solutions to this problem and evaluate them on a real world corpus of biographical documents.

Contact
Name(s):Daniela Alessi
Phone:068193255000
Video Broadcast
Video Broadcast:NoTo Location:
Tags, Category, Keywords and additional notes
Note:
Attachments, File(s):

Created by:Daniela Alessi/MPI-INF, 09/21/2015 01:49 PMLast modified by:Uwe Brahm/MPII/DE, 11/24/2016 04:13 PM
  • Daniela Alessi, 09/21/2015 02:21 PM
  • Daniela Alessi, 09/21/2015 02:18 PM -- Created document.