MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Efficient Entity Disambiguation via Similarity Hashing

Ba Dat Nguyen
International Max Planck Research School for Computer Science - IMPRS
PhD Application Talk
AG 1, AG 2, AG 3, AG 4, AG 5, SWS, RG1, MMCI  
Public Audience
English

Date, Time and Location

Monday, 8 October 2012
09:00
60 Minutes
E1 4
024
Saarbrücken

Abstract

he task of Named Entity Disambiguation (NED), which maps mentions of
ambiguous names in natural language onto a set of known entities, has been
an important issue in many areas including machine translation and
information extraction. Working with a huge amount of data (e.g. more than
three million entities in Yago), some parts in an NED system which
estimate the probability of a mention matching an entity, the similarity
between a mention and an entity and the coherence among entity candidates
for all mentions together might become bottlenecks. Thus, it is
challenging for an interactive NED system to reach not only high accuracy
but also efficiency.
This talk presents an efficient way of disambiguating named entities by
similarity hashing. Our framework is integrated with AIDA which is an
on-line tool for entity detection and disambiguation developed at
Max-Planck Institute for Informatics. We apply various state-of-the-art
approaches, for example Locality Sensitive Hashing (LSH) and Spectral
Hashing, to some forms of similarity search problem such as near-duplicate
search for mention-entity matching, and especially related pair detection
for entity-entity mapping which is not the default application of using
hashing techniques due to the usually low similarities between entities.

Contact

--email hidden
passcode not visible
logged in users only

Marc Schmitt, 10/05/2012 16:05 -- Created document.