Campus Event Calendar: Aliaksandr Talaika (02/10/2014 in E1 4/024)

Campus Event Calendar

Campus Event Calendar:
- All Upcoming:
  - only for D1
  - only for D2
  - only for INET
  - only for D4
  - only for D5
  - only for D6
  - only for RG1
  - Mailing Lists
  - by Speaker
  - by Type
  - by Category
  - by Title
  - Calendar
  - RSS Feed
- History of Events:

Event Entry

What and Who

IBEX: Id-Based Entity Extraction

Aliaksandr Talaika

Fachrichtung Informatik - Saarbrücken

PhD Application Talk

Master of Science

AG 1, AG 2, AG 3, AG 4, AG 5, SWS, RG1, MMCI

Public Audience

English

Note: We use this to send email in the morning.

Date, Time and Location

Monday, 10 February 2014

10:50

90 Minutes

E1 4

024

Saarbrücken

Abstract

Several academic and industrial projects have started extracting entities from the Web. In this thesis, we show that a certain subclass of entities, namely those that have unique identifiers, can be extracted at large scale with high precision from Web data. This applies most notably to commercial products, but also to email addresses, scientific publications, chemical substances, and a wide variety of other entities. By making systematic use of the identifiers, our algorithm can leapfrog page segmentation, complex named entity recognition, or table alignment. Our method can extract millions of items, each disambiguated to a canonical entity, with a precision of 73-96%. This yields a database of unique entities at Web scale. It allows us detailed statistics on the presence of commercial products, people, and other objects on the Internet.

Contact

--email hidden

System used:

Meeting URL:

Meeting ID:

Passcode:

passcode not visible

Code Visible for:

logged in users only

Aaron Alsancak, 02/06/2014 10:38 -- Created document.

Imprint / Impressum | Data Protection / Datenschutzhinweis