Campus Event Calendar: Sunita Sarawagi (09/13/2004 in 46.1

Campus Event Calendar

Campus Event Calendar:
- All Upcoming:
  - only for D1
  - only for D2
  - only for INET
  - only for D4
  - only for D5
  - only for D6
  - only for RG1
  - Mailing Lists
  - by Speaker
  - by Type
  - by Category
  - by Title
  - Calendar
  - RSS Feed
- History of Events:

Event Entry

What and Who

Models and indices for integrating unstructured data with a relational database

Sunita Sarawagi

IIT Bombay

Talk

AG 1, AG 2, AG 3, AG 4, AG 5

MPI Audience

Note: We use this to send email in the morning.

Date, Time and Location

Monday, 13 September 2004

15:00

-- Not specified --

46.1 - MPII

Rotunde 4th floor

Saarbrücken

Abstract

Database systems are islands of structure in a sea of unstructured data sources. Several real-world applications now need to create bridges for smooth integration of semi-structured sources with existing structured databases for seamless querying and mining. This integration requires extracting structured column values from the unstructured source and mapping them to known database entities. Existing methods of data integration do not effectively exploit the wealth of information available in multi-relational entities. We present statistical models for co-reference resolution and information extraction in a database setting. We then go over the performance challenges of training and applying these models efficiently over very large databases. This requires us to break open a black box statistical model and extract predicates over indexable attributes of the database. We show how to extract such predicates for several classification models, including naive Bayes classifiers and support vector machines. We extend these indexing methods for supporting similarity predicates needed during data integration.

Contact

Gerhard Weikum

0681/9325-500

--email hidden

System used:

Meeting URL:

Meeting ID:

Passcode:

passcode not visible

Code Visible for:

logged in users only

Tags, Category, Keywords and additional notes

Note:

Homepage: http://www.it.iitb.ac.in/~sunita/
Biography: Sunita Sarawagi researches in the fields of databases, data mining, and machine learning. She is associate professor at IIT Bombay. Prior to that she was a research staff member at IBM Almaden Research Center. She got her PhD in databases from the University of California at Berkeley and a bachelors degree from IIT Kharagpur. She was visiting associate professor at CMU Jan-May 2004. She has several publications in international conferences on databases and data mining and several patents. She has served as program committee member for ACM SIGMOD, VLDB, ACM SIGKDD, IEEE ICDE and ICML conferences and is editor-in-chief of the ACM SIGKDD newsletter.

Uwe Brahm, 09/12/2004 01:08
Uwe Brahm, 09/12/2004 01:07 -- Created document.

Imprint / Impressum | Data Protection / Datenschutzhinweis