MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

The Bellman data quality browser

Dr. Divesh Srivastava
AT&T Labs
Talk
AG 1, AG 2, AG 3, AG 4, AG 5, SWS, RG1, RG2  
Expert Audience
English

Date, Time and Location

Friday, 14 March 2008
14:15
45 Minutes
E1 5
433
Saarbrücken

Abstract

Data quality is a serious concern in complex industrial-scale
databases, which often have thousands of tables and tens of thousands
of columns.  Commonly encountered problems include duplicates and
default values in columns treated as keys, data inconsistencies, and
poor quality join paths.  Compounding the data quality problems are
incomplete and out-of-date metadata about the database and the
processes used to populate the database. These problems make the task
of analyzing data particularly challenging.  The Bellman data quality
browser has been built to effectively address such problems.  Bellman
profiles the database and computes concise statistical summaries of
the contents of the database to identify approximate keys, frequent
values of a field (often default values), joinable fields, and to
understand database dynamics (changes in a database over time). In
this talk, I'll describe the technology underlying Bellman and how
it is used to help make sense of complex databases.

Contact

Gerhard Weikum
--email hidden
passcode not visible
logged in users only

Petra Schaaf, 03/12/2008 10:33
Petra Schaaf, 03/10/2008 07:57
Petra Schaaf, 02/21/2008 12:03
Petra Schaaf, 02/21/2008 12:00 -- Created document.