Max-Planck-Institut für Informatik
max planck institut
informatik
mpii logo Minerva of the Max Planck Society
 

MPI-INF or MPI-SWS or Local Campus Event Calendar

<< Previous Entry Next Entry >> New Event Entry Edit this Entry Login to DB (to update, delete)
What and Who
Title:Series Discovery with Missing and Erroneous Values
Speaker:Dr. Pei Li
coming from:University of Zurich
Speakers Bio:http://www.ifi.uzh.ch/dbtg/Staff/peili.html
Event Type:AG5 Talk
Visibility:D5
We use this to send out email in the morning.
Level:AG Audience
Language:English
Date, Time and Location
Date:Thursday, 9 April 2015
Time:10:00
Duration:60 Minutes
Location:Saarbr├╝cken
Building:E1 4
Room:433
Abstract
A series of real-world data, such as a series of music records, is often generated with order
dependency semantics; for example, music records in a series with larger catalog numbers are
usually released later in years. In this talk, I will discuss how order dependencies can be
exploited to discovery series as well as to repair missing and erroneous values of ordered
attributes in a dataset. The problem is challenging in the following aspects. First, order
dependency mechanisms are unknown a-priori and can vary among series. For example, a series can
assign catalog numbers to records in either increasing or decreasing order over time. Second,
order dependencies are often not satisfied by every record pair in a real-world series. There can
be a substantial number of records that slightly violate an order dependency. Existing ordering
integrity constraints would consider such records as exceptions, and blindly label them as
outliers. The two factors make our goal of ``one shot, two kills'' - series discovery as well as
error detection extremely challenging.
To make order dependencies applicable to real-world series, we propose the notion of longest
monotonic bands that characterize series, meanwhile being able to distinguish slight violations to
order dependencies from local outliers in a series. We also provide an efficient framework for
discovering series that are approximated by longest monotonic bands. In this talk, I will present
analyses of our proposed algorithms, and show the effectiveness of our framework with preliminary
results in real-world datasets.
Contact
Name(s):Petra Schaaf
Phone:5000
EMail:--email address not disclosed on the web
Video Broadcast
Video Broadcast:NoTo Location:
Tags, Category, Keywords and additional notes
Note:
Attachments, File(s):
Created by:Petra Schaaf/AG5/MPII/DE, 04/08/2015 10:10 AMLast modified by:Uwe Brahm/MPII/DE, 11/24/2016 04:13 PM
  • Petra Schaaf, 04/08/2015 10:13 AM
  • Petra Schaaf, 04/08/2015 10:13 AM -- Created document.