MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Exploratory Analysis with Imprecise Queries

Prof. Mirek Riedewald
Northeastern University, Boston
MPI-Kolloquium

Mirek Riedewald received his PhD from the University of California at Santa Barbara, USA. Currently he is an Associate Professor in the College of Computer and Information Science at Northeastern University in Boston, USA. Prior to joining Northeastern University, he was a Research Associate at Cornell University. He also held visiting research positions at Microsoft Research in Redmond and at the Max Planck Institute for Informatics in Germany. Prof. Riedewald's research interests are in databases and data mining, with an emphasis on designing scalable analysis techniques for data-driven science. He has collaborated successfully with scientists from different domains, including ornithology, physics, mechanical and aerospace engineering, and astronomy. This work resulted in novel approaches for data warehousing, data stream processing, prediction, and parallel data processing using computer clusters. He is now focusing on exploratory analysis of massive observational data and on techniques for automated reconstruction of structure and dynamics of neural circuits, a crucial step toward understanding the functionality of the brain. Prof. Riedewald's work was published in the premier peer-reviewed data management research venues like ACM SIGMOD, VLDB, IEEE ICDE, and IEEE TKDE, as well as in domain science journals.
AG 1, AG 2, AG 3, AG 4, AG 5, SWS, RG1, MMCI  
Public Audience
English

Date, Time and Location

Tuesday, 5 July 2016
11:00
60 Minutes
E1 4
0.24
Saarbrücken

Abstract

It all started with a seemingly simple request for help by the Cornell Lab of Ornithology, one of the world's leaders in research about birds and the environment. To reach beyond the thousands of regular contributors to their citizen-science programs, they wanted to leverage their vast collections of bird observation data in order to help less experienced users identify the species of an observed bird. This quickly turned into a challenging problem at the intersection of big-data management and machine learning.

The result is Merlin, a system for exploratory search in large databases. The user interacts with it by specifying probability distributions over attributes, which express imprecise conditions about the entities of interest. Merlin helps the user home in on the right query conditions by addressing three key challenges: (1) efficiently computing results for an imprecise query, (2) providing feedback about the sensitivity of the result to changes of individual conditions, and (3) suggesting new conditions. We provide an overview of Merlin, formally introduce the notion of sensitivity, and present novel algorithms for quantifying the effect of uncertainty in user-specified conditions. To support interactive responses, we also develop techniques that can deliver probability estimates within a given realtime limit. Finally, we will discuss the challenges in accurately estimating probabilities, e.g., the value of P in "The bird you are looking for is species S with probability P," and how Merlin addresses them in an interactive environment with hard real-time constraints.

Contact

Daniela Alessi
5000
--email hidden
passcode not visible
logged in users only

Petra Schaaf, 07/04/2016 08:59
Daniela Alessi, 07/01/2016 12:43 -- Created document.