Max-Planck-Institut für Informatik
max planck institut
mpii logo Minerva of the Max Planck Society


Top-k query processing in probabilistic databases with non-materialized views

Dylla, Maximilian and Miliaraki, Iris and Theobald, Martin

MPI-I-2012-5-002. June 2012, 59 pages. | Status: available - back from printing | Next --> Entry | Previous <-- Entry

Abstract in LaTeX format:
In this paper, we investigate a novel approach of computing confidence bounds
for top-k ranking queries in probabilistic databases with non-materialized views.
Unlike prior approaches, we present an exact pruning algorithm for finding the
top-ranked query answers according to their marginal probabilities without the
need to first materialize all answer candidates via the views. Specifically, we
consider conjunctive queries over multiple levels of select-project-join views, the
latter of which are cast into Datalog rules, where also the rules themselves may
be uncertain, i.e., be valid with some degree of confidence. To our knowledge,
this work is the first to address integrated data and confidence computations in the
context of probabilistic databases by considering confidence bounds over partially
evaluated query answers with first-order lineage formulas. We further extend our
query processing techniques by a tool-suite of scheduling strategies based on se-
lectivity estimation and the expected impact of subgoals on the final confidence
of answer candidates. Experiments with large datasets demonstrate drastic run-
time improvements over both sampling and decomposition-based methods—even
in the presence of recursive rules.
References to related material:

To download this research report, please select the type of document that fits best your needs.Attachement Size(s):
MPI-I-2012-5-002.pdfMPI-I-2012-5-002.pdf431 KBytes
Please note: If you don't have a viewer for PostScript on your platform, try to install GhostScript and GhostView
URL to this document:
Hide details for BibTeXBibTeX
  AUTHOR = {Dylla, Maximilian and Miliaraki, Iris and Theobald, Martin},
  TITLE = {Top-k query processing in probabilistic databases with non-materialized views},
  TYPE = {Research Report},
  INSTITUTION = {Max-Planck-Institut f{\"u}r Informatik},
  ADDRESS = {Stuhlsatzenhausweg 85, 66123 Saarbr{\"u}cken, Germany},
  NUMBER = {MPI-I-2012-5-002},
  MONTH = {June},
  YEAR = {2012},
  ISSN = {0946-011X},