Proceedings Article, Paper
@InProceedings
Beitrag in Tagungsband, Workshop


Show entries of:

this year (2017) | last year (2016) | two years ago (2015) | Notes URL

Action:

login to update

Options:








Author, Editor

Author(s):

Theobald, Martin
Schenkel, Ralf
Weikum, Gerhard

dblp
dblp
dblp



Editor(s):

Böhm, Klemens
Jensen, Christian S.
Haas, Laura M.
Kersten, Martin L.
Larson, Per-Ake
Ooi, Beng Chin

dblp
dblp
dblp
dblp
dblp
dblp

Not MPII Editor(s):

Böhm, Klemens
Jensen, Christian S.
Haas, Laura M.
Kersten, Martin L.
Larson, Per-Ake
Ooi, Beng Chin

BibTeX cite key*:

TheobaldSW05a

Title, Booktitle

Title*:

An Efficient and Versatile Query Engine for TopX Search


vldb2005_tsw.pdf (448.83 KB)

Booktitle*:

Proceedings of the 31st International Conference on Very Large Data Bases (VLDB 2005)

Event, URLs

URL of the conference:

http://vldb.idi.ntnu.no/

URL for downloading the paper:

http://www.vldb2005.org/program/paper/thu/p625-theobald.pdf

Event Address*:

Trondheim, Norway

Language:

English

Event Date*
(no longer used):


Organization:


Event Start Date:

30 August 2005

Event End Date:

2 September 2005

Publisher

Name*:

ACM

URL:


Address*:

New York, USA

Type:


Vol, No, Year, pp.

Series:


Volume:


Number:


Month:


Pages:

625-636

Year*:

2005

VG Wort Pages:

53

ISBN/ISSN:

1-59593-154-6; 1-59593-177-5

Sequence Number:


DOI:




Note, Abstract, ©

Note:

Acceptance ratio 1:6

(LaTeX) Abstract:

This paper presents a novel engine, coined TopX, for efficient ranked retrieval of XML documents over semistructured but nonschematic data collections. The algorithm follows the paradigm of threshold algorithms for top-k query processing with a focus on inexpensive
sequential accesses to index lists and only a few judiciously scheduled random accesses.
The difficulties in applying the existing
top-k algorithms to XML data lie in 1) the need to consider scores for XML elements while aggregating them at the document level, 2) the combination of vague content conditions with XML path conditions, 3)
the need to relax query conditions if too few results satisfy all conditions, and 4) the selectivity estimation for both content and structure conditions and their impact on evaluation
strategies. TopX addresses these issues by precomputing score and path information in an appropriately designed index structure, by largely avoiding or postponing the evaluation of expensive path conditions so as to preserve the sequential access pattern on index lists, and by selectively scheduling random accesses when they are cost-beneficial. In addition, TopX can compute approximate topk results using probabilistic score estimators, thus speeding up queries with a small and controllable loss in retrieval precision.



Download
Access Level:

Public

Correlation

MPG Unit:

Max-Planck-Institut für Informatik



MPG Subunit:

Databases and Information Systems Group

Audience:

popular

Appearance:

MPII WWW Server, MPII FTP Server, MPG publications list, university publications list, working group publication list, Fachbeirat, VG Wort



BibTeX Entry:

@INPROCEEDINGS{TheobaldSW05a,
AUTHOR = {Theobald, Martin and Schenkel, Ralf and Weikum, Gerhard},
EDITOR = {B{\"o}hm, Klemens and Jensen, Christian S. and Haas, Laura M. and Kersten, Martin L. and Larson, Per-Ake and Ooi, Beng Chin},
TITLE = {An Efficient and Versatile Query Engine for {TopX} Search},
BOOKTITLE = {Proceedings of the 31st International Conference on Very Large Data Bases (VLDB 2005)},
PUBLISHER = {ACM},
YEAR = {2005},
PAGES = {625--636},
ADDRESS = {Trondheim, Norway},
ISBN = {1-59593-154-6},
; ISBN = {1-59593-177-5},
NOTE = {Acceptance ratio 1:6},
}


Entry last modified by Martin Theobald, 04/14/2009
Show details for Edit History (please click the blue arrow to see the details)Edit History (please click the blue arrow to see the details)
Hide details for Edit History (please click the blue arrow to see the details)Edit History (please click the blue arrow to see the details)

Editor(s)
Ralf Schenkel
Created
05/11/2005 09:22:43 AM
Revisions
17.
16.
15.
14.
13.
Editor(s)
Martin Theobald
Ralf Schenkel
Ralf Schenkel
Adriana Davidescu
Adriana Davidescu
Edit Dates
04/14/2009 02:41:41 PM
06.11.2006 11:08:21
06.11.2006 11:07:14
11.08.2006 13:24:15
14.06.2006 14:43:24
Show details for Attachment SectionAttachment Section
Hide details for Attachment SectionAttachment Section
TheobaldSW05-a.pdf
View attachments here:


File Attachment Icon
vldb2005_tsw.pdf