Electronic Proceedings Article
@InProceedings
Internet-Beitrag in Tagungsband, Workshop


Show entries of:

this year (2017) | last year (2016) | two years ago (2015) | Notes URL

Action:

login to update

Options:




Library Locked Library locked




Author, Editor

Author(s):

Theobald, Martin
AbuJarour, Mohammed
Schenkel, Ralf

dblp
dblp
dblp

Not MPG Author(s):

AbuJarour, Mohammed

Editor(s):

Geva, Shlomo
Kamps, Jaap
Trotman, Andrew

dblp
dblp
dblp

Not MPII Editor(s):

Geva, Shlomo
Kamps, Jaap
Trotman, Andrew

BibTeX cite key*:

TheobaldAS_INEX08preproc

Title, Conference

Title*:

TopX 2.0 at the INEX 2008 Efficiency Track

Booktitle*:

Advances in Focused Retrieval: 7th International Workshop of the Initiative for the Evaluation of XML Retrieval (INEX 2008)

Event Address*:

Schloss Dagstuhl, Germany

URL of the conference:


Event Date*:
(no longer used):


URL for downloading the paper:

http://www.inex.otago.ac.nz/data/proceedings/INEX2008-preproceedings.pdf

Event Start Date:

15 December 2008

Event End Date:

18 December 2008

Language:

English

Organization:


Publisher

Publisher's Name:

Springer

Publisher's URL:


Address*:

Heidelberg

Type:


Vol, No, pp., Year

Series:


Volume:


Number:


Month:


Pages:

224-236



Sequence Number:


Year*:

2008

ISBN/ISSN:

978-3-642-03760-3





Abstract, Links, ©

URL for Reference:


Note:


(LaTeX) Abstract:

For the INEX Efficiency Track 2008, we were just on time to finish and (for the first time) evaluate our brand-new TopX 2.0 prototype. Complementing our long-running effort on efficient top-k query processing on top of a relational back-end, we now switched to a compressed object-oriented storage for text-centric XML data with direct access to customized inverted files, along with a complete reimplementation of the engine in C++. Core of the new engine is a multiple-nested block-index structure that seamlessly integrates top-kstyle sorted access to large blocks stored as inverted files on disk with in-memory merge-joins for efficient score aggregations. The main challenge in designing this new index structure was to marry no less than three different paradigms in search engine design: 1) sorting blocks in descending order of the maximum element score they contain for threshold-based candidate pruning and top-k-style early termination; 2) sorting elements within each block by their id to support efficient in-memory merge-joins; and 3) encoding both structural and contentrelated information into a single, unified index structure. Our INEX 2008 experiments demonstrate efficiency gains of up to a factor of 30 compared to the previous Java/JDBC-based TopX 1.0 implementation over a relational back-end. TopX 2.0 achieves overall runtimes of less than 51 seconds for the entire batch of 568 Efficiency Track topics in their content-and-structure (CAS) version and less than 29 seconds for the content-only (CO) version, respectively, using a top-15, focused (i.e., non-overlapping) retrieval mode—an average of merely 89 ms per CAS query and 49 ms per CO query.

URL for the Abstract:




Tags, Categories, Keywords:


HyperLinks / References / URLs:


Copyright Message:


Personal Comments:


Download
Access Level:

Public

Correlation

MPG Unit:

Max-Planck-Institut für Informatik



MPG Subunit:

Databases and Information Systems Group

Audience:

popular

Appearance:

MPII WWW Server, MPII FTP Server, MPG publications list, university publications list, working group publication list, Fachbeirat

BibTeX Entry:
@INPROCEEDINGS{TheobaldAS_INEX08preproc,
AUTHOR = {Theobald, Martin and AbuJarour, Mohammed and Schenkel, Ralf},
EDITOR = {Geva, Shlomo and Kamps, Jaap and Trotman, Andrew},
TITLE = {{TopX 2.0 at the INEX 2008 Efficiency Track}},
BOOKTITLE = {Advances in Focused Retrieval: 7th International Workshop of the Initiative for the Evaluation of XML Retrieval (INEX 2008)},
PUBLISHER = {Springer},
YEAR = {2008},
PAGES = {224--236},
ADDRESS = {Schloss Dagstuhl, Germany},
ISBN = {978-3-642-03760-3},
}


Entry last modified by Martin Theobald, 06/10/2014
Show details for Edit History (please click the blue arrow to see the details)Edit History (please click the blue arrow to see the details)
Hide details for Edit History (please click the blue arrow to see the details)Edit History (please click the blue arrow to see the details)

Editor(s)
[Library]
Created
03/24/2009 05:16:17 PM
Revisions
3.
2.
1.
0.
Editor(s)
Martin Theobald
Martin Theobald
Martin Theobald
Ralf Schenkel
Edit Dates
02/14/2011 08:27:44 AM
04/20/2009 02:00:12 PM
04/14/2009 02:40:51 PM
24.03.2009 17:20:49
Show details for Attachment SectionAttachment Section
Hide details for Attachment SectionAttachment Section