Proceedings Article, Paper
@InProceedings
Beitrag in Tagungsband, Workshop


Show entries of:

this year (2019) | last year (2018) | two years ago (2017) | Notes URL

Action:

login to update

Options:








Author, Editor

Author(s):

Berberich, Klaus
Bedathur, Srikanta
Neumann, Thomas
Weikum, Gerhard

dblp
dblp
dblp
dblp



Editor(s):

Kraaij, Wessel
de Vries, Arjen P.
Clarke, Charles L. A.
Fuhr, Norbert
Kando, Noriko

dblp
dblp
dblp
dblp
dblp

Not MPII Editor(s):

Kraaij, Wessel
de Vries, Arjen P.
Clarke, Charles L. A.
Fuhr, Norbert
Kando, Noriko

BibTeX cite key*:

BerberichBNW2007a

Title, Booktitle

Title*:

A Time Machine for Text Search

Booktitle*:

Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2007)

Event, URLs

URL of the conference:


URL for downloading the paper:


Event Address*:

Amsterdam, Netherlands

Language:

English

Event Date*
(no longer used):


Organization:

Association for Computing Machinery (ACM)

Event Start Date:

23 July 2007

Event End Date:

27 July 2007

Publisher

Name*:

ACM

URL:


Address*:

New York

Type:


Vol, No, Year, pp.

Series:


Volume:


Number:


Month:


Pages:

519-526

Year*:

2007

VG Wort Pages:

38

ISBN/ISSN:

978-1-59593-597-7

Sequence Number:


DOI:

http://doi.acm.org/10.1145/1277741.1277831



Note, Abstract, ©


(LaTeX) Abstract:

Text search over temporally versioned document collections such as web archives has received little attention as a research problem. As a consequence, there is no scalable and principled solution to search such a collection as of a specified time. In this work, we address this shortcoming and propose an efficient solution for time-travel text search by extending the inverted file index to make it ready for temporal search. We introduce approximate temporal coalescing as a tunable method to reduce the index size without significantly affecting the quality of results. In order to further improve the performance of time-travel queries, we introduce two principled techniques to trade off index size for its performance. These techniques can be formulated as optimization problems that can be solved to near-optimality. Finally, our approach is evaluated in a comprehensive series of experiments on two large-scale real-world datasets. Results unequivocally show that our methods make it possible to build an efficient "time machine" scalable to large versioned text collections.



Download
Access Level:

MPG

Correlation

MPG Unit:

Max-Planck-Institut für Informatik



MPG Subunit:

Databases and Information Systems Group

Audience:

popular

Appearance:

MPII WWW Server, MPII FTP Server, MPG publications list, university publications list, working group publication list, Fachbeirat, VG Wort



BibTeX Entry:

@INPROCEEDINGS{BerberichBNW2007a,
AUTHOR = {Berberich, Klaus and Bedathur, Srikanta and Neumann, Thomas and Weikum, Gerhard},
EDITOR = {Kraaij, Wessel and de Vries, Arjen P. and Clarke, Charles L. A. and Fuhr, Norbert and Kando, Noriko},
TITLE = {A Time Machine for Text Search},
BOOKTITLE = {Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2007)},
PUBLISHER = {ACM},
YEAR = {2007},
ORGANIZATION = {Association for Computing Machinery (ACM)},
PAGES = {519--526},
ADDRESS = {Amsterdam, Netherlands},
ISBN = {978-1-59593-597-7},
DOI = {http://doi.acm.org/10.1145/1277741.1277831},
}


Entry last modified by Martin Theobald, 04/15/2009
Show details for Edit History (please click the blue arrow to see the details)Edit History (please click the blue arrow to see the details)
Hide details for Edit History (please click the blue arrow to see the details)Edit History (please click the blue arrow to see the details)

Editor(s)
Klaus Berberich
Created
04/10/2007 08:07:18 AM
Revisions
7.
6.
5.
4.
3.
Editor(s)
Martin Theobald
Srikanta Bedathur
Adriana Davidescu
Adriana Davidescu
Adriana Davidescu
Edit Dates
04/15/2009 01:17:53 PM
03/15/2009 10:34:21 PM
02.01.2008 15:06:40
21.09.2007 16:19:40
04/20/2007 10:19:19 AM