Proceedings Article, Paper
@InProceedings
Beitrag in Tagungsband, Workshop


Show entries of:

this year (2017) | last year (2016) | two years ago (2015) | Notes URL

Action:

login to update

Options:




Library Locked Library locked




Author, Editor

Author(s):

Schenkel, Ralf

dblp



Editor(s):

Gurrin, Cathal
He, Yulan
Kazai, Gabriella
Kruschwitz, Udo
Little, Suzanne
Roelleke, Thomas
Rüger, Stefan
van Rijsbergen, Keith

dblp
dblp
dblp
dblp
dblp
dblp
dblp
dblp

Not MPII Editor(s):

Gurrin, Cathal
He, Yulan
Kazai, Gabriella
Kruschwitz, Udo
Little, Suzanne
Roelleke, Thomas
Rüger, Stefan
van Rijsbergen, Keith

BibTeX cite key*:

SchenkelECIR2010

Title, Booktitle

Title*:

Temporal Shingling for Version Identification in Web Archives

Booktitle*:

Advances in Information Retrieval : 32nd European Conference on IR Research, ECIR 2010

Event, URLs

URL of the conference:

http://kmi.open.ac.uk/events/ecir2010/

URL for downloading the paper:

http://dx.doi.org/10.1007/978-3-642-12275-0_44

Event Address*:

Milton Keynes, UK

Language:

English

Event Date*
(no longer used):


Organization:


Event Start Date:

28 March 2010

Event End Date:

31 March 2010

Publisher

Name*:

Springer

URL:


Address*:

Berlin

Type:


Vol, No, Year, pp.

Series:

Lecture Notes in Computer Science

Volume:

5993

Number:


Month:


Pages:

508-519

Year*:

2010

VG Wort Pages:


ISBN/ISSN:

978-3-642-12274-3

Sequence Number:


DOI:

10.1007/978-3-642-12275-0_44



Note, Abstract, ©


(LaTeX) Abstract:

Building and preserving archives of the evolving Web has been an important problem in research. Given the huge volume of content that is added or updated daily, identifying the right versions of pages to store in the archive is an important building block of any large-scale archival system. This paper presents temporal shingling, an extension of the well-established shingling technique for measuring how similar two snapshots of a page are. This novel method considers the lifespan of shingles to differentiate between important updates that should be archived and transient changes that may be ignored. Extensive experiments demonstrate the tradeoff between archive size and version coverage, and show that the novel method yields better archive coverage at smaller sizes than existing techniques.



Download
Access Level:

Public

Correlation

MPG Unit:

Max-Planck-Institut für Informatik



MPG Subunit:

Databases and Information Systems Group

Audience:

Expert

Appearance:

MPII WWW Server, MPII FTP Server, MPG publications list, university publications list, working group publication list, Fachbeirat, VG Wort



BibTeX Entry:

@INPROCEEDINGS{SchenkelECIR2010,
AUTHOR = {Schenkel, Ralf},
EDITOR = {Gurrin, Cathal and He, Yulan and Kazai, Gabriella and Kruschwitz, Udo and Little, Suzanne and Roelleke, Thomas and R{\"u}ger, Stefan and van Rijsbergen, Keith},
TITLE = {Temporal Shingling for Version Identification in Web Archives},
BOOKTITLE = {Advances in Information Retrieval : 32nd European Conference on IR Research, ECIR 2010},
PUBLISHER = {Springer},
YEAR = {2010},
VOLUME = {5993},
PAGES = {508--519},
SERIES = {Lecture Notes in Computer Science},
ADDRESS = {Milton Keynes, UK},
ISBN = {978-3-642-12274-3},
DOI = {10.1007/978-3-642-12275-0_44},
}


Entry last modified by Anja Becker, 02/01/2011
Show details for Edit History (please click the blue arrow to see the details)Edit History (please click the blue arrow to see the details)
Hide details for Edit History (please click the blue arrow to see the details)Edit History (please click the blue arrow to see the details)

Editor(s)
[Library]
Created
11/24/2009 06:48:02 AM
Revisions
2.
1.
0.

Editor(s)
Anja Becker
Ralf Schenkel
Ralf Schenkel

Edit Dates
01.02.2011 12:18:16
24.11.2009 06:58:28
24.11.2009 06:53:31

Show details for Attachment SectionAttachment Section
Hide details for Attachment SectionAttachment Section