MPI-I-2006-1-007
Output-sensitive autocompletion search
Bast, Holger and Weber, Ingmar and Mortensen, Christian W.
June 2006, 17 pages.
.
Status: available - back from printing
We consider the following autocompletion search scenario: imagine a user
of a search engine typing a query; then with every keystroke display those
completions of the last query word that would lead to the best hits, and
also display the best such hits. The following problem is at the core of
this feature: for a fixed document collection, given a set $D$ of
documents, and an alphabetical range $W$ of words, compute the set of all
word-in-document pairs $(w,d)$ from the collection such that $w \in W$
and $d\in D$.
We present a new data structure with the help of which such
autocompletion queries can be processed, on the average, in time linear
in the input plus output size, independent of the size of the underlying
document collection. At the same time, our data structure uses no more
space than an inverted index. Actual query processing times on a large test collection
correlate almost perfectly with our theoretical bound.
-
- Attachement: MPI-I-2006-1-007.pdf (268 KBytes)
URL to this document: https://domino.mpi-inf.mpg.de/internet/reports.nsf/NumberView/2006-1-007
BibTeX
@TECHREPORT{BastWeberMortensen2006,
AUTHOR = {Bast, Holger and Weber, Ingmar and Mortensen, Christian W.},
TITLE = {Output-sensitive autocompletion search},
TYPE = {Research Report},
INSTITUTION = {Max-Planck-Institut f{\"u}r Informatik},
ADDRESS = {Stuhlsatzenhausweg 85, 66123 Saarbr{\"u}cken, Germany},
NUMBER = {MPI-I-2006-1-007},
MONTH = {June},
YEAR = {2006},
ISSN = {0946-011X},
}