Output-sensitive autocompletion search

Holger Bast, Christian W. Mortensen, Ingmar Weber

Research output: Contribution to journalArticle

16 Citations (Scopus)

Abstract

We consider the following autocompletion search scenario: imagine a user of a search engine typing a query; then with every keystroke display those completions of the last query word that would lead to the best hits, and also display the best such hits. The following problem is at the core of this feature: for a fixed document collection, given a set D of documents, and an alphabetical range W of words, compute the set of all word-in-document pairs (w, d) from the collection such that w W and d D. We present a new data structure with the help of which such autocompletion queries can be processed, on the average, in time linear in the input plus output size, independent of the size of the underlying document collection. At the same time, our data structure uses no more space than an inverted index. Actual query processing times on a large test collection correlate almost perfectly with our theoretical bound.

Original languageEnglish
Pages (from-to)269-286
Number of pages18
JournalInformation Retrieval
Volume11
Issue number4
DOIs
Publication statusPublished - 1 Aug 2008
Externally publishedYes

Fingerprint

Data structures
Query processing
Search engines
search engine
scenario
time

Keywords

  • Autocompletion
  • Index data structure
  • Output-sensitive
  • Prefix search

ASJC Scopus subject areas

  • Information Systems

Cite this

Output-sensitive autocompletion search. / Bast, Holger; Mortensen, Christian W.; Weber, Ingmar.

In: Information Retrieval, Vol. 11, No. 4, 01.08.2008, p. 269-286.

Research output: Contribution to journalArticle

Bast, Holger ; Mortensen, Christian W. ; Weber, Ingmar. / Output-sensitive autocompletion search. In: Information Retrieval. 2008 ; Vol. 11, No. 4. pp. 269-286.
@article{2b30e219acb7491dbf48d15fbdc965f0,
title = "Output-sensitive autocompletion search",
abstract = "We consider the following autocompletion search scenario: imagine a user of a search engine typing a query; then with every keystroke display those completions of the last query word that would lead to the best hits, and also display the best such hits. The following problem is at the core of this feature: for a fixed document collection, given a set D of documents, and an alphabetical range W of words, compute the set of all word-in-document pairs (w, d) from the collection such that w W and d D. We present a new data structure with the help of which such autocompletion queries can be processed, on the average, in time linear in the input plus output size, independent of the size of the underlying document collection. At the same time, our data structure uses no more space than an inverted index. Actual query processing times on a large test collection correlate almost perfectly with our theoretical bound.",
keywords = "Autocompletion, Index data structure, Output-sensitive, Prefix search",
author = "Holger Bast and Mortensen, {Christian W.} and Ingmar Weber",
year = "2008",
month = "8",
day = "1",
doi = "10.1007/s10791-008-9048-x",
language = "English",
volume = "11",
pages = "269--286",
journal = "Information Retrieval",
issn = "1386-4564",
publisher = "Springer Netherlands",
number = "4",

}

TY - JOUR

T1 - Output-sensitive autocompletion search

AU - Bast, Holger

AU - Mortensen, Christian W.

AU - Weber, Ingmar

PY - 2008/8/1

Y1 - 2008/8/1

N2 - We consider the following autocompletion search scenario: imagine a user of a search engine typing a query; then with every keystroke display those completions of the last query word that would lead to the best hits, and also display the best such hits. The following problem is at the core of this feature: for a fixed document collection, given a set D of documents, and an alphabetical range W of words, compute the set of all word-in-document pairs (w, d) from the collection such that w W and d D. We present a new data structure with the help of which such autocompletion queries can be processed, on the average, in time linear in the input plus output size, independent of the size of the underlying document collection. At the same time, our data structure uses no more space than an inverted index. Actual query processing times on a large test collection correlate almost perfectly with our theoretical bound.

AB - We consider the following autocompletion search scenario: imagine a user of a search engine typing a query; then with every keystroke display those completions of the last query word that would lead to the best hits, and also display the best such hits. The following problem is at the core of this feature: for a fixed document collection, given a set D of documents, and an alphabetical range W of words, compute the set of all word-in-document pairs (w, d) from the collection such that w W and d D. We present a new data structure with the help of which such autocompletion queries can be processed, on the average, in time linear in the input plus output size, independent of the size of the underlying document collection. At the same time, our data structure uses no more space than an inverted index. Actual query processing times on a large test collection correlate almost perfectly with our theoretical bound.

KW - Autocompletion

KW - Index data structure

KW - Output-sensitive

KW - Prefix search

UR - http://www.scopus.com/inward/record.url?scp=43949116913&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=43949116913&partnerID=8YFLogxK

U2 - 10.1007/s10791-008-9048-x

DO - 10.1007/s10791-008-9048-x

M3 - Article

VL - 11

SP - 269

EP - 286

JO - Information Retrieval

JF - Information Retrieval

SN - 1386-4564

IS - 4

ER -