Output-sensitive antocompletion search

Holger Bast, Christian W. Mortensen, Ingmar Weber

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)

Abstract

We consider the following autocompletion search scenario: imagine a user of a search engine typing a query; then with every keystroke display those completions of the last query word that would lead to the best hits, and also display the best such hits. The following problem is at the core of this feature: for a fixed document collection, given a set D of documents, and an alphabetical range W of words, compute the set of all word-in-document pairs (w, d) from the collection such that w ∈ W and d ∈ D. We present a new data structure with the help of which such autocompletion queries can be processed, on the average, in time linear in the input plus output size, independent of the size of the underlying document collection. At the same time, our data structure uses no more space than an inverted index. Actual query processing times on a large test collection correlate almost perfectly with our theoretical bound.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages150-162
Number of pages13
Volume4209 LNCS
Publication statusPublished - 31 Oct 2006
Externally publishedYes
Event13th International Conference on String Processing and Information Retrieval, SPIRE 2006 - Glasgow, United Kingdom
Duration: 11 Oct 200613 Oct 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4209 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other13th International Conference on String Processing and Information Retrieval, SPIRE 2006
CountryUnited Kingdom
CityGlasgow
Period11/10/0613/10/06

Fingerprint

Data structures
Query
Hits
Search Engine
Query processing
Output
Data Structures
Search engines
Query Processing
Correlate
Completion
Linear Time
Scenarios
Range of data

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Bast, H., Mortensen, C. W., & Weber, I. (2006). Output-sensitive antocompletion search. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4209 LNCS, pp. 150-162). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4209 LNCS).

Output-sensitive antocompletion search. / Bast, Holger; Mortensen, Christian W.; Weber, Ingmar.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4209 LNCS 2006. p. 150-162 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4209 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Bast, H, Mortensen, CW & Weber, I 2006, Output-sensitive antocompletion search. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 4209 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4209 LNCS, pp. 150-162, 13th International Conference on String Processing and Information Retrieval, SPIRE 2006, Glasgow, United Kingdom, 11/10/06.
Bast H, Mortensen CW, Weber I. Output-sensitive antocompletion search. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4209 LNCS. 2006. p. 150-162. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Bast, Holger ; Mortensen, Christian W. ; Weber, Ingmar. / Output-sensitive antocompletion search. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4209 LNCS 2006. pp. 150-162 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{f949bf613cd14acb8081cb88056fbae6,
title = "Output-sensitive antocompletion search",
abstract = "We consider the following autocompletion search scenario: imagine a user of a search engine typing a query; then with every keystroke display those completions of the last query word that would lead to the best hits, and also display the best such hits. The following problem is at the core of this feature: for a fixed document collection, given a set D of documents, and an alphabetical range W of words, compute the set of all word-in-document pairs (w, d) from the collection such that w ∈ W and d ∈ D. We present a new data structure with the help of which such autocompletion queries can be processed, on the average, in time linear in the input plus output size, independent of the size of the underlying document collection. At the same time, our data structure uses no more space than an inverted index. Actual query processing times on a large test collection correlate almost perfectly with our theoretical bound.",
author = "Holger Bast and Mortensen, {Christian W.} and Ingmar Weber",
year = "2006",
month = "10",
day = "31",
language = "English",
isbn = "3540457747",
volume = "4209 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "150--162",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Output-sensitive antocompletion search

AU - Bast, Holger

AU - Mortensen, Christian W.

AU - Weber, Ingmar

PY - 2006/10/31

Y1 - 2006/10/31

N2 - We consider the following autocompletion search scenario: imagine a user of a search engine typing a query; then with every keystroke display those completions of the last query word that would lead to the best hits, and also display the best such hits. The following problem is at the core of this feature: for a fixed document collection, given a set D of documents, and an alphabetical range W of words, compute the set of all word-in-document pairs (w, d) from the collection such that w ∈ W and d ∈ D. We present a new data structure with the help of which such autocompletion queries can be processed, on the average, in time linear in the input plus output size, independent of the size of the underlying document collection. At the same time, our data structure uses no more space than an inverted index. Actual query processing times on a large test collection correlate almost perfectly with our theoretical bound.

AB - We consider the following autocompletion search scenario: imagine a user of a search engine typing a query; then with every keystroke display those completions of the last query word that would lead to the best hits, and also display the best such hits. The following problem is at the core of this feature: for a fixed document collection, given a set D of documents, and an alphabetical range W of words, compute the set of all word-in-document pairs (w, d) from the collection such that w ∈ W and d ∈ D. We present a new data structure with the help of which such autocompletion queries can be processed, on the average, in time linear in the input plus output size, independent of the size of the underlying document collection. At the same time, our data structure uses no more space than an inverted index. Actual query processing times on a large test collection correlate almost perfectly with our theoretical bound.

UR - http://www.scopus.com/inward/record.url?scp=33750295990&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33750295990&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:33750295990

SN - 3540457747

SN - 9783540457749

VL - 4209 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 150

EP - 162

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -