Topical query decomposition

Francesco Bonchi, Carlos Castillo, Debora Donato, Aristides Gionis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

18 Citations (Scopus)

Abstract

We introduce the problem of query decomposition, where we are given a query and a document retrieval system, and we want to produce a small set of queries whose union of resulting documents corresponds approximately to that of the original query. Ideally, these queries should represent coherent, conceptually well-separated topics. We provide an abstract formulation of the query decomposition problem, and we tackle it from two different perspectives. We first show how the problem can be instantiated as a specific variant of a set cover problem, for which we provide an efficient greedy algorithm. Next, we show how the same problem can be seen as a constrained clustering problem, with a very particular kind of constraint, i.e., clustering with predefined clusters. We develop a two-phase algorithm based on hierarchical agglomerative clustering followed by dynamic programming. Our experiments, conducted on a set of actual queries in a Web scale search engine, confirm the effectiveness of the proposed solutions.

Original languageEnglish
Title of host publicationProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Pages52-60
Number of pages9
DOIs
Publication statusPublished - 1 Dec 2008
Externally publishedYes
Event14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008 - Las Vegas, NV, United States
Duration: 24 Aug 200827 Aug 2008

Other

Other14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008
CountryUnited States
CityLas Vegas, NV
Period24/8/0827/8/08

Fingerprint

Decomposition
Information retrieval systems
Search engines
Dynamic programming
World Wide Web
Experiments

Keywords

  • Clustering
  • Query recommendation
  • Set cover

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Bonchi, F., Castillo, C., Donato, D., & Gionis, A. (2008). Topical query decomposition. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 52-60) https://doi.org/10.1145/1401890.1401902

Topical query decomposition. / Bonchi, Francesco; Castillo, Carlos; Donato, Debora; Gionis, Aristides.

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008. p. 52-60.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Bonchi, F, Castillo, C, Donato, D & Gionis, A 2008, Topical query decomposition. in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 52-60, 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008, Las Vegas, NV, United States, 24/8/08. https://doi.org/10.1145/1401890.1401902
Bonchi F, Castillo C, Donato D, Gionis A. Topical query decomposition. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008. p. 52-60 https://doi.org/10.1145/1401890.1401902
Bonchi, Francesco ; Castillo, Carlos ; Donato, Debora ; Gionis, Aristides. / Topical query decomposition. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008. pp. 52-60
@inproceedings{d1b4de54ea594eec80db6d4c0f43289c,
title = "Topical query decomposition",
abstract = "We introduce the problem of query decomposition, where we are given a query and a document retrieval system, and we want to produce a small set of queries whose union of resulting documents corresponds approximately to that of the original query. Ideally, these queries should represent coherent, conceptually well-separated topics. We provide an abstract formulation of the query decomposition problem, and we tackle it from two different perspectives. We first show how the problem can be instantiated as a specific variant of a set cover problem, for which we provide an efficient greedy algorithm. Next, we show how the same problem can be seen as a constrained clustering problem, with a very particular kind of constraint, i.e., clustering with predefined clusters. We develop a two-phase algorithm based on hierarchical agglomerative clustering followed by dynamic programming. Our experiments, conducted on a set of actual queries in a Web scale search engine, confirm the effectiveness of the proposed solutions.",
keywords = "Clustering, Query recommendation, Set cover",
author = "Francesco Bonchi and Carlos Castillo and Debora Donato and Aristides Gionis",
year = "2008",
month = "12",
day = "1",
doi = "10.1145/1401890.1401902",
language = "English",
isbn = "9781605581934",
pages = "52--60",
booktitle = "Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",

}

TY - GEN

T1 - Topical query decomposition

AU - Bonchi, Francesco

AU - Castillo, Carlos

AU - Donato, Debora

AU - Gionis, Aristides

PY - 2008/12/1

Y1 - 2008/12/1

N2 - We introduce the problem of query decomposition, where we are given a query and a document retrieval system, and we want to produce a small set of queries whose union of resulting documents corresponds approximately to that of the original query. Ideally, these queries should represent coherent, conceptually well-separated topics. We provide an abstract formulation of the query decomposition problem, and we tackle it from two different perspectives. We first show how the problem can be instantiated as a specific variant of a set cover problem, for which we provide an efficient greedy algorithm. Next, we show how the same problem can be seen as a constrained clustering problem, with a very particular kind of constraint, i.e., clustering with predefined clusters. We develop a two-phase algorithm based on hierarchical agglomerative clustering followed by dynamic programming. Our experiments, conducted on a set of actual queries in a Web scale search engine, confirm the effectiveness of the proposed solutions.

AB - We introduce the problem of query decomposition, where we are given a query and a document retrieval system, and we want to produce a small set of queries whose union of resulting documents corresponds approximately to that of the original query. Ideally, these queries should represent coherent, conceptually well-separated topics. We provide an abstract formulation of the query decomposition problem, and we tackle it from two different perspectives. We first show how the problem can be instantiated as a specific variant of a set cover problem, for which we provide an efficient greedy algorithm. Next, we show how the same problem can be seen as a constrained clustering problem, with a very particular kind of constraint, i.e., clustering with predefined clusters. We develop a two-phase algorithm based on hierarchical agglomerative clustering followed by dynamic programming. Our experiments, conducted on a set of actual queries in a Web scale search engine, confirm the effectiveness of the proposed solutions.

KW - Clustering

KW - Query recommendation

KW - Set cover

UR - http://www.scopus.com/inward/record.url?scp=65449115787&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=65449115787&partnerID=8YFLogxK

U2 - 10.1145/1401890.1401902

DO - 10.1145/1401890.1401902

M3 - Conference contribution

AN - SCOPUS:65449115787

SN - 9781605581934

SP - 52

EP - 60

BT - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

ER -