Analysis and automatic classification of web search queries for diversification requirements

Sumit Bhatia, Cliff Brunk, Prasenjit Mitra

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

Search result diversification enables the modern day search engines to construct a result list that consists of documents that are relevant to the user query and at the same time, diverse enough to meet the expectations of a diverse user population. However, all the queries received by a search engine may not benefit from diversification. Further, different types of queries may benefit from different diversification mechanisms. In this paper we present an analysis of logs of a commercial web search engine and study the web search queries for their diversification requirements. We analyze queries based on their click entropy and popularity and propose a query taxonomy based on their diversification requirements. We then carry out the task of automatically classifying web search queries into one of the classes of our proposed taxonomy. We utilize various query-based, click-based and reformulation-based features for the query classification task and achieve strong classification results.

Original languageEnglish
JournalProceedings of the ASIST Annual Meeting
Volume49
Issue number1
DOIs
Publication statusPublished - 2012
Externally publishedYes

Fingerprint

Search engines
diversification
Taxonomies
search engine
taxonomy
World Wide Web
Entropy
entropy
popularity

Keywords

  • Diversity
  • Query classification
  • Query logs
  • Web search

ASJC Scopus subject areas

  • Information Systems
  • Library and Information Sciences

Cite this

Analysis and automatic classification of web search queries for diversification requirements. / Bhatia, Sumit; Brunk, Cliff; Mitra, Prasenjit.

In: Proceedings of the ASIST Annual Meeting, Vol. 49, No. 1, 2012.

Research output: Contribution to journalArticle

@article{cc57552ef8064c34a0f58a495bb230b9,
title = "Analysis and automatic classification of web search queries for diversification requirements",
abstract = "Search result diversification enables the modern day search engines to construct a result list that consists of documents that are relevant to the user query and at the same time, diverse enough to meet the expectations of a diverse user population. However, all the queries received by a search engine may not benefit from diversification. Further, different types of queries may benefit from different diversification mechanisms. In this paper we present an analysis of logs of a commercial web search engine and study the web search queries for their diversification requirements. We analyze queries based on their click entropy and popularity and propose a query taxonomy based on their diversification requirements. We then carry out the task of automatically classifying web search queries into one of the classes of our proposed taxonomy. We utilize various query-based, click-based and reformulation-based features for the query classification task and achieve strong classification results.",
keywords = "Diversity, Query classification, Query logs, Web search",
author = "Sumit Bhatia and Cliff Brunk and Prasenjit Mitra",
year = "2012",
doi = "10.1002/meet.14504901188",
language = "English",
volume = "49",
journal = "Proceedings of the ASIST Annual Meeting",
issn = "1550-8390",
publisher = "Learned Information",
number = "1",

}

TY - JOUR

T1 - Analysis and automatic classification of web search queries for diversification requirements

AU - Bhatia, Sumit

AU - Brunk, Cliff

AU - Mitra, Prasenjit

PY - 2012

Y1 - 2012

N2 - Search result diversification enables the modern day search engines to construct a result list that consists of documents that are relevant to the user query and at the same time, diverse enough to meet the expectations of a diverse user population. However, all the queries received by a search engine may not benefit from diversification. Further, different types of queries may benefit from different diversification mechanisms. In this paper we present an analysis of logs of a commercial web search engine and study the web search queries for their diversification requirements. We analyze queries based on their click entropy and popularity and propose a query taxonomy based on their diversification requirements. We then carry out the task of automatically classifying web search queries into one of the classes of our proposed taxonomy. We utilize various query-based, click-based and reformulation-based features for the query classification task and achieve strong classification results.

AB - Search result diversification enables the modern day search engines to construct a result list that consists of documents that are relevant to the user query and at the same time, diverse enough to meet the expectations of a diverse user population. However, all the queries received by a search engine may not benefit from diversification. Further, different types of queries may benefit from different diversification mechanisms. In this paper we present an analysis of logs of a commercial web search engine and study the web search queries for their diversification requirements. We analyze queries based on their click entropy and popularity and propose a query taxonomy based on their diversification requirements. We then carry out the task of automatically classifying web search queries into one of the classes of our proposed taxonomy. We utilize various query-based, click-based and reformulation-based features for the query classification task and achieve strong classification results.

KW - Diversity

KW - Query classification

KW - Query logs

KW - Web search

UR - http://www.scopus.com/inward/record.url?scp=84878616838&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84878616838&partnerID=8YFLogxK

U2 - 10.1002/meet.14504901188

DO - 10.1002/meet.14504901188

M3 - Article

VL - 49

JO - Proceedings of the ASIST Annual Meeting

JF - Proceedings of the ASIST Annual Meeting

SN - 1550-8390

IS - 1

ER -