Predicting query reformulation during Web searching

Bernard Jansen, Danielle Booth, Amanda Spink

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

This paper reports results from a study in which we automatically classified the query reformulation patterns for 964,780 Web searching sessions (composed of 1,523,072 queries) in order to predict what the next query reformulation would be. We employed an n-gram modeling approach to describe the probability of searchers transitioning from one query reformulation state to another and predict their next state. We developed first, second, third, and fourth order models and evaluated each model for accuracy of prediction. Findings show that Reformulation and Assistance account for approximately 45 percent of all query reformulations. Searchers seem to seek system searching assistant early in the session or after a content change. The results of our evaluations show that the first and second order models provided the best predictability, between 28 and 40 percent overall, and higher than 70 percent for some patterns. Implications are that the n-gram approach can be used for improving searching systems and searching assistance in real time.

Original languageEnglish
Title of host publicationConference on Human Factors in Computing Systems - Proceedings
Pages3907-3912
Number of pages6
DOIs
Publication statusPublished - 2009
Externally publishedYes
Event27th International Conference Extended Abstracts on Human Factors in Computing Systems, CHI 2009 - Boston, MA
Duration: 4 Apr 20099 Apr 2009

Other

Other27th International Conference Extended Abstracts on Human Factors in Computing Systems, CHI 2009
CityBoston, MA
Period4/4/099/4/09

Keywords

  • N-grams
  • Query reformulation
  • Stochastic process
  • Web queries
  • Web sessions

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Graphics and Computer-Aided Design
  • Software

Cite this

Jansen, B., Booth, D., & Spink, A. (2009). Predicting query reformulation during Web searching. In Conference on Human Factors in Computing Systems - Proceedings (pp. 3907-3912) https://doi.org/10.1145/1520340.1520592

Predicting query reformulation during Web searching. / Jansen, Bernard; Booth, Danielle; Spink, Amanda.

Conference on Human Factors in Computing Systems - Proceedings. 2009. p. 3907-3912.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Jansen, B, Booth, D & Spink, A 2009, Predicting query reformulation during Web searching. in Conference on Human Factors in Computing Systems - Proceedings. pp. 3907-3912, 27th International Conference Extended Abstracts on Human Factors in Computing Systems, CHI 2009, Boston, MA, 4/4/09. https://doi.org/10.1145/1520340.1520592
Jansen B, Booth D, Spink A. Predicting query reformulation during Web searching. In Conference on Human Factors in Computing Systems - Proceedings. 2009. p. 3907-3912 https://doi.org/10.1145/1520340.1520592
Jansen, Bernard ; Booth, Danielle ; Spink, Amanda. / Predicting query reformulation during Web searching. Conference on Human Factors in Computing Systems - Proceedings. 2009. pp. 3907-3912
@inproceedings{e7bc94c9ea8948adb242ce4963fe3fb0,
title = "Predicting query reformulation during Web searching",
abstract = "This paper reports results from a study in which we automatically classified the query reformulation patterns for 964,780 Web searching sessions (composed of 1,523,072 queries) in order to predict what the next query reformulation would be. We employed an n-gram modeling approach to describe the probability of searchers transitioning from one query reformulation state to another and predict their next state. We developed first, second, third, and fourth order models and evaluated each model for accuracy of prediction. Findings show that Reformulation and Assistance account for approximately 45 percent of all query reformulations. Searchers seem to seek system searching assistant early in the session or after a content change. The results of our evaluations show that the first and second order models provided the best predictability, between 28 and 40 percent overall, and higher than 70 percent for some patterns. Implications are that the n-gram approach can be used for improving searching systems and searching assistance in real time.",
keywords = "N-grams, Query reformulation, Stochastic process, Web queries, Web sessions",
author = "Bernard Jansen and Danielle Booth and Amanda Spink",
year = "2009",
doi = "10.1145/1520340.1520592",
language = "English",
isbn = "9781605582474",
pages = "3907--3912",
booktitle = "Conference on Human Factors in Computing Systems - Proceedings",

}

TY - GEN

T1 - Predicting query reformulation during Web searching

AU - Jansen, Bernard

AU - Booth, Danielle

AU - Spink, Amanda

PY - 2009

Y1 - 2009

N2 - This paper reports results from a study in which we automatically classified the query reformulation patterns for 964,780 Web searching sessions (composed of 1,523,072 queries) in order to predict what the next query reformulation would be. We employed an n-gram modeling approach to describe the probability of searchers transitioning from one query reformulation state to another and predict their next state. We developed first, second, third, and fourth order models and evaluated each model for accuracy of prediction. Findings show that Reformulation and Assistance account for approximately 45 percent of all query reformulations. Searchers seem to seek system searching assistant early in the session or after a content change. The results of our evaluations show that the first and second order models provided the best predictability, between 28 and 40 percent overall, and higher than 70 percent for some patterns. Implications are that the n-gram approach can be used for improving searching systems and searching assistance in real time.

AB - This paper reports results from a study in which we automatically classified the query reformulation patterns for 964,780 Web searching sessions (composed of 1,523,072 queries) in order to predict what the next query reformulation would be. We employed an n-gram modeling approach to describe the probability of searchers transitioning from one query reformulation state to another and predict their next state. We developed first, second, third, and fourth order models and evaluated each model for accuracy of prediction. Findings show that Reformulation and Assistance account for approximately 45 percent of all query reformulations. Searchers seem to seek system searching assistant early in the session or after a content change. The results of our evaluations show that the first and second order models provided the best predictability, between 28 and 40 percent overall, and higher than 70 percent for some patterns. Implications are that the n-gram approach can be used for improving searching systems and searching assistance in real time.

KW - N-grams

KW - Query reformulation

KW - Stochastic process

KW - Web queries

KW - Web sessions

UR - http://www.scopus.com/inward/record.url?scp=70349190160&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70349190160&partnerID=8YFLogxK

U2 - 10.1145/1520340.1520592

DO - 10.1145/1520340.1520592

M3 - Conference contribution

AN - SCOPUS:70349190160

SN - 9781605582474

SP - 3907

EP - 3912

BT - Conference on Human Factors in Computing Systems - Proceedings

ER -