Classifying the user intent of web queries using k-means clustering

Ashish Kathuria, Bernard Jansen, Carolyn Hafernik, Amanda Spink

Research output: Contribution to journalArticle

29 Citations (Scopus)

Abstract

Purpose: Web search engines are frequently used by people to locate information on the Internet. However, not all queries have an informational goal. Instead of information, some people may be looking for specific web sites or may wish to conduct transactions with web services. This paper aims to focus on automatically classifying the different user intents behind web queries. Design/methodology/approach: For the research reported in this paper, 130,000 web search engine queries are categorized as informational, navigational, or transactional using a k-means clustering approach based on a variety of query traits. Findings: The research findings show that more than 75 percent of web queries (clustered into eight classifications) are informational in nature, with about 12 percent each for navigational and transactional. Results also show that web queries fall into eight clusters, six primarily informational, and one each of primarily transactional and navigational. Research limitations/implications: This study provides an important contribution to web search literature because it provides information about the goals of searchers and a method for automatically classifying the intents of the user queries. Automatic classification of user intent can lead to improved web search engines by tailoring results to specific user needs. Practical implications: The paper discusses how web search engines can use automatically classified user queries to provide more targeted and relevant results in web searching by implementing a real time classification method as presented in this research. Originality/value: This research investigates a new application of a method for automatically classifying the intent of user queries. There has been limited research to date on automatically classifying the user intent of web queries, even though the pay-off for web search engines can be quite beneficial.

Original languageEnglish
Pages (from-to)563-581
Number of pages19
JournalInternet Research
Volume20
Issue number5
DOIs
Publication statusPublished - Oct 2010
Externally publishedYes

Fingerprint

World Wide Web
Search engines
search engine
Web services
K-means clustering
Query
Websites
transaction
Internet
Web search
methodology
Search engine
Values

Keywords

  • Methods of enquiry
  • Search engines
  • User interfaces

ASJC Scopus subject areas

  • Communication
  • Sociology and Political Science
  • Economics and Econometrics

Cite this

Classifying the user intent of web queries using k-means clustering. / Kathuria, Ashish; Jansen, Bernard; Hafernik, Carolyn; Spink, Amanda.

In: Internet Research, Vol. 20, No. 5, 10.2010, p. 563-581.

Research output: Contribution to journalArticle

Kathuria, Ashish ; Jansen, Bernard ; Hafernik, Carolyn ; Spink, Amanda. / Classifying the user intent of web queries using k-means clustering. In: Internet Research. 2010 ; Vol. 20, No. 5. pp. 563-581.
@article{ec1764bcd2fe4b299ba35fef7beaac89,
title = "Classifying the user intent of web queries using k-means clustering",
abstract = "Purpose: Web search engines are frequently used by people to locate information on the Internet. However, not all queries have an informational goal. Instead of information, some people may be looking for specific web sites or may wish to conduct transactions with web services. This paper aims to focus on automatically classifying the different user intents behind web queries. Design/methodology/approach: For the research reported in this paper, 130,000 web search engine queries are categorized as informational, navigational, or transactional using a k-means clustering approach based on a variety of query traits. Findings: The research findings show that more than 75 percent of web queries (clustered into eight classifications) are informational in nature, with about 12 percent each for navigational and transactional. Results also show that web queries fall into eight clusters, six primarily informational, and one each of primarily transactional and navigational. Research limitations/implications: This study provides an important contribution to web search literature because it provides information about the goals of searchers and a method for automatically classifying the intents of the user queries. Automatic classification of user intent can lead to improved web search engines by tailoring results to specific user needs. Practical implications: The paper discusses how web search engines can use automatically classified user queries to provide more targeted and relevant results in web searching by implementing a real time classification method as presented in this research. Originality/value: This research investigates a new application of a method for automatically classifying the intent of user queries. There has been limited research to date on automatically classifying the user intent of web queries, even though the pay-off for web search engines can be quite beneficial.",
keywords = "Methods of enquiry, Search engines, User interfaces",
author = "Ashish Kathuria and Bernard Jansen and Carolyn Hafernik and Amanda Spink",
year = "2010",
month = "10",
doi = "10.1108/10662241011084112",
language = "English",
volume = "20",
pages = "563--581",
journal = "Internet Research",
issn = "1066-2243",
publisher = "Emerald Group Publishing Ltd.",
number = "5",

}

TY - JOUR

T1 - Classifying the user intent of web queries using k-means clustering

AU - Kathuria, Ashish

AU - Jansen, Bernard

AU - Hafernik, Carolyn

AU - Spink, Amanda

PY - 2010/10

Y1 - 2010/10

N2 - Purpose: Web search engines are frequently used by people to locate information on the Internet. However, not all queries have an informational goal. Instead of information, some people may be looking for specific web sites or may wish to conduct transactions with web services. This paper aims to focus on automatically classifying the different user intents behind web queries. Design/methodology/approach: For the research reported in this paper, 130,000 web search engine queries are categorized as informational, navigational, or transactional using a k-means clustering approach based on a variety of query traits. Findings: The research findings show that more than 75 percent of web queries (clustered into eight classifications) are informational in nature, with about 12 percent each for navigational and transactional. Results also show that web queries fall into eight clusters, six primarily informational, and one each of primarily transactional and navigational. Research limitations/implications: This study provides an important contribution to web search literature because it provides information about the goals of searchers and a method for automatically classifying the intents of the user queries. Automatic classification of user intent can lead to improved web search engines by tailoring results to specific user needs. Practical implications: The paper discusses how web search engines can use automatically classified user queries to provide more targeted and relevant results in web searching by implementing a real time classification method as presented in this research. Originality/value: This research investigates a new application of a method for automatically classifying the intent of user queries. There has been limited research to date on automatically classifying the user intent of web queries, even though the pay-off for web search engines can be quite beneficial.

AB - Purpose: Web search engines are frequently used by people to locate information on the Internet. However, not all queries have an informational goal. Instead of information, some people may be looking for specific web sites or may wish to conduct transactions with web services. This paper aims to focus on automatically classifying the different user intents behind web queries. Design/methodology/approach: For the research reported in this paper, 130,000 web search engine queries are categorized as informational, navigational, or transactional using a k-means clustering approach based on a variety of query traits. Findings: The research findings show that more than 75 percent of web queries (clustered into eight classifications) are informational in nature, with about 12 percent each for navigational and transactional. Results also show that web queries fall into eight clusters, six primarily informational, and one each of primarily transactional and navigational. Research limitations/implications: This study provides an important contribution to web search literature because it provides information about the goals of searchers and a method for automatically classifying the intents of the user queries. Automatic classification of user intent can lead to improved web search engines by tailoring results to specific user needs. Practical implications: The paper discusses how web search engines can use automatically classified user queries to provide more targeted and relevant results in web searching by implementing a real time classification method as presented in this research. Originality/value: This research investigates a new application of a method for automatically classifying the intent of user queries. There has been limited research to date on automatically classifying the user intent of web queries, even though the pay-off for web search engines can be quite beneficial.

KW - Methods of enquiry

KW - Search engines

KW - User interfaces

UR - http://www.scopus.com/inward/record.url?scp=78049471879&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78049471879&partnerID=8YFLogxK

U2 - 10.1108/10662241011084112

DO - 10.1108/10662241011084112

M3 - Article

VL - 20

SP - 563

EP - 581

JO - Internet Research

JF - Internet Research

SN - 1066-2243

IS - 5

ER -