Who uses web search for what? And how?

Ingmar Weber, Alejandro Jaimes

Research output: Chapter in Book/Report/Conference proceedingConference contribution

58 Citations (Scopus)

Abstract

We analyze a large query log of 2.3 million anonymous registered users from a web-scale U.S. search engine in order to jointly analyze their on-line behavior in terms of who they might be (demographics), what they search for (query topics), and how they search (session analysis). We examine basic demographics from registration information provided by the users, augmented with U.S. census data, analyze basic session statistics, classify queries into types (navigational, informational, transactional) based on click entropy, classify queries into topic categories, and cluster users based on the queries they issued. We then examine the resulting clusters in terms of demographics and search behavior. Our analysis of the data suggests that there are important differences in search behavior across different demographic groups in terms of the topics they search for, and how they search (e.g., white conservatives are those likely to have voted republican, mostly white males, who search for business, home, and gardening related topics; Baby Boomers tend to be primarily interested in Finance and a large fraction of their sessions consist of simple navigational queries related to online banking, etc.). Finally, we examine regional search differences, which seem to correlate with differences in local industries (e.g., gambling related queries are highest in Las Vegas and lowest in Salt Lake City; searches related to actors are about three times higher in L.A. than in any other region).

Original languageEnglish
Title of host publicationProceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011
Pages15-24
Number of pages10
DOIs
Publication statusPublished - 14 Mar 2011
Externally publishedYes
Event4th ACM International Conference on Web Search and Data Mining, WSDM 2011 - Hong Kong, China
Duration: 9 Feb 201112 Feb 2011

Other

Other4th ACM International Conference on Web Search and Data Mining, WSDM 2011
CountryChina
CityHong Kong
Period9/2/1112/2/11

Fingerprint

Finance
Search engines
World Wide Web
Industry
Entropy
Statistics

Keywords

  • Demographics
  • Query logs
  • Session analysis
  • Topic classification

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Software

Cite this

Weber, I., & Jaimes, A. (2011). Who uses web search for what? And how? In Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011 (pp. 15-24) https://doi.org/10.1145/1935826.1935839

Who uses web search for what? And how? / Weber, Ingmar; Jaimes, Alejandro.

Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011. 2011. p. 15-24.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Weber, I & Jaimes, A 2011, Who uses web search for what? And how? in Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011. pp. 15-24, 4th ACM International Conference on Web Search and Data Mining, WSDM 2011, Hong Kong, China, 9/2/11. https://doi.org/10.1145/1935826.1935839
Weber I, Jaimes A. Who uses web search for what? And how? In Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011. 2011. p. 15-24 https://doi.org/10.1145/1935826.1935839
Weber, Ingmar ; Jaimes, Alejandro. / Who uses web search for what? And how?. Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011. 2011. pp. 15-24
@inproceedings{ce53a58bc4a54f839eb4b24bfaaaae60,
title = "Who uses web search for what? And how?",
abstract = "We analyze a large query log of 2.3 million anonymous registered users from a web-scale U.S. search engine in order to jointly analyze their on-line behavior in terms of who they might be (demographics), what they search for (query topics), and how they search (session analysis). We examine basic demographics from registration information provided by the users, augmented with U.S. census data, analyze basic session statistics, classify queries into types (navigational, informational, transactional) based on click entropy, classify queries into topic categories, and cluster users based on the queries they issued. We then examine the resulting clusters in terms of demographics and search behavior. Our analysis of the data suggests that there are important differences in search behavior across different demographic groups in terms of the topics they search for, and how they search (e.g., white conservatives are those likely to have voted republican, mostly white males, who search for business, home, and gardening related topics; Baby Boomers tend to be primarily interested in Finance and a large fraction of their sessions consist of simple navigational queries related to online banking, etc.). Finally, we examine regional search differences, which seem to correlate with differences in local industries (e.g., gambling related queries are highest in Las Vegas and lowest in Salt Lake City; searches related to actors are about three times higher in L.A. than in any other region).",
keywords = "Demographics, Query logs, Session analysis, Topic classification",
author = "Ingmar Weber and Alejandro Jaimes",
year = "2011",
month = "3",
day = "14",
doi = "10.1145/1935826.1935839",
language = "English",
isbn = "9781450304931",
pages = "15--24",
booktitle = "Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011",

}

TY - GEN

T1 - Who uses web search for what? And how?

AU - Weber, Ingmar

AU - Jaimes, Alejandro

PY - 2011/3/14

Y1 - 2011/3/14

N2 - We analyze a large query log of 2.3 million anonymous registered users from a web-scale U.S. search engine in order to jointly analyze their on-line behavior in terms of who they might be (demographics), what they search for (query topics), and how they search (session analysis). We examine basic demographics from registration information provided by the users, augmented with U.S. census data, analyze basic session statistics, classify queries into types (navigational, informational, transactional) based on click entropy, classify queries into topic categories, and cluster users based on the queries they issued. We then examine the resulting clusters in terms of demographics and search behavior. Our analysis of the data suggests that there are important differences in search behavior across different demographic groups in terms of the topics they search for, and how they search (e.g., white conservatives are those likely to have voted republican, mostly white males, who search for business, home, and gardening related topics; Baby Boomers tend to be primarily interested in Finance and a large fraction of their sessions consist of simple navigational queries related to online banking, etc.). Finally, we examine regional search differences, which seem to correlate with differences in local industries (e.g., gambling related queries are highest in Las Vegas and lowest in Salt Lake City; searches related to actors are about three times higher in L.A. than in any other region).

AB - We analyze a large query log of 2.3 million anonymous registered users from a web-scale U.S. search engine in order to jointly analyze their on-line behavior in terms of who they might be (demographics), what they search for (query topics), and how they search (session analysis). We examine basic demographics from registration information provided by the users, augmented with U.S. census data, analyze basic session statistics, classify queries into types (navigational, informational, transactional) based on click entropy, classify queries into topic categories, and cluster users based on the queries they issued. We then examine the resulting clusters in terms of demographics and search behavior. Our analysis of the data suggests that there are important differences in search behavior across different demographic groups in terms of the topics they search for, and how they search (e.g., white conservatives are those likely to have voted republican, mostly white males, who search for business, home, and gardening related topics; Baby Boomers tend to be primarily interested in Finance and a large fraction of their sessions consist of simple navigational queries related to online banking, etc.). Finally, we examine regional search differences, which seem to correlate with differences in local industries (e.g., gambling related queries are highest in Las Vegas and lowest in Salt Lake City; searches related to actors are about three times higher in L.A. than in any other region).

KW - Demographics

KW - Query logs

KW - Session analysis

KW - Topic classification

UR - http://www.scopus.com/inward/record.url?scp=79952418852&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79952418852&partnerID=8YFLogxK

U2 - 10.1145/1935826.1935839

DO - 10.1145/1935826.1935839

M3 - Conference contribution

SN - 9781450304931

SP - 15

EP - 24

BT - Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011

ER -