Real time search on the web

Queries, topics, and economic value

Bernard Jansen, Zhe Liu, Courtney Weaver, Gerry Campbell, Matthew Gregg

Research output: Contribution to journalArticle

16 Citations (Scopus)

Abstract

Real time search is an increasingly important area of information seeking on the Web. In this research, we analyze 1,005,296 user interactions with a real time search engine over a 190 day period. Using query log analysis, we investigate searching behavior, categorize search topics, and measure the economic value of this real time search stream. We examine aggregate usage of the search engine, including number of users, queries, and terms. We then classify queries into subject categories using the Google Directory topical hierarchy. We next estimate the economic value of the real time search traffic using the Google AdWords keyword advertising platform. Results shows that 30% of the queries were unique (used only once in the entire dataset), which is low compared to traditional Web searching. Also, 60% of the search traffic comes from the search engine's application program interface, indicating that real time search is heavily leveraged by other applications. There are many repeated queries over time via these application program interfaces, perhaps indicating both long term interest in a topic and the polling nature of real time queries. Concerning search topics, the most used terms dealt with technology, entertainment, and politics, reflecting both the temporal nature of the queries and, perhaps, an early adopter user-based. However, 36% of the queries indicate some geographical affinity, pointing to a location-based aspect to real time search. In terms of economic value, we calculate this real time search stream to be worth approximately US $33,000,000 (US $33 M) on the online advertising market at the time of the study. We discuss the implications for search engines and content providers as real time content increasingly enters the main stream as an information source.

Original languageEnglish
Pages (from-to)491-506
Number of pages16
JournalInformation Processing and Management
Volume47
Issue number4
DOIs
Publication statusPublished - Jul 2011
Externally publishedYes

Fingerprint

economic value
Search engines
Economics
search engine
Application programs
Interfaces (computer)
Marketing
time
Query
World Wide Web
Economic value
traffic
entertainment

Keywords

  • Collecta
  • Economic value of search
  • Real time content
  • Real time search
  • Search topics
  • Twitter

ASJC Scopus subject areas

  • Media Technology
  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences
  • Management Science and Operations Research

Cite this

Real time search on the web : Queries, topics, and economic value. / Jansen, Bernard; Liu, Zhe; Weaver, Courtney; Campbell, Gerry; Gregg, Matthew.

In: Information Processing and Management, Vol. 47, No. 4, 07.2011, p. 491-506.

Research output: Contribution to journalArticle

Jansen, Bernard ; Liu, Zhe ; Weaver, Courtney ; Campbell, Gerry ; Gregg, Matthew. / Real time search on the web : Queries, topics, and economic value. In: Information Processing and Management. 2011 ; Vol. 47, No. 4. pp. 491-506.
@article{3384e116e4aa4971a7f71462801e6149,
title = "Real time search on the web: Queries, topics, and economic value",
abstract = "Real time search is an increasingly important area of information seeking on the Web. In this research, we analyze 1,005,296 user interactions with a real time search engine over a 190 day period. Using query log analysis, we investigate searching behavior, categorize search topics, and measure the economic value of this real time search stream. We examine aggregate usage of the search engine, including number of users, queries, and terms. We then classify queries into subject categories using the Google Directory topical hierarchy. We next estimate the economic value of the real time search traffic using the Google AdWords keyword advertising platform. Results shows that 30{\%} of the queries were unique (used only once in the entire dataset), which is low compared to traditional Web searching. Also, 60{\%} of the search traffic comes from the search engine's application program interface, indicating that real time search is heavily leveraged by other applications. There are many repeated queries over time via these application program interfaces, perhaps indicating both long term interest in a topic and the polling nature of real time queries. Concerning search topics, the most used terms dealt with technology, entertainment, and politics, reflecting both the temporal nature of the queries and, perhaps, an early adopter user-based. However, 36{\%} of the queries indicate some geographical affinity, pointing to a location-based aspect to real time search. In terms of economic value, we calculate this real time search stream to be worth approximately US $33,000,000 (US $33 M) on the online advertising market at the time of the study. We discuss the implications for search engines and content providers as real time content increasingly enters the main stream as an information source.",
keywords = "Collecta, Economic value of search, Real time content, Real time search, Search topics, Twitter",
author = "Bernard Jansen and Zhe Liu and Courtney Weaver and Gerry Campbell and Matthew Gregg",
year = "2011",
month = "7",
doi = "10.1016/j.ipm.2011.01.007",
language = "English",
volume = "47",
pages = "491--506",
journal = "Information Processing and Management",
issn = "0306-4573",
publisher = "Elsevier Limited",
number = "4",

}

TY - JOUR

T1 - Real time search on the web

T2 - Queries, topics, and economic value

AU - Jansen, Bernard

AU - Liu, Zhe

AU - Weaver, Courtney

AU - Campbell, Gerry

AU - Gregg, Matthew

PY - 2011/7

Y1 - 2011/7

N2 - Real time search is an increasingly important area of information seeking on the Web. In this research, we analyze 1,005,296 user interactions with a real time search engine over a 190 day period. Using query log analysis, we investigate searching behavior, categorize search topics, and measure the economic value of this real time search stream. We examine aggregate usage of the search engine, including number of users, queries, and terms. We then classify queries into subject categories using the Google Directory topical hierarchy. We next estimate the economic value of the real time search traffic using the Google AdWords keyword advertising platform. Results shows that 30% of the queries were unique (used only once in the entire dataset), which is low compared to traditional Web searching. Also, 60% of the search traffic comes from the search engine's application program interface, indicating that real time search is heavily leveraged by other applications. There are many repeated queries over time via these application program interfaces, perhaps indicating both long term interest in a topic and the polling nature of real time queries. Concerning search topics, the most used terms dealt with technology, entertainment, and politics, reflecting both the temporal nature of the queries and, perhaps, an early adopter user-based. However, 36% of the queries indicate some geographical affinity, pointing to a location-based aspect to real time search. In terms of economic value, we calculate this real time search stream to be worth approximately US $33,000,000 (US $33 M) on the online advertising market at the time of the study. We discuss the implications for search engines and content providers as real time content increasingly enters the main stream as an information source.

AB - Real time search is an increasingly important area of information seeking on the Web. In this research, we analyze 1,005,296 user interactions with a real time search engine over a 190 day period. Using query log analysis, we investigate searching behavior, categorize search topics, and measure the economic value of this real time search stream. We examine aggregate usage of the search engine, including number of users, queries, and terms. We then classify queries into subject categories using the Google Directory topical hierarchy. We next estimate the economic value of the real time search traffic using the Google AdWords keyword advertising platform. Results shows that 30% of the queries were unique (used only once in the entire dataset), which is low compared to traditional Web searching. Also, 60% of the search traffic comes from the search engine's application program interface, indicating that real time search is heavily leveraged by other applications. There are many repeated queries over time via these application program interfaces, perhaps indicating both long term interest in a topic and the polling nature of real time queries. Concerning search topics, the most used terms dealt with technology, entertainment, and politics, reflecting both the temporal nature of the queries and, perhaps, an early adopter user-based. However, 36% of the queries indicate some geographical affinity, pointing to a location-based aspect to real time search. In terms of economic value, we calculate this real time search stream to be worth approximately US $33,000,000 (US $33 M) on the online advertising market at the time of the study. We discuss the implications for search engines and content providers as real time content increasingly enters the main stream as an information source.

KW - Collecta

KW - Economic value of search

KW - Real time content

KW - Real time search

KW - Search topics

KW - Twitter

UR - http://www.scopus.com/inward/record.url?scp=79957601305&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79957601305&partnerID=8YFLogxK

U2 - 10.1016/j.ipm.2011.01.007

DO - 10.1016/j.ipm.2011.01.007

M3 - Article

VL - 47

SP - 491

EP - 506

JO - Information Processing and Management

JF - Information Processing and Management

SN - 0306-4573

IS - 4

ER -