Time series analysis of a Web search engine transaction log

Ying Zhang, Bernard Jansen, Amanda Spink

Research output: Contribution to journalArticle

43 Citations (Scopus)

Abstract

In this paper, we use time series analysis to evaluate predictive scenarios using search engine transactional logs. Our goal is to develop models for the analysis of searchers' behaviors over time and investigate if time series analysis is a valid method for predicting relationships between searcher actions. Time series analysis is a method often used to understand the underlying characteristics of temporal data in order to make forecasts. In this study, we used a Web search engine transactional log and time series analysis to investigate users' actions. We conducted our analysis in two phases. In the initial phase, we employed a basic analysis and found that 10% of searchers clicked on sponsored links. However, from 22:00 to 24:00, searchers almost exclusively clicked on the organic links, with almost no clicks on sponsored links. In the second and more extensive phase, we used a one-step prediction time series analysis method along with a transfer function method. The period rarely affects navigational and transactional queries, while rates for transactional queries vary during different periods. Our results show that the average length of a searcher session is approximately 2.9 interactions and that this average is consistent across time periods. Most importantly, our findings shows that searchers who submit the shortest queries (i.e., in number of terms) click on highest ranked results. We discuss implications, including predictive value, and future research.

Original languageEnglish
Pages (from-to)230-245
Number of pages16
JournalInformation Processing and Management
Volume45
Issue number2
DOIs
Publication statusPublished - Mar 2009
Externally publishedYes

Fingerprint

Time series analysis
time series analysis
Search engines
search engine
transaction
Transfer functions
Search engine
Web search
scenario
interaction
Query

Keywords

  • ARIMA
  • Box-Jenkins model
  • Search engine
  • Time series analysis
  • Transactional log

ASJC Scopus subject areas

  • Media Technology
  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences
  • Management Science and Operations Research

Cite this

Time series analysis of a Web search engine transaction log. / Zhang, Ying; Jansen, Bernard; Spink, Amanda.

In: Information Processing and Management, Vol. 45, No. 2, 03.2009, p. 230-245.

Research output: Contribution to journalArticle

@article{b027a286867c48a5838a630eca8089c5,
title = "Time series analysis of a Web search engine transaction log",
abstract = "In this paper, we use time series analysis to evaluate predictive scenarios using search engine transactional logs. Our goal is to develop models for the analysis of searchers' behaviors over time and investigate if time series analysis is a valid method for predicting relationships between searcher actions. Time series analysis is a method often used to understand the underlying characteristics of temporal data in order to make forecasts. In this study, we used a Web search engine transactional log and time series analysis to investigate users' actions. We conducted our analysis in two phases. In the initial phase, we employed a basic analysis and found that 10{\%} of searchers clicked on sponsored links. However, from 22:00 to 24:00, searchers almost exclusively clicked on the organic links, with almost no clicks on sponsored links. In the second and more extensive phase, we used a one-step prediction time series analysis method along with a transfer function method. The period rarely affects navigational and transactional queries, while rates for transactional queries vary during different periods. Our results show that the average length of a searcher session is approximately 2.9 interactions and that this average is consistent across time periods. Most importantly, our findings shows that searchers who submit the shortest queries (i.e., in number of terms) click on highest ranked results. We discuss implications, including predictive value, and future research.",
keywords = "ARIMA, Box-Jenkins model, Search engine, Time series analysis, Transactional log",
author = "Ying Zhang and Bernard Jansen and Amanda Spink",
year = "2009",
month = "3",
doi = "10.1016/j.ipm.2008.07.003",
language = "English",
volume = "45",
pages = "230--245",
journal = "Information Processing and Management",
issn = "0306-4573",
publisher = "Elsevier Limited",
number = "2",

}

TY - JOUR

T1 - Time series analysis of a Web search engine transaction log

AU - Zhang, Ying

AU - Jansen, Bernard

AU - Spink, Amanda

PY - 2009/3

Y1 - 2009/3

N2 - In this paper, we use time series analysis to evaluate predictive scenarios using search engine transactional logs. Our goal is to develop models for the analysis of searchers' behaviors over time and investigate if time series analysis is a valid method for predicting relationships between searcher actions. Time series analysis is a method often used to understand the underlying characteristics of temporal data in order to make forecasts. In this study, we used a Web search engine transactional log and time series analysis to investigate users' actions. We conducted our analysis in two phases. In the initial phase, we employed a basic analysis and found that 10% of searchers clicked on sponsored links. However, from 22:00 to 24:00, searchers almost exclusively clicked on the organic links, with almost no clicks on sponsored links. In the second and more extensive phase, we used a one-step prediction time series analysis method along with a transfer function method. The period rarely affects navigational and transactional queries, while rates for transactional queries vary during different periods. Our results show that the average length of a searcher session is approximately 2.9 interactions and that this average is consistent across time periods. Most importantly, our findings shows that searchers who submit the shortest queries (i.e., in number of terms) click on highest ranked results. We discuss implications, including predictive value, and future research.

AB - In this paper, we use time series analysis to evaluate predictive scenarios using search engine transactional logs. Our goal is to develop models for the analysis of searchers' behaviors over time and investigate if time series analysis is a valid method for predicting relationships between searcher actions. Time series analysis is a method often used to understand the underlying characteristics of temporal data in order to make forecasts. In this study, we used a Web search engine transactional log and time series analysis to investigate users' actions. We conducted our analysis in two phases. In the initial phase, we employed a basic analysis and found that 10% of searchers clicked on sponsored links. However, from 22:00 to 24:00, searchers almost exclusively clicked on the organic links, with almost no clicks on sponsored links. In the second and more extensive phase, we used a one-step prediction time series analysis method along with a transfer function method. The period rarely affects navigational and transactional queries, while rates for transactional queries vary during different periods. Our results show that the average length of a searcher session is approximately 2.9 interactions and that this average is consistent across time periods. Most importantly, our findings shows that searchers who submit the shortest queries (i.e., in number of terms) click on highest ranked results. We discuss implications, including predictive value, and future research.

KW - ARIMA

KW - Box-Jenkins model

KW - Search engine

KW - Time series analysis

KW - Transactional log

UR - http://www.scopus.com/inward/record.url?scp=60549085552&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=60549085552&partnerID=8YFLogxK

U2 - 10.1016/j.ipm.2008.07.003

DO - 10.1016/j.ipm.2008.07.003

M3 - Article

VL - 45

SP - 230

EP - 245

JO - Information Processing and Management

JF - Information Processing and Management

SN - 0306-4573

IS - 2

ER -