A study of results overlap and uniqueness among major Web search engines

Amanda Spink, Bernard Jansen, Chris Blakely, Sherry Koshman

Research output: Contribution to journalArticle

98 Citations (Scopus)

Abstract

The performance and capabilities of Web search engines is an important and significant area of research. Millions of people world wide use Web search engines very day. This paper reports the results of a major study examining the overlap among results retrieved by multiple Web search engines for a large set of more than 10,000 queries. Previous smaller studies have discussed a lack of overlap in results returned by Web search engines for the same queries. The goal of the current study was to conduct a large-scale study to measure the overlap of search results on the first result page (both non-sponsored and sponsored) across the four most popular Web search engines, at specific points in time using a large number of queries. The Web search engines included in the study were MSN Search, Google, Yahoo! and Ask Jeeves. Our study then compares these results with the first page results retrieved for the same queries by the metasearch engine Dogpile.com. Two sets of randomly selected user-entered queries, one set was 10,316 queries and the other 12,570 queries, from Infospace's Dogpile.com search engine (the first set was from Dogpile, the second was from across the Infospace Network of search properties were submitted to the four single Web search engines). Findings show that the percent of total results unique to only one of the four Web search engines was 84.9%, shared by two of the three Web search engines was 11.4%, shared by three of the Web search engines was 2.6%, and shared by all four Web search engines was 1.1%. This small degree of overlap shows the significant difference in the way major Web search engines retrieve and rank results in response to given queries. Results point to the value of metasearch engines in Web retrieval to overcome the biases of individual search engines.

Original languageEnglish
Pages (from-to)1379-1391
Number of pages13
JournalInformation Processing and Management
Volume42
Issue number5
DOIs
Publication statusPublished - Sep 2006
Externally publishedYes

Fingerprint

Search engines
search engine
Search engine
Uniqueness
Web search
Engines
Query
World Wide Web

Keywords

  • Ask Jeeves
  • Dogpile
  • Google
  • Infospace Inc
  • MSN Search
  • Overlap
  • Web search engine
  • Yahoo

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Library and Information Sciences

Cite this

A study of results overlap and uniqueness among major Web search engines. / Spink, Amanda; Jansen, Bernard; Blakely, Chris; Koshman, Sherry.

In: Information Processing and Management, Vol. 42, No. 5, 09.2006, p. 1379-1391.

Research output: Contribution to journalArticle

Spink, Amanda ; Jansen, Bernard ; Blakely, Chris ; Koshman, Sherry. / A study of results overlap and uniqueness among major Web search engines. In: Information Processing and Management. 2006 ; Vol. 42, No. 5. pp. 1379-1391.
@article{7d5ad346922748489021a680ef514c11,
title = "A study of results overlap and uniqueness among major Web search engines",
abstract = "The performance and capabilities of Web search engines is an important and significant area of research. Millions of people world wide use Web search engines very day. This paper reports the results of a major study examining the overlap among results retrieved by multiple Web search engines for a large set of more than 10,000 queries. Previous smaller studies have discussed a lack of overlap in results returned by Web search engines for the same queries. The goal of the current study was to conduct a large-scale study to measure the overlap of search results on the first result page (both non-sponsored and sponsored) across the four most popular Web search engines, at specific points in time using a large number of queries. The Web search engines included in the study were MSN Search, Google, Yahoo! and Ask Jeeves. Our study then compares these results with the first page results retrieved for the same queries by the metasearch engine Dogpile.com. Two sets of randomly selected user-entered queries, one set was 10,316 queries and the other 12,570 queries, from Infospace's Dogpile.com search engine (the first set was from Dogpile, the second was from across the Infospace Network of search properties were submitted to the four single Web search engines). Findings show that the percent of total results unique to only one of the four Web search engines was 84.9{\%}, shared by two of the three Web search engines was 11.4{\%}, shared by three of the Web search engines was 2.6{\%}, and shared by all four Web search engines was 1.1{\%}. This small degree of overlap shows the significant difference in the way major Web search engines retrieve and rank results in response to given queries. Results point to the value of metasearch engines in Web retrieval to overcome the biases of individual search engines.",
keywords = "Ask Jeeves, Dogpile, Google, Infospace Inc, MSN Search, Overlap, Web search engine, Yahoo",
author = "Amanda Spink and Bernard Jansen and Chris Blakely and Sherry Koshman",
year = "2006",
month = "9",
doi = "10.1016/j.ipm.2005.11.001",
language = "English",
volume = "42",
pages = "1379--1391",
journal = "Information Processing and Management",
issn = "0306-4573",
publisher = "Elsevier Limited",
number = "5",

}

TY - JOUR

T1 - A study of results overlap and uniqueness among major Web search engines

AU - Spink, Amanda

AU - Jansen, Bernard

AU - Blakely, Chris

AU - Koshman, Sherry

PY - 2006/9

Y1 - 2006/9

N2 - The performance and capabilities of Web search engines is an important and significant area of research. Millions of people world wide use Web search engines very day. This paper reports the results of a major study examining the overlap among results retrieved by multiple Web search engines for a large set of more than 10,000 queries. Previous smaller studies have discussed a lack of overlap in results returned by Web search engines for the same queries. The goal of the current study was to conduct a large-scale study to measure the overlap of search results on the first result page (both non-sponsored and sponsored) across the four most popular Web search engines, at specific points in time using a large number of queries. The Web search engines included in the study were MSN Search, Google, Yahoo! and Ask Jeeves. Our study then compares these results with the first page results retrieved for the same queries by the metasearch engine Dogpile.com. Two sets of randomly selected user-entered queries, one set was 10,316 queries and the other 12,570 queries, from Infospace's Dogpile.com search engine (the first set was from Dogpile, the second was from across the Infospace Network of search properties were submitted to the four single Web search engines). Findings show that the percent of total results unique to only one of the four Web search engines was 84.9%, shared by two of the three Web search engines was 11.4%, shared by three of the Web search engines was 2.6%, and shared by all four Web search engines was 1.1%. This small degree of overlap shows the significant difference in the way major Web search engines retrieve and rank results in response to given queries. Results point to the value of metasearch engines in Web retrieval to overcome the biases of individual search engines.

AB - The performance and capabilities of Web search engines is an important and significant area of research. Millions of people world wide use Web search engines very day. This paper reports the results of a major study examining the overlap among results retrieved by multiple Web search engines for a large set of more than 10,000 queries. Previous smaller studies have discussed a lack of overlap in results returned by Web search engines for the same queries. The goal of the current study was to conduct a large-scale study to measure the overlap of search results on the first result page (both non-sponsored and sponsored) across the four most popular Web search engines, at specific points in time using a large number of queries. The Web search engines included in the study were MSN Search, Google, Yahoo! and Ask Jeeves. Our study then compares these results with the first page results retrieved for the same queries by the metasearch engine Dogpile.com. Two sets of randomly selected user-entered queries, one set was 10,316 queries and the other 12,570 queries, from Infospace's Dogpile.com search engine (the first set was from Dogpile, the second was from across the Infospace Network of search properties were submitted to the four single Web search engines). Findings show that the percent of total results unique to only one of the four Web search engines was 84.9%, shared by two of the three Web search engines was 11.4%, shared by three of the Web search engines was 2.6%, and shared by all four Web search engines was 1.1%. This small degree of overlap shows the significant difference in the way major Web search engines retrieve and rank results in response to given queries. Results point to the value of metasearch engines in Web retrieval to overcome the biases of individual search engines.

KW - Ask Jeeves

KW - Dogpile

KW - Google

KW - Infospace Inc

KW - MSN Search

KW - Overlap

KW - Web search engine

KW - Yahoo

UR - http://www.scopus.com/inward/record.url?scp=33748272156&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33748272156&partnerID=8YFLogxK

U2 - 10.1016/j.ipm.2005.11.001

DO - 10.1016/j.ipm.2005.11.001

M3 - Article

VL - 42

SP - 1379

EP - 1391

JO - Information Processing and Management

JF - Information Processing and Management

SN - 0306-4573

IS - 5

ER -