Adversarial Web search

Carlos Castillo, Brian D. Davison

Research output: Contribution to journalArticle

49 Citations (Scopus)

Abstract

Web search engines have become indispensable tools for finding content. As the popularity of the Web has increased, the efforts to exploit the Web for commercial, social, or political advantage have grown, making it harder for search engines to discriminate between truthful signals of content quality and deceptive attempts to game search engines' rankings. This problem is further complicated by the open nature of the Web, which allows anyone to write and publish anything, and by the fact that search engines must analyze ever-growing numbers of Web pages. Moreover, increasing expectations of users, who over time rely on Web search for information needs related to more aspects of their lives, further deepen the need for search engines to develop effective counter-measures against deception. In this monograph, we consider the effects of the adversarial relationship between search systems and those who wish to manipulate them, a field known as "Adversarial Information Retrieval". We show that search engine spammers create false content and misleading links to lure unsuspecting visitors to pages filled with advertisements or malware. We also examine work over the past decade or so that aims to discover such spamming activities to get spam pages removed or their effect on the quality of the results reduced. Research in Adversarial Information Retrieval has been evolving over time, and currently continues both in traditional areas (e.g., link spam) and newer areas, such as click fraud and spam in social media, demonstrating that this conflict is far from over.

Original languageEnglish
Pages (from-to)377-486
Number of pages110
JournalFoundations and Trends in Information Retrieval
Volume4
Issue number5
DOIs
Publication statusPublished - 1 Dec 2010
Externally publishedYes

Fingerprint

Search engines
World Wide Web
Information retrieval
Spamming
Websites

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Information Systems

Cite this

Adversarial Web search. / Castillo, Carlos; Davison, Brian D.

In: Foundations and Trends in Information Retrieval, Vol. 4, No. 5, 01.12.2010, p. 377-486.

Research output: Contribution to journalArticle

Castillo, Carlos ; Davison, Brian D. / Adversarial Web search. In: Foundations and Trends in Information Retrieval. 2010 ; Vol. 4, No. 5. pp. 377-486.
@article{d08730652dc84d38a27dedc395d4a7f5,
title = "Adversarial Web search",
abstract = "Web search engines have become indispensable tools for finding content. As the popularity of the Web has increased, the efforts to exploit the Web for commercial, social, or political advantage have grown, making it harder for search engines to discriminate between truthful signals of content quality and deceptive attempts to game search engines' rankings. This problem is further complicated by the open nature of the Web, which allows anyone to write and publish anything, and by the fact that search engines must analyze ever-growing numbers of Web pages. Moreover, increasing expectations of users, who over time rely on Web search for information needs related to more aspects of their lives, further deepen the need for search engines to develop effective counter-measures against deception. In this monograph, we consider the effects of the adversarial relationship between search systems and those who wish to manipulate them, a field known as {"}Adversarial Information Retrieval{"}. We show that search engine spammers create false content and misleading links to lure unsuspecting visitors to pages filled with advertisements or malware. We also examine work over the past decade or so that aims to discover such spamming activities to get spam pages removed or their effect on the quality of the results reduced. Research in Adversarial Information Retrieval has been evolving over time, and currently continues both in traditional areas (e.g., link spam) and newer areas, such as click fraud and spam in social media, demonstrating that this conflict is far from over.",
author = "Carlos Castillo and Davison, {Brian D.}",
year = "2010",
month = "12",
day = "1",
doi = "10.1561/1500000021",
language = "English",
volume = "4",
pages = "377--486",
journal = "Foundations and Trends in Information Retrieval",
issn = "1554-0669",
publisher = "Now Publishers Inc",
number = "5",

}

TY - JOUR

T1 - Adversarial Web search

AU - Castillo, Carlos

AU - Davison, Brian D.

PY - 2010/12/1

Y1 - 2010/12/1

N2 - Web search engines have become indispensable tools for finding content. As the popularity of the Web has increased, the efforts to exploit the Web for commercial, social, or political advantage have grown, making it harder for search engines to discriminate between truthful signals of content quality and deceptive attempts to game search engines' rankings. This problem is further complicated by the open nature of the Web, which allows anyone to write and publish anything, and by the fact that search engines must analyze ever-growing numbers of Web pages. Moreover, increasing expectations of users, who over time rely on Web search for information needs related to more aspects of their lives, further deepen the need for search engines to develop effective counter-measures against deception. In this monograph, we consider the effects of the adversarial relationship between search systems and those who wish to manipulate them, a field known as "Adversarial Information Retrieval". We show that search engine spammers create false content and misleading links to lure unsuspecting visitors to pages filled with advertisements or malware. We also examine work over the past decade or so that aims to discover such spamming activities to get spam pages removed or their effect on the quality of the results reduced. Research in Adversarial Information Retrieval has been evolving over time, and currently continues both in traditional areas (e.g., link spam) and newer areas, such as click fraud and spam in social media, demonstrating that this conflict is far from over.

AB - Web search engines have become indispensable tools for finding content. As the popularity of the Web has increased, the efforts to exploit the Web for commercial, social, or political advantage have grown, making it harder for search engines to discriminate between truthful signals of content quality and deceptive attempts to game search engines' rankings. This problem is further complicated by the open nature of the Web, which allows anyone to write and publish anything, and by the fact that search engines must analyze ever-growing numbers of Web pages. Moreover, increasing expectations of users, who over time rely on Web search for information needs related to more aspects of their lives, further deepen the need for search engines to develop effective counter-measures against deception. In this monograph, we consider the effects of the adversarial relationship between search systems and those who wish to manipulate them, a field known as "Adversarial Information Retrieval". We show that search engine spammers create false content and misleading links to lure unsuspecting visitors to pages filled with advertisements or malware. We also examine work over the past decade or so that aims to discover such spamming activities to get spam pages removed or their effect on the quality of the results reduced. Research in Adversarial Information Retrieval has been evolving over time, and currently continues both in traditional areas (e.g., link spam) and newer areas, such as click fraud and spam in social media, demonstrating that this conflict is far from over.

UR - http://www.scopus.com/inward/record.url?scp=79251544948&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79251544948&partnerID=8YFLogxK

U2 - 10.1561/1500000021

DO - 10.1561/1500000021

M3 - Article

AN - SCOPUS:79251544948

VL - 4

SP - 377

EP - 486

JO - Foundations and Trends in Information Retrieval

JF - Foundations and Trends in Information Retrieval

SN - 1554-0669

IS - 5

ER -