Graph regularization methods for web spam detection

Jacob Abernethy, Olivier Chapelle, Carlos Castillo

Research output: Contribution to journalArticle

29 Citations (Scopus)

Abstract

We present an algorithm, witch, that learns to detect spam hosts or pages on the Web. Unlike most other approaches, it simultaneously exploits the structure of the Web graph as well as page contents and features. The method is efficient, scalable, and provides state-of-the-art accuracy on a standard Web spam benchmark.

Original languageEnglish
Pages (from-to)207-225
Number of pages19
JournalMachine Learning
Volume81
Issue number2
DOIs
Publication statusPublished - 1 Nov 2010
Externally publishedYes

Keywords

  • Adversarial information retrieval
  • Graph regularization
  • Spam detection
  • Web spam

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software

Cite this

Abernethy, J., Chapelle, O., & Castillo, C. (2010). Graph regularization methods for web spam detection. Machine Learning, 81(2), 207-225. https://doi.org/10.1007/s10994-010-5171-1

Graph regularization methods for web spam detection. / Abernethy, Jacob; Chapelle, Olivier; Castillo, Carlos.

In: Machine Learning, Vol. 81, No. 2, 01.11.2010, p. 207-225.

Research output: Contribution to journalArticle

Abernethy, J, Chapelle, O & Castillo, C 2010, 'Graph regularization methods for web spam detection', Machine Learning, vol. 81, no. 2, pp. 207-225. https://doi.org/10.1007/s10994-010-5171-1
Abernethy, Jacob ; Chapelle, Olivier ; Castillo, Carlos. / Graph regularization methods for web spam detection. In: Machine Learning. 2010 ; Vol. 81, No. 2. pp. 207-225.
@article{2b405485b26741c8867232510a5effc7,
title = "Graph regularization methods for web spam detection",
abstract = "We present an algorithm, witch, that learns to detect spam hosts or pages on the Web. Unlike most other approaches, it simultaneously exploits the structure of the Web graph as well as page contents and features. The method is efficient, scalable, and provides state-of-the-art accuracy on a standard Web spam benchmark.",
keywords = "Adversarial information retrieval, Graph regularization, Spam detection, Web spam",
author = "Jacob Abernethy and Olivier Chapelle and Carlos Castillo",
year = "2010",
month = "11",
day = "1",
doi = "10.1007/s10994-010-5171-1",
language = "English",
volume = "81",
pages = "207--225",
journal = "Machine Learning",
issn = "0885-6125",
publisher = "Springer Netherlands",
number = "2",

}

TY - JOUR

T1 - Graph regularization methods for web spam detection

AU - Abernethy, Jacob

AU - Chapelle, Olivier

AU - Castillo, Carlos

PY - 2010/11/1

Y1 - 2010/11/1

N2 - We present an algorithm, witch, that learns to detect spam hosts or pages on the Web. Unlike most other approaches, it simultaneously exploits the structure of the Web graph as well as page contents and features. The method is efficient, scalable, and provides state-of-the-art accuracy on a standard Web spam benchmark.

AB - We present an algorithm, witch, that learns to detect spam hosts or pages on the Web. Unlike most other approaches, it simultaneously exploits the structure of the Web graph as well as page contents and features. The method is efficient, scalable, and provides state-of-the-art accuracy on a standard Web spam benchmark.

KW - Adversarial information retrieval

KW - Graph regularization

KW - Spam detection

KW - Web spam

UR - http://www.scopus.com/inward/record.url?scp=78049527603&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78049527603&partnerID=8YFLogxK

U2 - 10.1007/s10994-010-5171-1

DO - 10.1007/s10994-010-5171-1

M3 - Article

VL - 81

SP - 207

EP - 225

JO - Machine Learning

JF - Machine Learning

SN - 0885-6125

IS - 2

ER -