Query-log mining for detecting spam

Carlos Castillo, Claudio Corsi, Debora Donato, Paolo Ferragina, Aristides Gionis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Citations (Scopus)

Abstract

Every day millions of users search for information on the web via search engines, and provide implicit feedback to the results shown for their queries by clicking or not onto them. This feedback is encoded in the form of a query log that consists of a sequence of search actions, one per user query, each describing the following information: (i) terms composing a query, (ii) documents returned by the search engine, (iii) documents that have been clicked, (iv) the rank of those documents in the list of results, (v) date and time of the search action/click, (vi) an anonymous identifier for each session, and more. In this work, we investigate the idea of characterizing the documents and the queries belonging to a given query log with the goal, of improving algorithms for detecting spam, both at the document level and at the query level.

Original languageEnglish
Title of host publicationAIRWeb 2008 - Proceedings of the 4th International Workshop on Adversarial Information Retrieval on the Web
Pages17-20
Number of pages4
DOIs
Publication statusPublished - 1 Dec 2008
Externally publishedYes
Event4th International Workshop on Adversarial Information Retrieval on the Web, AIRWeb 2008 - Beijing, China
Duration: 22 Apr 200822 Apr 2008

Other

Other4th International Workshop on Adversarial Information Retrieval on the Web, AIRWeb 2008
CountryChina
CityBeijing
Period22/4/0822/4/08

    Fingerprint

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems

Cite this

Castillo, C., Corsi, C., Donato, D., Ferragina, P., & Gionis, A. (2008). Query-log mining for detecting spam. In AIRWeb 2008 - Proceedings of the 4th International Workshop on Adversarial Information Retrieval on the Web (pp. 17-20) https://doi.org/10.1145/1451983.1451987