Endorsements and rebuttals in blog distillation

Giacomo Berardi, Andrea Esuli, Fabrizio Sebastiani, Fabrizio Silvestri

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

In this paper we test a new approach to blog distillation, defined as the task in which, given a user query, the system ranks the blogs in descending order of relevance to the query topic. Our approach is based on the idea of adding a link analysis phase to the standard retrieval-by-topicality phase. However, differently from other link analysis methods, we check whether a given hyperlink is a citation with a positive or a negative nature, i.e., if it expresses approval or disapproval of the hyperlinked page by the hyperlinking page. This allows us to test the hypothesis that distinguishing approval from disapproval brings about benefits in the blog distillation task. We have tested our method on the Blogs08 collection used in the last two editions (2009 and 2010) of the TREC Blog Track, a collection consisting of more than one million blogs and more than 28 million blog posts. Unfortunately, the experimental results seem to disconfirm the above hypothesis, due to the low level of connectivity of the collection which severely limits the impact of a link analysis phase (and, a fortiori, of the attempt to distinguish endorsements from rebuttals). Application contexts other than the blogosphere (such as, e.g., the domain of eBay transactions) are probably more suited to such an approach.

Original languageEnglish
Pages (from-to)38-47
Number of pages10
JournalInformation Sciences
Volume249
DOIs
Publication statusPublished - 10 Nov 2013
Externally publishedYes

Fingerprint

Distillation
Blogs
Link Analysis
Query
Citations
Transactions
Endorsements
Connectivity
Retrieval
Express
Experimental Results
Link analysis

Keywords

  • Blog distillation
  • Blog search
  • Link analysis
  • Sentiment analysis

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computer Science Applications
  • Information Systems and Management

Cite this

Berardi, G., Esuli, A., Sebastiani, F., & Silvestri, F. (2013). Endorsements and rebuttals in blog distillation. Information Sciences, 249, 38-47. https://doi.org/10.1016/j.ins.2013.05.037

Endorsements and rebuttals in blog distillation. / Berardi, Giacomo; Esuli, Andrea; Sebastiani, Fabrizio; Silvestri, Fabrizio.

In: Information Sciences, Vol. 249, 10.11.2013, p. 38-47.

Research output: Contribution to journalArticle

Berardi, G, Esuli, A, Sebastiani, F & Silvestri, F 2013, 'Endorsements and rebuttals in blog distillation', Information Sciences, vol. 249, pp. 38-47. https://doi.org/10.1016/j.ins.2013.05.037
Berardi G, Esuli A, Sebastiani F, Silvestri F. Endorsements and rebuttals in blog distillation. Information Sciences. 2013 Nov 10;249:38-47. https://doi.org/10.1016/j.ins.2013.05.037
Berardi, Giacomo ; Esuli, Andrea ; Sebastiani, Fabrizio ; Silvestri, Fabrizio. / Endorsements and rebuttals in blog distillation. In: Information Sciences. 2013 ; Vol. 249. pp. 38-47.
@article{87c292c178fd461ea85392ef5b4b147c,
title = "Endorsements and rebuttals in blog distillation",
abstract = "In this paper we test a new approach to blog distillation, defined as the task in which, given a user query, the system ranks the blogs in descending order of relevance to the query topic. Our approach is based on the idea of adding a link analysis phase to the standard retrieval-by-topicality phase. However, differently from other link analysis methods, we check whether a given hyperlink is a citation with a positive or a negative nature, i.e., if it expresses approval or disapproval of the hyperlinked page by the hyperlinking page. This allows us to test the hypothesis that distinguishing approval from disapproval brings about benefits in the blog distillation task. We have tested our method on the Blogs08 collection used in the last two editions (2009 and 2010) of the TREC Blog Track, a collection consisting of more than one million blogs and more than 28 million blog posts. Unfortunately, the experimental results seem to disconfirm the above hypothesis, due to the low level of connectivity of the collection which severely limits the impact of a link analysis phase (and, a fortiori, of the attempt to distinguish endorsements from rebuttals). Application contexts other than the blogosphere (such as, e.g., the domain of eBay transactions) are probably more suited to such an approach.",
keywords = "Blog distillation, Blog search, Link analysis, Sentiment analysis",
author = "Giacomo Berardi and Andrea Esuli and Fabrizio Sebastiani and Fabrizio Silvestri",
year = "2013",
month = "11",
day = "10",
doi = "10.1016/j.ins.2013.05.037",
language = "English",
volume = "249",
pages = "38--47",
journal = "Information Sciences",
issn = "0020-0255",
publisher = "Elsevier Inc.",

}

TY - JOUR

T1 - Endorsements and rebuttals in blog distillation

AU - Berardi, Giacomo

AU - Esuli, Andrea

AU - Sebastiani, Fabrizio

AU - Silvestri, Fabrizio

PY - 2013/11/10

Y1 - 2013/11/10

N2 - In this paper we test a new approach to blog distillation, defined as the task in which, given a user query, the system ranks the blogs in descending order of relevance to the query topic. Our approach is based on the idea of adding a link analysis phase to the standard retrieval-by-topicality phase. However, differently from other link analysis methods, we check whether a given hyperlink is a citation with a positive or a negative nature, i.e., if it expresses approval or disapproval of the hyperlinked page by the hyperlinking page. This allows us to test the hypothesis that distinguishing approval from disapproval brings about benefits in the blog distillation task. We have tested our method on the Blogs08 collection used in the last two editions (2009 and 2010) of the TREC Blog Track, a collection consisting of more than one million blogs and more than 28 million blog posts. Unfortunately, the experimental results seem to disconfirm the above hypothesis, due to the low level of connectivity of the collection which severely limits the impact of a link analysis phase (and, a fortiori, of the attempt to distinguish endorsements from rebuttals). Application contexts other than the blogosphere (such as, e.g., the domain of eBay transactions) are probably more suited to such an approach.

AB - In this paper we test a new approach to blog distillation, defined as the task in which, given a user query, the system ranks the blogs in descending order of relevance to the query topic. Our approach is based on the idea of adding a link analysis phase to the standard retrieval-by-topicality phase. However, differently from other link analysis methods, we check whether a given hyperlink is a citation with a positive or a negative nature, i.e., if it expresses approval or disapproval of the hyperlinked page by the hyperlinking page. This allows us to test the hypothesis that distinguishing approval from disapproval brings about benefits in the blog distillation task. We have tested our method on the Blogs08 collection used in the last two editions (2009 and 2010) of the TREC Blog Track, a collection consisting of more than one million blogs and more than 28 million blog posts. Unfortunately, the experimental results seem to disconfirm the above hypothesis, due to the low level of connectivity of the collection which severely limits the impact of a link analysis phase (and, a fortiori, of the attempt to distinguish endorsements from rebuttals). Application contexts other than the blogosphere (such as, e.g., the domain of eBay transactions) are probably more suited to such an approach.

KW - Blog distillation

KW - Blog search

KW - Link analysis

KW - Sentiment analysis

UR - http://www.scopus.com/inward/record.url?scp=84883179516&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84883179516&partnerID=8YFLogxK

U2 - 10.1016/j.ins.2013.05.037

DO - 10.1016/j.ins.2013.05.037

M3 - Article

AN - SCOPUS:84883179516

VL - 249

SP - 38

EP - 47

JO - Information Sciences

JF - Information Sciences

SN - 0020-0255

ER -