Using content-level structures for summarizing microblog repost trees

Jing Li, Wei Gao, Zhongyu Wei, Baolin Peng, Kam Fai Wong

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

A microblog repost tree provides strong clues on how an event described therein develops. To help social media users capture the main clues of events on microblogging sites, we propose a novel repost tree summarization framework by effectively differentiating two kinds of messages on repost trees called leaders and followers, which are derived from content-level structure information, i.e., contents of messages and the reposting relations. To this end, Conditional Random Fields (CRF) model is used to detect leaders across repost tree paths. We then present a variant of random-walk-based summarization model to rank and select salient messages based on the result of leader detection. To reduce the error propagation cascaded from leader detection, we improve the framework by enhancing the random walk with adjustment steps for sampling from leader probabilities given all the reposting messages. For evaluation, we construct two annotated corpora, one for leader detection, and the other for repost tree summarization. Experimental results confirm the effectiveness of our method.

Original languageEnglish
Title of host publicationSocial Media Content Analysis
Subtitle of host publicationNatural Language Processing and Beyond
PublisherWorld Scientific Publishing Co. Pte Ltd
Pages255-275
Number of pages21
ISBN (Electronic)9789813223615
ISBN (Print)9789813223608
DOIs
Publication statusPublished - 1 Jan 2017

Fingerprint

Sampling

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Li, J., Gao, W., Wei, Z., Peng, B., & Wong, K. F. (2017). Using content-level structures for summarizing microblog repost trees. In Social Media Content Analysis: Natural Language Processing and Beyond (pp. 255-275). World Scientific Publishing Co. Pte Ltd. https://doi.org/10.1142/9789813223615_0018

Using content-level structures for summarizing microblog repost trees. / Li, Jing; Gao, Wei; Wei, Zhongyu; Peng, Baolin; Wong, Kam Fai.

Social Media Content Analysis: Natural Language Processing and Beyond. World Scientific Publishing Co. Pte Ltd, 2017. p. 255-275.

Research output: Chapter in Book/Report/Conference proceedingChapter

Li, J, Gao, W, Wei, Z, Peng, B & Wong, KF 2017, Using content-level structures for summarizing microblog repost trees. in Social Media Content Analysis: Natural Language Processing and Beyond. World Scientific Publishing Co. Pte Ltd, pp. 255-275. https://doi.org/10.1142/9789813223615_0018
Li J, Gao W, Wei Z, Peng B, Wong KF. Using content-level structures for summarizing microblog repost trees. In Social Media Content Analysis: Natural Language Processing and Beyond. World Scientific Publishing Co. Pte Ltd. 2017. p. 255-275 https://doi.org/10.1142/9789813223615_0018
Li, Jing ; Gao, Wei ; Wei, Zhongyu ; Peng, Baolin ; Wong, Kam Fai. / Using content-level structures for summarizing microblog repost trees. Social Media Content Analysis: Natural Language Processing and Beyond. World Scientific Publishing Co. Pte Ltd, 2017. pp. 255-275
@inbook{47533c62bb8d4e70a2ff09a193433b4c,
title = "Using content-level structures for summarizing microblog repost trees",
abstract = "A microblog repost tree provides strong clues on how an event described therein develops. To help social media users capture the main clues of events on microblogging sites, we propose a novel repost tree summarization framework by effectively differentiating two kinds of messages on repost trees called leaders and followers, which are derived from content-level structure information, i.e., contents of messages and the reposting relations. To this end, Conditional Random Fields (CRF) model is used to detect leaders across repost tree paths. We then present a variant of random-walk-based summarization model to rank and select salient messages based on the result of leader detection. To reduce the error propagation cascaded from leader detection, we improve the framework by enhancing the random walk with adjustment steps for sampling from leader probabilities given all the reposting messages. For evaluation, we construct two annotated corpora, one for leader detection, and the other for repost tree summarization. Experimental results confirm the effectiveness of our method.",
author = "Jing Li and Wei Gao and Zhongyu Wei and Baolin Peng and Wong, {Kam Fai}",
year = "2017",
month = "1",
day = "1",
doi = "10.1142/9789813223615_0018",
language = "English",
isbn = "9789813223608",
pages = "255--275",
booktitle = "Social Media Content Analysis",
publisher = "World Scientific Publishing Co. Pte Ltd",
address = "Singapore",

}

TY - CHAP

T1 - Using content-level structures for summarizing microblog repost trees

AU - Li, Jing

AU - Gao, Wei

AU - Wei, Zhongyu

AU - Peng, Baolin

AU - Wong, Kam Fai

PY - 2017/1/1

Y1 - 2017/1/1

N2 - A microblog repost tree provides strong clues on how an event described therein develops. To help social media users capture the main clues of events on microblogging sites, we propose a novel repost tree summarization framework by effectively differentiating two kinds of messages on repost trees called leaders and followers, which are derived from content-level structure information, i.e., contents of messages and the reposting relations. To this end, Conditional Random Fields (CRF) model is used to detect leaders across repost tree paths. We then present a variant of random-walk-based summarization model to rank and select salient messages based on the result of leader detection. To reduce the error propagation cascaded from leader detection, we improve the framework by enhancing the random walk with adjustment steps for sampling from leader probabilities given all the reposting messages. For evaluation, we construct two annotated corpora, one for leader detection, and the other for repost tree summarization. Experimental results confirm the effectiveness of our method.

AB - A microblog repost tree provides strong clues on how an event described therein develops. To help social media users capture the main clues of events on microblogging sites, we propose a novel repost tree summarization framework by effectively differentiating two kinds of messages on repost trees called leaders and followers, which are derived from content-level structure information, i.e., contents of messages and the reposting relations. To this end, Conditional Random Fields (CRF) model is used to detect leaders across repost tree paths. We then present a variant of random-walk-based summarization model to rank and select salient messages based on the result of leader detection. To reduce the error propagation cascaded from leader detection, we improve the framework by enhancing the random walk with adjustment steps for sampling from leader probabilities given all the reposting messages. For evaluation, we construct two annotated corpora, one for leader detection, and the other for repost tree summarization. Experimental results confirm the effectiveness of our method.

UR - http://www.scopus.com/inward/record.url?scp=85041583795&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85041583795&partnerID=8YFLogxK

U2 - 10.1142/9789813223615_0018

DO - 10.1142/9789813223615_0018

M3 - Chapter

SN - 9789813223608

SP - 255

EP - 275

BT - Social Media Content Analysis

PB - World Scientific Publishing Co. Pte Ltd

ER -