Using content-level structures for summarizing microblog repost trees

Jing Li, Wei Gao, Zhongyu Wei, Baolin Peng, Kam Fai Wong

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

A microblog repost tree provides strong clues on how an event described therein develops. To help social media users capture the main clues of events on microblogging sites, we propose a novel repost tree summarization framework by effectively differentiating two kinds of messages on repost trees called leaders and followers, which are derived from contentlevel structure information, i.e., contents of messages and the reposting relations. To this end, Conditional Random Fields (CRF) model is used to detect leaders across repost tree paths. We then present a variant of random-walk-based summarization model to rank and select salient messages based on the result of leader detection. To reduce the error propagation cascaded from leader detection, we improve the framework by enhancing the random walk with adjustment steps for sampling from leader probabilities given all the reposting messages. For evaluation, we construct two annotated corpora, one for leader detection, and the other for repost tree summarization. Experimental results confirm the effectiveness of our method.

Original languageEnglish
Title of host publicationConference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing
PublisherAssociation for Computational Linguistics (ACL)
Pages2168-2178
Number of pages11
ISBN (Print)9781941643327
Publication statusPublished - 2015
EventConference on Empirical Methods in Natural Language Processing, EMNLP 2015 - Lisbon, Portugal
Duration: 17 Sep 201521 Sep 2015

Other

OtherConference on Empirical Methods in Natural Language Processing, EMNLP 2015
CountryPortugal
CityLisbon
Period17/9/1521/9/15

Fingerprint

Sampling

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Cite this

Li, J., Gao, W., Wei, Z., Peng, B., & Wong, K. F. (2015). Using content-level structures for summarizing microblog repost trees. In Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing (pp. 2168-2178). Association for Computational Linguistics (ACL).

Using content-level structures for summarizing microblog repost trees. / Li, Jing; Gao, Wei; Wei, Zhongyu; Peng, Baolin; Wong, Kam Fai.

Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (ACL), 2015. p. 2168-2178.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Li, J, Gao, W, Wei, Z, Peng, B & Wong, KF 2015, Using content-level structures for summarizing microblog repost trees. in Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (ACL), pp. 2168-2178, Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, 17/9/15.
Li J, Gao W, Wei Z, Peng B, Wong KF. Using content-level structures for summarizing microblog repost trees. In Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (ACL). 2015. p. 2168-2178
Li, Jing ; Gao, Wei ; Wei, Zhongyu ; Peng, Baolin ; Wong, Kam Fai. / Using content-level structures for summarizing microblog repost trees. Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (ACL), 2015. pp. 2168-2178
@inproceedings{ddacc943284d41cbb8a11f59a75b2093,
title = "Using content-level structures for summarizing microblog repost trees",
abstract = "A microblog repost tree provides strong clues on how an event described therein develops. To help social media users capture the main clues of events on microblogging sites, we propose a novel repost tree summarization framework by effectively differentiating two kinds of messages on repost trees called leaders and followers, which are derived from contentlevel structure information, i.e., contents of messages and the reposting relations. To this end, Conditional Random Fields (CRF) model is used to detect leaders across repost tree paths. We then present a variant of random-walk-based summarization model to rank and select salient messages based on the result of leader detection. To reduce the error propagation cascaded from leader detection, we improve the framework by enhancing the random walk with adjustment steps for sampling from leader probabilities given all the reposting messages. For evaluation, we construct two annotated corpora, one for leader detection, and the other for repost tree summarization. Experimental results confirm the effectiveness of our method.",
author = "Jing Li and Wei Gao and Zhongyu Wei and Baolin Peng and Wong, {Kam Fai}",
year = "2015",
language = "English",
isbn = "9781941643327",
pages = "2168--2178",
booktitle = "Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing",
publisher = "Association for Computational Linguistics (ACL)",

}

TY - GEN

T1 - Using content-level structures for summarizing microblog repost trees

AU - Li, Jing

AU - Gao, Wei

AU - Wei, Zhongyu

AU - Peng, Baolin

AU - Wong, Kam Fai

PY - 2015

Y1 - 2015

N2 - A microblog repost tree provides strong clues on how an event described therein develops. To help social media users capture the main clues of events on microblogging sites, we propose a novel repost tree summarization framework by effectively differentiating two kinds of messages on repost trees called leaders and followers, which are derived from contentlevel structure information, i.e., contents of messages and the reposting relations. To this end, Conditional Random Fields (CRF) model is used to detect leaders across repost tree paths. We then present a variant of random-walk-based summarization model to rank and select salient messages based on the result of leader detection. To reduce the error propagation cascaded from leader detection, we improve the framework by enhancing the random walk with adjustment steps for sampling from leader probabilities given all the reposting messages. For evaluation, we construct two annotated corpora, one for leader detection, and the other for repost tree summarization. Experimental results confirm the effectiveness of our method.

AB - A microblog repost tree provides strong clues on how an event described therein develops. To help social media users capture the main clues of events on microblogging sites, we propose a novel repost tree summarization framework by effectively differentiating two kinds of messages on repost trees called leaders and followers, which are derived from contentlevel structure information, i.e., contents of messages and the reposting relations. To this end, Conditional Random Fields (CRF) model is used to detect leaders across repost tree paths. We then present a variant of random-walk-based summarization model to rank and select salient messages based on the result of leader detection. To reduce the error propagation cascaded from leader detection, we improve the framework by enhancing the random walk with adjustment steps for sampling from leader probabilities given all the reposting messages. For evaluation, we construct two annotated corpora, one for leader detection, and the other for repost tree summarization. Experimental results confirm the effectiveness of our method.

UR - http://www.scopus.com/inward/record.url?scp=84959895371&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84959895371&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84959895371

SN - 9781941643327

SP - 2168

EP - 2178

BT - Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing

PB - Association for Computational Linguistics (ACL)

ER -