Joint topic modeling for event summarization across news and social media streams

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Social media streams such as Twitter are regarded as faster first-hand sources of information generated by massive users. The content diffused through this channel, although noisy, provides important complement and sometimes even a substitute to the traditional news media reporting. In this chapter, we describe a novel unsupervised approach based on topic modeling to summarize trending subjects by jointly discovering the representative and complementary information from news and tweets. Our method captures the content that enriches the subject matter by reinforcing the identification of complementary sentence-tweet pairs. To valuate the complementarity of a pair, we leverage topic modeling formalism by combining a two-dimensional topic-aspect model and a cross-collection approach in the multi-document summarization literature. The final summaries are generated by co-ranking the news sentences and tweets in both sides simultaneously. Experiments give promising results as compared to state-of-the-art baselines.

Original languageEnglish
Title of host publicationSocial Media Content Analysis
Subtitle of host publicationNatural Language Processing and Beyond
PublisherWorld Scientific Publishing Co. Pte Ltd
Pages321-346
Number of pages26
ISBN (Electronic)9789813223615
ISBN (Print)9789813223608
DOIs
Publication statusPublished - 1 Jan 2017

Fingerprint

Experiments

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Gao, W., Li, P., & Darwish, K. (2017). Joint topic modeling for event summarization across news and social media streams. In Social Media Content Analysis: Natural Language Processing and Beyond (pp. 321-346). World Scientific Publishing Co. Pte Ltd. https://doi.org/10.1142/9789813223615_0022

Joint topic modeling for event summarization across news and social media streams. / Gao, Wei; Li, Peng; Darwish, Kareem.

Social Media Content Analysis: Natural Language Processing and Beyond. World Scientific Publishing Co. Pte Ltd, 2017. p. 321-346.

Research output: Chapter in Book/Report/Conference proceedingChapter

Gao, W, Li, P & Darwish, K 2017, Joint topic modeling for event summarization across news and social media streams. in Social Media Content Analysis: Natural Language Processing and Beyond. World Scientific Publishing Co. Pte Ltd, pp. 321-346. https://doi.org/10.1142/9789813223615_0022
Gao W, Li P, Darwish K. Joint topic modeling for event summarization across news and social media streams. In Social Media Content Analysis: Natural Language Processing and Beyond. World Scientific Publishing Co. Pte Ltd. 2017. p. 321-346 https://doi.org/10.1142/9789813223615_0022
Gao, Wei ; Li, Peng ; Darwish, Kareem. / Joint topic modeling for event summarization across news and social media streams. Social Media Content Analysis: Natural Language Processing and Beyond. World Scientific Publishing Co. Pte Ltd, 2017. pp. 321-346
@inbook{2e695b93662044a7bebe85aa3a8bc10e,
title = "Joint topic modeling for event summarization across news and social media streams",
abstract = "Social media streams such as Twitter are regarded as faster first-hand sources of information generated by massive users. The content diffused through this channel, although noisy, provides important complement and sometimes even a substitute to the traditional news media reporting. In this chapter, we describe a novel unsupervised approach based on topic modeling to summarize trending subjects by jointly discovering the representative and complementary information from news and tweets. Our method captures the content that enriches the subject matter by reinforcing the identification of complementary sentence-tweet pairs. To valuate the complementarity of a pair, we leverage topic modeling formalism by combining a two-dimensional topic-aspect model and a cross-collection approach in the multi-document summarization literature. The final summaries are generated by co-ranking the news sentences and tweets in both sides simultaneously. Experiments give promising results as compared to state-of-the-art baselines.",
author = "Wei Gao and Peng Li and Kareem Darwish",
year = "2017",
month = "1",
day = "1",
doi = "10.1142/9789813223615_0022",
language = "English",
isbn = "9789813223608",
pages = "321--346",
booktitle = "Social Media Content Analysis",
publisher = "World Scientific Publishing Co. Pte Ltd",
address = "Singapore",

}

TY - CHAP

T1 - Joint topic modeling for event summarization across news and social media streams

AU - Gao, Wei

AU - Li, Peng

AU - Darwish, Kareem

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Social media streams such as Twitter are regarded as faster first-hand sources of information generated by massive users. The content diffused through this channel, although noisy, provides important complement and sometimes even a substitute to the traditional news media reporting. In this chapter, we describe a novel unsupervised approach based on topic modeling to summarize trending subjects by jointly discovering the representative and complementary information from news and tweets. Our method captures the content that enriches the subject matter by reinforcing the identification of complementary sentence-tweet pairs. To valuate the complementarity of a pair, we leverage topic modeling formalism by combining a two-dimensional topic-aspect model and a cross-collection approach in the multi-document summarization literature. The final summaries are generated by co-ranking the news sentences and tweets in both sides simultaneously. Experiments give promising results as compared to state-of-the-art baselines.

AB - Social media streams such as Twitter are regarded as faster first-hand sources of information generated by massive users. The content diffused through this channel, although noisy, provides important complement and sometimes even a substitute to the traditional news media reporting. In this chapter, we describe a novel unsupervised approach based on topic modeling to summarize trending subjects by jointly discovering the representative and complementary information from news and tweets. Our method captures the content that enriches the subject matter by reinforcing the identification of complementary sentence-tweet pairs. To valuate the complementarity of a pair, we leverage topic modeling formalism by combining a two-dimensional topic-aspect model and a cross-collection approach in the multi-document summarization literature. The final summaries are generated by co-ranking the news sentences and tweets in both sides simultaneously. Experiments give promising results as compared to state-of-the-art baselines.

UR - http://www.scopus.com/inward/record.url?scp=85041598289&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85041598289&partnerID=8YFLogxK

U2 - 10.1142/9789813223615_0022

DO - 10.1142/9789813223615_0022

M3 - Chapter

SN - 9789813223608

SP - 321

EP - 346

BT - Social Media Content Analysis

PB - World Scientific Publishing Co. Pte Ltd

ER -