Social media streams such as Twitter are regarded as faster first-hand sources of information generated by massive users. The content diffused through this channel, although noisy, provides important complement and sometimes even a substitute to the traditional news media reporting. In this chapter, we describe a novel unsupervised approach based on topic modeling to summarize trending subjects by jointly discovering the representative and complementary information from news and tweets. Our method captures the content that enriches the subject matter by reinforcing the identification of complementary sentence-tweet pairs. To valuate the complementarity of a pair, we leverage topic modeling formalism by combining a two-dimensional topic-aspect model and a cross-collection approach in the multi-document summarization literature. The final summaries are generated by co-ranking the news sentences and tweets in both sides simultaneously. Experiments give promising results as compared to state-of-the-art baselines.
|Title of host publication||Social Media Content Analysis|
|Subtitle of host publication||Natural Language Processing and Beyond|
|Publisher||World Scientific Publishing Co. Pte Ltd|
|Number of pages||26|
|Publication status||Published - 1 Jan 2017|
ASJC Scopus subject areas
- Computer Science(all)