Topic extraction from microblog posts using conversation structures

Jing Li, Ming Liao, Wei Gao, Yulan He, Kam Fai Wong

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Conventional topic models are ineffective for topic extraction from microblog messages since the lack of structure and context among the posts renders poor message-level word co-occurrence patterns. In this work, we organize microblog posts as conversation trees based on reposting and replying relations, which enrich context information to alleviate data sparseness. Our model generates words according to topic dependencies derived from the conversation structures. In specific, we differentiate messages as leader messages, which initiate key aspects of previously focused topics or shift the focus to different topics, and follower messages that do not introduce any new information but simply echo topics from the messages that they repost or reply. Our model captures the different extents that leader and follower messages may contain the key topical words, thus further enhances the quality of the induced topics. The results of thorough experiments demonstrate the effectiveness of our proposed model.

Original languageEnglish
Title of host publicationSocial Media Content Analysis
Subtitle of host publicationNatural Language Processing and Beyond
PublisherWorld Scientific Publishing Co. Pte Ltd
Pages419-437
Number of pages19
ISBN (Electronic)9789813223615
ISBN (Print)9789813223608
DOIs
Publication statusPublished - 1 Jan 2017

Fingerprint

Experiments

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Li, J., Liao, M., Gao, W., He, Y., & Wong, K. F. (2017). Topic extraction from microblog posts using conversation structures. In Social Media Content Analysis: Natural Language Processing and Beyond (pp. 419-437). World Scientific Publishing Co. Pte Ltd. https://doi.org/10.1142/9789813223615_0026

Topic extraction from microblog posts using conversation structures. / Li, Jing; Liao, Ming; Gao, Wei; He, Yulan; Wong, Kam Fai.

Social Media Content Analysis: Natural Language Processing and Beyond. World Scientific Publishing Co. Pte Ltd, 2017. p. 419-437.

Research output: Chapter in Book/Report/Conference proceedingChapter

Li, J, Liao, M, Gao, W, He, Y & Wong, KF 2017, Topic extraction from microblog posts using conversation structures. in Social Media Content Analysis: Natural Language Processing and Beyond. World Scientific Publishing Co. Pte Ltd, pp. 419-437. https://doi.org/10.1142/9789813223615_0026
Li J, Liao M, Gao W, He Y, Wong KF. Topic extraction from microblog posts using conversation structures. In Social Media Content Analysis: Natural Language Processing and Beyond. World Scientific Publishing Co. Pte Ltd. 2017. p. 419-437 https://doi.org/10.1142/9789813223615_0026
Li, Jing ; Liao, Ming ; Gao, Wei ; He, Yulan ; Wong, Kam Fai. / Topic extraction from microblog posts using conversation structures. Social Media Content Analysis: Natural Language Processing and Beyond. World Scientific Publishing Co. Pte Ltd, 2017. pp. 419-437
@inbook{8c09f546a1414279bc363242d470c2e6,
title = "Topic extraction from microblog posts using conversation structures",
abstract = "Conventional topic models are ineffective for topic extraction from microblog messages since the lack of structure and context among the posts renders poor message-level word co-occurrence patterns. In this work, we organize microblog posts as conversation trees based on reposting and replying relations, which enrich context information to alleviate data sparseness. Our model generates words according to topic dependencies derived from the conversation structures. In specific, we differentiate messages as leader messages, which initiate key aspects of previously focused topics or shift the focus to different topics, and follower messages that do not introduce any new information but simply echo topics from the messages that they repost or reply. Our model captures the different extents that leader and follower messages may contain the key topical words, thus further enhances the quality of the induced topics. The results of thorough experiments demonstrate the effectiveness of our proposed model.",
author = "Jing Li and Ming Liao and Wei Gao and Yulan He and Wong, {Kam Fai}",
year = "2017",
month = "1",
day = "1",
doi = "10.1142/9789813223615_0026",
language = "English",
isbn = "9789813223608",
pages = "419--437",
booktitle = "Social Media Content Analysis",
publisher = "World Scientific Publishing Co. Pte Ltd",
address = "Singapore",

}

TY - CHAP

T1 - Topic extraction from microblog posts using conversation structures

AU - Li, Jing

AU - Liao, Ming

AU - Gao, Wei

AU - He, Yulan

AU - Wong, Kam Fai

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Conventional topic models are ineffective for topic extraction from microblog messages since the lack of structure and context among the posts renders poor message-level word co-occurrence patterns. In this work, we organize microblog posts as conversation trees based on reposting and replying relations, which enrich context information to alleviate data sparseness. Our model generates words according to topic dependencies derived from the conversation structures. In specific, we differentiate messages as leader messages, which initiate key aspects of previously focused topics or shift the focus to different topics, and follower messages that do not introduce any new information but simply echo topics from the messages that they repost or reply. Our model captures the different extents that leader and follower messages may contain the key topical words, thus further enhances the quality of the induced topics. The results of thorough experiments demonstrate the effectiveness of our proposed model.

AB - Conventional topic models are ineffective for topic extraction from microblog messages since the lack of structure and context among the posts renders poor message-level word co-occurrence patterns. In this work, we organize microblog posts as conversation trees based on reposting and replying relations, which enrich context information to alleviate data sparseness. Our model generates words according to topic dependencies derived from the conversation structures. In specific, we differentiate messages as leader messages, which initiate key aspects of previously focused topics or shift the focus to different topics, and follower messages that do not introduce any new information but simply echo topics from the messages that they repost or reply. Our model captures the different extents that leader and follower messages may contain the key topical words, thus further enhances the quality of the induced topics. The results of thorough experiments demonstrate the effectiveness of our proposed model.

UR - http://www.scopus.com/inward/record.url?scp=85041565783&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85041565783&partnerID=8YFLogxK

U2 - 10.1142/9789813223615_0026

DO - 10.1142/9789813223615_0026

M3 - Chapter

AN - SCOPUS:85041565783

SN - 9789813223608

SP - 419

EP - 437

BT - Social Media Content Analysis

PB - World Scientific Publishing Co. Pte Ltd

ER -