Unsupervised adaptive microblog filtering for broad dynamic topics

Walid Magdy, Tamer Elsayed

Research output: Contribution to journalArticle

18 Citations (Scopus)

Abstract

Information filtering has been a major task of study in the field of information retrieval (IR) for a long time, focusing on filtering well-formed documents such as news articles. Recently, more interest was directed towards applying filtering tasks to user-generated content such as microblogs. Several earlier studies investigated microblog filtering for focused topics. Another vital filtering scenario in microblogs targets the detection of posts that are relevant to long-standing broad and dynamic topics, i.e., topics spanning several subtopics that change over time. This type of filtering in microblogs is essential for many applications such as social studies on large events and news tracking of temporal topics. In this paper, we introduce an adaptive microblog filtering task that focuses on tracking topics of broad and dynamic nature. We propose an entirely-unsupervised approach that adapts to new aspects of the topic to retrieve relevant microblogs. We evaluated our filtering approach using 6 broad topics, each tested on 4 different time periods over 4 months. Experimental results showed that, on average, our approach achieved 84% increase in recall relative to the baseline approach, while maintaining an acceptable precision that showed a drop of about 8%. Our filtering method is currently implemented on TweetMogaz, a news portal generated from tweets. The website compiles the stream of Arabic tweets and detects the relevant tweets to different regions in the Middle East to be presented in the form of comprehensive reports that include top stories and news in each region.

Original languageEnglish
JournalInformation Processing and Management
DOIs
Publication statusAccepted/In press - 28 Mar 2015

Keywords

  • Arabic tweets
  • Broad dynamic topics
  • Microblog filtering
  • Twitter
  • Unsupervised adaptive filtering

ASJC Scopus subject areas

  • Media Technology
  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences
  • Management Science and Operations Research

Fingerprint Dive into the research topics of 'Unsupervised adaptive microblog filtering for broad dynamic topics'. Together they form a unique fingerprint.

  • Cite this