Unsupervised adaptive microblog filtering for broad dynamic topics

Walid Magdy, Tamer Elsayed

Research output: Contribution to journalArticle

15 Citations (Scopus)

Abstract

Information filtering has been a major task of study in the field of information retrieval (IR) for a long time, focusing on filtering well-formed documents such as news articles. Recently, more interest was directed towards applying filtering tasks to user-generated content such as microblogs. Several earlier studies investigated microblog filtering for focused topics. Another vital filtering scenario in microblogs targets the detection of posts that are relevant to long-standing broad and dynamic topics, i.e., topics spanning several subtopics that change over time. This type of filtering in microblogs is essential for many applications such as social studies on large events and news tracking of temporal topics. In this paper, we introduce an adaptive microblog filtering task that focuses on tracking topics of broad and dynamic nature. We propose an entirely-unsupervised approach that adapts to new aspects of the topic to retrieve relevant microblogs. We evaluated our filtering approach using 6 broad topics, each tested on 4 different time periods over 4 months. Experimental results showed that, on average, our approach achieved 84% increase in recall relative to the baseline approach, while maintaining an acceptable precision that showed a drop of about 8%. Our filtering method is currently implemented on TweetMogaz, a news portal generated from tweets. The website compiles the stream of Arabic tweets and detects the relevant tweets to different regions in the Middle East to be presented in the form of comprehensive reports that include top stories and news in each region.

Original languageEnglish
JournalInformation Processing and Management
DOIs
Publication statusAccepted/In press - 28 Mar 2015

Fingerprint

Adaptive filtering
news
Information filtering
Information retrieval
Websites
social studies
information retrieval
Middle East
website
scenario
event
time
News

Keywords

  • Arabic tweets
  • Broad dynamic topics
  • Microblog filtering
  • Twitter
  • Unsupervised adaptive filtering

ASJC Scopus subject areas

  • Media Technology
  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences
  • Management Science and Operations Research

Cite this

Unsupervised adaptive microblog filtering for broad dynamic topics. / Magdy, Walid; Elsayed, Tamer.

In: Information Processing and Management, 28.03.2015.

Research output: Contribution to journalArticle

@article{f221b7a6ea544e05ab610582ce5ec535,
title = "Unsupervised adaptive microblog filtering for broad dynamic topics",
abstract = "Information filtering has been a major task of study in the field of information retrieval (IR) for a long time, focusing on filtering well-formed documents such as news articles. Recently, more interest was directed towards applying filtering tasks to user-generated content such as microblogs. Several earlier studies investigated microblog filtering for focused topics. Another vital filtering scenario in microblogs targets the detection of posts that are relevant to long-standing broad and dynamic topics, i.e., topics spanning several subtopics that change over time. This type of filtering in microblogs is essential for many applications such as social studies on large events and news tracking of temporal topics. In this paper, we introduce an adaptive microblog filtering task that focuses on tracking topics of broad and dynamic nature. We propose an entirely-unsupervised approach that adapts to new aspects of the topic to retrieve relevant microblogs. We evaluated our filtering approach using 6 broad topics, each tested on 4 different time periods over 4 months. Experimental results showed that, on average, our approach achieved 84{\%} increase in recall relative to the baseline approach, while maintaining an acceptable precision that showed a drop of about 8{\%}. Our filtering method is currently implemented on TweetMogaz, a news portal generated from tweets. The website compiles the stream of Arabic tweets and detects the relevant tweets to different regions in the Middle East to be presented in the form of comprehensive reports that include top stories and news in each region.",
keywords = "Arabic tweets, Broad dynamic topics, Microblog filtering, Twitter, Unsupervised adaptive filtering",
author = "Walid Magdy and Tamer Elsayed",
year = "2015",
month = "3",
day = "28",
doi = "10.1016/j.ipm.2015.11.004",
language = "English",
journal = "Information Processing and Management",
issn = "0306-4573",
publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Unsupervised adaptive microblog filtering for broad dynamic topics

AU - Magdy, Walid

AU - Elsayed, Tamer

PY - 2015/3/28

Y1 - 2015/3/28

N2 - Information filtering has been a major task of study in the field of information retrieval (IR) for a long time, focusing on filtering well-formed documents such as news articles. Recently, more interest was directed towards applying filtering tasks to user-generated content such as microblogs. Several earlier studies investigated microblog filtering for focused topics. Another vital filtering scenario in microblogs targets the detection of posts that are relevant to long-standing broad and dynamic topics, i.e., topics spanning several subtopics that change over time. This type of filtering in microblogs is essential for many applications such as social studies on large events and news tracking of temporal topics. In this paper, we introduce an adaptive microblog filtering task that focuses on tracking topics of broad and dynamic nature. We propose an entirely-unsupervised approach that adapts to new aspects of the topic to retrieve relevant microblogs. We evaluated our filtering approach using 6 broad topics, each tested on 4 different time periods over 4 months. Experimental results showed that, on average, our approach achieved 84% increase in recall relative to the baseline approach, while maintaining an acceptable precision that showed a drop of about 8%. Our filtering method is currently implemented on TweetMogaz, a news portal generated from tweets. The website compiles the stream of Arabic tweets and detects the relevant tweets to different regions in the Middle East to be presented in the form of comprehensive reports that include top stories and news in each region.

AB - Information filtering has been a major task of study in the field of information retrieval (IR) for a long time, focusing on filtering well-formed documents such as news articles. Recently, more interest was directed towards applying filtering tasks to user-generated content such as microblogs. Several earlier studies investigated microblog filtering for focused topics. Another vital filtering scenario in microblogs targets the detection of posts that are relevant to long-standing broad and dynamic topics, i.e., topics spanning several subtopics that change over time. This type of filtering in microblogs is essential for many applications such as social studies on large events and news tracking of temporal topics. In this paper, we introduce an adaptive microblog filtering task that focuses on tracking topics of broad and dynamic nature. We propose an entirely-unsupervised approach that adapts to new aspects of the topic to retrieve relevant microblogs. We evaluated our filtering approach using 6 broad topics, each tested on 4 different time periods over 4 months. Experimental results showed that, on average, our approach achieved 84% increase in recall relative to the baseline approach, while maintaining an acceptable precision that showed a drop of about 8%. Our filtering method is currently implemented on TweetMogaz, a news portal generated from tweets. The website compiles the stream of Arabic tweets and detects the relevant tweets to different regions in the Middle East to be presented in the form of comprehensive reports that include top stories and news in each region.

KW - Arabic tweets

KW - Broad dynamic topics

KW - Microblog filtering

KW - Twitter

KW - Unsupervised adaptive filtering

UR - http://www.scopus.com/inward/record.url?scp=84952886552&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84952886552&partnerID=8YFLogxK

U2 - 10.1016/j.ipm.2015.11.004

DO - 10.1016/j.ipm.2015.11.004

M3 - Article

JO - Information Processing and Management

JF - Information Processing and Management

SN - 0306-4573

ER -