Microblogs data management: a survey

Amr Magdy, Laila Abdelhafeez, Yunfan Kang, Eric Ong, Mohamed F. Mokbel

Research output: Contribution to journalArticle


Microblogs data is the microlength user-generated data that is posted on the web, e.g., tweets, online reviews, comments on news and social media. It has gained considerable attention in recent years due to its widespread popularity, rich content, and value in several societal applications. Nowadays, microblogs applications span a wide spectrum of interests including targeted advertising, market reports, news delivery, political campaigns, rescue services, and public health. Consequently, major research efforts have been spent to manage, analyze, and visualize microblogs to support different applications. This paper gives a comprehensive review of major research and system work in microblogs data management. The paper reviews core components that enable large-scale querying and indexing for microblogs data. A dedicated part gives particular focus for discussing system-level issues and on-going effort on supporting microblogs through the rising wave of big data systems. In addition, we review the major research topics that exploit these core data management components to provide innovative and effective analysis and visualization for microblogs, such as event detection, recommendations, automatic geotagging, and user queries. Throughout the different parts, we highlight the challenges, innovations, and future opportunities in microblogs data research.

Original languageEnglish
JournalVLDB Journal
Publication statusAccepted/In press - 1 Jan 2019



  • Aggregation
  • Classification
  • Clustering
  • Data analysis
  • Data management
  • Event
  • Event analysis
  • Event detection
  • Flushing policy
  • Geo
  • Geotagging
  • Graph
  • Indexing
  • Keyword
  • Main-memory
  • Memory management
  • Microblogs
  • Probabilistic models
  • Query processing
  • Ranking
  • Recommendation
  • Sampling
  • Social media
  • Spatial
  • Statistical
  • Summarization
  • Systems
  • Temporal
  • Textual
  • Top-k
  • Twitter
  • User
  • Visual analysis

ASJC Scopus subject areas

  • Information Systems
  • Hardware and Architecture

Cite this