Microblogs data management: a survey

Amr Magdy, Laila Abdelhafeez, Yunfan Kang, Eric Ong, Mohamed F. Mokbel

Research output: Contribution to journalArticle

Abstract

Microblogs data is the microlength user-generated data that is posted on the web, e.g., tweets, online reviews, comments on news and social media. It has gained considerable attention in recent years due to its widespread popularity, rich content, and value in several societal applications. Nowadays, microblogs applications span a wide spectrum of interests including targeted advertising, market reports, news delivery, political campaigns, rescue services, and public health. Consequently, major research efforts have been spent to manage, analyze, and visualize microblogs to support different applications. This paper gives a comprehensive review of major research and system work in microblogs data management. The paper reviews core components that enable large-scale querying and indexing for microblogs data. A dedicated part gives particular focus for discussing system-level issues and on-going effort on supporting microblogs through the rising wave of big data systems. In addition, we review the major research topics that exploit these core data management components to provide innovative and effective analysis and visualization for microblogs, such as event detection, recommendations, automatic geotagging, and user queries. Throughout the different parts, we highlight the challenges, innovations, and future opportunities in microblogs data research.

Original languageEnglish
JournalVLDB Journal
DOIs
Publication statusAccepted/In press - 1 Jan 2019

Fingerprint

Information management
Public health
Marketing
Visualization
Innovation

Keywords

  • Aggregation
  • Classification
  • Clustering
  • Data analysis
  • Data management
  • Event
  • Event analysis
  • Event detection
  • Flushing policy
  • Geo
  • Geotagging
  • Graph
  • Indexing
  • Keyword
  • Main-memory
  • Memory management
  • Microblogs
  • Probabilistic models
  • Query processing
  • Ranking
  • Recommendation
  • Sampling
  • Social media
  • Spatial
  • Statistical
  • Summarization
  • Systems
  • Temporal
  • Textual
  • Top-k
  • Twitter
  • User
  • Visual analysis

ASJC Scopus subject areas

  • Information Systems
  • Hardware and Architecture

Cite this

Microblogs data management : a survey. / Magdy, Amr; Abdelhafeez, Laila; Kang, Yunfan; Ong, Eric; Mokbel, Mohamed F.

In: VLDB Journal, 01.01.2019.

Research output: Contribution to journalArticle

Magdy, Amr ; Abdelhafeez, Laila ; Kang, Yunfan ; Ong, Eric ; Mokbel, Mohamed F. / Microblogs data management : a survey. In: VLDB Journal. 2019.
@article{074ef0419b9740c3bdbde545a671228a,
title = "Microblogs data management: a survey",
abstract = "Microblogs data is the microlength user-generated data that is posted on the web, e.g., tweets, online reviews, comments on news and social media. It has gained considerable attention in recent years due to its widespread popularity, rich content, and value in several societal applications. Nowadays, microblogs applications span a wide spectrum of interests including targeted advertising, market reports, news delivery, political campaigns, rescue services, and public health. Consequently, major research efforts have been spent to manage, analyze, and visualize microblogs to support different applications. This paper gives a comprehensive review of major research and system work in microblogs data management. The paper reviews core components that enable large-scale querying and indexing for microblogs data. A dedicated part gives particular focus for discussing system-level issues and on-going effort on supporting microblogs through the rising wave of big data systems. In addition, we review the major research topics that exploit these core data management components to provide innovative and effective analysis and visualization for microblogs, such as event detection, recommendations, automatic geotagging, and user queries. Throughout the different parts, we highlight the challenges, innovations, and future opportunities in microblogs data research.",
keywords = "Aggregation, Classification, Clustering, Data analysis, Data management, Event, Event analysis, Event detection, Flushing policy, Geo, Geotagging, Graph, Indexing, Keyword, Main-memory, Memory management, Microblogs, Probabilistic models, Query processing, Ranking, Recommendation, Sampling, Social media, Spatial, Statistical, Summarization, Systems, Temporal, Textual, Top-k, Twitter, User, Visual analysis",
author = "Amr Magdy and Laila Abdelhafeez and Yunfan Kang and Eric Ong and Mokbel, {Mohamed F.}",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/s00778-019-00569-6",
language = "English",
journal = "VLDB Journal",
issn = "1066-8888",
publisher = "Springer New York",

}

TY - JOUR

T1 - Microblogs data management

T2 - a survey

AU - Magdy, Amr

AU - Abdelhafeez, Laila

AU - Kang, Yunfan

AU - Ong, Eric

AU - Mokbel, Mohamed F.

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Microblogs data is the microlength user-generated data that is posted on the web, e.g., tweets, online reviews, comments on news and social media. It has gained considerable attention in recent years due to its widespread popularity, rich content, and value in several societal applications. Nowadays, microblogs applications span a wide spectrum of interests including targeted advertising, market reports, news delivery, political campaigns, rescue services, and public health. Consequently, major research efforts have been spent to manage, analyze, and visualize microblogs to support different applications. This paper gives a comprehensive review of major research and system work in microblogs data management. The paper reviews core components that enable large-scale querying and indexing for microblogs data. A dedicated part gives particular focus for discussing system-level issues and on-going effort on supporting microblogs through the rising wave of big data systems. In addition, we review the major research topics that exploit these core data management components to provide innovative and effective analysis and visualization for microblogs, such as event detection, recommendations, automatic geotagging, and user queries. Throughout the different parts, we highlight the challenges, innovations, and future opportunities in microblogs data research.

AB - Microblogs data is the microlength user-generated data that is posted on the web, e.g., tweets, online reviews, comments on news and social media. It has gained considerable attention in recent years due to its widespread popularity, rich content, and value in several societal applications. Nowadays, microblogs applications span a wide spectrum of interests including targeted advertising, market reports, news delivery, political campaigns, rescue services, and public health. Consequently, major research efforts have been spent to manage, analyze, and visualize microblogs to support different applications. This paper gives a comprehensive review of major research and system work in microblogs data management. The paper reviews core components that enable large-scale querying and indexing for microblogs data. A dedicated part gives particular focus for discussing system-level issues and on-going effort on supporting microblogs through the rising wave of big data systems. In addition, we review the major research topics that exploit these core data management components to provide innovative and effective analysis and visualization for microblogs, such as event detection, recommendations, automatic geotagging, and user queries. Throughout the different parts, we highlight the challenges, innovations, and future opportunities in microblogs data research.

KW - Aggregation

KW - Classification

KW - Clustering

KW - Data analysis

KW - Data management

KW - Event

KW - Event analysis

KW - Event detection

KW - Flushing policy

KW - Geo

KW - Geotagging

KW - Graph

KW - Indexing

KW - Keyword

KW - Main-memory

KW - Memory management

KW - Microblogs

KW - Probabilistic models

KW - Query processing

KW - Ranking

KW - Recommendation

KW - Sampling

KW - Social media

KW - Spatial

KW - Statistical

KW - Summarization

KW - Systems

KW - Temporal

KW - Textual

KW - Top-k

KW - Twitter

KW - User

KW - Visual analysis

UR - http://www.scopus.com/inward/record.url?scp=85074010168&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85074010168&partnerID=8YFLogxK

U2 - 10.1007/s00778-019-00569-6

DO - 10.1007/s00778-019-00569-6

M3 - Article

AN - SCOPUS:85074010168

JO - VLDB Journal

JF - VLDB Journal

SN - 1066-8888

ER -