Venus: Scalable Real-Time Spatial Queries on Microblogs with Adaptive Load Shedding

Amr Magdy, Mohamed Mokbel, Sameh Elnikety, Suman Nath, Yuxiong He

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

Microblogging services have become among the most popular services on the web in the last few years. This led to significant increase in data size, speed, and applications. This paper presents Venus; a system that supports real-time spatial queries on microblogs. Venus supports its queries on a spatial boundary R and a temporal boundary T , from which only the top- k microblogs are returned in the query answer based on a spatio-temporal ranking function. Supporting such queries requires Venus to digest hundreds of millions of real-time microblogs in main-memory with high rates, yet, it provides low query responses and efficient memory utilization. To this end, Venus employs: (1) an efficient in-memory spatio-temporal index that digests high rates of incoming microblogs in real time, (2) a scalable query processor that prune the search space, R and T, effectively to provide low query latency on millions of items in real time, and (3) a group of memory optimization techniques that provide system administrators with different options to save significant memory resources while keeping the query accuracy almost perfect. Venus memory optimization techniques make use of the local arrival rates of microblogs to smartly shed microblogs that are old enough not to contribute to any query answer. In addition, Venus can adaptively, in real time, adjust its load shedding based on both the spatial distribution and the parameters of incoming query loads. All Venus components can accommodate different spatial and temporal ranking functions that are able to capture the importance of each dimension differently depending on the applications requirements. Extensive experimental results based on real Twitter data and actual locations of Bing search queries show that Venus supports high arrival rates of up to 64 K microblogs/second and average query latency of 4 msec.

Original languageEnglish
Article number7303957
Pages (from-to)356-370
Number of pages15
JournalIEEE Transactions on Knowledge and Data Engineering
Volume28
Issue number2
DOIs
Publication statusPublished - 1 Feb 2016
Externally publishedYes

Fingerprint

Data storage equipment
Spatial distribution
Computer systems

Keywords

  • Efficiency
  • Location
  • Memory Optimization
  • Microblogs
  • Performance
  • Scalability
  • Social
  • Spatial
  • Temporal

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics

Cite this

Venus : Scalable Real-Time Spatial Queries on Microblogs with Adaptive Load Shedding. / Magdy, Amr; Mokbel, Mohamed; Elnikety, Sameh; Nath, Suman; He, Yuxiong.

In: IEEE Transactions on Knowledge and Data Engineering, Vol. 28, No. 2, 7303957, 01.02.2016, p. 356-370.

Research output: Contribution to journalArticle

Magdy, Amr ; Mokbel, Mohamed ; Elnikety, Sameh ; Nath, Suman ; He, Yuxiong. / Venus : Scalable Real-Time Spatial Queries on Microblogs with Adaptive Load Shedding. In: IEEE Transactions on Knowledge and Data Engineering. 2016 ; Vol. 28, No. 2. pp. 356-370.
@article{2f7062e2752641fd974031c1be5d0675,
title = "Venus: Scalable Real-Time Spatial Queries on Microblogs with Adaptive Load Shedding",
abstract = "Microblogging services have become among the most popular services on the web in the last few years. This led to significant increase in data size, speed, and applications. This paper presents Venus; a system that supports real-time spatial queries on microblogs. Venus supports its queries on a spatial boundary R and a temporal boundary T , from which only the top- k microblogs are returned in the query answer based on a spatio-temporal ranking function. Supporting such queries requires Venus to digest hundreds of millions of real-time microblogs in main-memory with high rates, yet, it provides low query responses and efficient memory utilization. To this end, Venus employs: (1) an efficient in-memory spatio-temporal index that digests high rates of incoming microblogs in real time, (2) a scalable query processor that prune the search space, R and T, effectively to provide low query latency on millions of items in real time, and (3) a group of memory optimization techniques that provide system administrators with different options to save significant memory resources while keeping the query accuracy almost perfect. Venus memory optimization techniques make use of the local arrival rates of microblogs to smartly shed microblogs that are old enough not to contribute to any query answer. In addition, Venus can adaptively, in real time, adjust its load shedding based on both the spatial distribution and the parameters of incoming query loads. All Venus components can accommodate different spatial and temporal ranking functions that are able to capture the importance of each dimension differently depending on the applications requirements. Extensive experimental results based on real Twitter data and actual locations of Bing search queries show that Venus supports high arrival rates of up to 64 K microblogs/second and average query latency of 4 msec.",
keywords = "Efficiency, Location, Memory Optimization, Microblogs, Performance, Scalability, Social, Spatial, Temporal",
author = "Amr Magdy and Mohamed Mokbel and Sameh Elnikety and Suman Nath and Yuxiong He",
year = "2016",
month = "2",
day = "1",
doi = "10.1109/TKDE.2015.2493531",
language = "English",
volume = "28",
pages = "356--370",
journal = "IEEE Transactions on Knowledge and Data Engineering",
issn = "1041-4347",
publisher = "IEEE Computer Society",
number = "2",

}

TY - JOUR

T1 - Venus

T2 - Scalable Real-Time Spatial Queries on Microblogs with Adaptive Load Shedding

AU - Magdy, Amr

AU - Mokbel, Mohamed

AU - Elnikety, Sameh

AU - Nath, Suman

AU - He, Yuxiong

PY - 2016/2/1

Y1 - 2016/2/1

N2 - Microblogging services have become among the most popular services on the web in the last few years. This led to significant increase in data size, speed, and applications. This paper presents Venus; a system that supports real-time spatial queries on microblogs. Venus supports its queries on a spatial boundary R and a temporal boundary T , from which only the top- k microblogs are returned in the query answer based on a spatio-temporal ranking function. Supporting such queries requires Venus to digest hundreds of millions of real-time microblogs in main-memory with high rates, yet, it provides low query responses and efficient memory utilization. To this end, Venus employs: (1) an efficient in-memory spatio-temporal index that digests high rates of incoming microblogs in real time, (2) a scalable query processor that prune the search space, R and T, effectively to provide low query latency on millions of items in real time, and (3) a group of memory optimization techniques that provide system administrators with different options to save significant memory resources while keeping the query accuracy almost perfect. Venus memory optimization techniques make use of the local arrival rates of microblogs to smartly shed microblogs that are old enough not to contribute to any query answer. In addition, Venus can adaptively, in real time, adjust its load shedding based on both the spatial distribution and the parameters of incoming query loads. All Venus components can accommodate different spatial and temporal ranking functions that are able to capture the importance of each dimension differently depending on the applications requirements. Extensive experimental results based on real Twitter data and actual locations of Bing search queries show that Venus supports high arrival rates of up to 64 K microblogs/second and average query latency of 4 msec.

AB - Microblogging services have become among the most popular services on the web in the last few years. This led to significant increase in data size, speed, and applications. This paper presents Venus; a system that supports real-time spatial queries on microblogs. Venus supports its queries on a spatial boundary R and a temporal boundary T , from which only the top- k microblogs are returned in the query answer based on a spatio-temporal ranking function. Supporting such queries requires Venus to digest hundreds of millions of real-time microblogs in main-memory with high rates, yet, it provides low query responses and efficient memory utilization. To this end, Venus employs: (1) an efficient in-memory spatio-temporal index that digests high rates of incoming microblogs in real time, (2) a scalable query processor that prune the search space, R and T, effectively to provide low query latency on millions of items in real time, and (3) a group of memory optimization techniques that provide system administrators with different options to save significant memory resources while keeping the query accuracy almost perfect. Venus memory optimization techniques make use of the local arrival rates of microblogs to smartly shed microblogs that are old enough not to contribute to any query answer. In addition, Venus can adaptively, in real time, adjust its load shedding based on both the spatial distribution and the parameters of incoming query loads. All Venus components can accommodate different spatial and temporal ranking functions that are able to capture the importance of each dimension differently depending on the applications requirements. Extensive experimental results based on real Twitter data and actual locations of Bing search queries show that Venus supports high arrival rates of up to 64 K microblogs/second and average query latency of 4 msec.

KW - Efficiency

KW - Location

KW - Memory Optimization

KW - Microblogs

KW - Performance

KW - Scalability

KW - Social

KW - Spatial

KW - Temporal

UR - http://www.scopus.com/inward/record.url?scp=84962385770&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84962385770&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2015.2493531

DO - 10.1109/TKDE.2015.2493531

M3 - Article

AN - SCOPUS:84962385770

VL - 28

SP - 356

EP - 370

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 2

M1 - 7303957

ER -