Mars

Real-time spatio-temporal queries on microblogs

Amr Magdy, Ahmed M. Aly, Mohamed Mokbel, Sameh Elnikety, Yuxiong He, Suman Nath

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

Mars demonstration exploits the microblogs location information to support a wide variety of important spatio-temporal queries on microblogs. Supported queries include range, nearest-neighbor, and aggregate queries. Mars works under a challenging environment where streams of microblogs are arriving with high arrival rates. Mars distinguishes itself with three novel contributions: (1) Efficient in-memory digestion/expiration techniques that can handle microblogs of high arrival rates up to 64,000 microblog/sec. This also includes highly accurate and efficient hopping-window based aggregation for incoming microblogs keywords. (2) Smart memory optimization and load shedding techniques that adjust in-memory contents based on the expected query load to trade off a significant storage savings with a slight and bounded accuracy loss. (3) Scalable real-time query processing, exploiting Zipf distributed microblogs data for efficient top-k aggregate query processing. In addition, Mars employs a scalable real-time nearest neighbor and range query processing module that employs various pruning techniques so that it serves heavy query workloads in real time. Mars is demonstrated using a stream of real tweets obtained from Twitter firehose with a production query workload obtained from Bing web search. We show that Mars serves incoming queries with an average latency of less than 4 msec and with 99% answer accuracy while saving up to 70% of storage overhead for different query loads.

Original languageEnglish
Title of host publication2014 IEEE 30th International Conference on Data Engineering, ICDE 2014
PublisherIEEE Computer Society
Pages1238-1241
Number of pages4
ISBN (Print)9781479925544
DOIs
Publication statusPublished - 1 Jan 2014
Externally publishedYes
Event30th IEEE International Conference on Data Engineering, ICDE 2014 - Chicago, IL, United States
Duration: 31 Mar 20144 Apr 2014

Other

Other30th IEEE International Conference on Data Engineering, ICDE 2014
CountryUnited States
CityChicago, IL
Period31/3/144/4/14

Fingerprint

Query processing
Data storage equipment
Demonstrations
Agglomeration

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Cite this

Magdy, A., Aly, A. M., Mokbel, M., Elnikety, S., He, Y., & Nath, S. (2014). Mars: Real-time spatio-temporal queries on microblogs. In 2014 IEEE 30th International Conference on Data Engineering, ICDE 2014 (pp. 1238-1241). [6816750] IEEE Computer Society. https://doi.org/10.1109/ICDE.2014.6816750

Mars : Real-time spatio-temporal queries on microblogs. / Magdy, Amr; Aly, Ahmed M.; Mokbel, Mohamed; Elnikety, Sameh; He, Yuxiong; Nath, Suman.

2014 IEEE 30th International Conference on Data Engineering, ICDE 2014. IEEE Computer Society, 2014. p. 1238-1241 6816750.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Magdy, A, Aly, AM, Mokbel, M, Elnikety, S, He, Y & Nath, S 2014, Mars: Real-time spatio-temporal queries on microblogs. in 2014 IEEE 30th International Conference on Data Engineering, ICDE 2014., 6816750, IEEE Computer Society, pp. 1238-1241, 30th IEEE International Conference on Data Engineering, ICDE 2014, Chicago, IL, United States, 31/3/14. https://doi.org/10.1109/ICDE.2014.6816750
Magdy A, Aly AM, Mokbel M, Elnikety S, He Y, Nath S. Mars: Real-time spatio-temporal queries on microblogs. In 2014 IEEE 30th International Conference on Data Engineering, ICDE 2014. IEEE Computer Society. 2014. p. 1238-1241. 6816750 https://doi.org/10.1109/ICDE.2014.6816750
Magdy, Amr ; Aly, Ahmed M. ; Mokbel, Mohamed ; Elnikety, Sameh ; He, Yuxiong ; Nath, Suman. / Mars : Real-time spatio-temporal queries on microblogs. 2014 IEEE 30th International Conference on Data Engineering, ICDE 2014. IEEE Computer Society, 2014. pp. 1238-1241
@inproceedings{122db7c3253a4117ba142872469c3bd3,
title = "Mars: Real-time spatio-temporal queries on microblogs",
abstract = "Mars demonstration exploits the microblogs location information to support a wide variety of important spatio-temporal queries on microblogs. Supported queries include range, nearest-neighbor, and aggregate queries. Mars works under a challenging environment where streams of microblogs are arriving with high arrival rates. Mars distinguishes itself with three novel contributions: (1) Efficient in-memory digestion/expiration techniques that can handle microblogs of high arrival rates up to 64,000 microblog/sec. This also includes highly accurate and efficient hopping-window based aggregation for incoming microblogs keywords. (2) Smart memory optimization and load shedding techniques that adjust in-memory contents based on the expected query load to trade off a significant storage savings with a slight and bounded accuracy loss. (3) Scalable real-time query processing, exploiting Zipf distributed microblogs data for efficient top-k aggregate query processing. In addition, Mars employs a scalable real-time nearest neighbor and range query processing module that employs various pruning techniques so that it serves heavy query workloads in real time. Mars is demonstrated using a stream of real tweets obtained from Twitter firehose with a production query workload obtained from Bing web search. We show that Mars serves incoming queries with an average latency of less than 4 msec and with 99{\%} answer accuracy while saving up to 70{\%} of storage overhead for different query loads.",
author = "Amr Magdy and Aly, {Ahmed M.} and Mohamed Mokbel and Sameh Elnikety and Yuxiong He and Suman Nath",
year = "2014",
month = "1",
day = "1",
doi = "10.1109/ICDE.2014.6816750",
language = "English",
isbn = "9781479925544",
pages = "1238--1241",
booktitle = "2014 IEEE 30th International Conference on Data Engineering, ICDE 2014",
publisher = "IEEE Computer Society",

}

TY - GEN

T1 - Mars

T2 - Real-time spatio-temporal queries on microblogs

AU - Magdy, Amr

AU - Aly, Ahmed M.

AU - Mokbel, Mohamed

AU - Elnikety, Sameh

AU - He, Yuxiong

AU - Nath, Suman

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Mars demonstration exploits the microblogs location information to support a wide variety of important spatio-temporal queries on microblogs. Supported queries include range, nearest-neighbor, and aggregate queries. Mars works under a challenging environment where streams of microblogs are arriving with high arrival rates. Mars distinguishes itself with three novel contributions: (1) Efficient in-memory digestion/expiration techniques that can handle microblogs of high arrival rates up to 64,000 microblog/sec. This also includes highly accurate and efficient hopping-window based aggregation for incoming microblogs keywords. (2) Smart memory optimization and load shedding techniques that adjust in-memory contents based on the expected query load to trade off a significant storage savings with a slight and bounded accuracy loss. (3) Scalable real-time query processing, exploiting Zipf distributed microblogs data for efficient top-k aggregate query processing. In addition, Mars employs a scalable real-time nearest neighbor and range query processing module that employs various pruning techniques so that it serves heavy query workloads in real time. Mars is demonstrated using a stream of real tweets obtained from Twitter firehose with a production query workload obtained from Bing web search. We show that Mars serves incoming queries with an average latency of less than 4 msec and with 99% answer accuracy while saving up to 70% of storage overhead for different query loads.

AB - Mars demonstration exploits the microblogs location information to support a wide variety of important spatio-temporal queries on microblogs. Supported queries include range, nearest-neighbor, and aggregate queries. Mars works under a challenging environment where streams of microblogs are arriving with high arrival rates. Mars distinguishes itself with three novel contributions: (1) Efficient in-memory digestion/expiration techniques that can handle microblogs of high arrival rates up to 64,000 microblog/sec. This also includes highly accurate and efficient hopping-window based aggregation for incoming microblogs keywords. (2) Smart memory optimization and load shedding techniques that adjust in-memory contents based on the expected query load to trade off a significant storage savings with a slight and bounded accuracy loss. (3) Scalable real-time query processing, exploiting Zipf distributed microblogs data for efficient top-k aggregate query processing. In addition, Mars employs a scalable real-time nearest neighbor and range query processing module that employs various pruning techniques so that it serves heavy query workloads in real time. Mars is demonstrated using a stream of real tweets obtained from Twitter firehose with a production query workload obtained from Bing web search. We show that Mars serves incoming queries with an average latency of less than 4 msec and with 99% answer accuracy while saving up to 70% of storage overhead for different query loads.

UR - http://www.scopus.com/inward/record.url?scp=84901788549&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84901788549&partnerID=8YFLogxK

U2 - 10.1109/ICDE.2014.6816750

DO - 10.1109/ICDE.2014.6816750

M3 - Conference contribution

SN - 9781479925544

SP - 1238

EP - 1241

BT - 2014 IEEE 30th International Conference on Data Engineering, ICDE 2014

PB - IEEE Computer Society

ER -