OceanRT: Real-time analytics over large temporal data

Shiming Zhang, Yin Yang, Wei Fan, Liang Lan, Mingxuan Yuan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Citations (Scopus)

Abstract

We demonstrate OceanRT, a novel cloud-based infrastructure that performs online analytics in real time, over large-scale temporal data such as call logs from a telecommunication company. Apart from proprietary systems for which few details have been revealed, most existing big-data analytics systems are built on top of an offline, MapReduce-style infrastructure, which inherently limits their efficiency. In contrast, OceanRT employs a novel computing architecture consisting of interconnected Access Query Engines (AQEs), as well as a new storage scheme that ensures data locality and fast access for temporal data. Our preliminary evaluation shows that OceanRT can be up to 10× faster than Impala [10], 12× faster than Shark [5], and 200× faster than Hive [13]. The demo will show how OceanRT manages a real call log dataset (around 5TB per day) from a large mobile network operator in China. Besides presenting the processing of a few preset queries, we also allow the audience to issue ad hoc HiveQL [13] queries, watch how OceanRT answers them, and compare the speed of OceanRT with its competitors.

Original languageEnglish
Title of host publicationSIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages1099-1102
Number of pages4
ISBN (Print)9781450323765
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event2014 ACM SIGMOD International Conference on Management of Data, SIGMOD 2014 - Snowbird, UT, United States
Duration: 22 Jun 201427 Jun 2014

Other

Other2014 ACM SIGMOD International Conference on Management of Data, SIGMOD 2014
CountryUnited States
CitySnowbird, UT
Period22/6/1427/6/14

Fingerprint

Telecommunication
Wireless networks
Engines
Processing
Industry
Big data

Keywords

  • Design
  • Management
  • Performance

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Zhang, S., Yang, Y., Fan, W., Lan, L., & Yuan, M. (2014). OceanRT: Real-time analytics over large temporal data. In SIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (pp. 1099-1102). Association for Computing Machinery. https://doi.org/10.1145/2588555.2594513

OceanRT : Real-time analytics over large temporal data. / Zhang, Shiming; Yang, Yin; Fan, Wei; Lan, Liang; Yuan, Mingxuan.

SIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. Association for Computing Machinery, 2014. p. 1099-1102.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhang, S, Yang, Y, Fan, W, Lan, L & Yuan, M 2014, OceanRT: Real-time analytics over large temporal data. in SIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. Association for Computing Machinery, pp. 1099-1102, 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, United States, 22/6/14. https://doi.org/10.1145/2588555.2594513
Zhang S, Yang Y, Fan W, Lan L, Yuan M. OceanRT: Real-time analytics over large temporal data. In SIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. Association for Computing Machinery. 2014. p. 1099-1102 https://doi.org/10.1145/2588555.2594513
Zhang, Shiming ; Yang, Yin ; Fan, Wei ; Lan, Liang ; Yuan, Mingxuan. / OceanRT : Real-time analytics over large temporal data. SIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. Association for Computing Machinery, 2014. pp. 1099-1102
@inproceedings{d522d0b811e54860b1455653b816d4c1,
title = "OceanRT: Real-time analytics over large temporal data",
abstract = "We demonstrate OceanRT, a novel cloud-based infrastructure that performs online analytics in real time, over large-scale temporal data such as call logs from a telecommunication company. Apart from proprietary systems for which few details have been revealed, most existing big-data analytics systems are built on top of an offline, MapReduce-style infrastructure, which inherently limits their efficiency. In contrast, OceanRT employs a novel computing architecture consisting of interconnected Access Query Engines (AQEs), as well as a new storage scheme that ensures data locality and fast access for temporal data. Our preliminary evaluation shows that OceanRT can be up to 10× faster than Impala [10], 12× faster than Shark [5], and 200× faster than Hive [13]. The demo will show how OceanRT manages a real call log dataset (around 5TB per day) from a large mobile network operator in China. Besides presenting the processing of a few preset queries, we also allow the audience to issue ad hoc HiveQL [13] queries, watch how OceanRT answers them, and compare the speed of OceanRT with its competitors.",
keywords = "Design, Management, Performance",
author = "Shiming Zhang and Yin Yang and Wei Fan and Liang Lan and Mingxuan Yuan",
year = "2014",
doi = "10.1145/2588555.2594513",
language = "English",
isbn = "9781450323765",
pages = "1099--1102",
booktitle = "SIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - OceanRT

T2 - Real-time analytics over large temporal data

AU - Zhang, Shiming

AU - Yang, Yin

AU - Fan, Wei

AU - Lan, Liang

AU - Yuan, Mingxuan

PY - 2014

Y1 - 2014

N2 - We demonstrate OceanRT, a novel cloud-based infrastructure that performs online analytics in real time, over large-scale temporal data such as call logs from a telecommunication company. Apart from proprietary systems for which few details have been revealed, most existing big-data analytics systems are built on top of an offline, MapReduce-style infrastructure, which inherently limits their efficiency. In contrast, OceanRT employs a novel computing architecture consisting of interconnected Access Query Engines (AQEs), as well as a new storage scheme that ensures data locality and fast access for temporal data. Our preliminary evaluation shows that OceanRT can be up to 10× faster than Impala [10], 12× faster than Shark [5], and 200× faster than Hive [13]. The demo will show how OceanRT manages a real call log dataset (around 5TB per day) from a large mobile network operator in China. Besides presenting the processing of a few preset queries, we also allow the audience to issue ad hoc HiveQL [13] queries, watch how OceanRT answers them, and compare the speed of OceanRT with its competitors.

AB - We demonstrate OceanRT, a novel cloud-based infrastructure that performs online analytics in real time, over large-scale temporal data such as call logs from a telecommunication company. Apart from proprietary systems for which few details have been revealed, most existing big-data analytics systems are built on top of an offline, MapReduce-style infrastructure, which inherently limits their efficiency. In contrast, OceanRT employs a novel computing architecture consisting of interconnected Access Query Engines (AQEs), as well as a new storage scheme that ensures data locality and fast access for temporal data. Our preliminary evaluation shows that OceanRT can be up to 10× faster than Impala [10], 12× faster than Shark [5], and 200× faster than Hive [13]. The demo will show how OceanRT manages a real call log dataset (around 5TB per day) from a large mobile network operator in China. Besides presenting the processing of a few preset queries, we also allow the audience to issue ad hoc HiveQL [13] queries, watch how OceanRT answers them, and compare the speed of OceanRT with its competitors.

KW - Design

KW - Management

KW - Performance

UR - http://www.scopus.com/inward/record.url?scp=84904327816&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84904327816&partnerID=8YFLogxK

U2 - 10.1145/2588555.2594513

DO - 10.1145/2588555.2594513

M3 - Conference contribution

AN - SCOPUS:84904327816

SN - 9781450323765

SP - 1099

EP - 1102

BT - SIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data

PB - Association for Computing Machinery

ER -