HaCube: Extending MapReduce for efficient OLAP cube materialization and view maintenance

Zhengkui Wang, Yan Chu, Kian Lee Tan, Divyakant Agrawal, Amr Ei Abbadi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Data cubes are widely used as a powerful tool to provide multi-dimensional views in data warehousing and On-Line Analytical Processing (OLAP). However, with increasing data sizes, it is becoming computationally expensive to perform data cube analysis. In this paper, we introduce HaCube, an extension of MapReduce, designed for efficient parallel data cube computation on large-scale data. We also provide a general data cube materialization solution which is able to facilitate the features in MapReduce-like systems towards an efficient data cube computation. Furthermore, we demonstrate how HaCube supports view maintenance through either incremental computation (e.g. used for SUM or COUNT) or recomputation (e.g. used for MEDIAN or CORRELATION). We implement HaCube by extending Hadoop and evaluate it based on the TPC-D benchmark over billions of tuples on a cluster with over 320 cores. The experimental results demonstrate the efficiency, scalability and practicality of HaCube for cube computation over a large amount of data in a distributed environment.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages113-129
Number of pages17
Volume9643
ISBN (Print)9783319320489
DOIs
Publication statusPublished - 2016
Externally publishedYes
Event21st International Conference on Database Systems for Advanced Applications, DASFAA 2016 - Dallas, United States
Duration: 16 Apr 201619 Apr 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9643
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other21st International Conference on Database Systems for Advanced Applications, DASFAA 2016
CountryUnited States
CityDallas
Period16/4/1619/4/16

    Fingerprint

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Wang, Z., Chu, Y., Tan, K. L., Agrawal, D., & Abbadi, A. E. (2016). HaCube: Extending MapReduce for efficient OLAP cube materialization and view maintenance. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9643, pp. 113-129). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9643). Springer Verlag. https://doi.org/10.1007/978-3-319-32049-6_8