CARTILAGE: Adding flexibility to the hadoop skeleton

Alekh Jindal, Jorge Arnulfo Quiane Ruiz, Samuel Madden

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Modern enterprises have to deal with a variety of analytical queries over very large datasets. In this respect, Hadoop has gained much popularity since it scales to thousand of nodes and terabytes of data. However, Hadoop suffers from poor performance, especially in I/O performance. Several works have proposed alternate data storage for Hadoop in order to improve the query performance. However, many of these works end up making deep changes in Hadoop or HDFS. As a result, they are (i) difficult to adopt by several users, and (ii) not compatible with future Hadoop releases. In this paper, we present CARTILAGE, a comprehensive data storage framework built on top of HDFS. CARTILAGE allows users full control over their data storage, including data partitioning, data replication, data layouts, and data placement. Furthermore, CARTILAGE can be layered on top of an existing HDFS installation. This means that Hadoop, as well as other query engines, can readily make use of CARTILAGE. We describe several use-cases of CARTILAGE and propose to demonstrate the flexibility and efficiency of CARTILAGE through a set of novel scenarios.

Original languageEnglish
Title of host publicationProceedings of the ACM SIGMOD International Conference on Management of Data
Pages1057-1060
Number of pages4
DOIs
Publication statusPublished - 29 Jul 2013
Event2013 ACM SIGMOD Conference on Management of Data, SIGMOD 2013 - New York, NY, United States
Duration: 22 Jun 201327 Jun 2013

Other

Other2013 ACM SIGMOD Conference on Management of Data, SIGMOD 2013
CountryUnited States
CityNew York, NY
Period22/6/1327/6/13

Fingerprint

Data storage equipment
Engines
Industry

Keywords

  • Ease of use
  • Flexible storage
  • HDFS
  • Portability

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Jindal, A., Quiane Ruiz, J. A., & Madden, S. (2013). CARTILAGE: Adding flexibility to the hadoop skeleton. In Proceedings of the ACM SIGMOD International Conference on Management of Data (pp. 1057-1060) https://doi.org/10.1145/2463676.2465258

CARTILAGE : Adding flexibility to the hadoop skeleton. / Jindal, Alekh; Quiane Ruiz, Jorge Arnulfo; Madden, Samuel.

Proceedings of the ACM SIGMOD International Conference on Management of Data. 2013. p. 1057-1060.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Jindal, A, Quiane Ruiz, JA & Madden, S 2013, CARTILAGE: Adding flexibility to the hadoop skeleton. in Proceedings of the ACM SIGMOD International Conference on Management of Data. pp. 1057-1060, 2013 ACM SIGMOD Conference on Management of Data, SIGMOD 2013, New York, NY, United States, 22/6/13. https://doi.org/10.1145/2463676.2465258
Jindal A, Quiane Ruiz JA, Madden S. CARTILAGE: Adding flexibility to the hadoop skeleton. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 2013. p. 1057-1060 https://doi.org/10.1145/2463676.2465258
Jindal, Alekh ; Quiane Ruiz, Jorge Arnulfo ; Madden, Samuel. / CARTILAGE : Adding flexibility to the hadoop skeleton. Proceedings of the ACM SIGMOD International Conference on Management of Data. 2013. pp. 1057-1060
@inproceedings{e90db2c71877490686672d9b7db8e882,
title = "CARTILAGE: Adding flexibility to the hadoop skeleton",
abstract = "Modern enterprises have to deal with a variety of analytical queries over very large datasets. In this respect, Hadoop has gained much popularity since it scales to thousand of nodes and terabytes of data. However, Hadoop suffers from poor performance, especially in I/O performance. Several works have proposed alternate data storage for Hadoop in order to improve the query performance. However, many of these works end up making deep changes in Hadoop or HDFS. As a result, they are (i) difficult to adopt by several users, and (ii) not compatible with future Hadoop releases. In this paper, we present CARTILAGE, a comprehensive data storage framework built on top of HDFS. CARTILAGE allows users full control over their data storage, including data partitioning, data replication, data layouts, and data placement. Furthermore, CARTILAGE can be layered on top of an existing HDFS installation. This means that Hadoop, as well as other query engines, can readily make use of CARTILAGE. We describe several use-cases of CARTILAGE and propose to demonstrate the flexibility and efficiency of CARTILAGE through a set of novel scenarios.",
keywords = "Ease of use, Flexible storage, HDFS, Portability",
author = "Alekh Jindal and {Quiane Ruiz}, {Jorge Arnulfo} and Samuel Madden",
year = "2013",
month = "7",
day = "29",
doi = "10.1145/2463676.2465258",
language = "English",
isbn = "9781450320375",
pages = "1057--1060",
booktitle = "Proceedings of the ACM SIGMOD International Conference on Management of Data",

}

TY - GEN

T1 - CARTILAGE

T2 - Adding flexibility to the hadoop skeleton

AU - Jindal, Alekh

AU - Quiane Ruiz, Jorge Arnulfo

AU - Madden, Samuel

PY - 2013/7/29

Y1 - 2013/7/29

N2 - Modern enterprises have to deal with a variety of analytical queries over very large datasets. In this respect, Hadoop has gained much popularity since it scales to thousand of nodes and terabytes of data. However, Hadoop suffers from poor performance, especially in I/O performance. Several works have proposed alternate data storage for Hadoop in order to improve the query performance. However, many of these works end up making deep changes in Hadoop or HDFS. As a result, they are (i) difficult to adopt by several users, and (ii) not compatible with future Hadoop releases. In this paper, we present CARTILAGE, a comprehensive data storage framework built on top of HDFS. CARTILAGE allows users full control over their data storage, including data partitioning, data replication, data layouts, and data placement. Furthermore, CARTILAGE can be layered on top of an existing HDFS installation. This means that Hadoop, as well as other query engines, can readily make use of CARTILAGE. We describe several use-cases of CARTILAGE and propose to demonstrate the flexibility and efficiency of CARTILAGE through a set of novel scenarios.

AB - Modern enterprises have to deal with a variety of analytical queries over very large datasets. In this respect, Hadoop has gained much popularity since it scales to thousand of nodes and terabytes of data. However, Hadoop suffers from poor performance, especially in I/O performance. Several works have proposed alternate data storage for Hadoop in order to improve the query performance. However, many of these works end up making deep changes in Hadoop or HDFS. As a result, they are (i) difficult to adopt by several users, and (ii) not compatible with future Hadoop releases. In this paper, we present CARTILAGE, a comprehensive data storage framework built on top of HDFS. CARTILAGE allows users full control over their data storage, including data partitioning, data replication, data layouts, and data placement. Furthermore, CARTILAGE can be layered on top of an existing HDFS installation. This means that Hadoop, as well as other query engines, can readily make use of CARTILAGE. We describe several use-cases of CARTILAGE and propose to demonstrate the flexibility and efficiency of CARTILAGE through a set of novel scenarios.

KW - Ease of use

KW - Flexible storage

KW - HDFS

KW - Portability

UR - http://www.scopus.com/inward/record.url?scp=84880540611&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84880540611&partnerID=8YFLogxK

U2 - 10.1145/2463676.2465258

DO - 10.1145/2463676.2465258

M3 - Conference contribution

AN - SCOPUS:84880540611

SN - 9781450320375

SP - 1057

EP - 1060

BT - Proceedings of the ACM SIGMOD International Conference on Management of Data

ER -