Runtime measurements in the cloud

Observing, analyzing, and reducing variance

Jörg Schad, Jens Dittrich, Jorge Arnulfo Quiane Ruiz

Research output: Chapter in Book/Report/Conference proceedingChapter

399 Citations (Scopus)

Abstract

One of the main reasons why cloud computing has gained so much popularity is due to its ease of use and its ability to scale computing resources on demand. As a result, users can now rent computing nodes on large commercial clusters through several vendors, such as Amazon and rackspace. However, despite the attention paid by Cloud providers, performance unpredictability is a major issue in Cloud computing for (1) database researchers performing wall clock experiments, and (2) database applications providing servicelevel agreements. In this paper, we carry out a study of the performance variance of the most widely used Cloud infrastructure (Amazon EC2) from different perspectives. We use established microbenchmarks to measure performance variance in CPU, I/O, and network. And, we use a multi-node MapReduce application to quantify the impact on real dataintensive applications. We collected data for an entire month and compare it with the results obtained on a local cluster. Our results show that EC2 performance varies a lot and often falls into two bands having a large performance gap in-between - which is somewhat surprising. We observe in our experiments that these two bands correspond to the different virtual system types provided by Amazon. Moreover, we analyze results considering different availability zones, points in time, and locations. This analysis indicates that, among others, the choice of availability zone also influences the performance variability. A major conclusion of our work is that the variance on EC2 is currently so high that wall clock experiments may only be performed with considerable care. To this end, we provide some hints to users.

Original languageEnglish
Title of host publicationProceedings of the VLDB Endowment
Pages460-471
Number of pages12
Volume3
Edition1
Publication statusPublished - Sep 2010
Externally publishedYes

Fingerprint

Cloud computing
Clocks
Availability
Experiments
Program processors

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computer Science(all)

Cite this

Schad, J., Dittrich, J., & Quiane Ruiz, J. A. (2010). Runtime measurements in the cloud: Observing, analyzing, and reducing variance. In Proceedings of the VLDB Endowment (1 ed., Vol. 3, pp. 460-471)

Runtime measurements in the cloud : Observing, analyzing, and reducing variance. / Schad, Jörg; Dittrich, Jens; Quiane Ruiz, Jorge Arnulfo.

Proceedings of the VLDB Endowment. Vol. 3 1. ed. 2010. p. 460-471.

Research output: Chapter in Book/Report/Conference proceedingChapter

Schad, J, Dittrich, J & Quiane Ruiz, JA 2010, Runtime measurements in the cloud: Observing, analyzing, and reducing variance. in Proceedings of the VLDB Endowment. 1 edn, vol. 3, pp. 460-471.
Schad J, Dittrich J, Quiane Ruiz JA. Runtime measurements in the cloud: Observing, analyzing, and reducing variance. In Proceedings of the VLDB Endowment. 1 ed. Vol. 3. 2010. p. 460-471
Schad, Jörg ; Dittrich, Jens ; Quiane Ruiz, Jorge Arnulfo. / Runtime measurements in the cloud : Observing, analyzing, and reducing variance. Proceedings of the VLDB Endowment. Vol. 3 1. ed. 2010. pp. 460-471
@inbook{1131cf4f47004cfea89401e88a24f967,
title = "Runtime measurements in the cloud: Observing, analyzing, and reducing variance",
abstract = "One of the main reasons why cloud computing has gained so much popularity is due to its ease of use and its ability to scale computing resources on demand. As a result, users can now rent computing nodes on large commercial clusters through several vendors, such as Amazon and rackspace. However, despite the attention paid by Cloud providers, performance unpredictability is a major issue in Cloud computing for (1) database researchers performing wall clock experiments, and (2) database applications providing servicelevel agreements. In this paper, we carry out a study of the performance variance of the most widely used Cloud infrastructure (Amazon EC2) from different perspectives. We use established microbenchmarks to measure performance variance in CPU, I/O, and network. And, we use a multi-node MapReduce application to quantify the impact on real dataintensive applications. We collected data for an entire month and compare it with the results obtained on a local cluster. Our results show that EC2 performance varies a lot and often falls into two bands having a large performance gap in-between - which is somewhat surprising. We observe in our experiments that these two bands correspond to the different virtual system types provided by Amazon. Moreover, we analyze results considering different availability zones, points in time, and locations. This analysis indicates that, among others, the choice of availability zone also influences the performance variability. A major conclusion of our work is that the variance on EC2 is currently so high that wall clock experiments may only be performed with considerable care. To this end, we provide some hints to users.",
author = "J{\"o}rg Schad and Jens Dittrich and {Quiane Ruiz}, {Jorge Arnulfo}",
year = "2010",
month = "9",
language = "English",
volume = "3",
pages = "460--471",
booktitle = "Proceedings of the VLDB Endowment",
edition = "1",

}

TY - CHAP

T1 - Runtime measurements in the cloud

T2 - Observing, analyzing, and reducing variance

AU - Schad, Jörg

AU - Dittrich, Jens

AU - Quiane Ruiz, Jorge Arnulfo

PY - 2010/9

Y1 - 2010/9

N2 - One of the main reasons why cloud computing has gained so much popularity is due to its ease of use and its ability to scale computing resources on demand. As a result, users can now rent computing nodes on large commercial clusters through several vendors, such as Amazon and rackspace. However, despite the attention paid by Cloud providers, performance unpredictability is a major issue in Cloud computing for (1) database researchers performing wall clock experiments, and (2) database applications providing servicelevel agreements. In this paper, we carry out a study of the performance variance of the most widely used Cloud infrastructure (Amazon EC2) from different perspectives. We use established microbenchmarks to measure performance variance in CPU, I/O, and network. And, we use a multi-node MapReduce application to quantify the impact on real dataintensive applications. We collected data for an entire month and compare it with the results obtained on a local cluster. Our results show that EC2 performance varies a lot and often falls into two bands having a large performance gap in-between - which is somewhat surprising. We observe in our experiments that these two bands correspond to the different virtual system types provided by Amazon. Moreover, we analyze results considering different availability zones, points in time, and locations. This analysis indicates that, among others, the choice of availability zone also influences the performance variability. A major conclusion of our work is that the variance on EC2 is currently so high that wall clock experiments may only be performed with considerable care. To this end, we provide some hints to users.

AB - One of the main reasons why cloud computing has gained so much popularity is due to its ease of use and its ability to scale computing resources on demand. As a result, users can now rent computing nodes on large commercial clusters through several vendors, such as Amazon and rackspace. However, despite the attention paid by Cloud providers, performance unpredictability is a major issue in Cloud computing for (1) database researchers performing wall clock experiments, and (2) database applications providing servicelevel agreements. In this paper, we carry out a study of the performance variance of the most widely used Cloud infrastructure (Amazon EC2) from different perspectives. We use established microbenchmarks to measure performance variance in CPU, I/O, and network. And, we use a multi-node MapReduce application to quantify the impact on real dataintensive applications. We collected data for an entire month and compare it with the results obtained on a local cluster. Our results show that EC2 performance varies a lot and often falls into two bands having a large performance gap in-between - which is somewhat surprising. We observe in our experiments that these two bands correspond to the different virtual system types provided by Amazon. Moreover, we analyze results considering different availability zones, points in time, and locations. This analysis indicates that, among others, the choice of availability zone also influences the performance variability. A major conclusion of our work is that the variance on EC2 is currently so high that wall clock experiments may only be performed with considerable care. To this end, we provide some hints to users.

UR - http://www.scopus.com/inward/record.url?scp=80053503082&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80053503082&partnerID=8YFLogxK

M3 - Chapter

VL - 3

SP - 460

EP - 471

BT - Proceedings of the VLDB Endowment

ER -