Cloud versus in-house cluster

Evaluating amazon cluster compute instances for running MPI applications

Yan Zhai, Mingliang Liu, Jidong Zhai, Xiaosong Ma, Wenguang Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

55 Citations (Scopus)

Abstract

The emergence of cloud services brings new possibilities for constructing and using HPC platforms. However, while cloud services provide the exibility and convenience of customized, pay-as-you-go parallel computing, multiple previous studies in the past three years have indicated that cloudbased clusters need a significant performance boost to become a competitive choice, especially for tightly coupled parallel applications. In this work, we examine the feasibility of running HPC applications in clouds. This study distinguishes itself from existing investigations in several ways: 1) We carry out a comprehensive examination of issues relevant to the HPC community, including performance, cost, user experience, and range of user activities. 2) We compare an Amazon EC2-based platform built upon its newly available HPCoriented virtual machines with typical local cluster and supercomputer options, using benchmarks and applications with scale and problem size unprecedented in previous cloud HPC studies. 3) We perform detailed performance and scalability analysis to locate the chief limiting factors of the state-of-the-art cloud based clusters. 4) We present a case study on the impact of per-application parallel I/O system configuration uniquely enabled by cloud services. Our results reveal that though the scalability of EC2-based virtual clusters still lags behind traditional HPC alternatives, they are rapidly gaining in overall performance and cost-effectiveness, making them feasible candidates for performing tightly coupled scientific computing. In addition, our detailed benchmarking and profiling discloses and analyzes several problems regarding the performance and performance stability on EC2.

Original languageEnglish
Title of host publicationState of the Practice Reports, SC'11
DOIs
Publication statusPublished - 13 Dec 2011
Externally publishedYes
EventState of the Practice Reports, SC'11 - Seattle, WA, United States
Duration: 12 Nov 201118 Nov 2011

Other

OtherState of the Practice Reports, SC'11
CountryUnited States
CitySeattle, WA
Period12/11/1118/11/11

Fingerprint

Scalability
Natural sciences computing
Supercomputers
Benchmarking
Parallel processing systems
Cost effectiveness
Costs
Virtual machine

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications

Cite this

Cloud versus in-house cluster : Evaluating amazon cluster compute instances for running MPI applications. / Zhai, Yan; Liu, Mingliang; Zhai, Jidong; Ma, Xiaosong; Chen, Wenguang.

State of the Practice Reports, SC'11. 2011. 11.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhai, Y, Liu, M, Zhai, J, Ma, X & Chen, W 2011, Cloud versus in-house cluster: Evaluating amazon cluster compute instances for running MPI applications. in State of the Practice Reports, SC'11., 11, State of the Practice Reports, SC'11, Seattle, WA, United States, 12/11/11. https://doi.org/10.1145/2063348.2063363
Zhai, Yan ; Liu, Mingliang ; Zhai, Jidong ; Ma, Xiaosong ; Chen, Wenguang. / Cloud versus in-house cluster : Evaluating amazon cluster compute instances for running MPI applications. State of the Practice Reports, SC'11. 2011.
@inproceedings{309ab45345634371a025a8a15d953eb6,
title = "Cloud versus in-house cluster: Evaluating amazon cluster compute instances for running MPI applications",
abstract = "The emergence of cloud services brings new possibilities for constructing and using HPC platforms. However, while cloud services provide the exibility and convenience of customized, pay-as-you-go parallel computing, multiple previous studies in the past three years have indicated that cloudbased clusters need a significant performance boost to become a competitive choice, especially for tightly coupled parallel applications. In this work, we examine the feasibility of running HPC applications in clouds. This study distinguishes itself from existing investigations in several ways: 1) We carry out a comprehensive examination of issues relevant to the HPC community, including performance, cost, user experience, and range of user activities. 2) We compare an Amazon EC2-based platform built upon its newly available HPCoriented virtual machines with typical local cluster and supercomputer options, using benchmarks and applications with scale and problem size unprecedented in previous cloud HPC studies. 3) We perform detailed performance and scalability analysis to locate the chief limiting factors of the state-of-the-art cloud based clusters. 4) We present a case study on the impact of per-application parallel I/O system configuration uniquely enabled by cloud services. Our results reveal that though the scalability of EC2-based virtual clusters still lags behind traditional HPC alternatives, they are rapidly gaining in overall performance and cost-effectiveness, making them feasible candidates for performing tightly coupled scientific computing. In addition, our detailed benchmarking and profiling discloses and analyzes several problems regarding the performance and performance stability on EC2.",
author = "Yan Zhai and Mingliang Liu and Jidong Zhai and Xiaosong Ma and Wenguang Chen",
year = "2011",
month = "12",
day = "13",
doi = "10.1145/2063348.2063363",
language = "English",
isbn = "9781450311397",
booktitle = "State of the Practice Reports, SC'11",

}

TY - GEN

T1 - Cloud versus in-house cluster

T2 - Evaluating amazon cluster compute instances for running MPI applications

AU - Zhai, Yan

AU - Liu, Mingliang

AU - Zhai, Jidong

AU - Ma, Xiaosong

AU - Chen, Wenguang

PY - 2011/12/13

Y1 - 2011/12/13

N2 - The emergence of cloud services brings new possibilities for constructing and using HPC platforms. However, while cloud services provide the exibility and convenience of customized, pay-as-you-go parallel computing, multiple previous studies in the past three years have indicated that cloudbased clusters need a significant performance boost to become a competitive choice, especially for tightly coupled parallel applications. In this work, we examine the feasibility of running HPC applications in clouds. This study distinguishes itself from existing investigations in several ways: 1) We carry out a comprehensive examination of issues relevant to the HPC community, including performance, cost, user experience, and range of user activities. 2) We compare an Amazon EC2-based platform built upon its newly available HPCoriented virtual machines with typical local cluster and supercomputer options, using benchmarks and applications with scale and problem size unprecedented in previous cloud HPC studies. 3) We perform detailed performance and scalability analysis to locate the chief limiting factors of the state-of-the-art cloud based clusters. 4) We present a case study on the impact of per-application parallel I/O system configuration uniquely enabled by cloud services. Our results reveal that though the scalability of EC2-based virtual clusters still lags behind traditional HPC alternatives, they are rapidly gaining in overall performance and cost-effectiveness, making them feasible candidates for performing tightly coupled scientific computing. In addition, our detailed benchmarking and profiling discloses and analyzes several problems regarding the performance and performance stability on EC2.

AB - The emergence of cloud services brings new possibilities for constructing and using HPC platforms. However, while cloud services provide the exibility and convenience of customized, pay-as-you-go parallel computing, multiple previous studies in the past three years have indicated that cloudbased clusters need a significant performance boost to become a competitive choice, especially for tightly coupled parallel applications. In this work, we examine the feasibility of running HPC applications in clouds. This study distinguishes itself from existing investigations in several ways: 1) We carry out a comprehensive examination of issues relevant to the HPC community, including performance, cost, user experience, and range of user activities. 2) We compare an Amazon EC2-based platform built upon its newly available HPCoriented virtual machines with typical local cluster and supercomputer options, using benchmarks and applications with scale and problem size unprecedented in previous cloud HPC studies. 3) We perform detailed performance and scalability analysis to locate the chief limiting factors of the state-of-the-art cloud based clusters. 4) We present a case study on the impact of per-application parallel I/O system configuration uniquely enabled by cloud services. Our results reveal that though the scalability of EC2-based virtual clusters still lags behind traditional HPC alternatives, they are rapidly gaining in overall performance and cost-effectiveness, making them feasible candidates for performing tightly coupled scientific computing. In addition, our detailed benchmarking and profiling discloses and analyzes several problems regarding the performance and performance stability on EC2.

UR - http://www.scopus.com/inward/record.url?scp=83055184887&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=83055184887&partnerID=8YFLogxK

U2 - 10.1145/2063348.2063363

DO - 10.1145/2063348.2063363

M3 - Conference contribution

SN - 9781450311397

BT - State of the Practice Reports, SC'11

ER -