Building Semi-Elastic Virtual Clusters for Cost-Effective HPC Cloud Resource Provisioning

Shuangcheng Niu, Jidong Zhai, Xiaosong Ma, Xiongchao Tang, Wenguang Chen, Weimin Zheng

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Recent studies have found cloud environments increasingly appealing for executing HPC applications, including tightly coupled parallel simulations. At the same time, while public clouds offer elastic, on-demand resource provisioning and pay-As-you-go pricing, individual users setting up their on-demand virtual clusters may not be able to take full advantage of common cost-saving opportunities, such as reserved instances. In this paper, we propose a Semi-Elastic Cluster (SEC) computing model for organizations to reserve and dynamically resize a virtual cloud-based cluster. We present a set of integrated batch scheduling plus resource scaling strategies uniquely enabled by SEC, as well as an online reserved instance provisioning algorithm based on job history. Our trace-driven simulation results show that such a model has a 61.0 percent cost saving than individual users acquiring and managing cloud resources without causing longer average job wait time. Moreover, to exploit the advantages of different public clouds, we also extend SEC to a multi-cloud environment, where SEC can get a lower cost than on any single cloud. We design and implement a prototype system of the SEC model and evaluate it in terms of management overhead and average job wait time. Experimental results show that the management overhead is negligible with respect to the job wait time.

Original languageEnglish
Article number7239625
Pages (from-to)1915-1928
Number of pages14
JournalIEEE Transactions on Parallel and Distributed Systems
Volume27
Issue number7
DOIs
Publication statusPublished - 1 Jul 2016

Fingerprint

Costs
Cluster computing
Scheduling

Keywords

  • Cloud Computing
  • Job Scheduling
  • Resource Provisioning
  • Semi-Elastic Cluster
  • Trace-Driven Simulation

ASJC Scopus subject areas

  • Signal Processing
  • Hardware and Architecture
  • Computational Theory and Mathematics

Cite this

Building Semi-Elastic Virtual Clusters for Cost-Effective HPC Cloud Resource Provisioning. / Niu, Shuangcheng; Zhai, Jidong; Ma, Xiaosong; Tang, Xiongchao; Chen, Wenguang; Zheng, Weimin.

In: IEEE Transactions on Parallel and Distributed Systems, Vol. 27, No. 7, 7239625, 01.07.2016, p. 1915-1928.

Research output: Contribution to journalArticle

Niu, Shuangcheng ; Zhai, Jidong ; Ma, Xiaosong ; Tang, Xiongchao ; Chen, Wenguang ; Zheng, Weimin. / Building Semi-Elastic Virtual Clusters for Cost-Effective HPC Cloud Resource Provisioning. In: IEEE Transactions on Parallel and Distributed Systems. 2016 ; Vol. 27, No. 7. pp. 1915-1928.
@article{b16245cbe09445b9a007e608ead11429,
title = "Building Semi-Elastic Virtual Clusters for Cost-Effective HPC Cloud Resource Provisioning",
abstract = "Recent studies have found cloud environments increasingly appealing for executing HPC applications, including tightly coupled parallel simulations. At the same time, while public clouds offer elastic, on-demand resource provisioning and pay-As-you-go pricing, individual users setting up their on-demand virtual clusters may not be able to take full advantage of common cost-saving opportunities, such as reserved instances. In this paper, we propose a Semi-Elastic Cluster (SEC) computing model for organizations to reserve and dynamically resize a virtual cloud-based cluster. We present a set of integrated batch scheduling plus resource scaling strategies uniquely enabled by SEC, as well as an online reserved instance provisioning algorithm based on job history. Our trace-driven simulation results show that such a model has a 61.0 percent cost saving than individual users acquiring and managing cloud resources without causing longer average job wait time. Moreover, to exploit the advantages of different public clouds, we also extend SEC to a multi-cloud environment, where SEC can get a lower cost than on any single cloud. We design and implement a prototype system of the SEC model and evaluate it in terms of management overhead and average job wait time. Experimental results show that the management overhead is negligible with respect to the job wait time.",
keywords = "Cloud Computing, Job Scheduling, Resource Provisioning, Semi-Elastic Cluster, Trace-Driven Simulation",
author = "Shuangcheng Niu and Jidong Zhai and Xiaosong Ma and Xiongchao Tang and Wenguang Chen and Weimin Zheng",
year = "2016",
month = "7",
day = "1",
doi = "10.1109/TPDS.2015.2476459",
language = "English",
volume = "27",
pages = "1915--1928",
journal = "IEEE Transactions on Parallel and Distributed Systems",
issn = "1045-9219",
publisher = "IEEE Computer Society",
number = "7",

}

TY - JOUR

T1 - Building Semi-Elastic Virtual Clusters for Cost-Effective HPC Cloud Resource Provisioning

AU - Niu, Shuangcheng

AU - Zhai, Jidong

AU - Ma, Xiaosong

AU - Tang, Xiongchao

AU - Chen, Wenguang

AU - Zheng, Weimin

PY - 2016/7/1

Y1 - 2016/7/1

N2 - Recent studies have found cloud environments increasingly appealing for executing HPC applications, including tightly coupled parallel simulations. At the same time, while public clouds offer elastic, on-demand resource provisioning and pay-As-you-go pricing, individual users setting up their on-demand virtual clusters may not be able to take full advantage of common cost-saving opportunities, such as reserved instances. In this paper, we propose a Semi-Elastic Cluster (SEC) computing model for organizations to reserve and dynamically resize a virtual cloud-based cluster. We present a set of integrated batch scheduling plus resource scaling strategies uniquely enabled by SEC, as well as an online reserved instance provisioning algorithm based on job history. Our trace-driven simulation results show that such a model has a 61.0 percent cost saving than individual users acquiring and managing cloud resources without causing longer average job wait time. Moreover, to exploit the advantages of different public clouds, we also extend SEC to a multi-cloud environment, where SEC can get a lower cost than on any single cloud. We design and implement a prototype system of the SEC model and evaluate it in terms of management overhead and average job wait time. Experimental results show that the management overhead is negligible with respect to the job wait time.

AB - Recent studies have found cloud environments increasingly appealing for executing HPC applications, including tightly coupled parallel simulations. At the same time, while public clouds offer elastic, on-demand resource provisioning and pay-As-you-go pricing, individual users setting up their on-demand virtual clusters may not be able to take full advantage of common cost-saving opportunities, such as reserved instances. In this paper, we propose a Semi-Elastic Cluster (SEC) computing model for organizations to reserve and dynamically resize a virtual cloud-based cluster. We present a set of integrated batch scheduling plus resource scaling strategies uniquely enabled by SEC, as well as an online reserved instance provisioning algorithm based on job history. Our trace-driven simulation results show that such a model has a 61.0 percent cost saving than individual users acquiring and managing cloud resources without causing longer average job wait time. Moreover, to exploit the advantages of different public clouds, we also extend SEC to a multi-cloud environment, where SEC can get a lower cost than on any single cloud. We design and implement a prototype system of the SEC model and evaluate it in terms of management overhead and average job wait time. Experimental results show that the management overhead is negligible with respect to the job wait time.

KW - Cloud Computing

KW - Job Scheduling

KW - Resource Provisioning

KW - Semi-Elastic Cluster

KW - Trace-Driven Simulation

UR - http://www.scopus.com/inward/record.url?scp=84976385363&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84976385363&partnerID=8YFLogxK

U2 - 10.1109/TPDS.2015.2476459

DO - 10.1109/TPDS.2015.2476459

M3 - Article

AN - SCOPUS:84976385363

VL - 27

SP - 1915

EP - 1928

JO - IEEE Transactions on Parallel and Distributed Systems

JF - IEEE Transactions on Parallel and Distributed Systems

SN - 1045-9219

IS - 7

M1 - 7239625

ER -