Reliable MapReduce computing on opportunistic resources

Heshan Lin, Xiaosong Ma, Wu chun Feng

Research output: Contribution to journalArticle

12 Citations (Scopus)

Abstract

MapReduce offers an ease-of-use programming paradigm for processing large data sets, making it an attractive model for opportunistic compute resources. However, unlike dedicated resources, where MapReduce has mostly been deployed, opportunistic resources have significantly higher rates of node volatility. As a consequence, the data and task replication scheme adopted by existing MapReduce implementations is woefully inadequate on such volatile resources. In this paper, we propose MOON, short for MapReduce On Opportunistic eNvironments, which is designed to offer reliable MapReduce service for opportunistic computing. MOON adopts a hybrid resource architecture by supplementing opportunistic compute resources with a small set of dedicated resources, and it extends Hadoop, an open-source implementation of MapReduce, with adaptive task and data scheduling algorithms to take advantage of the hybrid resource architecture. Our results on an emulated opportunistic computing system running atop a 60-node cluster demonstrate that MOON can deliver significant performance improvements to Hadoop on volatile compute resources and even finish jobs that are not able to complete in Hadoop.

Original languageEnglish
Pages (from-to)145-161
Number of pages17
JournalCluster Computing
Volume15
Issue number2
DOIs
Publication statusPublished - 1 Jun 2012
Externally publishedYes

Fingerprint

Scheduling algorithms
Processing

Keywords

  • Cloud computing
  • MapReduce
  • Volunteer computing

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Software

Cite this

Reliable MapReduce computing on opportunistic resources. / Lin, Heshan; Ma, Xiaosong; Feng, Wu chun.

In: Cluster Computing, Vol. 15, No. 2, 01.06.2012, p. 145-161.

Research output: Contribution to journalArticle

Lin, Heshan ; Ma, Xiaosong ; Feng, Wu chun. / Reliable MapReduce computing on opportunistic resources. In: Cluster Computing. 2012 ; Vol. 15, No. 2. pp. 145-161.
@article{e218bb509f7a4cbebe23ee006dfe8ba7,
title = "Reliable MapReduce computing on opportunistic resources",
abstract = "MapReduce offers an ease-of-use programming paradigm for processing large data sets, making it an attractive model for opportunistic compute resources. However, unlike dedicated resources, where MapReduce has mostly been deployed, opportunistic resources have significantly higher rates of node volatility. As a consequence, the data and task replication scheme adopted by existing MapReduce implementations is woefully inadequate on such volatile resources. In this paper, we propose MOON, short for MapReduce On Opportunistic eNvironments, which is designed to offer reliable MapReduce service for opportunistic computing. MOON adopts a hybrid resource architecture by supplementing opportunistic compute resources with a small set of dedicated resources, and it extends Hadoop, an open-source implementation of MapReduce, with adaptive task and data scheduling algorithms to take advantage of the hybrid resource architecture. Our results on an emulated opportunistic computing system running atop a 60-node cluster demonstrate that MOON can deliver significant performance improvements to Hadoop on volatile compute resources and even finish jobs that are not able to complete in Hadoop.",
keywords = "Cloud computing, MapReduce, Volunteer computing",
author = "Heshan Lin and Xiaosong Ma and Feng, {Wu chun}",
year = "2012",
month = "6",
day = "1",
doi = "10.1007/s10586-011-0158-7",
language = "English",
volume = "15",
pages = "145--161",
journal = "Cluster Computing",
issn = "1386-7857",
publisher = "Kluwer Academic Publishers",
number = "2",

}

TY - JOUR

T1 - Reliable MapReduce computing on opportunistic resources

AU - Lin, Heshan

AU - Ma, Xiaosong

AU - Feng, Wu chun

PY - 2012/6/1

Y1 - 2012/6/1

N2 - MapReduce offers an ease-of-use programming paradigm for processing large data sets, making it an attractive model for opportunistic compute resources. However, unlike dedicated resources, where MapReduce has mostly been deployed, opportunistic resources have significantly higher rates of node volatility. As a consequence, the data and task replication scheme adopted by existing MapReduce implementations is woefully inadequate on such volatile resources. In this paper, we propose MOON, short for MapReduce On Opportunistic eNvironments, which is designed to offer reliable MapReduce service for opportunistic computing. MOON adopts a hybrid resource architecture by supplementing opportunistic compute resources with a small set of dedicated resources, and it extends Hadoop, an open-source implementation of MapReduce, with adaptive task and data scheduling algorithms to take advantage of the hybrid resource architecture. Our results on an emulated opportunistic computing system running atop a 60-node cluster demonstrate that MOON can deliver significant performance improvements to Hadoop on volatile compute resources and even finish jobs that are not able to complete in Hadoop.

AB - MapReduce offers an ease-of-use programming paradigm for processing large data sets, making it an attractive model for opportunistic compute resources. However, unlike dedicated resources, where MapReduce has mostly been deployed, opportunistic resources have significantly higher rates of node volatility. As a consequence, the data and task replication scheme adopted by existing MapReduce implementations is woefully inadequate on such volatile resources. In this paper, we propose MOON, short for MapReduce On Opportunistic eNvironments, which is designed to offer reliable MapReduce service for opportunistic computing. MOON adopts a hybrid resource architecture by supplementing opportunistic compute resources with a small set of dedicated resources, and it extends Hadoop, an open-source implementation of MapReduce, with adaptive task and data scheduling algorithms to take advantage of the hybrid resource architecture. Our results on an emulated opportunistic computing system running atop a 60-node cluster demonstrate that MOON can deliver significant performance improvements to Hadoop on volatile compute resources and even finish jobs that are not able to complete in Hadoop.

KW - Cloud computing

KW - MapReduce

KW - Volunteer computing

UR - http://www.scopus.com/inward/record.url?scp=84861761434&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84861761434&partnerID=8YFLogxK

U2 - 10.1007/s10586-011-0158-7

DO - 10.1007/s10586-011-0158-7

M3 - Article

VL - 15

SP - 145

EP - 161

JO - Cluster Computing

JF - Cluster Computing

SN - 1386-7857

IS - 2

ER -