DRS: Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams

Tom Z J Fu, Jianbing Ding, Richard T B Ma, Marianne Winslett, Yin Yang, Zhenjie Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

39 Citations (Scopus)

Abstract

In a data stream management system (DSMS), users register continuous queries, and receive result updates as data arrive and expire. We focus on applications with real-time constraints, in which the user must receive each result update within a given period after the update occurs. To handle fast data, the DSMS is commonly placed on top of a cloud infrastructure. Because stream properties such as arrival rates can fluctuate unpredictably, cloud resources must be dynamically provisioned and scheduled accordingly to ensure real-time response. It is essential, for the existing systems or future developments, to possess the ability of scheduling resources dynamically according to the current workload, in order to avoid wasting resources, or failing in delivering correct results on time. Motivated by this, we propose DRS, a novel dynamic resource scheduler for cloud-based DSMSs. DRS overcomes three fundamental challenges: (a) how to model the relationship between the provisioned resources and query response time (b) where to best place resources, and (c) how to measure system load with minimal overhead. In particular, DRS includes an accurate performance model based on the theory of Jackson open queueing networks and is capable of handling arbitrary operator topologies, possibly with loops, splits and joins. Extensive experiments with real data confirm that DRS achieves real-time response with close to optimal resource consumption.

Original languageEnglish
Title of host publicationProceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages411-420
Number of pages10
Volume2015-July
ISBN (Electronic)9781467372145
DOIs
Publication statusPublished - 22 Jul 2015
Externally publishedYes
Event35th IEEE International Conference on Distributed Computing Systems, ICDCS 2015 - Columbus, United States
Duration: 29 Jun 20152 Jul 2015

Other

Other35th IEEE International Conference on Distributed Computing Systems, ICDCS 2015
CountryUnited States
CityColumbus
Period29/6/152/7/15

Fingerprint

Scheduling
Queueing networks
Mathematical operators
Topology
Experiments

Keywords

  • data stream analytics
  • resource scheduling

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

Fu, T. Z. J., Ding, J., Ma, R. T. B., Winslett, M., Yang, Y., & Zhang, Z. (2015). DRS: Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams. In Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015 (Vol. 2015-July, pp. 411-420). [7164927] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICDCS.2015.49

DRS : Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams. / Fu, Tom Z J; Ding, Jianbing; Ma, Richard T B; Winslett, Marianne; Yang, Yin; Zhang, Zhenjie.

Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015. Vol. 2015-July Institute of Electrical and Electronics Engineers Inc., 2015. p. 411-420 7164927.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Fu, TZJ, Ding, J, Ma, RTB, Winslett, M, Yang, Y & Zhang, Z 2015, DRS: Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams. in Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015. vol. 2015-July, 7164927, Institute of Electrical and Electronics Engineers Inc., pp. 411-420, 35th IEEE International Conference on Distributed Computing Systems, ICDCS 2015, Columbus, United States, 29/6/15. https://doi.org/10.1109/ICDCS.2015.49
Fu TZJ, Ding J, Ma RTB, Winslett M, Yang Y, Zhang Z. DRS: Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams. In Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015. Vol. 2015-July. Institute of Electrical and Electronics Engineers Inc. 2015. p. 411-420. 7164927 https://doi.org/10.1109/ICDCS.2015.49
Fu, Tom Z J ; Ding, Jianbing ; Ma, Richard T B ; Winslett, Marianne ; Yang, Yin ; Zhang, Zhenjie. / DRS : Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams. Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015. Vol. 2015-July Institute of Electrical and Electronics Engineers Inc., 2015. pp. 411-420
@inproceedings{029c1436b7534a7f85c5ef45451c0ba1,
title = "DRS: Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams",
abstract = "In a data stream management system (DSMS), users register continuous queries, and receive result updates as data arrive and expire. We focus on applications with real-time constraints, in which the user must receive each result update within a given period after the update occurs. To handle fast data, the DSMS is commonly placed on top of a cloud infrastructure. Because stream properties such as arrival rates can fluctuate unpredictably, cloud resources must be dynamically provisioned and scheduled accordingly to ensure real-time response. It is essential, for the existing systems or future developments, to possess the ability of scheduling resources dynamically according to the current workload, in order to avoid wasting resources, or failing in delivering correct results on time. Motivated by this, we propose DRS, a novel dynamic resource scheduler for cloud-based DSMSs. DRS overcomes three fundamental challenges: (a) how to model the relationship between the provisioned resources and query response time (b) where to best place resources, and (c) how to measure system load with minimal overhead. In particular, DRS includes an accurate performance model based on the theory of Jackson open queueing networks and is capable of handling arbitrary operator topologies, possibly with loops, splits and joins. Extensive experiments with real data confirm that DRS achieves real-time response with close to optimal resource consumption.",
keywords = "data stream analytics, resource scheduling",
author = "Fu, {Tom Z J} and Jianbing Ding and Ma, {Richard T B} and Marianne Winslett and Yin Yang and Zhenjie Zhang",
year = "2015",
month = "7",
day = "22",
doi = "10.1109/ICDCS.2015.49",
language = "English",
volume = "2015-July",
pages = "411--420",
booktitle = "Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - DRS

T2 - Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams

AU - Fu, Tom Z J

AU - Ding, Jianbing

AU - Ma, Richard T B

AU - Winslett, Marianne

AU - Yang, Yin

AU - Zhang, Zhenjie

PY - 2015/7/22

Y1 - 2015/7/22

N2 - In a data stream management system (DSMS), users register continuous queries, and receive result updates as data arrive and expire. We focus on applications with real-time constraints, in which the user must receive each result update within a given period after the update occurs. To handle fast data, the DSMS is commonly placed on top of a cloud infrastructure. Because stream properties such as arrival rates can fluctuate unpredictably, cloud resources must be dynamically provisioned and scheduled accordingly to ensure real-time response. It is essential, for the existing systems or future developments, to possess the ability of scheduling resources dynamically according to the current workload, in order to avoid wasting resources, or failing in delivering correct results on time. Motivated by this, we propose DRS, a novel dynamic resource scheduler for cloud-based DSMSs. DRS overcomes three fundamental challenges: (a) how to model the relationship between the provisioned resources and query response time (b) where to best place resources, and (c) how to measure system load with minimal overhead. In particular, DRS includes an accurate performance model based on the theory of Jackson open queueing networks and is capable of handling arbitrary operator topologies, possibly with loops, splits and joins. Extensive experiments with real data confirm that DRS achieves real-time response with close to optimal resource consumption.

AB - In a data stream management system (DSMS), users register continuous queries, and receive result updates as data arrive and expire. We focus on applications with real-time constraints, in which the user must receive each result update within a given period after the update occurs. To handle fast data, the DSMS is commonly placed on top of a cloud infrastructure. Because stream properties such as arrival rates can fluctuate unpredictably, cloud resources must be dynamically provisioned and scheduled accordingly to ensure real-time response. It is essential, for the existing systems or future developments, to possess the ability of scheduling resources dynamically according to the current workload, in order to avoid wasting resources, or failing in delivering correct results on time. Motivated by this, we propose DRS, a novel dynamic resource scheduler for cloud-based DSMSs. DRS overcomes three fundamental challenges: (a) how to model the relationship between the provisioned resources and query response time (b) where to best place resources, and (c) how to measure system load with minimal overhead. In particular, DRS includes an accurate performance model based on the theory of Jackson open queueing networks and is capable of handling arbitrary operator topologies, possibly with loops, splits and joins. Extensive experiments with real data confirm that DRS achieves real-time response with close to optimal resource consumption.

KW - data stream analytics

KW - resource scheduling

UR - http://www.scopus.com/inward/record.url?scp=84944318294&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84944318294&partnerID=8YFLogxK

U2 - 10.1109/ICDCS.2015.49

DO - 10.1109/ICDCS.2015.49

M3 - Conference contribution

AN - SCOPUS:84944318294

VL - 2015-July

SP - 411

EP - 420

BT - Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015

PB - Institute of Electrical and Electronics Engineers Inc.

ER -