Adaptive request scheduling for parallel scientific web services

Heshan Lin, Xiaosong Ma, Jiangtian Li, Ting Yu, Nagiza Samatova

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Scientific web services often possess data models and query workloads quite different from commercial ones and are much less studied. Individual queries have to be processed in parallel by multiple server nodes, due to the computation- and data-intensiveness of the processing. Meanwhile, each query is performed against portions of a large, common dataset. Existing scheduling policies from traditional environments (namely cluster web servers and supercomputers) consider only the data or the computation aspect alone and are therefore inadequate for this new type of workload. In this paper, we systematically investigate adaptive scheduling for scientific web services, by taking into account parallel computation scalability, data locality, and load balancing. Our case study focuses on high-throughput query processing on biological sequence databases, a fundamental task performed daily by millions of scientists, who increasingly prefer to use web services powered by parallel servers. Our research indicates that intelligent resource allocation and scheduling are crucial in improving the overall performance of a parallel sequence database search server. Failure to consider either the parallel computation scalability or the data locality issues can significantly hurt the system throughput and query response time. Also, no single static strategy works best for all request workloads or all resources settings. In response, we present several dynamic scheduling techniques that automatically adapt to the request workload and system configuration in making scheduling decisions. Experiments on a cluster using 32 processors show the combination of these techniques delivers a several-fold improvement in average query response time across various workloads.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages276-294
Number of pages19
Volume5069 LNCS
DOIs
Publication statusPublished - 14 Aug 2008
Externally publishedYes
Event20th International Conference on Scientific and Statistical Database Management, SSDBM 2008 - Hong Kong, China
Duration: 9 Jul 200811 Jul 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5069 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other20th International Conference on Scientific and Statistical Database Management, SSDBM 2008
CountryChina
CityHong Kong
Period9/7/0811/7/08

Fingerprint

Adaptive Scheduling
Workload
Web services
Web Services
Scheduling
Query
Servers
Data Locality
Server
Parallel Computation
Response Time
Reaction Time
Resource allocation
Scalability
Throughput
Databases
Resource Scheduling
Dynamic Scheduling
Resource Allocation
Scheduling Policy

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Lin, H., Ma, X., Li, J., Yu, T., & Samatova, N. (2008). Adaptive request scheduling for parallel scientific web services. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5069 LNCS, pp. 276-294). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5069 LNCS). https://doi.org/10.1007/978-3-540-69497-7_19

Adaptive request scheduling for parallel scientific web services. / Lin, Heshan; Ma, Xiaosong; Li, Jiangtian; Yu, Ting; Samatova, Nagiza.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5069 LNCS 2008. p. 276-294 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5069 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lin, H, Ma, X, Li, J, Yu, T & Samatova, N 2008, Adaptive request scheduling for parallel scientific web services. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 5069 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5069 LNCS, pp. 276-294, 20th International Conference on Scientific and Statistical Database Management, SSDBM 2008, Hong Kong, China, 9/7/08. https://doi.org/10.1007/978-3-540-69497-7_19
Lin H, Ma X, Li J, Yu T, Samatova N. Adaptive request scheduling for parallel scientific web services. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5069 LNCS. 2008. p. 276-294. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-540-69497-7_19
Lin, Heshan ; Ma, Xiaosong ; Li, Jiangtian ; Yu, Ting ; Samatova, Nagiza. / Adaptive request scheduling for parallel scientific web services. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5069 LNCS 2008. pp. 276-294 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{7217e8b74a3c41bd8f7b146fa0fc24d9,
title = "Adaptive request scheduling for parallel scientific web services",
abstract = "Scientific web services often possess data models and query workloads quite different from commercial ones and are much less studied. Individual queries have to be processed in parallel by multiple server nodes, due to the computation- and data-intensiveness of the processing. Meanwhile, each query is performed against portions of a large, common dataset. Existing scheduling policies from traditional environments (namely cluster web servers and supercomputers) consider only the data or the computation aspect alone and are therefore inadequate for this new type of workload. In this paper, we systematically investigate adaptive scheduling for scientific web services, by taking into account parallel computation scalability, data locality, and load balancing. Our case study focuses on high-throughput query processing on biological sequence databases, a fundamental task performed daily by millions of scientists, who increasingly prefer to use web services powered by parallel servers. Our research indicates that intelligent resource allocation and scheduling are crucial in improving the overall performance of a parallel sequence database search server. Failure to consider either the parallel computation scalability or the data locality issues can significantly hurt the system throughput and query response time. Also, no single static strategy works best for all request workloads or all resources settings. In response, we present several dynamic scheduling techniques that automatically adapt to the request workload and system configuration in making scheduling decisions. Experiments on a cluster using 32 processors show the combination of these techniques delivers a several-fold improvement in average query response time across various workloads.",
author = "Heshan Lin and Xiaosong Ma and Jiangtian Li and Ting Yu and Nagiza Samatova",
year = "2008",
month = "8",
day = "14",
doi = "10.1007/978-3-540-69497-7_19",
language = "English",
isbn = "3540694765",
volume = "5069 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "276--294",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Adaptive request scheduling for parallel scientific web services

AU - Lin, Heshan

AU - Ma, Xiaosong

AU - Li, Jiangtian

AU - Yu, Ting

AU - Samatova, Nagiza

PY - 2008/8/14

Y1 - 2008/8/14

N2 - Scientific web services often possess data models and query workloads quite different from commercial ones and are much less studied. Individual queries have to be processed in parallel by multiple server nodes, due to the computation- and data-intensiveness of the processing. Meanwhile, each query is performed against portions of a large, common dataset. Existing scheduling policies from traditional environments (namely cluster web servers and supercomputers) consider only the data or the computation aspect alone and are therefore inadequate for this new type of workload. In this paper, we systematically investigate adaptive scheduling for scientific web services, by taking into account parallel computation scalability, data locality, and load balancing. Our case study focuses on high-throughput query processing on biological sequence databases, a fundamental task performed daily by millions of scientists, who increasingly prefer to use web services powered by parallel servers. Our research indicates that intelligent resource allocation and scheduling are crucial in improving the overall performance of a parallel sequence database search server. Failure to consider either the parallel computation scalability or the data locality issues can significantly hurt the system throughput and query response time. Also, no single static strategy works best for all request workloads or all resources settings. In response, we present several dynamic scheduling techniques that automatically adapt to the request workload and system configuration in making scheduling decisions. Experiments on a cluster using 32 processors show the combination of these techniques delivers a several-fold improvement in average query response time across various workloads.

AB - Scientific web services often possess data models and query workloads quite different from commercial ones and are much less studied. Individual queries have to be processed in parallel by multiple server nodes, due to the computation- and data-intensiveness of the processing. Meanwhile, each query is performed against portions of a large, common dataset. Existing scheduling policies from traditional environments (namely cluster web servers and supercomputers) consider only the data or the computation aspect alone and are therefore inadequate for this new type of workload. In this paper, we systematically investigate adaptive scheduling for scientific web services, by taking into account parallel computation scalability, data locality, and load balancing. Our case study focuses on high-throughput query processing on biological sequence databases, a fundamental task performed daily by millions of scientists, who increasingly prefer to use web services powered by parallel servers. Our research indicates that intelligent resource allocation and scheduling are crucial in improving the overall performance of a parallel sequence database search server. Failure to consider either the parallel computation scalability or the data locality issues can significantly hurt the system throughput and query response time. Also, no single static strategy works best for all request workloads or all resources settings. In response, we present several dynamic scheduling techniques that automatically adapt to the request workload and system configuration in making scheduling decisions. Experiments on a cluster using 32 processors show the combination of these techniques delivers a several-fold improvement in average query response time across various workloads.

UR - http://www.scopus.com/inward/record.url?scp=49049086320&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=49049086320&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-69497-7_19

DO - 10.1007/978-3-540-69497-7_19

M3 - Conference contribution

AN - SCOPUS:49049086320

SN - 3540694765

SN - 9783540694762

VL - 5069 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 276

EP - 294

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -