Workflow optimization of performance and quality of service for bioinformatics application in high performance computing

Rashid J. Al-Ali, Nagarajan Kathiresan, Mohammed El Anbari, Eric R. Schendel, Tariq Abu Zaid

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Nowadays, High Performance Computing (HPC) systems commonly used in bioinformatics, such as genome sequencing, incorporate multi-processor architectures. Typically, most bioinformatics applications are multi-threaded and dominated by memory-intensive operations, which are not designed to take full advantage of these HPC capabilities. Therefore, the application end-user is responsible for optimizing the application performance and improving scalability with various performance engineering concepts. Additionally, most of the HPC systems are operated in a multi-user (or multi-job) environment; thus, Quality of Service (QoS) methods are essential for balancing between application performance, scalability and system utilization. We propose a QoS workflow that optimizes the balancing ratio between parallel efficiency and system utilization. Accordingly, our proposed optimization workflow will advise the end user of a selection criteria to apply toward resources and options for a given application and HPC system architecture. For example, the BWA-MEM algorithm is a popular and modern algorithm for aligning human genome sequences. We conducted various case studies on BWA-MEM using our optimization workflow, and as a result compared to a state-of-the-art baseline, the application performance is improved up to 67%, scalability extended up to 200%, parallel efficiency improved up to 39% and overall system utilization increased up to 38%.

Original languageEnglish
Pages (from-to)3-10
Number of pages8
JournalJournal of Computational Science
Volume15
DOIs
Publication statusPublished - 1 Jul 2016

Fingerprint

Bioinformatics
Work Flow
Quality of Service
Quality of service
High Performance
Optimization
Computing
Scalability
Balancing
Genome
Genes
Multiprocessor
System Architecture
Sequencing
Baseline
Optimise
Engineering
Data storage equipment
Resources

Keywords

  • Application performance and parallel efficiency
  • BWA-MEM algorithm
  • High performance computing
  • Next generation sequencing
  • Quality of service
  • Scalability

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)
  • Modelling and Simulation

Cite this

Workflow optimization of performance and quality of service for bioinformatics application in high performance computing. / Al-Ali, Rashid J.; Kathiresan, Nagarajan; Anbari, Mohammed El; Schendel, Eric R.; Zaid, Tariq Abu.

In: Journal of Computational Science, Vol. 15, 01.07.2016, p. 3-10.

Research output: Contribution to journalArticle

@article{2ad43b44416447a082d13f19ff6b933e,
title = "Workflow optimization of performance and quality of service for bioinformatics application in high performance computing",
abstract = "Nowadays, High Performance Computing (HPC) systems commonly used in bioinformatics, such as genome sequencing, incorporate multi-processor architectures. Typically, most bioinformatics applications are multi-threaded and dominated by memory-intensive operations, which are not designed to take full advantage of these HPC capabilities. Therefore, the application end-user is responsible for optimizing the application performance and improving scalability with various performance engineering concepts. Additionally, most of the HPC systems are operated in a multi-user (or multi-job) environment; thus, Quality of Service (QoS) methods are essential for balancing between application performance, scalability and system utilization. We propose a QoS workflow that optimizes the balancing ratio between parallel efficiency and system utilization. Accordingly, our proposed optimization workflow will advise the end user of a selection criteria to apply toward resources and options for a given application and HPC system architecture. For example, the BWA-MEM algorithm is a popular and modern algorithm for aligning human genome sequences. We conducted various case studies on BWA-MEM using our optimization workflow, and as a result compared to a state-of-the-art baseline, the application performance is improved up to 67{\%}, scalability extended up to 200{\%}, parallel efficiency improved up to 39{\%} and overall system utilization increased up to 38{\%}.",
keywords = "Application performance and parallel efficiency, BWA-MEM algorithm, High performance computing, Next generation sequencing, Quality of service, Scalability",
author = "Al-Ali, {Rashid J.} and Nagarajan Kathiresan and Anbari, {Mohammed El} and Schendel, {Eric R.} and Zaid, {Tariq Abu}",
year = "2016",
month = "7",
day = "1",
doi = "10.1016/j.jocs.2016.03.005",
language = "English",
volume = "15",
pages = "3--10",
journal = "Journal of Computational Science",
issn = "1877-7503",
publisher = "Elsevier",

}

TY - JOUR

T1 - Workflow optimization of performance and quality of service for bioinformatics application in high performance computing

AU - Al-Ali, Rashid J.

AU - Kathiresan, Nagarajan

AU - Anbari, Mohammed El

AU - Schendel, Eric R.

AU - Zaid, Tariq Abu

PY - 2016/7/1

Y1 - 2016/7/1

N2 - Nowadays, High Performance Computing (HPC) systems commonly used in bioinformatics, such as genome sequencing, incorporate multi-processor architectures. Typically, most bioinformatics applications are multi-threaded and dominated by memory-intensive operations, which are not designed to take full advantage of these HPC capabilities. Therefore, the application end-user is responsible for optimizing the application performance and improving scalability with various performance engineering concepts. Additionally, most of the HPC systems are operated in a multi-user (or multi-job) environment; thus, Quality of Service (QoS) methods are essential for balancing between application performance, scalability and system utilization. We propose a QoS workflow that optimizes the balancing ratio between parallel efficiency and system utilization. Accordingly, our proposed optimization workflow will advise the end user of a selection criteria to apply toward resources and options for a given application and HPC system architecture. For example, the BWA-MEM algorithm is a popular and modern algorithm for aligning human genome sequences. We conducted various case studies on BWA-MEM using our optimization workflow, and as a result compared to a state-of-the-art baseline, the application performance is improved up to 67%, scalability extended up to 200%, parallel efficiency improved up to 39% and overall system utilization increased up to 38%.

AB - Nowadays, High Performance Computing (HPC) systems commonly used in bioinformatics, such as genome sequencing, incorporate multi-processor architectures. Typically, most bioinformatics applications are multi-threaded and dominated by memory-intensive operations, which are not designed to take full advantage of these HPC capabilities. Therefore, the application end-user is responsible for optimizing the application performance and improving scalability with various performance engineering concepts. Additionally, most of the HPC systems are operated in a multi-user (or multi-job) environment; thus, Quality of Service (QoS) methods are essential for balancing between application performance, scalability and system utilization. We propose a QoS workflow that optimizes the balancing ratio between parallel efficiency and system utilization. Accordingly, our proposed optimization workflow will advise the end user of a selection criteria to apply toward resources and options for a given application and HPC system architecture. For example, the BWA-MEM algorithm is a popular and modern algorithm for aligning human genome sequences. We conducted various case studies on BWA-MEM using our optimization workflow, and as a result compared to a state-of-the-art baseline, the application performance is improved up to 67%, scalability extended up to 200%, parallel efficiency improved up to 39% and overall system utilization increased up to 38%.

KW - Application performance and parallel efficiency

KW - BWA-MEM algorithm

KW - High performance computing

KW - Next generation sequencing

KW - Quality of service

KW - Scalability

UR - http://www.scopus.com/inward/record.url?scp=84979457182&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84979457182&partnerID=8YFLogxK

U2 - 10.1016/j.jocs.2016.03.005

DO - 10.1016/j.jocs.2016.03.005

M3 - Article

VL - 15

SP - 3

EP - 10

JO - Journal of Computational Science

JF - Journal of Computational Science

SN - 1877-7503

ER -