Porting and Benchmarking of BWAKIT Pipeline on OpenPOWER Architecture

Nagarajan Kathiresan, Rashid J. Al-Ali, Puthen V. Jithesh, Ganesan Narayanasamy, Zaid Al-Ars

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Next Generation Sequencing (NGS) technology produces large volumes of genome data, which gets processed using various open source bioinformatics tools. The configuration and compilation of some bioinformatics tools (e.g. BWAKIT, root) is a challenging activity in its own right, not to mention the need to perform more elaborate porting activities for these applications on some architectures (e.g. IBM Power). The best practices of application porting should ensure (i) the semantics of the program or algorithm should not be changed, (ii) the output generated from the original source code and the modified source code (i.e., after porting) should be same even though the code is ported into different architectures and (iii) the output should be similar across different architectures after porting. Burrows-Wheeler Aligner (BWA) is the most popular genome mapping application used in the BWAKIT toolset. This BWAKIT provides pre-compiled binaries for x86_64 architecture and an end-to-end solution for genome mapping. In this paper, we show how to port various pre-built application binaries used in BWAKIT into OpenPOWER architecture and execute the BWAKIT pipeline successfully. Additionally, we demonstrate the validity of output results on OpenPOWER as well as present benchmarking results of BWAKIT applications that indicate the suitability of the highly multithreaded OpenPOWER architecture to execute these applications.

Original languageEnglish
Title of host publicationHigh Performance Computing - ISC High Performance 2018 International Workshops, Revised Selected Papers
EditorsMichèle Weiland, Sadaf Alam, Rio Yokota, John Shalf
PublisherSpringer Verlag
Pages402-410
Number of pages9
ISBN (Print)9783030024642
DOIs
Publication statusPublished - 1 Jan 2018
EventInternational Conference on High Performance Computing, ISC High Performance 2018 - Frankfurt, Germany
Duration: 28 Jun 201828 Jun 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11203 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Conference on High Performance Computing, ISC High Performance 2018
CountryGermany
CityFrankfurt
Period28/6/1828/6/18

Fingerprint

Benchmarking
Pipelines
Genome
Genes
Bioinformatics
Output
Binary
Best Practice
Compilation
Open Source
Sequencing
Architecture
Semantics
Roots
Configuration
Demonstrate

Keywords

  • Burrows-Wheeler Aligner
  • BWAKIT
  • Efficiency
  • Genome mapping
  • Parallelization
  • POWER architecture
  • Scalability

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Kathiresan, N., Al-Ali, R. J., Jithesh, P. V., Narayanasamy, G., & Al-Ars, Z. (2018). Porting and Benchmarking of BWAKIT Pipeline on OpenPOWER Architecture. In M. Weiland, S. Alam, R. Yokota, & J. Shalf (Eds.), High Performance Computing - ISC High Performance 2018 International Workshops, Revised Selected Papers (pp. 402-410). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11203 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-02465-9_27

Porting and Benchmarking of BWAKIT Pipeline on OpenPOWER Architecture. / Kathiresan, Nagarajan; Al-Ali, Rashid J.; Jithesh, Puthen V.; Narayanasamy, Ganesan; Al-Ars, Zaid.

High Performance Computing - ISC High Performance 2018 International Workshops, Revised Selected Papers. ed. / Michèle Weiland; Sadaf Alam; Rio Yokota; John Shalf. Springer Verlag, 2018. p. 402-410 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11203 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kathiresan, N, Al-Ali, RJ, Jithesh, PV, Narayanasamy, G & Al-Ars, Z 2018, Porting and Benchmarking of BWAKIT Pipeline on OpenPOWER Architecture. in M Weiland, S Alam, R Yokota & J Shalf (eds), High Performance Computing - ISC High Performance 2018 International Workshops, Revised Selected Papers. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11203 LNCS, Springer Verlag, pp. 402-410, International Conference on High Performance Computing, ISC High Performance 2018, Frankfurt, Germany, 28/6/18. https://doi.org/10.1007/978-3-030-02465-9_27
Kathiresan N, Al-Ali RJ, Jithesh PV, Narayanasamy G, Al-Ars Z. Porting and Benchmarking of BWAKIT Pipeline on OpenPOWER Architecture. In Weiland M, Alam S, Yokota R, Shalf J, editors, High Performance Computing - ISC High Performance 2018 International Workshops, Revised Selected Papers. Springer Verlag. 2018. p. 402-410. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-02465-9_27
Kathiresan, Nagarajan ; Al-Ali, Rashid J. ; Jithesh, Puthen V. ; Narayanasamy, Ganesan ; Al-Ars, Zaid. / Porting and Benchmarking of BWAKIT Pipeline on OpenPOWER Architecture. High Performance Computing - ISC High Performance 2018 International Workshops, Revised Selected Papers. editor / Michèle Weiland ; Sadaf Alam ; Rio Yokota ; John Shalf. Springer Verlag, 2018. pp. 402-410 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{4fdd4359d06841f6ad9b54d626ba936f,
title = "Porting and Benchmarking of BWAKIT Pipeline on OpenPOWER Architecture",
abstract = "Next Generation Sequencing (NGS) technology produces large volumes of genome data, which gets processed using various open source bioinformatics tools. The configuration and compilation of some bioinformatics tools (e.g. BWAKIT, root) is a challenging activity in its own right, not to mention the need to perform more elaborate porting activities for these applications on some architectures (e.g. IBM Power). The best practices of application porting should ensure (i) the semantics of the program or algorithm should not be changed, (ii) the output generated from the original source code and the modified source code (i.e., after porting) should be same even though the code is ported into different architectures and (iii) the output should be similar across different architectures after porting. Burrows-Wheeler Aligner (BWA) is the most popular genome mapping application used in the BWAKIT toolset. This BWAKIT provides pre-compiled binaries for x86_64 architecture and an end-to-end solution for genome mapping. In this paper, we show how to port various pre-built application binaries used in BWAKIT into OpenPOWER architecture and execute the BWAKIT pipeline successfully. Additionally, we demonstrate the validity of output results on OpenPOWER as well as present benchmarking results of BWAKIT applications that indicate the suitability of the highly multithreaded OpenPOWER architecture to execute these applications.",
keywords = "Burrows-Wheeler Aligner, BWAKIT, Efficiency, Genome mapping, Parallelization, POWER architecture, Scalability",
author = "Nagarajan Kathiresan and Al-Ali, {Rashid J.} and Jithesh, {Puthen V.} and Ganesan Narayanasamy and Zaid Al-Ars",
year = "2018",
month = "1",
day = "1",
doi = "10.1007/978-3-030-02465-9_27",
language = "English",
isbn = "9783030024642",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "402--410",
editor = "Mich{\`e}le Weiland and Sadaf Alam and Rio Yokota and John Shalf",
booktitle = "High Performance Computing - ISC High Performance 2018 International Workshops, Revised Selected Papers",

}

TY - GEN

T1 - Porting and Benchmarking of BWAKIT Pipeline on OpenPOWER Architecture

AU - Kathiresan, Nagarajan

AU - Al-Ali, Rashid J.

AU - Jithesh, Puthen V.

AU - Narayanasamy, Ganesan

AU - Al-Ars, Zaid

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Next Generation Sequencing (NGS) technology produces large volumes of genome data, which gets processed using various open source bioinformatics tools. The configuration and compilation of some bioinformatics tools (e.g. BWAKIT, root) is a challenging activity in its own right, not to mention the need to perform more elaborate porting activities for these applications on some architectures (e.g. IBM Power). The best practices of application porting should ensure (i) the semantics of the program or algorithm should not be changed, (ii) the output generated from the original source code and the modified source code (i.e., after porting) should be same even though the code is ported into different architectures and (iii) the output should be similar across different architectures after porting. Burrows-Wheeler Aligner (BWA) is the most popular genome mapping application used in the BWAKIT toolset. This BWAKIT provides pre-compiled binaries for x86_64 architecture and an end-to-end solution for genome mapping. In this paper, we show how to port various pre-built application binaries used in BWAKIT into OpenPOWER architecture and execute the BWAKIT pipeline successfully. Additionally, we demonstrate the validity of output results on OpenPOWER as well as present benchmarking results of BWAKIT applications that indicate the suitability of the highly multithreaded OpenPOWER architecture to execute these applications.

AB - Next Generation Sequencing (NGS) technology produces large volumes of genome data, which gets processed using various open source bioinformatics tools. The configuration and compilation of some bioinformatics tools (e.g. BWAKIT, root) is a challenging activity in its own right, not to mention the need to perform more elaborate porting activities for these applications on some architectures (e.g. IBM Power). The best practices of application porting should ensure (i) the semantics of the program or algorithm should not be changed, (ii) the output generated from the original source code and the modified source code (i.e., after porting) should be same even though the code is ported into different architectures and (iii) the output should be similar across different architectures after porting. Burrows-Wheeler Aligner (BWA) is the most popular genome mapping application used in the BWAKIT toolset. This BWAKIT provides pre-compiled binaries for x86_64 architecture and an end-to-end solution for genome mapping. In this paper, we show how to port various pre-built application binaries used in BWAKIT into OpenPOWER architecture and execute the BWAKIT pipeline successfully. Additionally, we demonstrate the validity of output results on OpenPOWER as well as present benchmarking results of BWAKIT applications that indicate the suitability of the highly multithreaded OpenPOWER architecture to execute these applications.

KW - Burrows-Wheeler Aligner

KW - BWAKIT

KW - Efficiency

KW - Genome mapping

KW - Parallelization

KW - POWER architecture

KW - Scalability

UR - http://www.scopus.com/inward/record.url?scp=85066112593&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066112593&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-02465-9_27

DO - 10.1007/978-3-030-02465-9_27

M3 - Conference contribution

SN - 9783030024642

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 402

EP - 410

BT - High Performance Computing - ISC High Performance 2018 International Workshops, Revised Selected Papers

A2 - Weiland, Michèle

A2 - Alam, Sadaf

A2 - Yokota, Rio

A2 - Shalf, John

PB - Springer Verlag

ER -