Performance improvement of BWA MEM algorithm using data-parallel with concurrent parallelization

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

Burrows-Wheeler Transform (BWT) is the widely used data compression technique in the next-generation sequencing (NGS) analysis. Due to the advancement in the NGS technology, the genome data size was increased rapidly and these higher volumes of genome data need to be processed by empirical parallelism. Generally, these NGS data will be processed by traditional parallel processing approaches like (i) thread parallelization (ii) Data parallelization and (iii) Concurrent parallelization, which are their own performance bottlenecks in, thread scalability, scattering/gathering of data and memory bandwidth limitations respectively. To eliminate these drawbacks, we introduced the hybrid parallelization approach called 'data-parallel with concurrent parallelization' to process our genome alignment. We used BWA MEM algorithm for aligning human genome sequence, which are dominated by huge memory intensive operations and the performance is limited due to cache/TLB misses. To eliminate the cache/TLB miss, the genome data is partitioned into multiple pieces (i.e., reducing the read size) using data parallelization and concurrently processing these multiple pieces of genome data within the same cache/memory hierarchy. Hence, the performance of proposed data-parallel with concurrent parallelization is 45% better than traditional parallelization approaches. Additionally, we provided proof of concept to process higher volume of genome data using BWA MEM algorithm on the high-end desktop machines.

Original languageEnglish
Title of host publicationProceedings of 2014 3rd International Conference on Parallel, Distributed and Grid Computing, PDGC 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages406-411
Number of pages6
ISBN (Electronic)9781479976836
DOIs
Publication statusPublished - 3 Feb 2015
Event2014 3rd IEEE International Conference on Parallel, Distributed and Grid Computing, PDGC 2014 - Solan, Himachal Pradesh, India
Duration: 11 Dec 201413 Dec 2014

Other

Other2014 3rd IEEE International Conference on Parallel, Distributed and Grid Computing, PDGC 2014
CountryIndia
CitySolan, Himachal Pradesh
Period11/12/1413/12/14

Fingerprint

Genes
Data storage equipment
Cache memory
Data compression
Processing
Scalability
Mathematical transformations
Scattering
Bandwidth

Keywords

  • Burrows-Wheeler Transform
  • BWA
  • Data Parallelization and Concurrent Parallelization
  • High Performance Computing
  • Human Genome Sequence
  • Threads Scalability

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computational Theory and Mathematics
  • Software

Cite this

Kathiresan, N., Temanni, R., & Al-Ali, R. J. (2015). Performance improvement of BWA MEM algorithm using data-parallel with concurrent parallelization. In Proceedings of 2014 3rd International Conference on Parallel, Distributed and Grid Computing, PDGC 2014 (pp. 406-411). [7030780] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/PDGC.2014.7030780

Performance improvement of BWA MEM algorithm using data-parallel with concurrent parallelization. / Kathiresan, Nagarajan; Temanni, Ramzi; Al-Ali, Rashid J.

Proceedings of 2014 3rd International Conference on Parallel, Distributed and Grid Computing, PDGC 2014. Institute of Electrical and Electronics Engineers Inc., 2015. p. 406-411 7030780.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kathiresan, N, Temanni, R & Al-Ali, RJ 2015, Performance improvement of BWA MEM algorithm using data-parallel with concurrent parallelization. in Proceedings of 2014 3rd International Conference on Parallel, Distributed and Grid Computing, PDGC 2014., 7030780, Institute of Electrical and Electronics Engineers Inc., pp. 406-411, 2014 3rd IEEE International Conference on Parallel, Distributed and Grid Computing, PDGC 2014, Solan, Himachal Pradesh, India, 11/12/14. https://doi.org/10.1109/PDGC.2014.7030780
Kathiresan N, Temanni R, Al-Ali RJ. Performance improvement of BWA MEM algorithm using data-parallel with concurrent parallelization. In Proceedings of 2014 3rd International Conference on Parallel, Distributed and Grid Computing, PDGC 2014. Institute of Electrical and Electronics Engineers Inc. 2015. p. 406-411. 7030780 https://doi.org/10.1109/PDGC.2014.7030780
Kathiresan, Nagarajan ; Temanni, Ramzi ; Al-Ali, Rashid J. / Performance improvement of BWA MEM algorithm using data-parallel with concurrent parallelization. Proceedings of 2014 3rd International Conference on Parallel, Distributed and Grid Computing, PDGC 2014. Institute of Electrical and Electronics Engineers Inc., 2015. pp. 406-411
@inproceedings{0cef5d91f3434eb098fdb850f01c4642,
title = "Performance improvement of BWA MEM algorithm using data-parallel with concurrent parallelization",
abstract = "Burrows-Wheeler Transform (BWT) is the widely used data compression technique in the next-generation sequencing (NGS) analysis. Due to the advancement in the NGS technology, the genome data size was increased rapidly and these higher volumes of genome data need to be processed by empirical parallelism. Generally, these NGS data will be processed by traditional parallel processing approaches like (i) thread parallelization (ii) Data parallelization and (iii) Concurrent parallelization, which are their own performance bottlenecks in, thread scalability, scattering/gathering of data and memory bandwidth limitations respectively. To eliminate these drawbacks, we introduced the hybrid parallelization approach called 'data-parallel with concurrent parallelization' to process our genome alignment. We used BWA MEM algorithm for aligning human genome sequence, which are dominated by huge memory intensive operations and the performance is limited due to cache/TLB misses. To eliminate the cache/TLB miss, the genome data is partitioned into multiple pieces (i.e., reducing the read size) using data parallelization and concurrently processing these multiple pieces of genome data within the same cache/memory hierarchy. Hence, the performance of proposed data-parallel with concurrent parallelization is 45{\%} better than traditional parallelization approaches. Additionally, we provided proof of concept to process higher volume of genome data using BWA MEM algorithm on the high-end desktop machines.",
keywords = "Burrows-Wheeler Transform, BWA, Data Parallelization and Concurrent Parallelization, High Performance Computing, Human Genome Sequence, Threads Scalability",
author = "Nagarajan Kathiresan and Ramzi Temanni and Al-Ali, {Rashid J.}",
year = "2015",
month = "2",
day = "3",
doi = "10.1109/PDGC.2014.7030780",
language = "English",
pages = "406--411",
booktitle = "Proceedings of 2014 3rd International Conference on Parallel, Distributed and Grid Computing, PDGC 2014",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Performance improvement of BWA MEM algorithm using data-parallel with concurrent parallelization

AU - Kathiresan, Nagarajan

AU - Temanni, Ramzi

AU - Al-Ali, Rashid J.

PY - 2015/2/3

Y1 - 2015/2/3

N2 - Burrows-Wheeler Transform (BWT) is the widely used data compression technique in the next-generation sequencing (NGS) analysis. Due to the advancement in the NGS technology, the genome data size was increased rapidly and these higher volumes of genome data need to be processed by empirical parallelism. Generally, these NGS data will be processed by traditional parallel processing approaches like (i) thread parallelization (ii) Data parallelization and (iii) Concurrent parallelization, which are their own performance bottlenecks in, thread scalability, scattering/gathering of data and memory bandwidth limitations respectively. To eliminate these drawbacks, we introduced the hybrid parallelization approach called 'data-parallel with concurrent parallelization' to process our genome alignment. We used BWA MEM algorithm for aligning human genome sequence, which are dominated by huge memory intensive operations and the performance is limited due to cache/TLB misses. To eliminate the cache/TLB miss, the genome data is partitioned into multiple pieces (i.e., reducing the read size) using data parallelization and concurrently processing these multiple pieces of genome data within the same cache/memory hierarchy. Hence, the performance of proposed data-parallel with concurrent parallelization is 45% better than traditional parallelization approaches. Additionally, we provided proof of concept to process higher volume of genome data using BWA MEM algorithm on the high-end desktop machines.

AB - Burrows-Wheeler Transform (BWT) is the widely used data compression technique in the next-generation sequencing (NGS) analysis. Due to the advancement in the NGS technology, the genome data size was increased rapidly and these higher volumes of genome data need to be processed by empirical parallelism. Generally, these NGS data will be processed by traditional parallel processing approaches like (i) thread parallelization (ii) Data parallelization and (iii) Concurrent parallelization, which are their own performance bottlenecks in, thread scalability, scattering/gathering of data and memory bandwidth limitations respectively. To eliminate these drawbacks, we introduced the hybrid parallelization approach called 'data-parallel with concurrent parallelization' to process our genome alignment. We used BWA MEM algorithm for aligning human genome sequence, which are dominated by huge memory intensive operations and the performance is limited due to cache/TLB misses. To eliminate the cache/TLB miss, the genome data is partitioned into multiple pieces (i.e., reducing the read size) using data parallelization and concurrently processing these multiple pieces of genome data within the same cache/memory hierarchy. Hence, the performance of proposed data-parallel with concurrent parallelization is 45% better than traditional parallelization approaches. Additionally, we provided proof of concept to process higher volume of genome data using BWA MEM algorithm on the high-end desktop machines.

KW - Burrows-Wheeler Transform

KW - BWA

KW - Data Parallelization and Concurrent Parallelization

KW - High Performance Computing

KW - Human Genome Sequence

KW - Threads Scalability

UR - http://www.scopus.com/inward/record.url?scp=84936147320&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84936147320&partnerID=8YFLogxK

U2 - 10.1109/PDGC.2014.7030780

DO - 10.1109/PDGC.2014.7030780

M3 - Conference contribution

SP - 406

EP - 411

BT - Proceedings of 2014 3rd International Conference on Parallel, Distributed and Grid Computing, PDGC 2014

PB - Institute of Electrical and Electronics Engineers Inc.

ER -