Hybrid programming model for implicit PDE simulations on multicore architectures

Dinesh Kaushik, David Keyes, Satish Balay, Barry Smith

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

The complexity of programming modern multicore processor based clusters is rapidly rising, with GPUs adding further demand for fine-grained parallelism. This paper analyzes the performance of the hybrid (MPI+OpenMP) programming model in the context of an implicit unstructured mesh CFD code. At the implementation level, the effects of cache locality, update management, work division, and synchronization frequency are studied. The hybrid model presents interesting algorithmic opportunities as well: the convergence of linear system solver is quicker than the pure MPI case since the parallel preconditioner stays stronger when hybrid model is used. This implies significant savings in the cost of communication and synchronization (explicit and implicit). Even though OpenMP based parallelism is easier to implement (with in a subdomain assigned to one MPI process for simplicity), getting good performance needs attention to data partitioning issues similar to those in the message-passing case.

Original languageEnglish
Title of host publicationOpenMP in the Petascale Era - 7th International Workshop on OpenMP, IWOMP 2011, Proceedings
Pages12-21
Number of pages10
Volume6665 LNCS
DOIs
Publication statusPublished - 2011
Externally publishedYes
Event7th International Workshop on OpenMP, IWOMP 2011 - Chicago, IL, United States
Duration: 13 Jun 201115 Jun 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6665 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other7th International Workshop on OpenMP, IWOMP 2011
CountryUnited States
CityChicago, IL
Period13/6/1115/6/11

Fingerprint

OpenMP
Hybrid Model
Programming Model
Parallelism
Synchronization
Data Partitioning
Multi-core Processor
Unstructured Mesh
Multicore programming
Message Passing
Locality
Preconditioner
Cache
Division
Simplicity
Simulation
Programming
Update
Linear Systems
Message passing

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Kaushik, D., Keyes, D., Balay, S., & Smith, B. (2011). Hybrid programming model for implicit PDE simulations on multicore architectures. In OpenMP in the Petascale Era - 7th International Workshop on OpenMP, IWOMP 2011, Proceedings (Vol. 6665 LNCS, pp. 12-21). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6665 LNCS). https://doi.org/10.1007/978-3-642-21487-5_2

Hybrid programming model for implicit PDE simulations on multicore architectures. / Kaushik, Dinesh; Keyes, David; Balay, Satish; Smith, Barry.

OpenMP in the Petascale Era - 7th International Workshop on OpenMP, IWOMP 2011, Proceedings. Vol. 6665 LNCS 2011. p. 12-21 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6665 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kaushik, D, Keyes, D, Balay, S & Smith, B 2011, Hybrid programming model for implicit PDE simulations on multicore architectures. in OpenMP in the Petascale Era - 7th International Workshop on OpenMP, IWOMP 2011, Proceedings. vol. 6665 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6665 LNCS, pp. 12-21, 7th International Workshop on OpenMP, IWOMP 2011, Chicago, IL, United States, 13/6/11. https://doi.org/10.1007/978-3-642-21487-5_2
Kaushik D, Keyes D, Balay S, Smith B. Hybrid programming model for implicit PDE simulations on multicore architectures. In OpenMP in the Petascale Era - 7th International Workshop on OpenMP, IWOMP 2011, Proceedings. Vol. 6665 LNCS. 2011. p. 12-21. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-21487-5_2
Kaushik, Dinesh ; Keyes, David ; Balay, Satish ; Smith, Barry. / Hybrid programming model for implicit PDE simulations on multicore architectures. OpenMP in the Petascale Era - 7th International Workshop on OpenMP, IWOMP 2011, Proceedings. Vol. 6665 LNCS 2011. pp. 12-21 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{26f02071f3d149ffb44209d59f844a24,
title = "Hybrid programming model for implicit PDE simulations on multicore architectures",
abstract = "The complexity of programming modern multicore processor based clusters is rapidly rising, with GPUs adding further demand for fine-grained parallelism. This paper analyzes the performance of the hybrid (MPI+OpenMP) programming model in the context of an implicit unstructured mesh CFD code. At the implementation level, the effects of cache locality, update management, work division, and synchronization frequency are studied. The hybrid model presents interesting algorithmic opportunities as well: the convergence of linear system solver is quicker than the pure MPI case since the parallel preconditioner stays stronger when hybrid model is used. This implies significant savings in the cost of communication and synchronization (explicit and implicit). Even though OpenMP based parallelism is easier to implement (with in a subdomain assigned to one MPI process for simplicity), getting good performance needs attention to data partitioning issues similar to those in the message-passing case.",
author = "Dinesh Kaushik and David Keyes and Satish Balay and Barry Smith",
year = "2011",
doi = "10.1007/978-3-642-21487-5_2",
language = "English",
isbn = "9783642214868",
volume = "6665 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "12--21",
booktitle = "OpenMP in the Petascale Era - 7th International Workshop on OpenMP, IWOMP 2011, Proceedings",

}

TY - GEN

T1 - Hybrid programming model for implicit PDE simulations on multicore architectures

AU - Kaushik, Dinesh

AU - Keyes, David

AU - Balay, Satish

AU - Smith, Barry

PY - 2011

Y1 - 2011

N2 - The complexity of programming modern multicore processor based clusters is rapidly rising, with GPUs adding further demand for fine-grained parallelism. This paper analyzes the performance of the hybrid (MPI+OpenMP) programming model in the context of an implicit unstructured mesh CFD code. At the implementation level, the effects of cache locality, update management, work division, and synchronization frequency are studied. The hybrid model presents interesting algorithmic opportunities as well: the convergence of linear system solver is quicker than the pure MPI case since the parallel preconditioner stays stronger when hybrid model is used. This implies significant savings in the cost of communication and synchronization (explicit and implicit). Even though OpenMP based parallelism is easier to implement (with in a subdomain assigned to one MPI process for simplicity), getting good performance needs attention to data partitioning issues similar to those in the message-passing case.

AB - The complexity of programming modern multicore processor based clusters is rapidly rising, with GPUs adding further demand for fine-grained parallelism. This paper analyzes the performance of the hybrid (MPI+OpenMP) programming model in the context of an implicit unstructured mesh CFD code. At the implementation level, the effects of cache locality, update management, work division, and synchronization frequency are studied. The hybrid model presents interesting algorithmic opportunities as well: the convergence of linear system solver is quicker than the pure MPI case since the parallel preconditioner stays stronger when hybrid model is used. This implies significant savings in the cost of communication and synchronization (explicit and implicit). Even though OpenMP based parallelism is easier to implement (with in a subdomain assigned to one MPI process for simplicity), getting good performance needs attention to data partitioning issues similar to those in the message-passing case.

UR - http://www.scopus.com/inward/record.url?scp=79959198742&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79959198742&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-21487-5_2

DO - 10.1007/978-3-642-21487-5_2

M3 - Conference contribution

AN - SCOPUS:79959198742

SN - 9783642214868

VL - 6665 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 12

EP - 21

BT - OpenMP in the Petascale Era - 7th International Workshop on OpenMP, IWOMP 2011, Proceedings

ER -