DMA-assisted, intranode communication in GPU accelerated systems

Feng Ji, Ashwin M. Aji, James Dinan, Darius Buntinas, Pavan Balaji, Rajeev Thakur, Wu Chun Feng, Xiaosong Ma

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

Accelerator awareness has become a pressing issue in data movement models, such as MPI, because of the rapid deployment of systems that utilize accelerators. In our previous work, we developed techniques to enhance MPI with accelerator awareness, thus allowing applications to easily and efficiently communicate data between accelerator memories. In this paper, we extend this work with techniques to perform efficient data movement between accelerators within the same node using a DMA-assisted, peer-to-peer intranode communication technique that was recently introduced for NVIDIA GPUs. We present a detailed design of our new approach to intranode communication and evaluate its improvement to communication and application performance using micro-kernel benchmarks and a 2D stencil application kernel.

Original languageEnglish
Title of host publicationProceedings of the 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012
Pages461-468
Number of pages8
DOIs
Publication statusPublished - 7 Dec 2012
Externally publishedYes
Event14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012 - Liverpool, United Kingdom
Duration: 25 Jun 201227 Jun 2012

Other

Other14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012
CountryUnited Kingdom
CityLiverpool
Period25/6/1227/6/12

Fingerprint

Dynamic mechanical analysis
Particle accelerators
Communication
Graphics processing unit
Data storage equipment

Keywords

  • GPU
  • Intranode communication
  • MPI

ASJC Scopus subject areas

  • Software

Cite this

Ji, F., Aji, A. M., Dinan, J., Buntinas, D., Balaji, P., Thakur, R., ... Ma, X. (2012). DMA-assisted, intranode communication in GPU accelerated systems. In Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012 (pp. 461-468). [6332208] https://doi.org/10.1109/HPCC.2012.69

DMA-assisted, intranode communication in GPU accelerated systems. / Ji, Feng; Aji, Ashwin M.; Dinan, James; Buntinas, Darius; Balaji, Pavan; Thakur, Rajeev; Feng, Wu Chun; Ma, Xiaosong.

Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012. 2012. p. 461-468 6332208.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ji, F, Aji, AM, Dinan, J, Buntinas, D, Balaji, P, Thakur, R, Feng, WC & Ma, X 2012, DMA-assisted, intranode communication in GPU accelerated systems. in Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012., 6332208, pp. 461-468, 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012, Liverpool, United Kingdom, 25/6/12. https://doi.org/10.1109/HPCC.2012.69
Ji F, Aji AM, Dinan J, Buntinas D, Balaji P, Thakur R et al. DMA-assisted, intranode communication in GPU accelerated systems. In Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012. 2012. p. 461-468. 6332208 https://doi.org/10.1109/HPCC.2012.69
Ji, Feng ; Aji, Ashwin M. ; Dinan, James ; Buntinas, Darius ; Balaji, Pavan ; Thakur, Rajeev ; Feng, Wu Chun ; Ma, Xiaosong. / DMA-assisted, intranode communication in GPU accelerated systems. Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012. 2012. pp. 461-468
@inproceedings{e6c5eb8bce444518a2c2a0d3f3a54fcf,
title = "DMA-assisted, intranode communication in GPU accelerated systems",
abstract = "Accelerator awareness has become a pressing issue in data movement models, such as MPI, because of the rapid deployment of systems that utilize accelerators. In our previous work, we developed techniques to enhance MPI with accelerator awareness, thus allowing applications to easily and efficiently communicate data between accelerator memories. In this paper, we extend this work with techniques to perform efficient data movement between accelerators within the same node using a DMA-assisted, peer-to-peer intranode communication technique that was recently introduced for NVIDIA GPUs. We present a detailed design of our new approach to intranode communication and evaluate its improvement to communication and application performance using micro-kernel benchmarks and a 2D stencil application kernel.",
keywords = "GPU, Intranode communication, MPI",
author = "Feng Ji and Aji, {Ashwin M.} and James Dinan and Darius Buntinas and Pavan Balaji and Rajeev Thakur and Feng, {Wu Chun} and Xiaosong Ma",
year = "2012",
month = "12",
day = "7",
doi = "10.1109/HPCC.2012.69",
language = "English",
isbn = "9780769547497",
pages = "461--468",
booktitle = "Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012",

}

TY - GEN

T1 - DMA-assisted, intranode communication in GPU accelerated systems

AU - Ji, Feng

AU - Aji, Ashwin M.

AU - Dinan, James

AU - Buntinas, Darius

AU - Balaji, Pavan

AU - Thakur, Rajeev

AU - Feng, Wu Chun

AU - Ma, Xiaosong

PY - 2012/12/7

Y1 - 2012/12/7

N2 - Accelerator awareness has become a pressing issue in data movement models, such as MPI, because of the rapid deployment of systems that utilize accelerators. In our previous work, we developed techniques to enhance MPI with accelerator awareness, thus allowing applications to easily and efficiently communicate data between accelerator memories. In this paper, we extend this work with techniques to perform efficient data movement between accelerators within the same node using a DMA-assisted, peer-to-peer intranode communication technique that was recently introduced for NVIDIA GPUs. We present a detailed design of our new approach to intranode communication and evaluate its improvement to communication and application performance using micro-kernel benchmarks and a 2D stencil application kernel.

AB - Accelerator awareness has become a pressing issue in data movement models, such as MPI, because of the rapid deployment of systems that utilize accelerators. In our previous work, we developed techniques to enhance MPI with accelerator awareness, thus allowing applications to easily and efficiently communicate data between accelerator memories. In this paper, we extend this work with techniques to perform efficient data movement between accelerators within the same node using a DMA-assisted, peer-to-peer intranode communication technique that was recently introduced for NVIDIA GPUs. We present a detailed design of our new approach to intranode communication and evaluate its improvement to communication and application performance using micro-kernel benchmarks and a 2D stencil application kernel.

KW - GPU

KW - Intranode communication

KW - MPI

UR - http://www.scopus.com/inward/record.url?scp=84870460850&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84870460850&partnerID=8YFLogxK

U2 - 10.1109/HPCC.2012.69

DO - 10.1109/HPCC.2012.69

M3 - Conference contribution

SN - 9780769547497

SP - 461

EP - 468

BT - Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012

ER -