Stream experiments

Toward latency hiding in GPGPU

Supada Laosooksathit, Chokchai Leangsuksun, Abdelkader Baggag, Clayton Chandler

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

In multithreaded programming on GPUs, data transfer between CPU and GPUs is a major impendence that prevents GPU to achieve its potential. Hence, stream management framework-a latency hiding strategy introduced by CUDA, becomes our attention. Streaming allows overlapping between kernel execution time and transfer time of independent data between CPU and GPUs. For this reason, the total execution time can potentially be reduced. In this paper, we introduced performance models in order to study the utilization of streams. Moreover, we have studied two methods that are used for timing operations in CUDA, namely CUDA calls and CUDA events. CUDA call functions are functions implemented in C++, while CUDA events method is an API. Our finding shows that CUDA events method is more accurate for timing operations running on GPU than CUDA call functions.

Original languageEnglish
Title of host publicationProceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010
Pages240-248
Number of pages9
Publication statusPublished - 2010
Externally publishedYes
Event9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010 - Innsbruck
Duration: 16 Feb 201018 Feb 2010

Other

Other9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010
CityInnsbruck
Period16/2/1018/2/10

Fingerprint

Experiments
Program processors
Data transfer
Computer programming
Application programming interfaces (API)
Graphics processing unit

Keywords

  • GPGPU
  • High performance computing
  • Latency hiding

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Networks and Communications
  • Software

Cite this

Laosooksathit, S., Leangsuksun, C., Baggag, A., & Chandler, C. (2010). Stream experiments: Toward latency hiding in GPGPU. In Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010 (pp. 240-248)

Stream experiments : Toward latency hiding in GPGPU. / Laosooksathit, Supada; Leangsuksun, Chokchai; Baggag, Abdelkader; Chandler, Clayton.

Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010. 2010. p. 240-248.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Laosooksathit, S, Leangsuksun, C, Baggag, A & Chandler, C 2010, Stream experiments: Toward latency hiding in GPGPU. in Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010. pp. 240-248, 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010, Innsbruck, 16/2/10.
Laosooksathit S, Leangsuksun C, Baggag A, Chandler C. Stream experiments: Toward latency hiding in GPGPU. In Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010. 2010. p. 240-248
Laosooksathit, Supada ; Leangsuksun, Chokchai ; Baggag, Abdelkader ; Chandler, Clayton. / Stream experiments : Toward latency hiding in GPGPU. Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010. 2010. pp. 240-248
@inproceedings{ad8d1c527f50415ebb630b45480987e6,
title = "Stream experiments: Toward latency hiding in GPGPU",
abstract = "In multithreaded programming on GPUs, data transfer between CPU and GPUs is a major impendence that prevents GPU to achieve its potential. Hence, stream management framework-a latency hiding strategy introduced by CUDA, becomes our attention. Streaming allows overlapping between kernel execution time and transfer time of independent data between CPU and GPUs. For this reason, the total execution time can potentially be reduced. In this paper, we introduced performance models in order to study the utilization of streams. Moreover, we have studied two methods that are used for timing operations in CUDA, namely CUDA calls and CUDA events. CUDA call functions are functions implemented in C++, while CUDA events method is an API. Our finding shows that CUDA events method is more accurate for timing operations running on GPU than CUDA call functions.",
keywords = "GPGPU, High performance computing, Latency hiding",
author = "Supada Laosooksathit and Chokchai Leangsuksun and Abdelkader Baggag and Clayton Chandler",
year = "2010",
language = "English",
isbn = "9780889868205",
pages = "240--248",
booktitle = "Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010",

}

TY - GEN

T1 - Stream experiments

T2 - Toward latency hiding in GPGPU

AU - Laosooksathit, Supada

AU - Leangsuksun, Chokchai

AU - Baggag, Abdelkader

AU - Chandler, Clayton

PY - 2010

Y1 - 2010

N2 - In multithreaded programming on GPUs, data transfer between CPU and GPUs is a major impendence that prevents GPU to achieve its potential. Hence, stream management framework-a latency hiding strategy introduced by CUDA, becomes our attention. Streaming allows overlapping between kernel execution time and transfer time of independent data between CPU and GPUs. For this reason, the total execution time can potentially be reduced. In this paper, we introduced performance models in order to study the utilization of streams. Moreover, we have studied two methods that are used for timing operations in CUDA, namely CUDA calls and CUDA events. CUDA call functions are functions implemented in C++, while CUDA events method is an API. Our finding shows that CUDA events method is more accurate for timing operations running on GPU than CUDA call functions.

AB - In multithreaded programming on GPUs, data transfer between CPU and GPUs is a major impendence that prevents GPU to achieve its potential. Hence, stream management framework-a latency hiding strategy introduced by CUDA, becomes our attention. Streaming allows overlapping between kernel execution time and transfer time of independent data between CPU and GPUs. For this reason, the total execution time can potentially be reduced. In this paper, we introduced performance models in order to study the utilization of streams. Moreover, we have studied two methods that are used for timing operations in CUDA, namely CUDA calls and CUDA events. CUDA call functions are functions implemented in C++, while CUDA events method is an API. Our finding shows that CUDA events method is more accurate for timing operations running on GPU than CUDA call functions.

KW - GPGPU

KW - High performance computing

KW - Latency hiding

UR - http://www.scopus.com/inward/record.url?scp=77954574912&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77954574912&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9780889868205

SP - 240

EP - 248

BT - Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010

ER -