GitHub

To fully exploit the computational power of the GPU generally a large amount of data parallelism must be expressed. In the specific case of accelerated libraries such as cuBLAS, cuFFT, and cuSPARSE if each operation does not possess a sufficient amount of data parallelism another option is to batch many smaller operations into a single large operation. This tutorial will demonstrate how to take advantage of batched cuBLAS operations to improve GPU utilization. Additionally this tutorial will expand on the GPU concurrency topics from the first tutorial through the use of streams and Hyper-Q. The full source can be viewed or downloaded from the OLCF GitHub. Please direct any questions or comments to [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
C		C
F90		F90
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

zheliu137/Batched_cuBLAS

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages