Group of introductory exercises to CUDA
Use Makefile to compile project. Requires gcc/g++ 4.4 (or nvcc compatible version) CUDA 5.5 or later (latest version recommended)
Use Makefile to compile the project. You will need to re-target the compiler to clang to avoid clang unsupported flags.
Use included Visual Studio 2012 project. Should run out of the box. If you do not have Visual Studio 2012, follow the following instructions:
-
- Create an empty project in this repository.
-
- Change the build customizations to target the correct version of CUDA on your machine.
-
- Add cudaBenchmark.cu
-
- Add paths to NvTools headers and libraries. If you have installed NSight properly, VS should have a macro $(NVTOOLSEXT_PATH) that points to the folder. The headers are in include, and the libraries are in lib.
-
- You should now be able to build the solution contained cudaBenchmark.cu. NOTE : To run, you may need to copy the nvtoolsext_32.dll to the System32 folder or the same folder as the built executable.
-
- Repeat steps 2 through 5 for reduction and transpose.
This executable runs a benchmark for pageable and pinned Host->Device and Device->Host memory transfer using cudaMemcpy.It also runs a benchmark for global memory Device->Device memory copies.
It is a good way to assess the maximum practical bandwidth limits of GPUS.
Executes matrix transpose operations. Each stage advances the performance of the kernel.
Executes sum of all elements in an array. Each stage advances the performance of the kernel.
- Exercises run CUDA BENCHMARK internally and display output
- The VS projects and Makefiles are built for 64-bit machines. You can modify it for 32-bit fairly easily.
Email: [email protected]
===================================================