diff --git a/README.md b/README.md index 6e02afa..117a432 100644 --- a/README.md +++ b/README.md @@ -1,133 +1,16 @@ -Project-2 -========= - -A Study in Parallel Algorithms : Stream Compaction - -# INTRODUCTION -Many of the algorithms you have learned thus far in your career have typically -been developed from a serial standpoint. When it comes to GPUs, we are mainly -looking at massively parallel work. Thus, it is necessary to reorient our -thinking. In this project, we will be implementing a couple different versions -of prefix sum. We will start with a simple single thread serial CPU version, -and then move to a naive GPU version. Each part of this homework is meant to -follow the logic of the previous parts, so please do not do this homework out of -order. - -This project will serve as a stream compaction library that you may use (and -will want to use) in your -future projects. For that reason, we suggest you create proper header and CUDA -files so that you can reuse this code later. You may want to create a separate -cpp file that contains your main function so that you can test the code you -write. - -# OVERVIEW -Stream compaction is broken down into two parts: (1) scan, and (2) scatter. - -## SCAN -Scan or prefix sum is the summation of the elements in an array such that the -resulting array is the summation of the terms before it. Prefix sum can either -be inclusive, meaning the current term is a summation of all the elements before -it and itself, or exclusive, meaning the current term is a summation of all -elements before it excluding itself. - -Inclusive: - -In : [ 3 4 6 7 9 10 ] - -Out : [ 3 7 13 20 29 39 ] - -Exclusive - -In : [ 3 4 6 7 9 10 ] - -Out : [ 0 3 7 13 20 29 ] - -Note that the resulting prefix sum will always be n + 1 elements if the input -array is of length n. Similarly, the first element of the exclusive prefix sum -will always be 0. In the following sections, all references to prefix sum will -be to the exclusive version of prefix sum. - -## SCATTER -The scatter section of stream compaction takes the results of the previous scan -in order to reorder the elements to form a compact array. - -For example, let's say we have the following array: -[ 0 0 3 4 0 6 6 7 0 1 ] - -We would only like to consider the non-zero elements in this zero, so we would -like to compact it into the following array: -[ 3 4 6 6 7 1 ] - -We can perform a transform on input array to transform it into a boolean array: - -In : [ 0 0 3 4 0 6 6 7 0 1 ] - -Out : [ 0 0 1 1 0 1 1 1 0 1 ] - -Performing a scan on the output, we get the following array : - -In : [ 0 0 1 1 0 1 1 1 0 1 ] - -Out : [ 0 0 0 1 2 2 3 4 5 5 ] - -Notice that the output array produces a corresponding index array that we can -use to create the resulting array for stream compaction. - -# PART 1 : REVIEW OF PREFIX SUM -Given the definition of exclusive prefix sum, please write a serial CPU version -of prefix sum. You may write this in the cpp file to separate this from the -CUDA code you will be writing in your .cu file. - -# PART 2 : NAIVE PREFIX SUM -We will now parallelize this the previous section's code. Recall from lecture -that we can parallelize this using a series of kernel calls. In this portion, -you are NOT allowed to use shared memory. - -### Questions -* Compare this version to the serial version of exclusive prefix scan. Please - include a table of how the runtimes compare on different lengths of arrays. -* Plot a graph of the comparison and write a short explanation of the phenomenon you - see here. - -# PART 3 : OPTIMIZING PREFIX SUM -In the previous section we did not take into account shared memory. In the -previous section, we kept everything in global memory, which is much slower than -shared memory. - -## PART 3a : Write prefix sum for a single block -Shared memory is accessible to threads of a block. Please write a version of -prefix sum that works on a single block. - -## PART 3b : Generalizing to arrays of any length. -Taking the previous portion, please write a version that generalizes prefix sum -to arbitrary length arrays, this includes arrays that will not fit on one block. - -### Questions -* Compare this version to the parallel prefix sum using global memory. -* Plot a graph of the comparison and write a short explanation of the phenomenon - you see here. - -# PART 4 : ADDING SCATTER -First create a serial version of scatter by expanding the serial version of -prefix sum. Then create a GPU version of scatter. Combine the function call -such that, given an array, you can call stream compact and it will compact the -array for you. Finally, write a version using thrust. - -### Questions -* Compare your version of stream compact to your version using thrust. How do - they compare? How might you optimize yours more, or how might thrust's stream - compact be optimized. - -# EXTRA CREDIT (+10) -For extra credit, please optimize your prefix sum for work parallelism and to -deal with bank conflicts. Information on this can be found in the GPU Gems -chapter listed in the references. - -# SUBMISSION -Please answer all the questions in each of the subsections above and write your -answers in the README by overwriting the README file. In future projects, we -expect your analysis to be similar to the one we have led you through in this -project. Like other projects, please open a pull request and email Harmony. - -# REFERENCES -"Parallel Prefix Sum (Scan) with CUDA." GPU Gems 3. +Project-2 +========= + +Serial vs. Naïve prefix scan (please see readme.pdf) + + + +Size 1000 5000 10000 50000 100000 500000 1000000 5000000 10000000 Serial 0.006842 0.028225 0.058161 0.253599 0.49223 1.98731 4.28381 22.123 43.947 Naïve 0.380992 0.75904 0.497632 1.1176 1.35654 3.50685 6.65933 35.5035 0.145952 +Runtime (microseconds) of serial vs. naïve GPU implementation of prefix scan + +It’s difficult to see from the graph itself, but a look at the table will make it clear that the serial implementation scales much more quickly than my naïve GPU implementation. This is because the GPU implementation roughly scales on log(n), one wave per depth, while the serial implementation is on the order of n, since it makes n calculations on the array to get the final result. + + +Naïve vs. Shared Memory Prefix Scan + +-N/A- diff --git a/README.md.pdf b/README.md.pdf new file mode 100644 index 0000000..3943cf4 Binary files /dev/null and b/README.md.pdf differ diff --git a/data.xlsx b/data.xlsx new file mode 100644 index 0000000..bef6868 Binary files /dev/null and b/data.xlsx differ diff --git a/project_2/Debug/Project-1_part-2.Build.CppClean.log b/project_2/Debug/Project-1_part-2.Build.CppClean.log new file mode 100644 index 0000000..b00ab8d --- /dev/null +++ b/project_2/Debug/Project-1_part-2.Build.CppClean.log @@ -0,0 +1,7 @@ +D:\Documents\CIS 565\Project-1_part-2\Debug\Project-1_part-2.pdb +D:\Documents\CIS 565\Project-1_part-2\Project-1_part-2\Debug\link.command.1.tlog +D:\Documents\CIS 565\Project-1_part-2\Project-1_part-2\Debug\link.read.1.tlog +D:\Documents\CIS 565\Project-1_part-2\Project-1_part-2\Debug\link.write.1.tlog +D:\Documents\CIS 565\Project-1_part-2\Project-1_part-2\Debug\matrix_math.cu.cache +D:\Documents\CIS 565\Project-1_part-2\Project-1_part-2\Debug\Project-1_part-2.exe.intermediate.manifest +D:\Documents\CIS 565\Project-1_part-2\Project-1_part-2\Debug\Project-1_part-2.write.1.tlog diff --git a/project_2/Debug/Project-1_part-2.lastbuildstate b/project_2/Debug/Project-1_part-2.lastbuildstate new file mode 100644 index 0000000..4b37382 --- /dev/null +++ b/project_2/Debug/Project-1_part-2.lastbuildstate @@ -0,0 +1,2 @@ +#v4.0:v100 +Debug|Win32|D:\Documents\CIS 565\Project-1_part-2\| diff --git a/project_2/Debug/Project-1_part-2.log b/project_2/Debug/Project-1_part-2.log new file mode 100644 index 0000000..54be084 --- /dev/null +++ b/project_2/Debug/Project-1_part-2.log @@ -0,0 +1,14 @@ +Build started 9/27/2014 2:57:45 PM. + 1>Project "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Project-1_part-2\Project-1_part-2.vcxproj" on node 2 (clean target(s)). + 1>CudaClean: + cmd.exe /C "C:\Users\Jiatong\AppData\Local\Temp\tmp5bfc9457b194476bb8e71d7240a8faf8.cmd" + "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\bin\nvcc.exe" -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -G --keep-dir Debug -maxrregcount=0 --machine 32 --compile -g -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd " -o "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Project-1_part-2\Win32/Debug/matrix_math.cu.obj" "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Project-1_part-2\matrix_math.cu" -clean + + D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Project-1_part-2>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\bin\nvcc.exe" -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -G --keep-dir Debug -maxrregcount=0 --machine 32 --compile -g -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd " -o "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Project-1_part-2\Win32/Debug/matrix_math.cu.obj" "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Project-1_part-2\matrix_math.cu" -clean + matrix_math.cu + Deleting file "Debug\matrix_math.cu.deps". + 1>Done Building Project "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Project-1_part-2\Project-1_part-2.vcxproj" (clean target(s)). + +Build succeeded. + +Time Elapsed 00:00:00.71 diff --git a/project_2/Debug/matrix_math.cu.deps b/project_2/Debug/matrix_math.cu.deps new file mode 100644 index 0000000..6baae49 --- /dev/null +++ b/project_2/Debug/matrix_math.cu.deps @@ -0,0 +1,320 @@ +C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include\cuda_runtime.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_config.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\sal.h +c:\program files (x86)\microsoft visual studio 10.0\vc\include\codeanalysis\sourceannotations.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\vadefs.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\limits.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stddef.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\channel_descriptor.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\cuda_runtime_api.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\cuda_device_runtime_api.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\cuda_runtime_api.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\common_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\time.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\wtime.inl +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\time.inl +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\math_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\math.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stdlib.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\cmath +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\yvals.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\use_ansi.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\math.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\cstdlib +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\cuda_surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\channel_descriptor.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\cuda_texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\channel_descriptor.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_11_atomic_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_12_atomic_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_13_double_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_20_atomic_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_32_atomic_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_35_atomic_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_32_atomic_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_20_intrinsics.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_30_intrinsics.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_32_intrinsics.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_35_intrinsics.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_32_intrinsics.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\cuda_surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_fetch_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\cuda_texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_indirect_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_indirect_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_launch_parameters.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stdio.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\swprintf.inl +C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include\cuda.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\iostream +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\istream +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\ostream +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\ios +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xlocnum +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\climits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\cstdio +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\streambuf +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xiosbase +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xlocale +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\cstring +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stdexcept +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\exception +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xstddef +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\cstddef +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\eh.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\malloc.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xstring +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xmemory +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\new +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xutility +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\utility +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\iosfwd +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\cwchar +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\wchar.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdbg.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\type_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\limits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\ymath.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\cfloat +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\float.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtwrn.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xtr1common +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\typeinfo +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xlocinfo +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xlocinfo.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\ctype.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\locale.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xdebug +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\system_error +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\cerrno +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\errno.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\share.h diff --git a/project_2/Debug/project_2.Build.CppClean.log b/project_2/Debug/project_2.Build.CppClean.log new file mode 100644 index 0000000..d8ec635 --- /dev/null +++ b/project_2/Debug/project_2.Build.CppClean.log @@ -0,0 +1,19 @@ +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Debug\cl.command.1.tlog +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Debug\CL.read.1.tlog +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Debug\CL.write.1.tlog +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Debug\compaction.cu.cache +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Debug\link.command.1.tlog +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Debug\link.read.1.tlog +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Debug\link.write.1.tlog +D:\DOCUMENTS\CIS 565\PROJECT2-STREAMCOMPACTION\PROJECT_2\DEBUG\MAIN.OBJ +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Debug\mt.command.1.tlog +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Debug\mt.read.1.tlog +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Debug\mt.write.1.tlog +D:\DOCUMENTS\CIS 565\PROJECT2-STREAMCOMPACTION\PROJECT_2\DEBUG\PROJECT_2.EXE +D:\DOCUMENTS\CIS 565\PROJECT2-STREAMCOMPACTION\PROJECT_2\DEBUG\PROJECT_2.EXE.INTERMEDIATE.MANIFEST +D:\DOCUMENTS\CIS 565\PROJECT2-STREAMCOMPACTION\PROJECT_2\DEBUG\PROJECT_2.ILK +D:\DOCUMENTS\CIS 565\PROJECT2-STREAMCOMPACTION\PROJECT_2\DEBUG\PROJECT_2.PDB +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Debug\project_2.vcxprojResolveAssemblyReference.cache +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Debug\project_2.write.1.tlog +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Debug\vc100.idb +D:\DOCUMENTS\CIS 565\PROJECT2-STREAMCOMPACTION\PROJECT_2\DEBUG\VC100.PDB diff --git a/project_2/Debug/project_2.log b/project_2/Debug/project_2.log new file mode 100644 index 0000000..b6472ae --- /dev/null +++ b/project_2/Debug/project_2.log @@ -0,0 +1,16 @@ +Build started 9/29/2014 7:46:58 AM. + 1>Project "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\project_2.vcxproj" on node 2 (clean target(s)). + 1>_PrepareForClean: + Deleting file "Debug\project_2.lastbuildstate". + CudaClean: + cmd.exe /C "C:\Users\Jiatong\AppData\Local\Temp\tmp9026fa5e4e394dbeb0bc5537db5d87e9.cmd" + "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\bin\nvcc.exe" -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -G --keep-dir Debug -maxrregcount=0 --machine 32 --compile -g -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd " -o "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Win32/Debug/compaction.cu.obj" "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\compaction.cu" -clean + + D:\Documents\CIS 565\Project2-StreamCompaction\project_2>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\bin\nvcc.exe" -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -G --keep-dir Debug -maxrregcount=0 --machine 32 --compile -g -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd " -o "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Win32/Debug/compaction.cu.obj" "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\compaction.cu" -clean + compaction.cu + Deleting file "Debug\compaction.cu.deps". + 1>Done Building Project "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\project_2.vcxproj" (clean target(s)). + +Build succeeded. + +Time Elapsed 00:00:00.40 diff --git a/project_2/Release/matrix_math.cu.deps b/project_2/Release/matrix_math.cu.deps new file mode 100644 index 0000000..6baae49 --- /dev/null +++ b/project_2/Release/matrix_math.cu.deps @@ -0,0 +1,320 @@ +C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include\cuda_runtime.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_config.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\sal.h +c:\program files (x86)\microsoft visual studio 10.0\vc\include\codeanalysis\sourceannotations.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\vadefs.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\limits.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stddef.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\channel_descriptor.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\cuda_runtime_api.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\cuda_device_runtime_api.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\cuda_runtime_api.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\common_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\time.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\wtime.inl +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\time.inl +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\math_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\math.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stdlib.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\cmath +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\yvals.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\use_ansi.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\math.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\cstdlib +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\cuda_surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\channel_descriptor.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\cuda_texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\channel_descriptor.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_11_atomic_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_12_atomic_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_13_double_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_20_atomic_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_32_atomic_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_35_atomic_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_32_atomic_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_20_intrinsics.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_30_intrinsics.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_32_intrinsics.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_35_intrinsics.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\sm_32_intrinsics.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\cuda_surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_fetch_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\cuda_texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_indirect_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_indirect_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\builtin_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\driver_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\surface_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\texture_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\host_defines.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_functions.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\device_launch_parameters.h +c:\program files\nvidia gpu computing toolkit\cuda\v6.5\include\vector_types.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stdio.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\swprintf.inl +C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include\cuda.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\iostream +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\istream +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\ostream +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\ios +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xlocnum +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\climits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\cstdio +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\streambuf +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xiosbase +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xlocale +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\cstring +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stdexcept +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\exception +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xstddef +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\cstddef +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\eh.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\malloc.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xstring +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xmemory +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\new +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xutility +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\utility +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\iosfwd +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\cwchar +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\wchar.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdbg.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\type_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\limits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\ymath.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\cfloat +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\float.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtwrn.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xtr1common +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xfwrap1 +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xxtype_traits +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\typeinfo +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xlocinfo +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xlocinfo.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\ctype.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\locale.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xdebug +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\system_error +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\cerrno +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\errno.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\crtdefs.h +C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\share.h diff --git a/project_2/Release/project_2.Build.CppClean.log b/project_2/Release/project_2.Build.CppClean.log new file mode 100644 index 0000000..f5b04e9 --- /dev/null +++ b/project_2/Release/project_2.Build.CppClean.log @@ -0,0 +1,16 @@ +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Release\cl.command.1.tlog +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Release\CL.read.1.tlog +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Release\CL.write.1.tlog +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Release\compaction.cu.cache +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Release\link.command.1.tlog +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Release\link.read.1.tlog +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Release\link.write.1.tlog +D:\DOCUMENTS\CIS 565\PROJECT2-STREAMCOMPACTION\PROJECT_2\RELEASE\MAIN.OBJ +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Release\mt.command.1.tlog +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Release\mt.read.1.tlog +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Release\mt.write.1.tlog +D:\DOCUMENTS\CIS 565\PROJECT2-STREAMCOMPACTION\PROJECT_2\RELEASE\PROJECT_2.EXE +D:\DOCUMENTS\CIS 565\PROJECT2-STREAMCOMPACTION\PROJECT_2\RELEASE\PROJECT_2.EXE.INTERMEDIATE.MANIFEST +D:\DOCUMENTS\CIS 565\PROJECT2-STREAMCOMPACTION\PROJECT_2\RELEASE\PROJECT_2.PDB +D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Release\project_2.write.1.tlog +D:\DOCUMENTS\CIS 565\PROJECT2-STREAMCOMPACTION\PROJECT_2\RELEASE\VC100.PDB diff --git a/project_2/Release/project_2.log b/project_2/Release/project_2.log new file mode 100644 index 0000000..8ff1889 --- /dev/null +++ b/project_2/Release/project_2.log @@ -0,0 +1,16 @@ +Build started 9/29/2014 7:46:55 AM. + 1>Project "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\project_2.vcxproj" on node 2 (clean target(s)). + 1>_PrepareForClean: + Deleting file "Release\project_2.lastbuildstate". + CudaClean: + cmd.exe /C "C:\Users\Jiatong\AppData\Local\Temp\tmp5832a1b695664d72af78c79b91c4a636.cmd" + "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\bin\nvcc.exe" -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" --keep-dir Release -maxrregcount=0 --machine 32 --compile -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /Zi /MD " -o "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Win32/Release/compaction.cu.obj" "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\compaction.cu" -clean + + D:\Documents\CIS 565\Project2-StreamCompaction\project_2>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\bin\nvcc.exe" -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" --keep-dir Release -maxrregcount=0 --machine 32 --compile -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /Zi /MD " -o "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\Win32/Release/compaction.cu.obj" "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\compaction.cu" -clean + compaction.cu + Deleting file "Release\compaction.cu.deps". + 1>Done Building Project "D:\Documents\CIS 565\Project2-StreamCompaction\project_2\project_2.vcxproj" (clean target(s)). + +Build succeeded. + +Time Elapsed 00:00:00.41 diff --git a/project_2/compaction.cu b/project_2/compaction.cu new file mode 100644 index 0000000..9e809bc --- /dev/null +++ b/project_2/compaction.cu @@ -0,0 +1,139 @@ +#include "compaction.cuh" +#include + +int maxThreadsPerBlock = 128; +cudaEvent_t beginEvent; +cudaEvent_t endEvent; + +// global calls +void initCuda (int N) { + cudaEventCreate(&beginEvent); + cudaEventCreate(&endEvent); +} + +__global__ void naive_scan (float* in_arr, float* scan_arr, int size, int depth) { + int index = threadIdx.x + blockIdx.x * blockDim.x; + + int val = 0; + + int in_index = index; + + if (depth == 1) { + in_index--; + } + + if (in_index >= 0 && index < size) { + int exp_2 = 1; + for (int i = 1; i < depth; i++) { + exp_2 *= 2; + } + val = in_arr[in_index]; + if (in_index >= exp_2) { + val += in_arr[in_index - exp_2]; + } + } + + if (index < size) { + scan_arr[index] = val; + } +} + +__global__ void shared_scan (float* in_arr, float* scan_arr, int size, int depth) { + __shared__ float in_arr_s1 [1]; //contains the lower numbers + //__shared__ float in_arr_s2 [blockDim.x]; //contains the higher numbers + + int index = threadIdx.x + blockIdx.x * blockDim.x; + + int exp_2 = 1; + for (int i = 1; i < depth; i++) { + exp_2 *= 2; + } + + float sValue = 0; + + if (index < size) { + in_arr_s1[index] = in_arr[index]; + } + __syncthreads(); + + int in_index = index; + if (depth == 1) { + in_index--; + } + + if (in_index >= 0 && index < size) { + sValue += in_arr_s1[in_index]; + if (in_index >= exp_2) { + sValue += in_arr_s1[in_index - exp_2]; + } + } + //in_arr_s2[index] = in_arr[index]; + if (index < size) { + scan_arr[index] = sValue; + } + __syncthreads(); +} + +void cudaScan (float* in_arr, float* out_arr, int size) { + int numBlocks = ceil(size/(float)maxThreadsPerBlock); + int threadsPerBlock = min(size, maxThreadsPerBlock); + + float* arr1, * arr2; + cudaMalloc((void**)&arr1, size*sizeof(float)); + cudaMalloc((void**)&arr2, size*sizeof(float)); + + float time; + int max_depth = ceil(log2((float)size)); + cudaMemcpy(arr1, in_arr, size*sizeof(float), cudaMemcpyHostToDevice); + cudaEventRecord(beginEvent, 0); + for (int i = 1; i <= max_depth; i++) { // not sure why it's ceil(log2(size)) but it works. + shared_scan<<>>(arr1, arr2, size, i); + //cudaThreadSynchronize(); // taking these out causes it to fail occasionally. + float* temp = arr1; + arr1 = arr2; + arr2 = temp; + } + cudaEventRecord(endEvent, 0); + cudaEventSynchronize(endEvent); + + cudaEventElapsedTime(&time, beginEvent, endEvent); + std::cout << "cudaGPUTime for size " << size << " was " << time << "ms" << std::endl; + + + cudaMemcpy(out_arr, arr1, size*sizeof(float), cudaMemcpyDeviceToHost); +} + +__global__ void scatter (float* in_arr, float* temp_arr, float* scan_arr, float* out_arr, int size) { + int index = threadIdx.x + blockIdx.x * blockDim.x; + + if (index < size && temp_arr[index] == 1) { + out_arr[(int)scan_arr[index]] = in_arr[index]; + } +} + +__global__ void compute (float* in_arr, float* out_arr, int size) { + //compute this array based on some function + int index = threadIdx.x + blockIdx.x * blockDim.x; + + out_arr[index] = index % 2; +} + +void cudaStreamCompaction (float* in_arr, float* out_arr, int size) { + int numBlocks = ceil(size/(float)maxThreadsPerBlock); + int threadsPerBlock = min(size, maxThreadsPerBlock); + float* temp_arr, *scan_arr; + float* arr, *compact_arr; + + cudaMalloc((void**)&temp_arr, size*sizeof(int)); + cudaMalloc((void**)&scan_arr, size*sizeof(int)); + cudaMalloc((void**)&arr, size*sizeof(float)); + cudaMalloc((void**)&compact_arr, size*sizeof(float)); + + cudaMemcpy(arr, in_arr, size*sizeof(float), cudaMemcpyHostToDevice); + + compute<<>>(arr, temp_arr, size); + cudaScan(arr, scan_arr, size); + scatter<<>>(arr, temp_arr, scan_arr, out_arr, size); + + cudaMemcpy(out_arr, compact_arr, size*sizeof(float), cudaMemcpyDeviceToHost); +} \ No newline at end of file diff --git a/project_2/compaction.cuh b/project_2/compaction.cuh new file mode 100644 index 0000000..3843bfb --- /dev/null +++ b/project_2/compaction.cuh @@ -0,0 +1,16 @@ +#ifndef COMPACTION_H +#define COMPATION_H + +void initCuda (int N); +/** + * Calls an internal function to perform an exclusive prefix sum on the array +**/ +void cudaScan (float* in_arr, float* out_arr, int size); +/** + * Calls an internal function to perform a scatter on the array +**/ +void cudaScatter (float* in_arr, float* out_arr, int size); + +float* prefixSum (float* arr, int size); + +#endif \ No newline at end of file diff --git a/project_2/main.cpp b/project_2/main.cpp new file mode 100644 index 0000000..6468401 --- /dev/null +++ b/project_2/main.cpp @@ -0,0 +1,83 @@ +#include +#include +#include +#include +#include "compaction.cuh" + +const int size = 65; + +float* prefixSum(float* arr, int size); +float* scatter(float* arr, float* temp_arr, float* scan_arr, int size); +void printArray(float* arr, int size); + +int main(int argc, char** argv) { + + initCuda(size); + + float* arr = new float[size]; + + for (int i = 0; i < size; i++) { + arr[i] = i; + } + + LARGE_INTEGER li; + QueryPerformanceFrequency(&li); + double PCFreq = double(li.QuadPart)/1000.0; + QueryPerformanceCounter(&li); + __int64 CounterStart = li.QuadPart; + + prefixSum(arr, size); + + QueryPerformanceCounter(&li); + double time = double(li.QuadPart-CounterStart)/PCFreq; + printArray(prefixSum(arr, size), size); + + //float time; + std::cout << "cudaCPUTime for size " << size << " was " << time << "ms" << std::endl; + + float* arr_gpu = new float[size]; + + cudaScan(arr, arr_gpu, size); + printArray(arr_gpu, size); + int a; + std::cin>>a; +} + +float* prefixSum (float* arr, int size) { + if (size < 1) { + return NULL; + } + + float* scan_arr = new float[size]; + + scan_arr[0] = 0; + + for (int i = 1; i < size; i++) { + scan_arr[i] = scan_arr[i-1] + arr[i-1]; + } + + return scan_arr; +} + +float* scatter (float* arr, int* temp_arr, int* scan_arr, int size) { + int c = 0; + for (int i = 0; i < size; i++) { + if (temp_arr[i] == 1 && scan_arr[i] > c) + c = scan_arr[i]; + } + float* scat_arr = new float[c]; + + for (int i = 0; i < size; i++) { + if (temp_arr[i] == 1) { + scat_arr[scan_arr[i]] = arr[i]; + } + } + return scat_arr; +} + +void printArray (float* arr, int size) { + for (int i = 0; i < size; i++) { + std::cout << arr[i] << " "; + } + std::cout << std::endl; +} diff --git a/project_2/project_2.sln b/project_2/project_2.sln new file mode 100644 index 0000000..12a696e --- /dev/null +++ b/project_2/project_2.sln @@ -0,0 +1,20 @@ + +Microsoft Visual Studio Solution File, Format Version 11.00 +# Visual Studio 2010 +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "project_2", "project_2.vcxproj", "{B9F2B494-1DFB-43C4-A456-00C4DCF651F9}" +EndProject +Global + GlobalSection(SolutionConfigurationPlatforms) = preSolution + Debug|Win32 = Debug|Win32 + Release|Win32 = Release|Win32 + EndGlobalSection + GlobalSection(ProjectConfigurationPlatforms) = postSolution + {B9F2B494-1DFB-43C4-A456-00C4DCF651F9}.Debug|Win32.ActiveCfg = Debug|Win32 + {B9F2B494-1DFB-43C4-A456-00C4DCF651F9}.Debug|Win32.Build.0 = Debug|Win32 + {B9F2B494-1DFB-43C4-A456-00C4DCF651F9}.Release|Win32.ActiveCfg = Release|Win32 + {B9F2B494-1DFB-43C4-A456-00C4DCF651F9}.Release|Win32.Build.0 = Release|Win32 + EndGlobalSection + GlobalSection(SolutionProperties) = preSolution + HideSolutionNode = FALSE + EndGlobalSection +EndGlobal diff --git a/project_2/project_2.suo b/project_2/project_2.suo new file mode 100644 index 0000000..3baeb19 Binary files /dev/null and b/project_2/project_2.suo differ diff --git a/project_2/project_2.vcxproj b/project_2/project_2.vcxproj new file mode 100644 index 0000000..4e3bc83 --- /dev/null +++ b/project_2/project_2.vcxproj @@ -0,0 +1,91 @@ + + + + + Debug + Win32 + + + Release + Win32 + + + + {B9F2B494-1DFB-43C4-A456-00C4DCF651F9} + Project1_part2 + project_2 + + + + Application + true + MultiByte + + + Application + false + true + MultiByte + + + + + + + + + + + + + + + + Level3 + Disabled + + + true + kernel32.lib;user32.lib;gdi32.lib;cudart.lib;winspool.lib;comdlg32.lib;advapi32.lib;shell32.lib;ole32.lib;oleaut32.lib;uuid.lib;odbc32.lib;odbccp32.lib;%(AdditionalDependencies) + + + $(ProjectDir)$(Platform)/$(Configuration)/%(Filename)%(Extension).obj + + + $(CudaToolkitIncludeDir) + + + + + Level3 + MaxSpeed + true + true + + + true + true + true + kernel32.lib;user32.lib;gdi32.lib;cudart.lib;winspool.lib;comdlg32.lib;advapi32.lib;shell32.lib;ole32.lib;oleaut32.lib;uuid.lib;odbc32.lib;odbccp32.lib;%(AdditionalDependencies) + + + $(CudaToolkitIncludeDir) + $(IntDir)%(Filename)%(Extension).obj + + + + + $(ProjectDir)$(Platform)/$(Configuration)/%(Filename)%(Extension).obj + + + + + + + + + + + + + \ No newline at end of file diff --git a/project_2/project_2.vcxproj.filters b/project_2/project_2.vcxproj.filters new file mode 100644 index 0000000..09159be --- /dev/null +++ b/project_2/project_2.vcxproj.filters @@ -0,0 +1,32 @@ + + + + + {4FC737F1-C7A5-4376-A066-2A32D752A2FF} + cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx + + + {93995380-89BD-4b04-88EB-625FBE52EBFB} + h;hpp;hxx;hm;inl;inc;xsd + + + {67DA6AB6-F800-4c08-8B7A-83BB121AAD01} + rc;ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe;resx;tiff;tif;png;wav;mfcribbon-ms + + + + + Source Files + + + + + Source Files + + + + + Header Files + + + \ No newline at end of file diff --git a/project_2/project_2.vcxproj.user b/project_2/project_2.vcxproj.user new file mode 100644 index 0000000..ace9a86 --- /dev/null +++ b/project_2/project_2.vcxproj.user @@ -0,0 +1,3 @@ + + + \ No newline at end of file diff --git a/project_2/vc100.pdb b/project_2/vc100.pdb new file mode 100644 index 0000000..d6c4dac Binary files /dev/null and b/project_2/vc100.pdb differ