This sample application performs general matrix multiplication using DPC++ on CPU or GPU, so it can be used as a target for OpenCL(TM) and Level Zero profiling and tracing tools.
DPC++ Matrix Multiplication (matrix size: 1024 x 1024, repeats 4 times)
Target device: Intel(R) Gen9
Matrix multiplication time: 0.0429941 sec
Results are CORRECT with accuracy: 4.90573e-06
Matrix multiplication time: 0.0431165 sec
Results are CORRECT with accuracy: 4.90573e-06
Matrix multiplication time: 0.0433001 sec
Results are CORRECT with accuracy: 4.90573e-06
Matrix multiplication time: 0.0428462 sec
Results are CORRECT with accuracy: 4.90573e-06
Total execution time: 0.373728 sec
- Linux
- Windows
- CMake (version 3.12 and above)
- Git (version 1.8 and above)
- Python (version 2.7 and above)
- Intel(R) oneAPI Base Toolkit
Run the following commands to build the sample ((make sure you have oneAPI DPC++ Compiler in PATH
for building)):
source <inteloneapi>/setvars.sh
cd <pti>/samples/dpc_gemm
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make
Use this command line to run the application:
./dpc_gemm [cpu|gpu|host] [matrix_size] [repeat_count]
Use Microsoft* Visual Studio x64 command prompt to run the following commands and build the sample (make sure you have oneAPI DPC++ Compiler in PATH
for building):
<inteloneapi>\setvars.bat
cd <pti>\samples\dpc_gemm
mkdir build
cd build
cmake ..
cmake --build . --config Release
Use this command line to run the application:
cd Release
dpc_gemm.exe [cpu|gpu|host] [matrix_size] [repeats_count]