From 11321946022da4bca44edecda0233ebb55b1f8ad Mon Sep 17 00:00:00 2001 From: Evgeny Mankov Date: Tue, 17 Dec 2024 21:13:04 +0100 Subject: [PATCH 01/17] [HIPIFY][doc] Introduce `cuTensor` to `hipTensor` hipification support in the documentation --- docs/hipify-clang.rst | 118 +++++++++++++++++++++++++---------------- docs/supported_apis.md | 1 + 2 files changed, 72 insertions(+), 47 deletions(-) diff --git a/docs/hipify-clang.rst b/docs/hipify-clang.rst index a193a4c8..12e1b77c 100644 --- a/docs/hipify-clang.rst +++ b/docs/hipify-clang.rst @@ -535,7 +535,23 @@ LLVM >= 10.0.0 -DCUDA_SDK_ROOT_DIR="C:/ProgramData/NVIDIA Corporation/CUDA Samples/v12.6" -4. Install `cuDNN `_ belonging to the version corresponding +4. [Optional] Install `cuTensor `_: + + * To specify the path to `cuTensor `_, use the ``CUDA_TENSOR_ROOT_DIR`` option: + + **Linux**: + + .. code-block:: bash + + -DCUDA_TENSOR_ROOT_DIR=/usr/include + + **Windows**: + + .. code-block:: shell + + -DCUDA_TENSOR_ROOT_DIR=D:/CUDA/cuTensor/2.0.2.1 + +5. [Optional] Install `cuDNN `_ belonging to the version corresponding to the CUDA version: * To specify the path to `cuDNN `_, use the ``CUDA_DNN_ROOT_DIR`` option: @@ -549,10 +565,10 @@ LLVM >= 10.0.0 **Windows**: .. code-block:: shell - + -DCUDA_DNN_ROOT_DIR=D:/CUDA/cuDNN/9.6.0 -5. [Optional] Install `CUB 1.9.8 `_ for ``CUDA < 11.0`` only; +6. [Optional] Install `CUB 1.9.8 `_ for ``CUDA < 11.0`` only; for ``CUDA >= 11.0``, the CUB shipped with CUDA will be used for testing. * To specify the path to CUB, use the ``CUDA_CUB_ROOT_DIR`` option (only for ``CUDA < 11.0``): @@ -569,9 +585,9 @@ LLVM >= 10.0.0 -DCUDA_CUB_ROOT_DIR=D:/CUDA/CUB -6. Install `Python `_ version 3.0 or greater. +7. Install `Python `_ version 3.0 or greater. -7. Install ``lit`` and ``FileCheck``; these are distributed with LLVM. +8. Install ``lit`` and ``FileCheck``; these are distributed with LLVM. * Install ``lit`` into ``Python``: @@ -619,7 +635,7 @@ LLVM >= 10.0.0 Alternatively, specify the path to ``FileCheck`` in the ``CMAKE_INSTALL_PREFIX`` option. -8. To run OpenGL tests successfully on: +9. To run OpenGL tests successfully on: **Linux**: @@ -631,9 +647,9 @@ LLVM >= 10.0.0 No installation required. All the required headers are shipped with the Windows SDK. -9. Set the ``HIPIFY_CLANG_TESTS`` option to ``ON``: ``-DHIPIFY_CLANG_TESTS=ON`` +10. Set the ``HIPIFY_CLANG_TESTS`` option to ``ON``: ``-DHIPIFY_CLANG_TESTS=ON`` -10. Build and run tests. +11. Build and run tests. Linux testing ====================================================== @@ -642,8 +658,8 @@ On Linux, the following configurations are tested: * Ubuntu 14: LLVM 4.0.0 - 7.1.0, CUDA 7.0 - 9.0, cuDNN 5.0.5 - 7.6.5 * Ubuntu 16-19: LLVM 8.0.0 - 14.0.6, CUDA 7.0 - 10.2, cuDNN 5.1.10 - 8.0.5 -* Ubuntu 20-21: LLVM 9.0.0 - 19.1.5, CUDA 7.0 - 12.6.3, cuDNN 5.1.10 - 9.6.0 -* Ubuntu 22-23: LLVM 13.0.0 - 19.1.5, CUDA 7.0 - 12.6.3, cuDNN 8.0.5 - 9.6.0 +* Ubuntu 20-21: LLVM 9.0.0 - 19.1.5, CUDA 7.0 - 12.6.3, cuDNN 5.1.10 - 9.6.0, cuTensor 1.0.1.0 - 2.0.2.1 +* Ubuntu 22-23: LLVM 13.0.0 - 19.1.5, CUDA 7.0 - 12.6.3, cuDNN 8.0.5 - 9.6.0, cuTensor 1.0.1.0 - 2.0.2.1 Minimum build system requirements for the above configurations: @@ -664,6 +680,7 @@ Here's how to build ``hipify-clang`` with testing support on ``Ubuntu 23.10.01`` -DCMAKE_PREFIX_PATH=/usr/llvm/19.1.5/dist \ -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.6.3 \ -DCUDA_DNN_ROOT_DIR=/usr/local/cudnn-9.6.0 \ + -DCUDA_TENSOR_ROOT_DIR=/usr/local/cutensor-2.0.2.1 \ -DLLVM_EXTERNAL_LIT=/usr/llvm/19.1.5/build/bin/llvm-lit \ ../hipify @@ -684,35 +701,38 @@ The corresponding successful output is: -- Detecting CXX compile features -- Detecting CXX compile features - done -- HIPIFY config: - -- - Build hipify-clang : ON - -- - Test hipify-clang : ON - -- - Is part of HIP SDK : OFF + -- - Build hipify-clang : ON + -- - Test hipify-clang : ON + -- - Is part of HIP SDK : OFF + -- - Install clang headers : ON -- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.13") -- Found LLVM 19.1.5: - -- - CMake module path : /usr/llvm/19.1.5/dist/lib/cmake/llvm - -- - Clang include path : /usr/llvm/19.1.5/dist/include - -- - LLVM Include path : /usr/llvm/19.1.5/dist/include - -- - Binary path : /usr/llvm/19.1.5/dist/bin + -- - CMake module path : /usr/llvm/19.1.5/dist/lib/cmake/llvm + -- - Clang include path : /usr/llvm/19.1.5/dist/include + -- - LLVM Include path : /usr/llvm/19.1.5/dist/include + -- - Binary path : /usr/llvm/19.1.5/dist/bin -- Linker detection: GNU ld -- ---- The below configuring for hipify-clang testing only ---- -- Found Python: /usr/bin/python3.13 (found suitable version "3.13.1", required range is "3.0...3.14") found components: Interpreter -- Found lit: /usr/local/bin/lit -- Found FileCheck: /GIT/LLVM/trunk/dist/FileCheck -- Initial CUDA to configure: - -- - CUDA Toolkit path : /usr/local/cuda-12.6.3 - -- - CUDA Samples path : - -- - cuDNN path : /usr/local/cudnn-9.6.0 - -- - CUB path : + -- - CUDA Toolkit path : /usr/local/cuda-12.6.3 + -- - CUDA Samples path : + -- - cuDNN path : /usr/local/cudnn-9.6.0 + -- - cuTENSOR path : /usr/local/cuTensor/2.0.2.1 + -- - CUB path : -- Found CUDAToolkit: /usr/local/cuda-12.6.3/targets/x86_64-linux/include (found version "12.6.85") -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Found CUDA config: - -- - CUDA Toolkit path : /usr/local/cuda-12.6.3 - -- - CUDA Samples path : OFF - -- - cuDNN path : /usr/local/cudnn-9.6.0 - -- - CUB path : /usr/local/cuda-12.6.3/include/cub - -- Configuring done (0.5s) + -- - CUDA Toolkit path : /usr/local/cuda-12.6.3 + -- - CUDA Samples path : OFF + -- - cuDNN path : /usr/local/cudnn-9.6.0 + -- - CUB path : /usr/local/cuda-12.6.3/include/cub + -- - cuTENSOR path : /usr/local/cuTensor/2.0.2.1 + -- Configuring done (0.6s) -- Generating done (0.0s) -- Build files have been written to: /usr/hipify/build @@ -861,6 +881,7 @@ Building with testing support using ``Visual Studio 17 2022`` on ``Windows 11``: -DCUDA_TOOLKIT_ROOT_DIR="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6" \ -DCUDA_SDK_ROOT_DIR="C:/ProgramData/NVIDIA Corporation/CUDA Samples/v12.5" \ -DCUDA_DNN_ROOT_DIR=D:/CUDA/cuDNN/9.6.0 \ + -DCUDA_TENSOR_ROOT_DIR=D:/CUDA/cuTensor/2.0.2.1 \ -DLLVM_EXTERNAL_LIT=D:/LLVM/19.1.5/build/Release/bin/llvm-lit.py \ ../hipify @@ -869,43 +890,46 @@ The corresponding successful output is: .. code-block:: shell -- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.22631. - -- The C compiler identification is MSVC 19.41.34120.0 - -- The CXX compiler identification is MSVC 19.41.34120.0 + -- The C compiler identification is MSVC 19.42.34435.0 + -- The CXX compiler identification is MSVC 19.42.34435.0 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done - -- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.41.34120/bin/Hostx64/x64/cl.exe - skipped + -- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.42.34433/bin/Hostx64/x64/cl.exe - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done - -- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.41.34120/bin/Hostx64/x64/cl.exe - skipped + -- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.42.34433/bin/Hostx64/x64/cl.exe - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- HIPIFY config: - -- - Build hipify-clang : ON - -- - Test hipify-clang : ON - -- - Is part of HIP SDK : OFF + -- - Build hipify-clang : ON + -- - Test hipify-clang : ON + -- - Is part of HIP SDK : OFF + -- - Install clang headers : ON -- Found LLVM 19.1.5: - -- - CMake module path : D:/LLVM/19.1.5/dist/lib/cmake/llvm - -- - Clang include path : D:/LLVM/19.1.5/dist/include - -- - LLVM Include path : D:/LLVM/19.1.5/dist/include - -- - Binary path : D:/LLVM/19.1.5/dist/bin + -- - CMake module path : D:/LLVM/19.1.5/dist/lib/cmake/llvm + -- - Clang include path : D:/LLVM/19.1.5/dist/include + -- - LLVM Include path : D:/LLVM/19.1.5/dist/include + -- - Binary path : D:/LLVM/19.1.5/dist/bin -- ---- The below configuring for hipify-clang testing only ---- - -- Found Python: C:/Users/evgen/AppData/Local/Programs/Python/Python313/python.exe (found suitable version "3.13.1", required range is "3.0...3.14") found components: Interpreter + -- Found Python: C:/Users/TT/AppData/Local/Programs/Python/Python313/python.exe (found suitable version "3.13.1", required range is "3.0...3.14") found components: Interpreter -- Found lit: C:/Users/TT/AppData/Local/Programs/Python/Python313/Scripts/lit.exe -- Found FileCheck: D:/LLVM/19.1.5/dist/bin/FileCheck.exe -- Initial CUDA to configure: - -- - CUDA Toolkit path : C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6 - -- - CUDA Samples path : C:/ProgramData/NVIDIA Corporation/CUDA Samples/v12.5 - -- - cuDNN path : D:/CUDA/cuDNN/9.6.0 - -- - CUB path : + -- - CUDA Toolkit path : C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6 + -- - CUDA Samples path : C:/ProgramData/NVIDIA Corporation/CUDA Samples/v12.5 + -- - cuDNN path : D:/CUDA/cuDNN/9.6.0 + -- - cuTENSOR path : D:/CUDA/cuTensor/2.0.2.1 + -- - CUB path : -- Found CUDAToolkit: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/include (found version "12.6.85") -- Found CUDA config: - -- - CUDA Toolkit path : C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6 - -- - CUDA Samples path : C:/ProgramData/NVIDIA Corporation/CUDA Samples/v12.5 - -- - cuDNN path : D:/CUDA/cuDNN/9.6.0 - -- - CUB path : C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/include/cub - -- Configuring done (2.1s) + -- - CUDA Toolkit path : C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6 + -- - CUDA Samples path : C:/ProgramData/NVIDIA Corporation/CUDA Samples/v12.5 + -- - cuDNN path : D:/CUDA/cuDNN/9.6.0 + -- - cuTENSOR path : D:/CUDA/cuTensor/2.0.2.1 + -- - CUB path : C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/include/cub + -- Configuring done (4.4s) -- Generating done (0.1s) -- Build files have been written to: D:/HIPIFY/build diff --git a/docs/supported_apis.md b/docs/supported_apis.md index dee56e83..34b470d5 100644 --- a/docs/supported_apis.md +++ b/docs/supported_apis.md @@ -13,6 +13,7 @@ | CURAND API | [HIP RAND API](tables/CURAND_API_supported_by_HIP.md) | [ROC RAND API](tables/CURAND_API_supported_by_ROC.md) | [HIP + ROC RAND API](tables/CURAND_API_supported_by_HIP_and_ROC.md) | | CUFFT API | [HIP FFT API](tables/CUFFT_API_supported_by_HIP.md) | | | | CUDNN API | [HIP DNN API](tables/CUDNN_API_supported_by_HIP.md) | [MIOPEN API](tables/CUDNN_API_supported_by_MIOPEN.md) | [HIP + MIOPEN API](tables/CUDNN_API_supported_by_HIP_and_MIOPEN.md) | +| CUTENSOR API | [HIP TENSOR API](tables/CUTENSOR_API_supported_by_HIP.md) | | | | CUB API | [HIP CUB API](tables/CUB_API_supported_by_HIP.md) | | | To generate the above documentation with the information about all supported CUDA APIs in Markdown format, run `hipify-clang --md --doc-format=full` with or without specifying the output directory (`-o`), for HIP and ROC separately `--doc-roc=separate` or in the joint format (ROC & HIP) `--doc-roc=joint`. From 12495f2d5c67a4129bc8538d34b0febfeda267f6 Mon Sep 17 00:00:00 2001 From: Evgeny Mankov Date: Tue, 17 Dec 2024 22:18:00 +0100 Subject: [PATCH 02/17] [HIPIFY][Tensor][tests][fix] Guard `cudaDataType` with `#if CUDA_VERSION >= 8000` --- tests/unit_tests/synthetic/libraries/cutensor2hiptensor.cu | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tests/unit_tests/synthetic/libraries/cutensor2hiptensor.cu b/tests/unit_tests/synthetic/libraries/cutensor2hiptensor.cu index 509cc6a3..d09521f3 100644 --- a/tests/unit_tests/synthetic/libraries/cutensor2hiptensor.cu +++ b/tests/unit_tests/synthetic/libraries/cutensor2hiptensor.cu @@ -31,8 +31,10 @@ int main() { cutensorTensorDescriptor_t *descC = nullptr; cutensorTensorDescriptor_t *descD = nullptr; +#if CUDA_VERSION >= 8000 // CHECK: hipDataType dataType; cudaDataType dataType; +#endif // CHECK: hipStream_t stream_t; cudaStream_t stream_t; @@ -205,10 +207,12 @@ int main() { // CHECK: hiptensorWorksizePreference_t TENSOR_WORKSPACE_RECOMMENDED = HIPTENSOR_WORKSPACE_RECOMMENDED; cutensorWorksizePreference_t TENSOR_WORKSPACE_RECOMMENDED = CUTENSOR_WORKSPACE_RECOMMENDED; +#if CUDA_VERSION >= 8000 // CUDA: cutensorStatus_t cutensorInitTensorDescriptor(const cutensorHandle_t* handle, cutensorTensorDescriptor_t* desc, const uint32_t numModes, const int64_t extent[], const int64_t stride[], cudaDataType_t dataType, cutensorOperator_t unaryOp); // HIP: hiptensorStatus_t hiptensorInitTensorDescriptor(const hiptensorHandle_t* handle, hiptensorTensorDescriptor_t* desc, const uint32_t numModes, const int64_t lens[], const int64_t strides[], hipDataType dataType, hiptensorOperator_t unaryOp); // CHECK: status = hiptensorInitTensorDescriptor(handle_c, tensorDescriptor, numModes, extent, stride, dataType, tensorOperator_t); status = cutensorInitTensorDescriptor(handle_c, tensorDescriptor, numModes, extent, stride, dataType, tensorOperator_t); +#endif // CUDA: cutensorStatus_t cutensorPermutation(const cutensorHandle_t* handle, const void* alpha, const void* A, const cutensorTensorDescriptor_t* descA, const int32_t modeA[], void* B, const cutensorTensorDescriptor_t* descB, const int32_t modeB[], const cudaDataType_t typeScalar, const cudaStream_t stream); // HIP: hiptensorStatus_t hiptensorPermutation(const hiptensorHandle_t* handle, const void* alpha, const void* A, const hiptensorTensorDescriptor_t* descA, const int32_t modeA[], void* B, const hiptensorTensorDescriptor_t* descB, const int32_t modeB[], const hipDataType typeScalar, const hipStream_t stream); From b30e5a448e830a436e3d228d08c3cbec9637e4c4 Mon Sep 17 00:00:00 2001 From: Evgeny Mankov Date: Wed, 18 Dec 2024 16:56:52 +0100 Subject: [PATCH 03/17] [HIPIFY][#1800][doc][Linux][clang][fix] Fix instructions on building `LLVM` for `HIPIFY` on `Linux` + [Reason] It turned out that the `X86` target needed to be specified for LLVM build on Linux, whereas there is no such obligatory dependence on Windows + There is no dependency on the `NVPTX` target on both OSs --- docs/hipify-clang.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/hipify-clang.rst b/docs/hipify-clang.rst index 12e1b77c..f1367032 100644 --- a/docs/hipify-clang.rst +++ b/docs/hipify-clang.rst @@ -450,7 +450,7 @@ LLVM <= 9.0.1 cmake \ -DCMAKE_INSTALL_PREFIX=../dist \ -DLLVM_SOURCE_DIR=../llvm \ - -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" \ + -DLLVM_TARGETS_TO_BUILD="X86" \ -DLLVM_INCLUDE_TESTS=OFF \ -DCMAKE_BUILD_TYPE=Release \ ../llvm @@ -466,7 +466,7 @@ LLVM <= 9.0.1 -Thost=x64 \ -DCMAKE_INSTALL_PREFIX=../dist \ -DLLVM_SOURCE_DIR=../llvm \ - -DLLVM_TARGETS_TO_BUILD="NVPTX" \ + -DLLVM_TARGETS_TO_BUILD="" \ -DLLVM_INCLUDE_TESTS=OFF \ -DCMAKE_BUILD_TYPE=Release \ ../llvm @@ -492,7 +492,7 @@ LLVM >= 10.0.0 cmake \ -DCMAKE_INSTALL_PREFIX=../dist \ - -DLLVM_TARGETS_TO_BUILD="" \ + -DLLVM_TARGETS_TO_BUILD="X86" \ -DLLVM_ENABLE_PROJECTS="clang" \ -DLLVM_INCLUDE_TESTS=OFF \ -DCMAKE_BUILD_TYPE=Release \ From d0125f7c26565bbe3606398e5a1b212489f41edd Mon Sep 17 00:00:00 2001 From: Evgeny Mankov Date: Wed, 18 Dec 2024 17:41:07 +0100 Subject: [PATCH 04/17] [HIPIFY][Tensor] Minor fixes and formatting --- bin/hipify-perl | 2 +- src/CUDA2HIP_TENSOR_API_functions.cpp | 199 ++++---- src/CUDA2HIP_TENSOR_API_types.cpp | 667 +++++++++++++------------- 3 files changed, 433 insertions(+), 435 deletions(-) diff --git a/bin/hipify-perl b/bin/hipify-perl index 06181010..36006f3f 100755 --- a/bin/hipify-perl +++ b/bin/hipify-perl @@ -7384,6 +7384,7 @@ sub simpleSubstitutions { subst("cusparseSparseToDenseAlg_t", "hipsparseSparseToDenseAlg_t", "type"); subst("cusparseStatus_t", "hipsparseStatus_t", "type"); subst("cutensorAlgo_t", "hiptensorAlgo_t", "type"); + subst("cutensorComputeType_t", "hiptensorComputeType_t", "type"); subst("cutensorContractionPlan_t", "hiptensorContractionPlan_t", "type"); subst("cutensorDataType_t", "hiptensorComputeType_t", "type"); subst("cutensorHandle_t", "hiptensorHandle_t", "type"); @@ -8694,7 +8695,6 @@ sub simpleSubstitutions { subst("cudaSuccess", "hipSuccess", "numeric_literal"); subst("cudaUserObjectNoDestructorSync", "hipUserObjectNoDestructorSync", "numeric_literal"); subst("cusolver_int_t", "int", "numeric_literal"); - subst("cutensorComputeType_t", "hiptensorComputeType_t", "numeric_literal"); subst("CUB_MAX", "CUB_MAX", "define"); subst("CUB_MIN", "CUB_MIN", "define"); subst("CUB_NAMESPACE_BEGIN", "BEGIN_HIPCUB_NAMESPACE", "define"); diff --git a/src/CUDA2HIP_TENSOR_API_functions.cpp b/src/CUDA2HIP_TENSOR_API_functions.cpp index 4bb5cc85..a6440012 100644 --- a/src/CUDA2HIP_TENSOR_API_functions.cpp +++ b/src/CUDA2HIP_TENSOR_API_functions.cpp @@ -1,5 +1,5 @@ /* -Copyright (c) 2015 - present Advanced Micro Devices, Inc. All rights reserved. +Copyright (c) 2024 - present Advanced Micro Devices, Inc. All rights reserved. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal @@ -23,111 +23,110 @@ THE SOFTWARE. #include "CUDA2HIP.h" const std::map CUDA_TENSOR_FUNCTION_MAP { - {"cutensorCreate", {"hiptensorCreate", "", CONV_LIB_FUNC, API_TENSOR, 2}}, - {"cutensorDestroy", {"hiptensorDestroy", "", CONV_LIB_FUNC, API_TENSOR, 2}}, - {"cutensorHandleResizePlanCache", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorHandleWritePlanCacheToFile", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorHandleReadPlanCacheFromFile", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorWriteKernelCacheToFile", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorReadKernelCacheFromFile", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorCreateTensorDescriptor", {"hiptensorInitTensorDescriptor", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorInitTensorDescriptor", {"hiptensorInitTensorDescriptor", "", CONV_LIB_FUNC, API_TENSOR, 2}}, - {"cutensorDestroyTensorDescriptor", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorCreateElementwiseTrinary", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorElementwiseTrinaryExecute", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorCreateElementwiseBinary", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorElementwiseBinaryExecute", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorCreatePermutation", {"hiptensorPermutation", "", CONV_LIB_FUNC, API_TENSOR, 2, HIP_UNSUPPORTED}}, - {"cutensorPermutation", {"hiptensorPermutation", "", CONV_LIB_FUNC, API_TENSOR, 2}}, - {"cutensorPermute", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorCreateContraction", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorContraction", {"hiptensorContraction", "", CONV_LIB_FUNC, API_TENSOR, 2}}, - {"cutensorDestroyOperationDescriptor", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorOperationDescriptorSetAttribute", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorOperationDescriptorGetAttribute", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorCreatePlanPreference", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorDestroyPlanPreference", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorPlanPreferenceSetAttribute", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorPlanGetAttribute", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorEstimateWorkspaceSize", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorCreatePlan", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorDestroyPlan", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorContract", {"hiptensorContraction", "", CONV_LIB_FUNC, API_TENSOR, 2}}, - {"cutensorReduction", {"hiptensorReduction", "", CONV_LIB_FUNC, API_TENSOR, 2}}, - {"cutensorCreateReduction", {"hiptensorReduction", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorReduce", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorGetErrorString", {"hiptensorGetErrorString", "", CONV_LIB_FUNC, API_TENSOR, 2}}, - {"cutensorGetVersion", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, - {"cutensorGetCudartVersion", {"hiptensorGetHiprtVersion", "", CONV_LIB_FUNC, API_TENSOR, 2}}, - {"cutensorLoggerSetCallback", {"hiptensorLoggerSetCallback", "", CONV_LIB_FUNC, API_TENSOR, 2}}, - {"cutensorLoggerSetFile", {"hiptensorLoggerSetFile", "", CONV_LIB_FUNC, API_TENSOR, 2}}, - {"cutensorLoggerOpenFile", {"hiptensorLoggerOpenFile", "", CONV_LIB_FUNC, API_TENSOR, 2}}, - {"cutensorLoggerSetLevel", {"hiptensorLoggerSetLevel", "", CONV_LIB_FUNC, API_TENSOR, 2}}, - {"cutensorLoggerSetMask", {"hiptensorLoggerSetMask", "", CONV_LIB_FUNC, API_TENSOR, 2}}, - {"cutensorLoggerForceDisable", {"hiptensorLoggerForceDisable", "", CONV_LIB_FUNC, API_TENSOR, 2}}, + {"cutensorCreate", {"hiptensorCreate", "", CONV_LIB_FUNC, API_TENSOR, 2}}, + {"cutensorDestroy", {"hiptensorDestroy", "", CONV_LIB_FUNC, API_TENSOR, 2}}, + {"cutensorHandleResizePlanCache", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorHandleWritePlanCacheToFile", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorHandleReadPlanCacheFromFile", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorWriteKernelCacheToFile", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorReadKernelCacheFromFile", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorCreateTensorDescriptor", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorInitTensorDescriptor", {"hiptensorInitTensorDescriptor", "", CONV_LIB_FUNC, API_TENSOR, 2}}, + {"cutensorDestroyTensorDescriptor", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorCreateElementwiseTrinary", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorElementwiseTrinaryExecute", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorCreateElementwiseBinary", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorElementwiseBinaryExecute", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorCreatePermutation", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorPermutation", {"hiptensorPermutation", "", CONV_LIB_FUNC, API_TENSOR, 2}}, + {"cutensorPermute", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorCreateContraction", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorContraction", {"hiptensorContraction", "", CONV_LIB_FUNC, API_TENSOR, 2}}, + {"cutensorDestroyOperationDescriptor", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorOperationDescriptorSetAttribute", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorOperationDescriptorGetAttribute", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorCreatePlanPreference", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorDestroyPlanPreference", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorPlanPreferenceSetAttribute", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorPlanGetAttribute", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorEstimateWorkspaceSize", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorCreatePlan", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorDestroyPlan", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorContract", {"hiptensorContraction", "", CONV_LIB_FUNC, API_TENSOR, 2}}, + {"cutensorReduction", {"hiptensorReduction", "", CONV_LIB_FUNC, API_TENSOR, 2}}, + {"cutensorCreateReduction", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorReduce", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorGetErrorString", {"hiptensorGetErrorString", "", CONV_LIB_FUNC, API_TENSOR, 2}}, + {"cutensorGetVersion", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorGetCudartVersion", {"hiptensorGetHiprtVersion", "", CONV_LIB_FUNC, API_TENSOR, 2}}, + {"cutensorLoggerSetCallback", {"hiptensorLoggerSetCallback", "", CONV_LIB_FUNC, API_TENSOR, 2}}, + {"cutensorLoggerSetFile", {"hiptensorLoggerSetFile", "", CONV_LIB_FUNC, API_TENSOR, 2}}, + {"cutensorLoggerOpenFile", {"hiptensorLoggerOpenFile", "", CONV_LIB_FUNC, API_TENSOR, 2}}, + {"cutensorLoggerSetLevel", {"hiptensorLoggerSetLevel", "", CONV_LIB_FUNC, API_TENSOR, 2}}, + {"cutensorLoggerSetMask", {"hiptensorLoggerSetMask", "", CONV_LIB_FUNC, API_TENSOR, 2}}, + {"cutensorLoggerForceDisable", {"hiptensorLoggerForceDisable", "", CONV_LIB_FUNC, API_TENSOR, 2}}, }; - const std::map CUDA_TENSOR_FUNCTION_VER_MAP { - {"cutensorCreate", {CUTENSOR_1700, CUDA_0, CUDA_0 }}, - {"cutensorDestroy", {CUTENSOR_1700, CUDA_0, CUDA_0 }}, - {"cutensorHandleResizePlanCache", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorHandleWritePlanCacheToFile", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorHandleReadPlanCacheFromFile", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorWriteKernelCacheToFile", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorReadKernelCacheFromFile", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorCreateTensorDescriptor", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorInitTensorDescriptor", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000}}, - {"cutensorDestroyTensorDescriptor", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorCreateElementwiseTrinary", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorElementwiseTrinaryExecute", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorCreateElementwiseBinary", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorElementwiseBinaryExecute", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorCreatePermutation", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorPermutation", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000}}, - {"cutensorPermute", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorCreateContraction", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorContraction", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000}}, - {"cutensorDestroyOperationDescriptor", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorOperationDescriptorSetAttribute", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorOperationDescriptorGetAttribute", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorCreatePlanPreference", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorDestroyPlanPreference", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorPlanPreferenceSetAttribute", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorPlanGetAttribute", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorEstimateWorkspaceSize", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorCreatePlan", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorDestroyPlan", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorContract", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorCreateReduction", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorReduction", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000}}, - {"cutensorReduce", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, - {"cutensorGetErrorString", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, - {"cutensorGetVersion", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, - {"cutensorGetCudartVersion", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, - {"cutensorLoggerSetCallback", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, - {"cutensorLoggerSetFile", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, - {"cutensorLoggerOpenFile", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, - {"cutensorLoggerSetLevel", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, - {"cutensorLoggerSetMask", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, - {"cutensorLoggerForceDisable", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, + {"cutensorCreate", {CUTENSOR_1700, CUDA_0, CUDA_0 }}, + {"cutensorDestroy", {CUTENSOR_1700, CUDA_0, CUDA_0 }}, + {"cutensorHandleResizePlanCache", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorHandleWritePlanCacheToFile", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorHandleReadPlanCacheFromFile", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorWriteKernelCacheToFile", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorReadKernelCacheFromFile", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorCreateTensorDescriptor", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorInitTensorDescriptor", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, + {"cutensorDestroyTensorDescriptor", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorCreateElementwiseTrinary", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorElementwiseTrinaryExecute", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorCreateElementwiseBinary", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorElementwiseBinaryExecute", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorCreatePermutation", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorPermutation", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, + {"cutensorPermute", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorCreateContraction", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorContraction", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, + {"cutensorDestroyOperationDescriptor", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorOperationDescriptorSetAttribute", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorOperationDescriptorGetAttribute", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorCreatePlanPreference", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorDestroyPlanPreference", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorPlanPreferenceSetAttribute", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorPlanGetAttribute", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorEstimateWorkspaceSize", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorCreatePlan", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorDestroyPlan", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorContract", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorCreateReduction", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorReduction", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, + {"cutensorReduce", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorGetErrorString", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"cutensorGetVersion", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"cutensorGetCudartVersion", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"cutensorLoggerSetCallback", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, + {"cutensorLoggerSetFile", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, + {"cutensorLoggerOpenFile", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, + {"cutensorLoggerSetLevel", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, + {"cutensorLoggerSetMask", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, + {"cutensorLoggerForceDisable", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, }; const std::map HIP_TENSOR_FUNCTION_VER_MAP { - {"hiptensorCreate", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorDestroy", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorInitTensorDescriptor", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorPermutation", {HIP_6010, HIP_0, HIP_0, }}, - {"hiptensorContraction", {HIP_6010, HIP_0, HIP_0, }}, - {"hiptensorReduction", {HIP_6030, HIP_0, HIP_0, }}, - {"hiptensorGetErrorString", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorGetHiprtVersion", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorLoggerSetCallback", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorLoggerSetFile", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorLoggerOpenFile", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorLoggerSetLevel", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorLoggerSetMask", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorLoggerForceDisable", {HIP_5070, HIP_0, HIP_0, }}, + {"hiptensorCreate", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorDestroy", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorInitTensorDescriptor", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorPermutation", {HIP_6010, HIP_0, HIP_0 }}, + {"hiptensorContraction", {HIP_6010, HIP_0, HIP_0 }}, + {"hiptensorReduction", {HIP_6030, HIP_0, HIP_0 }}, + {"hiptensorGetErrorString", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorGetHiprtVersion", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorLoggerSetCallback", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorLoggerSetFile", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorLoggerOpenFile", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorLoggerSetLevel", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorLoggerSetMask", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorLoggerForceDisable", {HIP_5070, HIP_0, HIP_0 }}, }; const std::map CUDA_TENSOR_API_SECTION_MAP { diff --git a/src/CUDA2HIP_TENSOR_API_types.cpp b/src/CUDA2HIP_TENSOR_API_types.cpp index de0f266f..59fb34d4 100644 --- a/src/CUDA2HIP_TENSOR_API_types.cpp +++ b/src/CUDA2HIP_TENSOR_API_types.cpp @@ -25,356 +25,355 @@ THE SOFTWARE. // Map of all functions const std::map CUDA_TENSOR_TYPE_NAME_MAP { // cuTENSOR enums - {"cutensorDataType_t", {"hiptensorComputeType_t", "", CONV_TYPE, API_TENSOR, 1}}, - {"CUTENSOR_R_16F", {"HIPTENSOR_COMPUTE_16F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_C_16F", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_16BF", {"HIPTENSOR_COMPUTE_16BF", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_C_16BF", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_32F", {"HIPTENSOR_COMPUTE_32F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_C_32F", {"HIPTENSOR_COMPUTE_C32F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_R_64F", {"HIPTENSOR_COMPUTE_64F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_C_64F", {"HIPTENSOR_COMPUTE_C64F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_R_4I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_C_4I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_4U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_C_4U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_8I", {"HIPTENSOR_COMPUTE_8I", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_C_8I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_8U", {"HIPTENSOR_COMPUTE_8U", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_C_8U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_16I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_C_16I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_16U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_C_16U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_32I", {"HIPTENSOR_COMPUTE_32I", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_C_32I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_32U", {"HIPTENSOR_COMPUTE_32U", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_C_32U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_64I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_C_64I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_64U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_C_64U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorDataType_t", {"hiptensorComputeType_t", "", CONV_TYPE, API_TENSOR, 1}}, + {"CUTENSOR_R_16F", {"HIPTENSOR_COMPUTE_16F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_C_16F", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_16BF", {"HIPTENSOR_COMPUTE_16BF", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_C_16BF", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_32F", {"HIPTENSOR_COMPUTE_32F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_C_32F", {"HIPTENSOR_COMPUTE_C32F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_R_64F", {"HIPTENSOR_COMPUTE_64F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_C_64F", {"HIPTENSOR_COMPUTE_C64F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_R_4I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_C_4I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_4U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_C_4U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_8I", {"HIPTENSOR_COMPUTE_8I", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_C_8I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_8U", {"HIPTENSOR_COMPUTE_8U", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_C_8U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_16I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_C_16I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_16U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_C_16U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_32I", {"HIPTENSOR_COMPUTE_32I", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_C_32I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_32U", {"HIPTENSOR_COMPUTE_32U", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_C_32U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_64I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_C_64I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_64U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_C_64U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"cutensorComputeType_t", {"hiptensorComputeType_t", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_COMPUTE_16F", {"HIPTENSOR_COMPUTE_16F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_COMPUTE_16BF", {"HIPTENSOR_COMPUTE_16BF", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_COMPUTE_TF32", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_COMPUTE_32F", {"HIPTENSOR_COMPUTE_32F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_COMPUTE_64F", {"HIPTENSOR_COMPUTE_64F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_COMPUTE_8U", {"HIPTENSOR_COMPUTE_8U", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_COMPUTE_8I", {"HIPTENSOR_COMPUTE_8I", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_COMPUTE_32U", {"HIPTENSOR_COMPUTE_32U", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_COMPUTE_32I", {"HIPTENSOR_COMPUTE_32I", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_R_MIN_16F", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_C_MIN_16F", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_MIN_32F", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_C_MIN_32F", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_MIN_64F", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_C_MIN_64F", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_MIN_8U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_MIN_32U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_MIN_16BF", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_R_MIN_TF32", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_C_MIN_TF32", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorComputeType_t", {"hiptensorComputeType_t", "", CONV_TYPE, API_TENSOR, 1}}, + {"CUTENSOR_COMPUTE_16F", {"HIPTENSOR_COMPUTE_16F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_COMPUTE_16BF", {"HIPTENSOR_COMPUTE_16BF", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_COMPUTE_TF32", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_COMPUTE_32F", {"HIPTENSOR_COMPUTE_32F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_COMPUTE_64F", {"HIPTENSOR_COMPUTE_64F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_COMPUTE_8U", {"HIPTENSOR_COMPUTE_8U", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_COMPUTE_8I", {"HIPTENSOR_COMPUTE_8I", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_COMPUTE_32U", {"HIPTENSOR_COMPUTE_32U", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_COMPUTE_32I", {"HIPTENSOR_COMPUTE_32I", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_R_MIN_16F", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_C_MIN_16F", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_MIN_32F", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_C_MIN_32F", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_MIN_64F", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_C_MIN_64F", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_MIN_8U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_MIN_32U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_MIN_16BF", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_MIN_TF32", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_C_MIN_TF32", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"cutensorOperator_t", {"hiptensorOperator_t", "", CONV_TYPE, API_TENSOR, 1}}, - {"CUTENSOR_OP_IDENTITY", {"HIPTENSOR_OP_IDENTITY", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_OP_SQRT", {"HIPTENSOR_OP_SQRT", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_OP_RELU", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_CONJ", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_RCP", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_SIGMOID", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_TANH", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_EXP", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_LOG", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_ABS", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_NEG", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_SIN", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_COS", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_TAN", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_SINH", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_COSH", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_ASIN", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_ACOS", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_ATAN", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_ASINH", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_ACOSH", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_ATANH", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_CEIL", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_FLOOR", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_MISH", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_SWISH", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_SOFT_PLUS", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_SOFT_SIGN", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OP_ADD", {"HIPTENSOR_OP_ADD", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_OP_MUL", {"HIPTENSOR_OP_MUL", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_OP_MAX", {"HIPTENSOR_OP_MAX", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_OP_MIN", {"HIPTENSOR_OP_MIN", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_OP_UNKNOWN", {"HIPTENSOR_OP_UNKNOWN", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"cutensorOperator_t", {"hiptensorOperator_t", "", CONV_TYPE, API_TENSOR, 1}}, + {"CUTENSOR_OP_IDENTITY", {"HIPTENSOR_OP_IDENTITY", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_OP_SQRT", {"HIPTENSOR_OP_SQRT", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_OP_RELU", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_CONJ", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_RCP", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_SIGMOID", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_TANH", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_EXP", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_LOG", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_ABS", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_NEG", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_SIN", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_COS", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_TAN", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_SINH", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_COSH", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_ASIN", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_ACOS", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_ATAN", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_ASINH", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_ACOSH", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_ATANH", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_CEIL", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_FLOOR", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_MISH", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_SWISH", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_SOFT_PLUS", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_SOFT_SIGN", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OP_ADD", {"HIPTENSOR_OP_ADD", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_OP_MUL", {"HIPTENSOR_OP_MUL", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_OP_MAX", {"HIPTENSOR_OP_MAX", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_OP_MIN", {"HIPTENSOR_OP_MIN", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_OP_UNKNOWN", {"HIPTENSOR_OP_UNKNOWN", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"cutensorStatus_t", {"hiptensorStatus_t", "", CONV_TYPE, API_TENSOR, 1}}, - {"CUTENSOR_STATUS_SUCCESS", {"HIPTENSOR_STATUS_SUCCESS", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_STATUS_NOT_INITIALIZED", {"HIPTENSOR_STATUS_NOT_INITIALIZED", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_STATUS_ALLOC_FAILED", {"HIPTENSOR_STATUS_ALLOC_FAILED", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_STATUS_INVALID_VALUE", {"HIPTENSOR_STATUS_INVALID_VALUE", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_STATUS_ARCH_MISMATCH", {"HIPTENSOR_STATUS_ARCH_MISMATCH", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_STATUS_MAPPING_ERROR", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_STATUS_EXECUTION_FAILED", {"HIPTENSOR_STATUS_EXECUTION_FAILED", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_STATUS_INTERNAL_ERROR", {"HIPTENSOR_STATUS_INTERNAL_ERROR", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_STATUS_NOT_SUPPORTED", {"HIPTENSOR_STATUS_NOT_SUPPORTED", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_STATUS_LICENSE_ERROR", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_STATUS_CUBLAS_ERROR", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_STATUS_CUDA_ERROR", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_STATUS_INSUFFICIENT_WORKSPACE", {"HIPTENSOR_STATUS_INSUFFICIENT_WORKSPACE", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_STATUS_INSUFFICIENT_DRIVER", {"HIPTENSOR_STATUS_INSUFFICIENT_DRIVER", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_STATUS_IO_ERROR", {"HIPTENSOR_STATUS_IO_ERROR", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"cutensorStatus_t", {"hiptensorStatus_t", "", CONV_TYPE, API_TENSOR, 1}}, + {"CUTENSOR_STATUS_SUCCESS", {"HIPTENSOR_STATUS_SUCCESS", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_STATUS_NOT_INITIALIZED", {"HIPTENSOR_STATUS_NOT_INITIALIZED", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_STATUS_ALLOC_FAILED", {"HIPTENSOR_STATUS_ALLOC_FAILED", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_STATUS_INVALID_VALUE", {"HIPTENSOR_STATUS_INVALID_VALUE", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_STATUS_ARCH_MISMATCH", {"HIPTENSOR_STATUS_ARCH_MISMATCH", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_STATUS_MAPPING_ERROR", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_STATUS_EXECUTION_FAILED", {"HIPTENSOR_STATUS_EXECUTION_FAILED", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_STATUS_INTERNAL_ERROR", {"HIPTENSOR_STATUS_INTERNAL_ERROR", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_STATUS_NOT_SUPPORTED", {"HIPTENSOR_STATUS_NOT_SUPPORTED", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_STATUS_LICENSE_ERROR", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_STATUS_CUBLAS_ERROR", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_STATUS_CUDA_ERROR", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_STATUS_INSUFFICIENT_WORKSPACE", {"HIPTENSOR_STATUS_INSUFFICIENT_WORKSPACE", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_STATUS_INSUFFICIENT_DRIVER", {"HIPTENSOR_STATUS_INSUFFICIENT_DRIVER", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_STATUS_IO_ERROR", {"HIPTENSOR_STATUS_IO_ERROR", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"cutensorAlgo_t", {"hiptensorAlgo_t", "", CONV_TYPE, API_TENSOR, 1}}, - {"CUTENSOR_ALGO_DEFAULT_PATIENT", {"HIPTENSOR_ALGO_DEFAULT_PATIENT", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_ALGO_GETT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_ALGO_TGETT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_ALGO_TTGT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_ALGO_DEFAULT", {"HIPTENSOR_ALGO_DEFAULT", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"cutensorAlgo_t", {"hiptensorAlgo_t", "", CONV_TYPE, API_TENSOR, 1}}, + {"CUTENSOR_ALGO_DEFAULT_PATIENT", {"HIPTENSOR_ALGO_DEFAULT_PATIENT", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_ALGO_GETT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_ALGO_TGETT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_ALGO_TTGT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_ALGO_DEFAULT", {"HIPTENSOR_ALGO_DEFAULT", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"cutensorWorksizePreference_t", {"hiptensorWorksizePreference_t", "", CONV_TYPE, API_TENSOR, 1}}, - {"CUTENSOR_WORKSPACE_MIN", {"HIPTENSOR_WORKSPACE_MIN", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_WORKSPACE_DEFAULT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_WORKSPACE_RECOMMENDED", {"HIPTENSOR_WORKSPACE_RECOMMENDED", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"CUTENSOR_WORKSPACE_MAX", {"HIPTENSOR_WORKSPACE_MAX", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"cutensorWorksizePreference_t", {"hiptensorWorksizePreference_t", "", CONV_TYPE, API_TENSOR, 1}}, + {"CUTENSOR_WORKSPACE_MIN", {"HIPTENSOR_WORKSPACE_MIN", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_WORKSPACE_DEFAULT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_WORKSPACE_RECOMMENDED", {"HIPTENSOR_WORKSPACE_RECOMMENDED", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, + {"CUTENSOR_WORKSPACE_MAX", {"HIPTENSOR_WORKSPACE_MAX", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, - {"cutensorOperationDescriptorAttribute_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OPERATION_DESCRIPTOR_TAG", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OPERATION_DESCRIPTOR_SCALAR_TYPE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OPERATION_DESCRIPTOR_FLOPS", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OPERATION_DESCRIPTOR_MOVED_BYTES", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OPERATION_DESCRIPTOR_PADDING_LEFT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OPERATION_DESCRIPTOR_PADDING_RIGHT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_OPERATION_DESCRIPTOR_PADDING_VALUE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorOperationDescriptorAttribute_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OPERATION_DESCRIPTOR_TAG", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OPERATION_DESCRIPTOR_SCALAR_TYPE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OPERATION_DESCRIPTOR_FLOPS", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OPERATION_DESCRIPTOR_MOVED_BYTES", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OPERATION_DESCRIPTOR_PADDING_LEFT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OPERATION_DESCRIPTOR_PADDING_RIGHT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_OPERATION_DESCRIPTOR_PADDING_VALUE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"cutensorPlanPreferenceAttribute_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_PLAN_PREFERENCE_AUTOTUNE_MODE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_PLAN_PREFERENCE_CACHE_MODE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_PLAN_PREFERENCE_INCREMENTAL_COUNT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_PLAN_PREFERENCE_ALGO", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_PLAN_PREFERENCE_KERNEL_RANK", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_PLAN_PREFERENCE_JIT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorPlanPreferenceAttribute_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_PLAN_PREFERENCE_AUTOTUNE_MODE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_PLAN_PREFERENCE_CACHE_MODE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_PLAN_PREFERENCE_INCREMENTAL_COUNT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_PLAN_PREFERENCE_ALGO", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_PLAN_PREFERENCE_KERNEL_RANK", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_PLAN_PREFERENCE_JIT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"cutensorAutotuneMode_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_AUTOTUNE_MODE_NONE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_AUTOTUNE_MODE_INCREMENTAL", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorAutotuneMode_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_AUTOTUNE_MODE_NONE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_AUTOTUNE_MODE_INCREMENTAL", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"cutensorJitMode_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_JIT_MODE_NONE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_JIT_MODE_DEFAULT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorJitMode_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_JIT_MODE_NONE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_JIT_MODE_DEFAULT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"cutensorCacheMode_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_CACHE_MODE_NONE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_CACHE_MODE_PEDANTIC", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorCacheMode_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_CACHE_MODE_NONE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_CACHE_MODE_PEDANTIC", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - {"cutensorPlanAttribute_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, - {"CUTENSOR_PLAN_REQUIRED_WORKSPACE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, - - {"cutensorHandle_t", {"hiptensorHandle_t", "", CONV_TYPE, API_TENSOR, 1}}, - {"cutensorHandle", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, - {"cutensorTensorDescriptor_t", {"hiptensorTensorDescriptor_t", "", CONV_TYPE, API_TENSOR, 1}}, - {"cutensorTensorDescriptor", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, - {"cutensorContractionPlan_t", {"hiptensorContractionPlan_t", "", CONV_TYPE, API_TENSOR, 1}}, - {"cutensorPlan_t", {"hiptensorContractionPlan_t", "", CONV_TYPE, API_TENSOR, 1}}, - {"cutensorPlan", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, - {"cutensorLoggerCallback_t", {"hiptensorLoggerCallback_t", "", CONV_TYPE, API_TENSOR, 1}}, + {"cutensorPlanAttribute_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_PLAN_REQUIRED_WORKSPACE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorHandle_t", {"hiptensorHandle_t", "", CONV_TYPE, API_TENSOR, 1}}, + {"cutensorHandle", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorTensorDescriptor_t", {"hiptensorTensorDescriptor_t", "", CONV_TYPE, API_TENSOR, 1}}, + {"cutensorTensorDescriptor", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorContractionPlan_t", {"hiptensorContractionPlan_t", "", CONV_TYPE, API_TENSOR, 1}}, + {"cutensorPlan_t", {"hiptensorContractionPlan_t", "", CONV_TYPE, API_TENSOR, 1}}, + {"cutensorPlan", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorLoggerCallback_t", {"hiptensorLoggerCallback_t", "", CONV_TYPE, API_TENSOR, 1}}, }; const std::map CUDA_TENSOR_TYPE_NAME_VER_MAP { - {"cutensorDataType_t", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_R_16F", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_C_16F", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_R_16BF", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_C_16BF", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_R_32F", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_C_32F", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_R_64F", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_C_64F", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_R_4I", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_C_4I", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_R_4U", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_C_4U", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_R_8I", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_C_8I", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_R_8U", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_C_8U", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_R_16I", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_C_16I", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_R_16U", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_C_16U", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_R_32I", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_C_32I", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_R_32U", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_C_32U", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_R_64I", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_C_64I", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_R_64U", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_C_64U", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_COMPUTE_16F", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000, }}, - {"CUTENSOR_COMPUTE_16BF", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000, }}, - {"CUTENSOR_COMPUTE_TF32", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000, }}, - {"CUTENSOR_COMPUTE_32F", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000, }}, - {"CUTENSOR_COMPUTE_64F", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000, }}, - {"CUTENSOR_COMPUTE_8U", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000, }}, - {"CUTENSOR_COMPUTE_8I", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000, }}, - {"CUTENSOR_COMPUTE_32U", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000, }}, - {"CUTENSOR_COMPUTE_32I", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000, }}, - {"CUTENSOR_R_MIN_16F", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000, }}, - {"CUTENSOR_C_MIN_16F", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000, }}, - {"CUTENSOR_R_MIN_32F", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000, }}, - {"CUTENSOR_C_MIN_32F", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000, }}, - {"CUTENSOR_R_MIN_64F", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000, }}, - {"CUTENSOR_C_MIN_64F", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000, }}, - {"CUTENSOR_R_MIN_8U", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000, }}, - {"CUTENSOR_R_MIN_32U", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000, }}, - {"CUTENSOR_R_MIN_16BF", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000, }}, - {"CUTENSOR_R_MIN_TF32", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000, }}, - {"CUTENSOR_C_MIN_TF32", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000, }}, - {"cutensorOperator_t", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_IDENTITY", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_SQRT", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_RELU", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_CONJ", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_RCP", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_SIGMOID", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_TANH", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_EXP", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_LOG", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_ABS", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_NEG", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_SIN", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_COS", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_TAN", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_SINH", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_COSH", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_ASIN", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_ACOS", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_ATAN", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_ASINH", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_ACOSH", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_ATANH", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_CEIL", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_FLOOR", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_MISH", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_SWISH", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_SOFT_PLUS", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_SOFT_SIGN", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_ADD", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_MUL", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_MAX", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_MIN", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OP_UNKNOWN", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"cutensorStatus_t", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_STATUS_SUCCESS", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_STATUS_NOT_INITIALIZED", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_STATUS_ALLOC_FAILED", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_STATUS_INVALID_VALUE", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_STATUS_ARCH_MISMATCH", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_STATUS_MAPPING_ERROR", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_STATUS_EXECUTION_FAILED", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_STATUS_INTERNAL_ERROR", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_STATUS_NOT_SUPPORTED", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_STATUS_LICENSE_ERROR", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_STATUS_CUBLAS_ERROR", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_STATUS_CUDA_ERROR", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_STATUS_INSUFFICIENT_WORKSPACE", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_STATUS_INSUFFICIENT_DRIVER", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_STATUS_IO_ERROR", {CUTENSOR_1200, CUDA_0, CUDA_0, }}, - {"cutensorAlgo_t", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_ALGO_DEFAULT_PATIENT", {CUTENSOR_1400, CUDA_0, CUDA_0, }}, - {"CUTENSOR_ALGO_GETT", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_ALGO_TGETT", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_ALGO_TTGT", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_ALGO_DEFAULT", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"cutensorWorksizePreference_t", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_WORKSPACE_MIN", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"CUTENSOR_WORKSPACE_DEFAULT", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_WORKSPACE_RECOMMENDED", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000, }}, - {"CUTENSOR_WORKSPACE_MAX", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"cutensorOperationDescriptorAttribute_t", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OPERATION_DESCRIPTOR_TAG", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OPERATION_DESCRIPTOR_SCALAR_TYPE", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OPERATION_DESCRIPTOR_FLOPS", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OPERATION_DESCRIPTOR_MOVED_BYTES", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OPERATION_DESCRIPTOR_PADDING_LEFT", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OPERATION_DESCRIPTOR_PADDING_RIGHT", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_OPERATION_DESCRIPTOR_PADDING_VALUE", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"cutensorPlanPreferenceAttribute_t", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_PLAN_PREFERENCE_AUTOTUNE_MODE", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_PLAN_PREFERENCE_CACHE_MODE", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_PLAN_PREFERENCE_INCREMENTAL_COUNT", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_PLAN_PREFERENCE_ALGO", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_PLAN_PREFERENCE_KERNEL_RANK", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_PLAN_PREFERENCE_JIT", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"cutensorAutotuneMode_t", {CUTENSOR_1200, CUDA_0, CUDA_0, }}, - {"CUTENSOR_AUTOTUNE_MODE_NONE", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_AUTOTUNE_MODE_INCREMENTAL", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_AUTOTUNE_NONE", {CUTENSOR_1200, CUDA_0, CUTENSOR_2000, }}, - {"CUTENSOR_AUTOTUNE_INCREMENTAL", {CUTENSOR_1200, CUDA_0, CUTENSOR_2000, }}, - {"cutensorJitMode_t", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_JIT_MODE_NONE", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_JIT_MODE_DEFAULT", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"cutensorCacheMode_t", {CUTENSOR_1200, CUDA_0, CUDA_0, }}, - {"CUTENSOR_CACHE_MODE_NONE", {CUTENSOR_1200, CUDA_0, CUDA_0, }}, - {"CUTENSOR_CACHE_MODE_PEDANTIC", {CUTENSOR_1200, CUDA_0, CUDA_0, }}, - {"cutensorPlanAttribute_t", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"CUTENSOR_PLAN_REQUIRED_WORKSPACE", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"cutensorHandle_t", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"cutensorHandle", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"cutensorTensorDescriptor_t", {CUTENSOR_1010, CUDA_0, CUDA_0, }}, - {"cutensorTensorDescriptor", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"cutensorContractionPlan_t", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000, }}, - {"cutensorPlan_t", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"cutensorPlan", {CUTENSOR_2000, CUDA_0, CUDA_0, }}, - {"cutensorLoggerCallback_t", {CUTENSOR_1320, CUDA_0, CUDA_0, }}, + {"cutensorDataType_t", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_R_16F", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_C_16F", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_R_16BF", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_C_16BF", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_R_32F", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_C_32F", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_R_64F", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_C_64F", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_R_4I", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_C_4I", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_R_4U", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_C_4U", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_R_8I", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_C_8I", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_R_8U", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_C_8U", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_R_16I", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_C_16I", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_R_16U", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_C_16U", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_R_32I", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_C_32I", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_R_32U", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_C_32U", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_R_64I", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_C_64I", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_R_64U", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_C_64U", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_COMPUTE_16F", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, + {"CUTENSOR_COMPUTE_16BF", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, + {"CUTENSOR_COMPUTE_TF32", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, + {"CUTENSOR_COMPUTE_32F", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, + {"CUTENSOR_COMPUTE_64F", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, + {"CUTENSOR_COMPUTE_8U", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, + {"CUTENSOR_COMPUTE_8I", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, + {"CUTENSOR_COMPUTE_32U", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, + {"CUTENSOR_COMPUTE_32I", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, + {"CUTENSOR_R_MIN_16F", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, + {"CUTENSOR_C_MIN_16F", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, + {"CUTENSOR_R_MIN_32F", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, + {"CUTENSOR_C_MIN_32F", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, + {"CUTENSOR_R_MIN_64F", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, + {"CUTENSOR_C_MIN_64F", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, + {"CUTENSOR_R_MIN_8U", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, + {"CUTENSOR_R_MIN_32U", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, + {"CUTENSOR_R_MIN_16BF", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, + {"CUTENSOR_R_MIN_TF32", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, + {"CUTENSOR_C_MIN_TF32", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, + {"cutensorOperator_t", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_IDENTITY", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_SQRT", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_RELU", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_CONJ", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_RCP", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_SIGMOID", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_TANH", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_EXP", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_LOG", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_ABS", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_NEG", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_SIN", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_COS", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_TAN", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_SINH", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_COSH", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_ASIN", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_ACOS", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_ATAN", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_ASINH", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_ACOSH", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_ATANH", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_CEIL", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_FLOOR", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_MISH", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_SWISH", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_SOFT_PLUS", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_SOFT_SIGN", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_ADD", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_MUL", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_MAX", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_MIN", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OP_UNKNOWN", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"cutensorStatus_t", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_STATUS_SUCCESS", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_STATUS_NOT_INITIALIZED", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_STATUS_ALLOC_FAILED", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_STATUS_INVALID_VALUE", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_STATUS_ARCH_MISMATCH", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_STATUS_MAPPING_ERROR", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_STATUS_EXECUTION_FAILED", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_STATUS_INTERNAL_ERROR", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_STATUS_NOT_SUPPORTED", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_STATUS_LICENSE_ERROR", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_STATUS_CUBLAS_ERROR", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_STATUS_CUDA_ERROR", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_STATUS_INSUFFICIENT_WORKSPACE", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_STATUS_INSUFFICIENT_DRIVER", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_STATUS_IO_ERROR", {CUTENSOR_1200, CUDA_0, CUDA_0 }}, + {"cutensorAlgo_t", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_ALGO_DEFAULT_PATIENT", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"CUTENSOR_ALGO_GETT", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_ALGO_TGETT", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_ALGO_TTGT", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_ALGO_DEFAULT", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"cutensorWorksizePreference_t", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_WORKSPACE_MIN", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"CUTENSOR_WORKSPACE_DEFAULT", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_WORKSPACE_RECOMMENDED", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, + {"CUTENSOR_WORKSPACE_MAX", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"cutensorOperationDescriptorAttribute_t", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OPERATION_DESCRIPTOR_TAG", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OPERATION_DESCRIPTOR_SCALAR_TYPE", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OPERATION_DESCRIPTOR_FLOPS", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OPERATION_DESCRIPTOR_MOVED_BYTES", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OPERATION_DESCRIPTOR_PADDING_LEFT", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OPERATION_DESCRIPTOR_PADDING_RIGHT", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_OPERATION_DESCRIPTOR_PADDING_VALUE", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorPlanPreferenceAttribute_t", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_PLAN_PREFERENCE_AUTOTUNE_MODE", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_PLAN_PREFERENCE_CACHE_MODE", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_PLAN_PREFERENCE_INCREMENTAL_COUNT", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_PLAN_PREFERENCE_ALGO", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_PLAN_PREFERENCE_KERNEL_RANK", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_PLAN_PREFERENCE_JIT", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorAutotuneMode_t", {CUTENSOR_1200, CUDA_0, CUDA_0 }}, + {"CUTENSOR_AUTOTUNE_MODE_NONE", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_AUTOTUNE_MODE_INCREMENTAL", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_AUTOTUNE_NONE", {CUTENSOR_1200, CUDA_0, CUTENSOR_2000 }}, + {"CUTENSOR_AUTOTUNE_INCREMENTAL", {CUTENSOR_1200, CUDA_0, CUTENSOR_2000 }}, + {"cutensorJitMode_t", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_JIT_MODE_NONE", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_JIT_MODE_DEFAULT", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorCacheMode_t", {CUTENSOR_1200, CUDA_0, CUDA_0 }}, + {"CUTENSOR_CACHE_MODE_NONE", {CUTENSOR_1200, CUDA_0, CUDA_0 }}, + {"CUTENSOR_CACHE_MODE_PEDANTIC", {CUTENSOR_1200, CUDA_0, CUDA_0 }}, + {"cutensorPlanAttribute_t", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"CUTENSOR_PLAN_REQUIRED_WORKSPACE", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorHandle_t", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"cutensorHandle", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorTensorDescriptor_t", {CUTENSOR_1010, CUDA_0, CUDA_0 }}, + {"cutensorTensorDescriptor", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorContractionPlan_t", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, + {"cutensorPlan_t", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorPlan", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, + {"cutensorLoggerCallback_t", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, }; const std::map HIP_TENSOR_TYPE_NAME_VER_MAP { - {"hiptensorComputeType_t", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_COMPUTE_16F", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_COMPUTE_16BF", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_COMPUTE_32F", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_COMPUTE_C32F", {HIP_6010, HIP_0, HIP_0, }}, - {"HIPTENSOR_COMPUTE_64F", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_COMPUTE_C64F", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_COMPUTE_8I", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_COMPUTE_8U", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_COMPUTE_32I", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_COMPUTE_32U", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorOperator_t", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_OP_IDENTITY", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_OP_SQRT", {HIP_6020, HIP_0, HIP_0, }}, - {"HIPTENSOR_OP_ADD", {HIP_6030, HIP_0, HIP_0, }}, - {"HIPTENSOR_OP_MUL", {HIP_6030, HIP_0, HIP_0, }}, - {"HIPTENSOR_OP_MAX", {HIP_6030, HIP_0, HIP_0, }}, - {"HIPTENSOR_OP_MIN", {HIP_6030, HIP_0, HIP_0, }}, - {"HIPTENSOR_OP_UNKNOWN", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorStatus_t", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_STATUS_SUCCESS", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_STATUS_NOT_INITIALIZED", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_STATUS_ALLOC_FAILED", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_STATUS_INVALID_VALUE", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_STATUS_ARCH_MISMATCH", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_STATUS_EXECUTION_FAILED", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_STATUS_INTERNAL_ERROR", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_STATUS_NOT_SUPPORTED", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_STATUS_INSUFFICIENT_WORKSPACE", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_STATUS_INSUFFICIENT_DRIVER", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_STATUS_IO_ERROR", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorAlgo_t", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_ALGO_DEFAULT", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_ALGO_DEFAULT_PATIENT", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorWorksizePreference_t", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_WORKSPACE_MIN", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_WORKSPACE_RECOMMENDED", {HIP_5070, HIP_0, HIP_0, }}, - {"HIPTENSOR_WORKSPACE_MAX", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorHandle_t", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorTensorDescriptor_t", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorContractionPlan_t", {HIP_5070, HIP_0, HIP_0, }}, - {"hiptensorLoggerCallback_t", {HIP_5070, HIP_0, HIP_0, }}, + {"hiptensorComputeType_t", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_COMPUTE_16F", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_COMPUTE_16BF", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_COMPUTE_32F", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_COMPUTE_C32F", {HIP_6010, HIP_0, HIP_0 }}, + {"HIPTENSOR_COMPUTE_64F", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_COMPUTE_C64F", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_COMPUTE_8I", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_COMPUTE_8U", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_COMPUTE_32I", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_COMPUTE_32U", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorOperator_t", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_OP_IDENTITY", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_OP_SQRT", {HIP_6020, HIP_0, HIP_0 }}, + {"HIPTENSOR_OP_ADD", {HIP_6030, HIP_0, HIP_0 }}, + {"HIPTENSOR_OP_MUL", {HIP_6030, HIP_0, HIP_0 }}, + {"HIPTENSOR_OP_MAX", {HIP_6030, HIP_0, HIP_0 }}, + {"HIPTENSOR_OP_MIN", {HIP_6030, HIP_0, HIP_0 }}, + {"HIPTENSOR_OP_UNKNOWN", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorStatus_t", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_STATUS_SUCCESS", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_STATUS_NOT_INITIALIZED", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_STATUS_ALLOC_FAILED", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_STATUS_INVALID_VALUE", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_STATUS_ARCH_MISMATCH", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_STATUS_EXECUTION_FAILED", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_STATUS_INTERNAL_ERROR", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_STATUS_NOT_SUPPORTED", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_STATUS_INSUFFICIENT_WORKSPACE", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_STATUS_INSUFFICIENT_DRIVER", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_STATUS_IO_ERROR", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorAlgo_t", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_ALGO_DEFAULT", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_ALGO_DEFAULT_PATIENT", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorWorksizePreference_t", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_WORKSPACE_MIN", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_WORKSPACE_RECOMMENDED", {HIP_5070, HIP_0, HIP_0 }}, + {"HIPTENSOR_WORKSPACE_MAX", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorHandle_t", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorTensorDescriptor_t", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorContractionPlan_t", {HIP_5070, HIP_0, HIP_0 }}, + {"hiptensorLoggerCallback_t", {HIP_5070, HIP_0, HIP_0 }}, }; From ec11cdd37cb7f9f0a1f06ab221ef59c4f71a891d Mon Sep 17 00:00:00 2001 From: Evgeny Mankov Date: Thu, 19 Dec 2024 19:08:57 +0100 Subject: [PATCH 05/17] [HIPIFY][RT][6.4.0] Sync with `HIP LRT 6.4.0` + Updated synthetic tests, the regenerated `hipify-perl`, and `Driver` and `Runtime` `CUDA2HIP` docs accordingly --- bin/hipify-perl | 26 +++++++++++++++++++ ...A_Driver_API_functions_supported_by_HIP.md | 20 +++++++------- ..._Runtime_API_functions_supported_by_HIP.md | 6 ++--- src/CUDA2HIP_Driver_API_functions.cpp | 3 ++- src/CUDA2HIP_Driver_API_types.cpp | 26 ++++++++++++------- src/CUDA2HIP_Runtime_API_functions.cpp | 2 +- src/CUDA2HIP_Runtime_API_types.cpp | 6 +++-- tests/unit_tests/synthetic/driver_enums.cu | 22 ++++++++++++++++ .../unit_tests/synthetic/driver_functions.cu | 8 ++++++ tests/unit_tests/synthetic/runtime_enums.cu | 4 +++ .../unit_tests/synthetic/runtime_functions.cu | 5 ++++ 11 files changed, 102 insertions(+), 26 deletions(-) diff --git a/bin/hipify-perl b/bin/hipify-perl index 36006f3f..fb11cec5 100755 --- a/bin/hipify-perl +++ b/bin/hipify-perl @@ -1398,15 +1398,28 @@ my %removed_funcs = ( ); my %experimental_funcs = ( + "cudaEventRecordWithFlags" => "6.4.0", + "cudaErrorInvalidTexture" => "6.4.0", + "cudaErrorInvalidChannelDescriptor" => "6.4.0", "cuStreamBatchMemOp_v2" => "6.4.0", "cuStreamBatchMemOp" => "6.4.0", "cuGraphExecBatchMemOpNodeSetParams" => "6.4.0", "cuGraphBatchMemOpNodeSetParams" => "6.4.0", "cuGraphBatchMemOpNodeGetParams" => "6.4.0", "cuGraphAddBatchMemOpNode" => "6.4.0", + "cuEventRecordWithFlags" => "6.4.0", + "CUstreamBatchMemOpType_enum" => "6.4.0", + "CUstreamBatchMemOpType" => "6.4.0", "CUstreamBatchMemOpParams_v1" => "6.4.0", "CUstreamBatchMemOpParams_union" => "6.4.0", "CUstreamBatchMemOpParams" => "6.4.0", + "CU_STREAM_MEM_OP_WRITE_VALUE_64" => "6.4.0", + "CU_STREAM_MEM_OP_WRITE_VALUE_32" => "6.4.0", + "CU_STREAM_MEM_OP_WAIT_VALUE_64" => "6.4.0", + "CU_STREAM_MEM_OP_WAIT_VALUE_32" => "6.4.0", + "CU_STREAM_MEM_OP_FLUSH_REMOTE_WRITES" => "6.4.0", + "CU_STREAM_MEM_OP_BARRIER" => "6.4.0", + "CU_GRAPH_NODE_TYPE_BATCH_MEM_OP" => "6.4.0", "CUDA_BATCH_MEM_OP_NODE_PARAMS_v2_st" => "6.4.0", "CUDA_BATCH_MEM_OP_NODE_PARAMS_v2" => "6.4.0", "CUDA_BATCH_MEM_OP_NODE_PARAMS_v1_st" => "6.4.0", @@ -1550,6 +1563,8 @@ sub subst { } sub experimentalSubstitutions { + subst("cuEventRecordWithFlags", "hipEventRecordWithFlags", "event"); + subst("cudaEventRecordWithFlags", "hipEventRecordWithFlags", "event"); subst("cuStreamBatchMemOp", "hipStreamBatchMemOp", "stream_memory"); subst("cuStreamBatchMemOp_v2", "hipStreamBatchMemOp", "stream_memory"); subst("cuGraphAddBatchMemOpNode", "hipGraphAddBatchMemOpNode", "graph"); @@ -1565,6 +1580,17 @@ sub experimentalSubstitutions { subst("CUstreamBatchMemOpParams", "hipStreamBatchMemOpParams", "type"); subst("CUstreamBatchMemOpParams_union", "hipStreamBatchMemOpParams_union", "type"); subst("CUstreamBatchMemOpParams_v1", "hipStreamBatchMemOpParams", "type"); + subst("CUstreamBatchMemOpType", "hipStreamBatchMemOpType", "type"); + subst("CUstreamBatchMemOpType_enum", "hipStreamBatchMemOpType", "type"); + subst("CU_GRAPH_NODE_TYPE_BATCH_MEM_OP", "hipGraphNodeTypeBatchMemOp", "numeric_literal"); + subst("CU_STREAM_MEM_OP_BARRIER", "hipStreamMemOpBarrier", "numeric_literal"); + subst("CU_STREAM_MEM_OP_FLUSH_REMOTE_WRITES", "hipStreamMemOpFlushRemoteWrites", "numeric_literal"); + subst("CU_STREAM_MEM_OP_WAIT_VALUE_32", "hipStreamMemOpWaitValue32", "numeric_literal"); + subst("CU_STREAM_MEM_OP_WAIT_VALUE_64", "hipStreamMemOpWaitValue64", "numeric_literal"); + subst("CU_STREAM_MEM_OP_WRITE_VALUE_32", "hipStreamMemOpWriteValue32", "numeric_literal"); + subst("CU_STREAM_MEM_OP_WRITE_VALUE_64", "hipStreamMemOpWriteValue64", "numeric_literal"); + subst("cudaErrorInvalidChannelDescriptor", "hipErrorInvalidChannelDescriptor", "numeric_literal"); + subst("cudaErrorInvalidTexture", "hipErrorInvalidTexture", "numeric_literal"); } sub rocSubstitutions { diff --git a/docs/tables/CUDA_Driver_API_functions_supported_by_HIP.md b/docs/tables/CUDA_Driver_API_functions_supported_by_HIP.md index 2b948d6c..6a58880c 100644 --- a/docs/tables/CUDA_Driver_API_functions_supported_by_HIP.md +++ b/docs/tables/CUDA_Driver_API_functions_supported_by_HIP.md @@ -725,7 +725,7 @@ |`CU_GRAPH_MEM_ATTR_RESERVED_MEM_HIGH`|11.4| | | |`hipGraphMemAttrReservedMemHigh`|5.3.0| | | | | |`CU_GRAPH_MEM_ATTR_USED_MEM_CURRENT`|11.4| | | |`hipGraphMemAttrUsedMemCurrent`|5.3.0| | | | | |`CU_GRAPH_MEM_ATTR_USED_MEM_HIGH`|11.4| | | |`hipGraphMemAttrUsedMemHigh`|5.3.0| | | | | -|`CU_GRAPH_NODE_TYPE_BATCH_MEM_OP`|11.7| | | | | | | | | | +|`CU_GRAPH_NODE_TYPE_BATCH_MEM_OP`|11.7| | | |`hipGraphNodeTypeBatchMemOp`|6.4.0| | | |6.4.0| |`CU_GRAPH_NODE_TYPE_CONDITIONAL`|12.3| | | | | | | | | | |`CU_GRAPH_NODE_TYPE_COUNT`|10.0| | |11.0|`hipGraphNodeTypeCount`|4.3.0| | | | | |`CU_GRAPH_NODE_TYPE_EMPTY`|10.0| | | |`hipGraphNodeTypeEmpty`|4.3.0| | | | | @@ -991,12 +991,12 @@ |`CU_STREAM_LEGACY`| | | | |`hipStreamLegacy`|6.2.0| | | | | |`CU_STREAM_MEMORY_BARRIER_TYPE_GPU`|11.7| | | | | | | | | | |`CU_STREAM_MEMORY_BARRIER_TYPE_SYS`|11.7| | | | | | | | | | -|`CU_STREAM_MEM_OP_BARRIER`|11.7| | | | | | | | | | -|`CU_STREAM_MEM_OP_FLUSH_REMOTE_WRITES`|8.0| | | | | | | | | | -|`CU_STREAM_MEM_OP_WAIT_VALUE_32`|8.0| | | | | | | | | | -|`CU_STREAM_MEM_OP_WAIT_VALUE_64`|9.0| | | | | | | | | | -|`CU_STREAM_MEM_OP_WRITE_VALUE_32`|8.0| | | | | | | | | | -|`CU_STREAM_MEM_OP_WRITE_VALUE_64`|9.0| | | | | | | | | | +|`CU_STREAM_MEM_OP_BARRIER`|11.7| | | |`hipStreamMemOpBarrier`|6.4.0| | | |6.4.0| +|`CU_STREAM_MEM_OP_FLUSH_REMOTE_WRITES`|8.0| | | |`hipStreamMemOpFlushRemoteWrites`|6.4.0| | | |6.4.0| +|`CU_STREAM_MEM_OP_WAIT_VALUE_32`|8.0| | | |`hipStreamMemOpWaitValue32`|6.4.0| | | |6.4.0| +|`CU_STREAM_MEM_OP_WAIT_VALUE_64`|9.0| | | |`hipStreamMemOpWaitValue64`|6.4.0| | | |6.4.0| +|`CU_STREAM_MEM_OP_WRITE_VALUE_32`|8.0| | | |`hipStreamMemOpWriteValue32`|6.4.0| | | |6.4.0| +|`CU_STREAM_MEM_OP_WRITE_VALUE_64`|9.0| | | |`hipStreamMemOpWriteValue64`|6.4.0| | | |6.4.0| |`CU_STREAM_NON_BLOCKING`| | | | |`hipStreamNonBlocking`|1.6.0| | | | | |`CU_STREAM_PER_THREAD`| | | | |`hipStreamPerThread`|4.5.0| | | | | |`CU_STREAM_SET_CAPTURE_DEPENDENCIES`|11.3| | | |`hipStreamSetCaptureDependencies`|5.0.0| | | | | @@ -1378,8 +1378,8 @@ |`CUstreamBatchMemOpParams`|8.0| | | |`hipStreamBatchMemOpParams`|6.4.0| | | |6.4.0| |`CUstreamBatchMemOpParams_union`|8.0| | | |`hipStreamBatchMemOpParams_union`|6.4.0| | | |6.4.0| |`CUstreamBatchMemOpParams_v1`|11.3| | | |`hipStreamBatchMemOpParams`|6.4.0| | | |6.4.0| -|`CUstreamBatchMemOpType`|8.0| | | | | | | | | | -|`CUstreamBatchMemOpType_enum`|8.0| | | | | | | | | | +|`CUstreamBatchMemOpType`|8.0| | | |`hipStreamBatchMemOpType`|6.4.0| | | |6.4.0| +|`CUstreamBatchMemOpType_enum`|8.0| | | |`hipStreamBatchMemOpType`|6.4.0| | | |6.4.0| |`CUstreamCallback`| | | | |`hipStreamCallback_t`|1.6.0| | | | | |`CUstreamCaptureMode`|10.1| | | |`hipStreamCaptureMode`|4.3.0| | | | | |`CUstreamCaptureMode_enum`|10.1| | | |`hipStreamCaptureMode`|4.3.0| | | | | @@ -1817,7 +1817,7 @@ |`cuEventElapsedTime`| | | | |`hipEventElapsedTime`|1.6.0| | | | | |`cuEventQuery`| | | | |`hipEventQuery`|1.6.0| | | | | |`cuEventRecord`| | | | |`hipEventRecord`|1.6.0| | | | | -|`cuEventRecordWithFlags`|11.1| | | | | | | | | | +|`cuEventRecordWithFlags`|11.1| | | |`hipEventRecordWithFlags`|6.4.0| | | |6.4.0| |`cuEventSynchronize`| | | | |`hipEventSynchronize`|1.6.0| | | | | ## **20. External Resource Interoperability** diff --git a/docs/tables/CUDA_Runtime_API_functions_supported_by_HIP.md b/docs/tables/CUDA_Runtime_API_functions_supported_by_HIP.md index 6eba24f0..4ccadf3f 100644 --- a/docs/tables/CUDA_Runtime_API_functions_supported_by_HIP.md +++ b/docs/tables/CUDA_Runtime_API_functions_supported_by_HIP.md @@ -103,7 +103,7 @@ |`cudaEventElapsedTime`| | | | |`hipEventElapsedTime`|1.6.0| | | | | |`cudaEventQuery`| | | | |`hipEventQuery`|1.6.0| | | | | |`cudaEventRecord`| | | | |`hipEventRecord`|1.6.0| | | | | -|`cudaEventRecordWithFlags`|11.1| | | | | | | | | | +|`cudaEventRecordWithFlags`|11.1| | | |`hipEventRecordWithFlags`|6.4.0| | | |6.4.0| |`cudaEventSynchronize`| | | | |`hipEventSynchronize`|1.6.0| | | | | ## **7. External Resource Interoperability** @@ -1045,7 +1045,7 @@ |`cudaErrorInitializationError`| | | | |`hipErrorNotInitialized`|1.6.0| | | | | |`cudaErrorInsufficientDriver`| | | | |`hipErrorInsufficientDriver`|1.7.0| | | | | |`cudaErrorInvalidAddressSpace`| | | | | | | | | | | -|`cudaErrorInvalidChannelDescriptor`| | | | | | | | | | | +|`cudaErrorInvalidChannelDescriptor`| | | | |`hipErrorInvalidChannelDescriptor`|6.4.0| | | |6.4.0| |`cudaErrorInvalidClusterSize`|11.8| | | | | | | | | | |`cudaErrorInvalidConfiguration`| | | | |`hipErrorInvalidConfiguration`|1.6.0| | | | | |`cudaErrorInvalidDevice`| | | | |`hipErrorInvalidDevice`|1.6.0| | | | | @@ -1066,7 +1066,7 @@ |`cudaErrorInvalidSource`|10.1| | | |`hipErrorInvalidSource`|1.6.0| | | | | |`cudaErrorInvalidSurface`| | | | | | | | | | | |`cudaErrorInvalidSymbol`| | | | |`hipErrorInvalidSymbol`|1.6.0| | | | | -|`cudaErrorInvalidTexture`| | | | | | | | | | | +|`cudaErrorInvalidTexture`| | | | |`hipErrorInvalidTexture`|6.4.0| | | |6.4.0| |`cudaErrorInvalidTextureBinding`| | | | | | | | | | | |`cudaErrorInvalidValue`| | | | |`hipErrorInvalidValue`|1.6.0| | | | | |`cudaErrorJitCompilationDisabled`|11.2| | | | | | | | | | diff --git a/src/CUDA2HIP_Driver_API_functions.cpp b/src/CUDA2HIP_Driver_API_functions.cpp index 1d4b9819..4f09c001 100644 --- a/src/CUDA2HIP_Driver_API_functions.cpp +++ b/src/CUDA2HIP_Driver_API_functions.cpp @@ -534,7 +534,7 @@ const std::map CUDA_DRIVER_FUNCTION_MAP { // cudaEventSynchronize {"cuEventSynchronize", {"hipEventSynchronize", "", CONV_EVENT, API_DRIVER, SEC::EVENT}}, // cudaEventRecordWithFlags - {"cuEventRecordWithFlags", {"hipEventRecordWithFlags", "", CONV_EVENT, API_DRIVER, SEC::EVENT, HIP_UNSUPPORTED}}, + {"cuEventRecordWithFlags", {"hipEventRecordWithFlags", "", CONV_EVENT, API_DRIVER, SEC::EVENT, HIP_EXPERIMENTAL}}, // 20. External Resource Interoperability // cudaDestroyExternalMemory @@ -1670,6 +1670,7 @@ const std::map HIP_DRIVER_FUNCTION_VER_MAP { {"hipGraphBatchMemOpNodeGetParams", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, {"hipGraphBatchMemOpNodeSetParams", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, {"hipGraphExecBatchMemOpNodeSetParams", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, + {"hipEventRecordWithFlags", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, }; const std::map CUDA_DRIVER_FUNCTION_CHANGED_VER_MAP { diff --git a/src/CUDA2HIP_Driver_API_types.cpp b/src/CUDA2HIP_Driver_API_types.cpp index 1d77e091..4fb6d8e3 100644 --- a/src/CUDA2HIP_Driver_API_types.cpp +++ b/src/CUDA2HIP_Driver_API_types.cpp @@ -1326,7 +1326,7 @@ const std::map CUDA_DRIVER_TYPE_NAME_MAP { // cudaGraphNodeTypeMemFree {"CU_GRAPH_NODE_TYPE_MEM_FREE", {"hipGraphNodeTypeMemFree", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES}}, // 11 // - {"CU_GRAPH_NODE_TYPE_BATCH_MEM_OP", {"hipGraphNodeTypeBatchMemOp", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES, HIP_UNSUPPORTED}}, // 12 + {"CU_GRAPH_NODE_TYPE_BATCH_MEM_OP", {"hipGraphNodeTypeBatchMemOp", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES, HIP_EXPERIMENTAL}}, // 12 // cudaGraphNodeTypeConditional {"CU_GRAPH_NODE_TYPE_CONDITIONAL", {"hipGraphNodeTypeConditional", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES, HIP_UNSUPPORTED}}, // 13 // cudaGraphNodeTypeCount @@ -1914,15 +1914,15 @@ const std::map CUDA_DRIVER_TYPE_NAME_MAP { {"CU_STREAM_NON_BLOCKING", {"hipStreamNonBlocking", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES}}, // 0x1 // no analogue - {"CUstreamBatchMemOpType", {"hipStreamBatchMemOpType", "", CONV_TYPE, API_DRIVER, SEC::DATA_TYPES, HIP_UNSUPPORTED}}, - {"CUstreamBatchMemOpType_enum", {"hipStreamBatchMemOpType", "", CONV_TYPE, API_DRIVER, SEC::DATA_TYPES, HIP_UNSUPPORTED}}, + {"CUstreamBatchMemOpType", {"hipStreamBatchMemOpType", "", CONV_TYPE, API_DRIVER, SEC::DATA_TYPES, HIP_EXPERIMENTAL}}, + {"CUstreamBatchMemOpType_enum", {"hipStreamBatchMemOpType", "", CONV_TYPE, API_DRIVER, SEC::DATA_TYPES, HIP_EXPERIMENTAL}}, // CUstreamBatchMemOpType enum values - {"CU_STREAM_MEM_OP_WAIT_VALUE_32", {"hipStreamBatchMemOpWaitValue32", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES, HIP_UNSUPPORTED}}, // 1 - {"CU_STREAM_MEM_OP_WRITE_VALUE_32", {"hipStreamBatchMemOpWriteValue32", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES, HIP_UNSUPPORTED}}, // 2 - {"CU_STREAM_MEM_OP_FLUSH_REMOTE_WRITES", {"hipStreamBatchMemOpFlushRemoteWrites", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES, HIP_UNSUPPORTED}}, // 3 - {"CU_STREAM_MEM_OP_WAIT_VALUE_64", {"hipStreamBatchMemOpWaitValue64", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES, HIP_UNSUPPORTED}}, // 4 - {"CU_STREAM_MEM_OP_WRITE_VALUE_64", {"hipStreamBatchMemOpWriteValue64", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES, HIP_UNSUPPORTED}}, // 5 - {"CU_STREAM_MEM_OP_BARRIER", {"hipStreamBatchMemOpBarrier", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES, HIP_UNSUPPORTED}}, // 6 + {"CU_STREAM_MEM_OP_WAIT_VALUE_32", {"hipStreamMemOpWaitValue32", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES, HIP_EXPERIMENTAL}}, // 1 + {"CU_STREAM_MEM_OP_WRITE_VALUE_32", {"hipStreamMemOpWriteValue32", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES, HIP_EXPERIMENTAL}}, // 2 + {"CU_STREAM_MEM_OP_FLUSH_REMOTE_WRITES", {"hipStreamMemOpFlushRemoteWrites", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES, HIP_EXPERIMENTAL}}, // 3 + {"CU_STREAM_MEM_OP_WAIT_VALUE_64", {"hipStreamMemOpWaitValue64", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES, HIP_EXPERIMENTAL}}, // 4 + {"CU_STREAM_MEM_OP_WRITE_VALUE_64", {"hipStreamMemOpWriteValue64", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES, HIP_EXPERIMENTAL}}, // 5 + {"CU_STREAM_MEM_OP_BARRIER", {"hipStreamMemOpBarrier", "", CONV_NUMERIC_LITERAL, API_DRIVER, SEC::DATA_TYPES, HIP_EXPERIMENTAL}}, // 6 // cudaStreamCaptureStatus {"CUstreamCaptureStatus", {"hipStreamCaptureStatus", "", CONV_TYPE, API_DRIVER, SEC::DATA_TYPES}}, @@ -4307,4 +4307,12 @@ const std::map HIP_DRIVER_TYPE_NAME_VER_MAP { {"hipStreamBatchMemOpParams_union", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, {"hipStreamBatchMemOpParams", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, {"hipBatchMemOpNodeParams", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, + {"hipStreamBatchMemOpType", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, + {"hipStreamMemOpWaitValue32", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, + {"hipStreamMemOpWriteValue32", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, + {"hipStreamMemOpFlushRemoteWrites", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, + {"hipStreamMemOpWaitValue64", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, + {"hipStreamMemOpWriteValue64", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, + {"hipStreamMemOpBarrier", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, + {"hipGraphNodeTypeBatchMemOp", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, }; diff --git a/src/CUDA2HIP_Runtime_API_functions.cpp b/src/CUDA2HIP_Runtime_API_functions.cpp index 256b5ab5..6550e309 100644 --- a/src/CUDA2HIP_Runtime_API_functions.cpp +++ b/src/CUDA2HIP_Runtime_API_functions.cpp @@ -194,7 +194,7 @@ const std::map CUDA_RUNTIME_FUNCTION_MAP { // cuEventSynchronize {"cudaEventSynchronize", {"hipEventSynchronize", "", CONV_EVENT, API_RUNTIME, SEC::EVENT}}, // cuEventRecordWithFlags - {"cudaEventRecordWithFlags", {"hipEventRecordWithFlags", "", CONV_EVENT, API_RUNTIME, SEC::EVENT, HIP_UNSUPPORTED}}, + {"cudaEventRecordWithFlags", {"hipEventRecordWithFlags", "", CONV_EVENT, API_RUNTIME, SEC::EVENT, HIP_EXPERIMENTAL}}, // 7. External Resource Interoperability // cuDestroyExternalMemory diff --git a/src/CUDA2HIP_Runtime_API_types.cpp b/src/CUDA2HIP_Runtime_API_types.cpp index dc0aebc6..bf2c801f 100644 --- a/src/CUDA2HIP_Runtime_API_types.cpp +++ b/src/CUDA2HIP_Runtime_API_types.cpp @@ -886,11 +886,11 @@ const std::map CUDA_RUNTIME_TYPE_NAME_MAP { // no analogue {"cudaErrorInvalidDevicePointer", {"hipErrorInvalidDevicePointer", "", CONV_NUMERIC_LITERAL, API_RUNTIME, SEC::DATA_TYPES, CUDA_DEPRECATED}}, // 17 // no analogue - {"cudaErrorInvalidTexture", {"hipErrorInvalidTexture", "", CONV_NUMERIC_LITERAL, API_RUNTIME, SEC::DATA_TYPES, HIP_UNSUPPORTED}}, // 18 + {"cudaErrorInvalidTexture", {"hipErrorInvalidTexture", "", CONV_NUMERIC_LITERAL, API_RUNTIME, SEC::DATA_TYPES, HIP_EXPERIMENTAL}}, // 18 // no analogue {"cudaErrorInvalidTextureBinding", {"hipErrorInvalidTextureBinding", "", CONV_NUMERIC_LITERAL, API_RUNTIME, SEC::DATA_TYPES, HIP_UNSUPPORTED}}, // 19 // no analogue - {"cudaErrorInvalidChannelDescriptor", {"hipErrorInvalidChannelDescriptor", "", CONV_NUMERIC_LITERAL, API_RUNTIME, SEC::DATA_TYPES, HIP_UNSUPPORTED}}, // 20 + {"cudaErrorInvalidChannelDescriptor", {"hipErrorInvalidChannelDescriptor", "", CONV_NUMERIC_LITERAL, API_RUNTIME, SEC::DATA_TYPES, HIP_EXPERIMENTAL}}, // 20 // no analogue {"cudaErrorInvalidMemcpyDirection", {"hipErrorInvalidMemcpyDirection", "", CONV_NUMERIC_LITERAL, API_RUNTIME, SEC::DATA_TYPES}}, // 21 // Deprecated since CUDA 3.1 @@ -3202,4 +3202,6 @@ const std::map HIP_RUNTIME_TYPE_NAME_VER_MAP { {"HIP_TWO_TO_M1022", {HIP_5070, HIP_0, HIP_0 }}, {"HIP_TRIG_PLOSS", {HIP_5070, HIP_0, HIP_0 }}, {"HIP_DBL2INT_CVT", {HIP_5070, HIP_0, HIP_0 }}, + {"hipErrorInvalidChannelDescriptor", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, + {"hipErrorInvalidTexture", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, }; diff --git a/tests/unit_tests/synthetic/driver_enums.cu b/tests/unit_tests/synthetic/driver_enums.cu index 7ef80ee6..f243d46e 100644 --- a/tests/unit_tests/synthetic/driver_enums.cu +++ b/tests/unit_tests/synthetic/driver_enums.cu @@ -687,6 +687,17 @@ int main() { int STREAM_WAIT_VALUE_GEQ = CU_STREAM_WAIT_VALUE_GEQ; int STREAM_WAIT_VALUE_EQ = CU_STREAM_WAIT_VALUE_EQ; int STREAM_WAIT_VALUE_AND = CU_STREAM_WAIT_VALUE_AND; + + // CHECK: hipStreamBatchMemOpType streamBatchMemOpType; + // CHECK-NEXT: hipStreamBatchMemOpType streamBatchMemOpType_enum; + // CHECK-NEXT: hipStreamBatchMemOpType STREAM_MEM_OP_WAIT_VALUE_32 = hipStreamMemOpWaitValue32; + // CHECK-NEXT: hipStreamBatchMemOpType STREAM_MEM_OP_WRITE_VALUE_32 = hipStreamMemOpWriteValue32; + // CHECK-NEXT: hipStreamBatchMemOpType STREAM_MEM_OP_FLUSH_REMOTE_WRITES = hipStreamMemOpFlushRemoteWrites; + CUstreamBatchMemOpType streamBatchMemOpType; + CUstreamBatchMemOpType_enum streamBatchMemOpType_enum; + CUstreamBatchMemOpType STREAM_MEM_OP_WAIT_VALUE_32 = CU_STREAM_MEM_OP_WAIT_VALUE_32; + CUstreamBatchMemOpType STREAM_MEM_OP_WRITE_VALUE_32 = CU_STREAM_MEM_OP_WRITE_VALUE_32; + CUstreamBatchMemOpType STREAM_MEM_OP_FLUSH_REMOTE_WRITES = CU_STREAM_MEM_OP_FLUSH_REMOTE_WRITES; #endif #if CUDA_VERSION >= 9000 @@ -707,6 +718,11 @@ int main() { // CHECK: int STREAM_WAIT_VALUE_NOR = hipStreamWaitValueNor; int STREAM_WAIT_VALUE_NOR = CU_STREAM_WAIT_VALUE_NOR; + + // CHECK: hipStreamBatchMemOpType STREAM_MEM_OP_WAIT_VALUE_64 = hipStreamMemOpWaitValue64; + // CHECK-NEXT: hipStreamBatchMemOpType STREAM_MEM_OP_WRITE_VALUE_64 = hipStreamMemOpWriteValue64; + CUstreamBatchMemOpType STREAM_MEM_OP_WAIT_VALUE_64 = CU_STREAM_MEM_OP_WAIT_VALUE_64; + CUstreamBatchMemOpType STREAM_MEM_OP_WRITE_VALUE_64 = CU_STREAM_MEM_OP_WRITE_VALUE_64; #endif #if CUDA_VERSION >= 9000 && CUDA_VERSION < 12000 @@ -1125,6 +1141,12 @@ int main() { // CHECK: hipKernelNodeAttrID KernelNodeAttributePriority = hipKernelNodeAttributePriority; CUkernelNodeAttrID KernelNodeAttributePriority = CU_KERNEL_NODE_ATTRIBUTE_PRIORITY; + + // CHECK: hipGraphNodeType GRAPH_NODE_TYPE_BATCH_MEM_OP = hipGraphNodeTypeBatchMemOp; + CUgraphNodeType GRAPH_NODE_TYPE_BATCH_MEM_OP = CU_GRAPH_NODE_TYPE_BATCH_MEM_OP; + + // CHECK: hipStreamBatchMemOpType STREAM_MEM_OP_BARRIER = hipStreamMemOpBarrier; + CUstreamBatchMemOpType STREAM_MEM_OP_BARRIER = CU_STREAM_MEM_OP_BARRIER; #endif #if CUDA_VERSION >= 11080 diff --git a/tests/unit_tests/synthetic/driver_functions.cu b/tests/unit_tests/synthetic/driver_functions.cu index ec8fa722..af597ca9 100644 --- a/tests/unit_tests/synthetic/driver_functions.cu +++ b/tests/unit_tests/synthetic/driver_functions.cu @@ -1595,6 +1595,11 @@ int main() { // HIP: hipError_t hipGraphUpload(hipGraphExec_t graphExec, hipStream_t stream); // CHECK: result = hipGraphUpload(graphExec, stream); result = cuGraphUpload(graphExec, stream); + + // CUDA:CUresult CUDAAPI cuEventRecordWithFlags(CUevent hEvent, CUstream hStream, unsigned int flags); + // HIP: hipError_t hipEventRecordWithFlags(hipEvent_t event, hipStream_t stream __dparm(0), unsigned int flags __dparm(0)); + // CHECK: result = hipEventRecordWithFlags(event_, stream, flags); + result = cuEventRecordWithFlags(event_, stream, flags); #endif #if CUDA_VERSION >= 11020 @@ -1881,6 +1886,9 @@ int main() { // CHECK: hipBatchMemOpNodeParams BATCH_MEM_OP_NODE_PARAMS; CUDA_BATCH_MEM_OP_NODE_PARAMS BATCH_MEM_OP_NODE_PARAMS; + // CHECK: hipGraphNodeType GRAPH_NODE_TYPE_BATCH_MEM_OP = hipGraphNodeTypeBatchMemOp; + CUgraphNodeType GRAPH_NODE_TYPE_BATCH_MEM_OP = CU_GRAPH_NODE_TYPE_BATCH_MEM_OP; + // CUDA: CUresult CUDAAPI cuGraphAddBatchMemOpNode(CUgraphNode *phGraphNode, CUgraph hGraph, const CUgraphNode *dependencies, size_t numDependencies, const CUDA_BATCH_MEM_OP_NODE_PARAMS *nodeParams); // HIP: hipError_t hipGraphAddBatchMemOpNode(hipGraphNode_t *phGraphNode, hipGraph_t hGraph, const hipGraphNode_t* dependencies, size_t numDependencies, const hipBatchMemOpNodeParams* nodeParams); // CHECK: result = hipGraphAddBatchMemOpNode(&graphNode, graph, &graphNode2, bytes, &BATCH_MEM_OP_NODE_PARAMS); diff --git a/tests/unit_tests/synthetic/runtime_enums.cu b/tests/unit_tests/synthetic/runtime_enums.cu index 9b0ff8a3..e6ae3cae 100644 --- a/tests/unit_tests/synthetic/runtime_enums.cu +++ b/tests/unit_tests/synthetic/runtime_enums.cu @@ -240,6 +240,8 @@ int main() { // CHECK-NEXT: hipError_t ErrorHostMemoryNotRegistered = hipErrorHostMemoryNotRegistered; // CHECK-NEXT: hipError_t ErrorLaunchFailure = hipErrorLaunchFailure; // CHECK-NEXT: hipError_t ErrorNotSupported = hipErrorNotSupported; + // CHECK-NEXT: hipError_t ErrorInvalidTexture = hipErrorInvalidTexture; + // CHECK-NEXT: hipError_t ErrorInvalidChannelDescriptor = hipErrorInvalidChannelDescriptor; cudaError Error; cudaError_t Error_t; cudaError_t Success = cudaSuccess; @@ -288,6 +290,8 @@ int main() { cudaError_t ErrorHostMemoryNotRegistered = cudaErrorHostMemoryNotRegistered; cudaError_t ErrorLaunchFailure = cudaErrorLaunchFailure; cudaError_t ErrorNotSupported = cudaErrorNotSupported; + cudaError_t ErrorInvalidTexture = cudaErrorInvalidTexture; + cudaError_t ErrorInvalidChannelDescriptor = cudaErrorInvalidChannelDescriptor; // CHECK: hipError_t ErrorUnknown = hipErrorUnknown; cudaError_t ErrorUnknown = cudaErrorUnknown; diff --git a/tests/unit_tests/synthetic/runtime_functions.cu b/tests/unit_tests/synthetic/runtime_functions.cu index f9c92a39..6e066522 100644 --- a/tests/unit_tests/synthetic/runtime_functions.cu +++ b/tests/unit_tests/synthetic/runtime_functions.cu @@ -1316,6 +1316,11 @@ int main() { // HIP: hipError_t hipGraphUpload(hipGraphExec_t graphExec, hipStream_t stream); // CHECK: result = hipGraphUpload(GraphExec_t, stream); result = cudaGraphUpload(GraphExec_t, stream); + + // CUDA:extern __host__ __cudart_builtin__ cudaError_t CUDARTAPI cudaEventRecordWithFlags(cudaEvent_t event, cudaStream_t stream __dv(0), unsigned int flags __dv(0)); + // HIP: hipError_t hipEventRecordWithFlags(hipEvent_t event, hipStream_t stream __dparm(0), unsigned int flags __dparm(0)); + // CHECK: result = hipEventRecordWithFlags(Event_t, stream, flags); + result = cudaEventRecordWithFlags(Event_t, stream, flags); #endif #if CUDA_VERSION >= 11020 From ac3e807ccc9600960283505db6a48565249858ce Mon Sep 17 00:00:00 2001 From: Evgeny Mankov Date: Thu, 19 Dec 2024 20:23:01 +0100 Subject: [PATCH 06/17] [HIPIFY][doc] `LLVM 19.1.6` is the latest supported LLVM release + No patches are needed + Updated the `README.md` accordingly + `hipify-clang` built with `LLVM 19.1.6` works correctly with the latest supported `CUDA 12.6.3`, even though clang may report that `CUDA 12.6.3` is not fully supported + Tested on `Windows 11` (`VS 2019` and `VS 2022`) and `Ubuntu 23.10` --- docs/hipify-clang.rst | 63 ++++++++++++++++++++++--------------------- 1 file changed, 32 insertions(+), 31 deletions(-) diff --git a/docs/hipify-clang.rst b/docs/hipify-clang.rst index f1367032..d7323d25 100644 --- a/docs/hipify-clang.rst +++ b/docs/hipify-clang.rst @@ -37,7 +37,7 @@ Dependencies * `LLVM+Clang `_ of at least version `4.0.0 `_; the latest stable and recommended release: - `19.1.5 `_. + `19.1.6 `_. * `CUDA `_ of at least version `7.0 `_, the latest supported version is @@ -189,7 +189,8 @@ Dependencies `19.1.2 `_, `19.1.3 `_, `19.1.4 `_, - `19.1.5 `_:sup:`4` + `19.1.5 `_, + `19.1.6 `_:sup:`4` - `12.6.3 `_:sup:`4` - **Latest stable config** - **Latest stable config** @@ -232,7 +233,7 @@ Dependencies In most cases, you can get a suitable version of ``LLVM+Clang`` with your package manager. However, you can also `download a release archive `_ and build or install it. In case of multiple versions of ``LLVM`` installed, set `CMAKE_PREFIX_PATH `_ so that -``CMake`` can find the desired version of ``LLVM``. For example, ``-DCMAKE_PREFIX_PATH=D:\LLVM\19.1.5\dist``. +``CMake`` can find the desired version of ``LLVM``. For example, ``-DCMAKE_PREFIX_PATH=D:\LLVM\19.1.6\dist``. Usage ============================================================ @@ -265,7 +266,7 @@ header files used during the hipification process: .. code:: shell - ./hipify-clang square.cu --cuda-path=/usr/local/cuda-12.6 --clang-resource-directory=/usr/llvm/19.1.5/dist/lib/clang/19 + ./hipify-clang square.cu --cuda-path=/usr/local/cuda-12.6 --clang-resource-directory=/usr/llvm/19.1.6/dist/lib/clang/19 For more information, refer to the `Clang manual for compiling CUDA `_. @@ -402,7 +403,7 @@ To ensure LLVM being found or in case of multiple LLVM instances, specify the pa .. code-block:: bash - -DCMAKE_PREFIX_PATH=/usr/llvm/19.1.5/dist + -DCMAKE_PREFIX_PATH=/usr/llvm/19.1.6/dist On Windows, specify the following option for CMake in the first place: ``-G "Visual Studio 17 2022"``. @@ -476,7 +477,7 @@ LLVM <= 9.0.1 LLVM >= 10.0.0 ----------------- -1. Download `LLVM project `_ sources. +1. Download `LLVM project `_ sources. 2. Build `LLVM project `_: @@ -595,13 +596,13 @@ LLVM >= 10.0.0 .. code-block:: bash - python /usr/llvm/19.1.5/llvm-project/llvm/utils/lit/setup.py install + python /usr/llvm/19.1.6/llvm-project/llvm/utils/lit/setup.py install **Windows**: .. code-block:: shell - python D:/LLVM/19.1.5/llvm-project/llvm/utils/lit/setup.py install + python D:/LLVM/19.1.6/llvm-project/llvm/utils/lit/setup.py install In case of errors similar to ``ModuleNotFoundError: No module named 'setuptools'``, upgrade the ``setuptools`` package: @@ -615,23 +616,23 @@ LLVM >= 10.0.0 .. code-block:: bash - -DLLVM_EXTERNAL_LIT=/usr/llvm/19.1.5/build/bin/llvm-lit + -DLLVM_EXTERNAL_LIT=/usr/llvm/19.1.6/build/bin/llvm-lit **Windows**: .. code-block:: shell - -DLLVM_EXTERNAL_LIT=D:/LLVM/19.1.5/build/Release/bin/llvm-lit.py + -DLLVM_EXTERNAL_LIT=D:/LLVM/19.1.6/build/Release/bin/llvm-lit.py * ``FileCheck``: **Linux**: - Copy from ``/usr/llvm/19.1.5/build/bin/`` to ``CMAKE_INSTALL_PREFIX/dist/bin``. + Copy from ``/usr/llvm/19.1.6/build/bin/`` to ``CMAKE_INSTALL_PREFIX/dist/bin``. **Windows**: - Copy from ``D:/LLVM/19.1.5/build/Release/bin`` to ``CMAKE_INSTALL_PREFIX/dist/bin``. + Copy from ``D:/LLVM/19.1.6/build/Release/bin`` to ``CMAKE_INSTALL_PREFIX/dist/bin``. Alternatively, specify the path to ``FileCheck`` in the ``CMAKE_INSTALL_PREFIX`` option. @@ -658,8 +659,8 @@ On Linux, the following configurations are tested: * Ubuntu 14: LLVM 4.0.0 - 7.1.0, CUDA 7.0 - 9.0, cuDNN 5.0.5 - 7.6.5 * Ubuntu 16-19: LLVM 8.0.0 - 14.0.6, CUDA 7.0 - 10.2, cuDNN 5.1.10 - 8.0.5 -* Ubuntu 20-21: LLVM 9.0.0 - 19.1.5, CUDA 7.0 - 12.6.3, cuDNN 5.1.10 - 9.6.0, cuTensor 1.0.1.0 - 2.0.2.1 -* Ubuntu 22-23: LLVM 13.0.0 - 19.1.5, CUDA 7.0 - 12.6.3, cuDNN 8.0.5 - 9.6.0, cuTensor 1.0.1.0 - 2.0.2.1 +* Ubuntu 20-21: LLVM 9.0.0 - 19.1.6, CUDA 7.0 - 12.6.3, cuDNN 5.1.10 - 9.6.0, cuTensor 1.0.1.0 - 2.0.2.1 +* Ubuntu 22-23: LLVM 13.0.0 - 19.1.6, CUDA 7.0 - 12.6.3, cuDNN 8.0.5 - 9.6.0, cuTensor 1.0.1.0 - 2.0.2.1 Minimum build system requirements for the above configurations: @@ -677,11 +678,11 @@ Here's how to build ``hipify-clang`` with testing support on ``Ubuntu 23.10.01`` -DHIPIFY_CLANG_TESTS=ON \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX=../dist \ - -DCMAKE_PREFIX_PATH=/usr/llvm/19.1.5/dist \ + -DCMAKE_PREFIX_PATH=/usr/llvm/19.1.6/dist \ -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.6.3 \ -DCUDA_DNN_ROOT_DIR=/usr/local/cudnn-9.6.0 \ -DCUDA_TENSOR_ROOT_DIR=/usr/local/cutensor-2.0.2.1 \ - -DLLVM_EXTERNAL_LIT=/usr/llvm/19.1.5/build/bin/llvm-lit \ + -DLLVM_EXTERNAL_LIT=/usr/llvm/19.1.6/build/bin/llvm-lit \ ../hipify The corresponding successful output is: @@ -706,11 +707,11 @@ The corresponding successful output is: -- - Is part of HIP SDK : OFF -- - Install clang headers : ON -- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.13") - -- Found LLVM 19.1.5: - -- - CMake module path : /usr/llvm/19.1.5/dist/lib/cmake/llvm - -- - Clang include path : /usr/llvm/19.1.5/dist/include - -- - LLVM Include path : /usr/llvm/19.1.5/dist/include - -- - Binary path : /usr/llvm/19.1.5/dist/bin + -- Found LLVM 19.1.6: + -- - CMake module path : /usr/llvm/19.1.6/dist/lib/cmake/llvm + -- - Clang include path : /usr/llvm/19.1.6/dist/include + -- - LLVM Include path : /usr/llvm/19.1.6/dist/include + -- - Binary path : /usr/llvm/19.1.6/dist/bin -- Linker detection: GNU ld -- ---- The below configuring for hipify-clang testing only ---- -- Found Python: /usr/bin/python3.13 (found suitable version "3.13.1", required range is "3.0...3.14") found components: Interpreter @@ -747,7 +748,7 @@ The corresponding successful output is: Running HIPify regression tests =============================================================== CUDA 12.6.85 - will be used for testing - LLVM 19.1.5 - will be used for testing + LLVM 19.1.6 - will be used for testing x86_64 - Platform architecture Linux 6.5.0-15-generic - Platform OS 64 - hipify-clang binary bitness @@ -847,7 +848,7 @@ Tested configurations: - ``2019.16.11.42, 2022.17.12.3`` - ``3.31.2`` - ``3.13.1`` - * - ``19.1.0 - 19.1.5`` + * - ``19.1.0 - 19.1.6`` - ``7.0 - 12.6.3`` - ``8.0.5 - 9.6.0`` - ``2019.16.11.42, 2022.17.12.3`` @@ -877,12 +878,12 @@ Building with testing support using ``Visual Studio 17 2022`` on ``Windows 11``: -DHIPIFY_CLANG_TESTS=ON \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX=../dist \ - -DCMAKE_PREFIX_PATH=D:/LLVM/19.1.5/dist \ + -DCMAKE_PREFIX_PATH=D:/LLVM/19.1.6/dist \ -DCUDA_TOOLKIT_ROOT_DIR="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6" \ -DCUDA_SDK_ROOT_DIR="C:/ProgramData/NVIDIA Corporation/CUDA Samples/v12.5" \ -DCUDA_DNN_ROOT_DIR=D:/CUDA/cuDNN/9.6.0 \ -DCUDA_TENSOR_ROOT_DIR=D:/CUDA/cuTensor/2.0.2.1 \ - -DLLVM_EXTERNAL_LIT=D:/LLVM/19.1.5/build/Release/bin/llvm-lit.py \ + -DLLVM_EXTERNAL_LIT=D:/LLVM/19.1.6/build/Release/bin/llvm-lit.py \ ../hipify The corresponding successful output is: @@ -907,15 +908,15 @@ The corresponding successful output is: -- - Test hipify-clang : ON -- - Is part of HIP SDK : OFF -- - Install clang headers : ON - -- Found LLVM 19.1.5: - -- - CMake module path : D:/LLVM/19.1.5/dist/lib/cmake/llvm - -- - Clang include path : D:/LLVM/19.1.5/dist/include - -- - LLVM Include path : D:/LLVM/19.1.5/dist/include - -- - Binary path : D:/LLVM/19.1.5/dist/bin + -- Found LLVM 19.1.6: + -- - CMake module path : D:/LLVM/19.1.6/dist/lib/cmake/llvm + -- - Clang include path : D:/LLVM/19.1.6/dist/include + -- - LLVM Include path : D:/LLVM/19.1.6/dist/include + -- - Binary path : D:/LLVM/19.1.6/dist/bin -- ---- The below configuring for hipify-clang testing only ---- -- Found Python: C:/Users/TT/AppData/Local/Programs/Python/Python313/python.exe (found suitable version "3.13.1", required range is "3.0...3.14") found components: Interpreter -- Found lit: C:/Users/TT/AppData/Local/Programs/Python/Python313/Scripts/lit.exe - -- Found FileCheck: D:/LLVM/19.1.5/dist/bin/FileCheck.exe + -- Found FileCheck: D:/LLVM/19.1.6/dist/bin/FileCheck.exe -- Initial CUDA to configure: -- - CUDA Toolkit path : C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6 -- - CUDA Samples path : C:/ProgramData/NVIDIA Corporation/CUDA Samples/v12.5 From bf96ec6e2751b66a5d2d139de60279a1f76bb6e0 Mon Sep 17 00:00:00 2001 From: Evgeny Mankov Date: Mon, 23 Dec 2024 21:54:36 +0100 Subject: [PATCH 07/17] [HIPIFY][TensorMg][feature] `cuTensorMg` support - Part 1 + Added the missing compute types + Added support for `cutensorMg.h` + Updated the regenerated `hipify-perl` and `TENSOR` `CUDA2HIP` docs accordingly --- bin/hipify-perl | 7 +++++++ docs/tables/CUTENSOR_API_supported_by_HIP.md | 3 +++ src/CUDA2HIP.cpp | 1 + src/CUDA2HIP_TENSOR_API_types.cpp | 6 ++++++ 4 files changed, 17 insertions(+) diff --git a/bin/hipify-perl b/bin/hipify-perl index fb11cec5..22b87f19 100755 --- a/bin/hipify-perl +++ b/bin/hipify-perl @@ -6797,6 +6797,7 @@ sub simpleSubstitutions { subst("curand_poisson.h", "hiprand\/hiprand_kernel.h", "include"); subst("curand_precalc.h", "hiprand\/hiprand_kernel.h", "include"); subst("curand_uniform.h", "hiprand\/hiprand_kernel.h", "include"); + subst("cutensorMg.h", "hiptensor.h", "include"); subst("device_functions.h", "hip\/device_functions.h", "include"); subst("device_launch_parameters.h", "", "include"); subst("driver_types.h", "hip\/driver_types.h", "include"); @@ -10272,8 +10273,10 @@ sub warnHipOnlyUnsupportedFunctions { "CUTENSOR_STATUS_CUBLAS_ERROR", "CUTENSOR_R_MIN_TF32", "CUTENSOR_R_MIN_8U", + "CUTENSOR_R_MIN_8I", "CUTENSOR_R_MIN_64F", "CUTENSOR_R_MIN_32U", + "CUTENSOR_R_MIN_32I", "CUTENSOR_R_MIN_32F", "CUTENSOR_R_MIN_16F", "CUTENSOR_R_MIN_16BF", @@ -10342,6 +10345,7 @@ sub warnHipOnlyUnsupportedFunctions { "CUTENSOR_C_16F", "CUTENSOR_C_16BF", "CUTENSOR_COMPUTE_TF32", + "CUTENSOR_COMPUTE_3XTF32", "CUTENSOR_CACHE_MODE_PEDANTIC", "CUTENSOR_CACHE_MODE_NONE", "CUTENSOR_AUTOTUNE_MODE_NONE", @@ -11724,8 +11728,10 @@ sub warnRocOnlyUnsupportedFunctions { "CUTENSOR_STATUS_CUBLAS_ERROR", "CUTENSOR_R_MIN_TF32", "CUTENSOR_R_MIN_8U", + "CUTENSOR_R_MIN_8I", "CUTENSOR_R_MIN_64F", "CUTENSOR_R_MIN_32U", + "CUTENSOR_R_MIN_32I", "CUTENSOR_R_MIN_32F", "CUTENSOR_R_MIN_16F", "CUTENSOR_R_MIN_16BF", @@ -11794,6 +11800,7 @@ sub warnRocOnlyUnsupportedFunctions { "CUTENSOR_C_16F", "CUTENSOR_C_16BF", "CUTENSOR_COMPUTE_TF32", + "CUTENSOR_COMPUTE_3XTF32", "CUTENSOR_CACHE_MODE_PEDANTIC", "CUTENSOR_CACHE_MODE_NONE", "CUTENSOR_AUTOTUNE_MODE_NONE", diff --git a/docs/tables/CUTENSOR_API_supported_by_HIP.md b/docs/tables/CUTENSOR_API_supported_by_HIP.md index 4c06dff0..157ed1a6 100644 --- a/docs/tables/CUTENSOR_API_supported_by_HIP.md +++ b/docs/tables/CUTENSOR_API_supported_by_HIP.md @@ -18,6 +18,7 @@ |`CUTENSOR_COMPUTE_32F`|1.0.1.0| | |2.0.0.0|`HIPTENSOR_COMPUTE_32F`|5.7.0| | | | | |`CUTENSOR_COMPUTE_32I`|1.0.1.0| | |2.0.0.0|`HIPTENSOR_COMPUTE_32I`|5.7.0| | | | | |`CUTENSOR_COMPUTE_32U`|1.0.1.0| | |2.0.0.0|`HIPTENSOR_COMPUTE_32U`|5.7.0| | | | | +|`CUTENSOR_COMPUTE_3XTF32`|2.0.0.0| | | | | | | | | | |`CUTENSOR_COMPUTE_64F`|1.0.1.0| | |2.0.0.0|`HIPTENSOR_COMPUTE_64F`|5.7.0| | | | | |`CUTENSOR_COMPUTE_8I`|1.0.1.0| | |2.0.0.0|`HIPTENSOR_COMPUTE_8I`|5.7.0| | | | | |`CUTENSOR_COMPUTE_8U`|1.0.1.0| | |2.0.0.0|`HIPTENSOR_COMPUTE_8U`|5.7.0| | | | | @@ -106,8 +107,10 @@ |`CUTENSOR_R_MIN_16BF`|1.0.1.0|1.2.0.0| |2.0.0.0| | | | | | | |`CUTENSOR_R_MIN_16F`|1.0.1.0|1.2.0.0| |2.0.0.0| | | | | | | |`CUTENSOR_R_MIN_32F`|1.0.1.0|1.2.0.0| |2.0.0.0| | | | | | | +|`CUTENSOR_R_MIN_32I`|1.0.1.0|1.2.0.0| |2.0.0.0| | | | | | | |`CUTENSOR_R_MIN_32U`|1.0.1.0|1.2.0.0| |2.0.0.0| | | | | | | |`CUTENSOR_R_MIN_64F`|1.0.1.0|1.2.0.0| |2.0.0.0| | | | | | | +|`CUTENSOR_R_MIN_8I`|1.0.1.0|1.2.0.0| |2.0.0.0| | | | | | | |`CUTENSOR_R_MIN_8U`|1.0.1.0|1.2.0.0| |2.0.0.0| | | | | | | |`CUTENSOR_R_MIN_TF32`|1.0.1.0|1.2.0.0| |2.0.0.0| | | | | | | |`CUTENSOR_STATUS_ALLOC_FAILED`|1.0.1.0| | | |`HIPTENSOR_STATUS_ALLOC_FAILED`|5.7.0| | | | | diff --git a/src/CUDA2HIP.cpp b/src/CUDA2HIP.cpp index 56650fce..85e09176 100644 --- a/src/CUDA2HIP.cpp +++ b/src/CUDA2HIP.cpp @@ -70,6 +70,7 @@ const std::map CUDA_INCLUDE_MAP { {"cudnn.h", {"hipDNN.h", "miopen/miopen.h", CONV_INCLUDE_CUDA_MAIN_H, API_DNN, 0}}, // cuTensor includes {"cutensor.h", {"hiptensor.h", "", CONV_INCLUDE_CUDA_MAIN_H, API_TENSOR, 0}}, + {"cutensorMg.h", {"hiptensor.h", "", CONV_INCLUDE, API_TENSOR, 0}}, // cuFFT includes {"cufft.h", {"hipfft/hipfft.h", "", CONV_INCLUDE_CUDA_MAIN_H, API_FFT, 0}}, {"cufftXt.h", {"hipfft/hipfftXt.h", "", CONV_INCLUDE, API_FFT, 0}}, diff --git a/src/CUDA2HIP_TENSOR_API_types.cpp b/src/CUDA2HIP_TENSOR_API_types.cpp index 59fb34d4..b1b8898f 100644 --- a/src/CUDA2HIP_TENSOR_API_types.cpp +++ b/src/CUDA2HIP_TENSOR_API_types.cpp @@ -59,6 +59,7 @@ const std::map CUDA_TENSOR_TYPE_NAME_MAP { {"CUTENSOR_COMPUTE_16F", {"HIPTENSOR_COMPUTE_16F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, {"CUTENSOR_COMPUTE_16BF", {"HIPTENSOR_COMPUTE_16BF", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, {"CUTENSOR_COMPUTE_TF32", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_COMPUTE_3XTF32", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, {"CUTENSOR_COMPUTE_32F", {"HIPTENSOR_COMPUTE_32F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, {"CUTENSOR_COMPUTE_64F", {"HIPTENSOR_COMPUTE_64F", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, {"CUTENSOR_COMPUTE_8U", {"HIPTENSOR_COMPUTE_8U", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1}}, @@ -73,6 +74,8 @@ const std::map CUDA_TENSOR_TYPE_NAME_MAP { {"CUTENSOR_C_MIN_64F", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, {"CUTENSOR_R_MIN_8U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, {"CUTENSOR_R_MIN_32U", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_MIN_8I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_R_MIN_32I", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, {"CUTENSOR_R_MIN_16BF", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, {"CUTENSOR_R_MIN_TF32", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, {"CUTENSOR_C_MIN_TF32", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, @@ -223,6 +226,7 @@ const std::map CUDA_TENSOR_TYPE_NAME_VER_MAP { {"CUTENSOR_COMPUTE_8I", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, {"CUTENSOR_COMPUTE_32U", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, {"CUTENSOR_COMPUTE_32I", {CUTENSOR_1010, CUDA_0, CUTENSOR_2000 }}, + {"CUTENSOR_COMPUTE_3XTF32", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, {"CUTENSOR_R_MIN_16F", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, {"CUTENSOR_C_MIN_16F", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, {"CUTENSOR_R_MIN_32F", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, @@ -231,6 +235,8 @@ const std::map CUDA_TENSOR_TYPE_NAME_VER_MAP { {"CUTENSOR_C_MIN_64F", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, {"CUTENSOR_R_MIN_8U", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, {"CUTENSOR_R_MIN_32U", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, + {"CUTENSOR_R_MIN_8I", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, + {"CUTENSOR_R_MIN_32I", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, {"CUTENSOR_R_MIN_16BF", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, {"CUTENSOR_R_MIN_TF32", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, {"CUTENSOR_C_MIN_TF32", {CUTENSOR_1010, CUTENSOR_1200, CUTENSOR_2000 }}, From ce6295256a0aca2486c5c33933a5259859693c07 Mon Sep 17 00:00:00 2001 From: Evgeny Mankov Date: Tue, 24 Dec 2024 11:33:54 +0100 Subject: [PATCH 08/17] [HIPIFY][TensorMg][feature] `cuTensorMg` support - Part 2 + Updated the regenerated `hipify-perl` and `TENSOR` `CUDA2HIP` docs accordingly --- bin/hipify-perl | 26 +++++++++++++++++++ docs/tables/CUTENSOR_API_supported_by_HIP.md | 13 ++++++++++ src/CUDA2HIP_TENSOR_API_types.cpp | 27 ++++++++++++++++++++ 3 files changed, 66 insertions(+) diff --git a/bin/hipify-perl b/bin/hipify-perl index 22b87f19..2526c0dd 100755 --- a/bin/hipify-perl +++ b/bin/hipify-perl @@ -9934,6 +9934,17 @@ sub warnHipOnlyUnsupportedFunctions { "cutensorOperationDescriptorSetAttribute", "cutensorOperationDescriptorGetAttribute", "cutensorOperationDescriptorAttribute_t", + "cutensorMgTensorDescriptor_t", + "cutensorMgTensorDescriptor_s", + "cutensorMgHostDevice_t", + "cutensorMgHandle_t", + "cutensorMgHandle_s", + "cutensorMgCopyPlan_t", + "cutensorMgCopyPlan_s", + "cutensorMgCopyDescriptor_t", + "cutensorMgCopyDescriptor_s", + "cutensorMgContractionDescriptor_t", + "cutensorMgContractionDescriptor_s", "cutensorJitMode_t", "cutensorHandleWritePlanCacheToFile", "cutensorHandleResizePlanCache", @@ -10326,6 +10337,8 @@ sub warnHipOnlyUnsupportedFunctions { "CUTENSOR_OPERATION_DESCRIPTOR_PADDING_LEFT", "CUTENSOR_OPERATION_DESCRIPTOR_MOVED_BYTES", "CUTENSOR_OPERATION_DESCRIPTOR_FLOPS", + "CUTENSOR_MG_DEVICE_HOST_PINNED", + "CUTENSOR_MG_DEVICE_HOST", "CUTENSOR_JIT_MODE_NONE", "CUTENSOR_JIT_MODE_DEFAULT", "CUTENSOR_C_MIN_TF32", @@ -11299,6 +11312,17 @@ sub warnRocOnlyUnsupportedFunctions { "cutensorOperationDescriptorSetAttribute", "cutensorOperationDescriptorGetAttribute", "cutensorOperationDescriptorAttribute_t", + "cutensorMgTensorDescriptor_t", + "cutensorMgTensorDescriptor_s", + "cutensorMgHostDevice_t", + "cutensorMgHandle_t", + "cutensorMgHandle_s", + "cutensorMgCopyPlan_t", + "cutensorMgCopyPlan_s", + "cutensorMgCopyDescriptor_t", + "cutensorMgCopyDescriptor_s", + "cutensorMgContractionDescriptor_t", + "cutensorMgContractionDescriptor_s", "cutensorJitMode_t", "cutensorHandleWritePlanCacheToFile", "cutensorHandleResizePlanCache", @@ -11781,6 +11805,8 @@ sub warnRocOnlyUnsupportedFunctions { "CUTENSOR_OPERATION_DESCRIPTOR_PADDING_LEFT", "CUTENSOR_OPERATION_DESCRIPTOR_MOVED_BYTES", "CUTENSOR_OPERATION_DESCRIPTOR_FLOPS", + "CUTENSOR_MG_DEVICE_HOST_PINNED", + "CUTENSOR_MG_DEVICE_HOST", "CUTENSOR_JIT_MODE_NONE", "CUTENSOR_JIT_MODE_DEFAULT", "CUTENSOR_C_MIN_TF32", diff --git a/docs/tables/CUTENSOR_API_supported_by_HIP.md b/docs/tables/CUTENSOR_API_supported_by_HIP.md index 157ed1a6..2c282eb7 100644 --- a/docs/tables/CUTENSOR_API_supported_by_HIP.md +++ b/docs/tables/CUTENSOR_API_supported_by_HIP.md @@ -43,6 +43,8 @@ |`CUTENSOR_C_MIN_TF32`|1.0.1.0|1.2.0.0| |2.0.0.0| | | | | | | |`CUTENSOR_JIT_MODE_DEFAULT`|2.0.0.0| | | | | | | | | | |`CUTENSOR_JIT_MODE_NONE`|2.0.0.0| | | | | | | | | | +|`CUTENSOR_MG_DEVICE_HOST`|1.4.0.0| | | | | | | | | | +|`CUTENSOR_MG_DEVICE_HOST_PINNED`|1.4.0.0| | | | | | | | | | |`CUTENSOR_OPERATION_DESCRIPTOR_FLOPS`|2.0.0.0| | | | | | | | | | |`CUTENSOR_OPERATION_DESCRIPTOR_MOVED_BYTES`|2.0.0.0| | | | | | | | | | |`CUTENSOR_OPERATION_DESCRIPTOR_PADDING_LEFT`|2.0.0.0| | | | | | | | | | @@ -142,6 +144,17 @@ |`cutensorHandle_t`|1.0.1.0| | | |`hiptensorHandle_t`|5.7.0| | | | | |`cutensorJitMode_t`|2.0.0.0| | | | | | | | | | |`cutensorLoggerCallback_t`|1.3.2.0| | | |`hiptensorLoggerCallback_t`|5.7.0| | | | | +|`cutensorMgContractionDescriptor_s`|1.4.0.0| | | | | | | | | | +|`cutensorMgContractionDescriptor_t`|1.4.0.0| | | | | | | | | | +|`cutensorMgCopyDescriptor_s`|1.4.0.0| | | | | | | | | | +|`cutensorMgCopyDescriptor_t`|1.4.0.0| | | | | | | | | | +|`cutensorMgCopyPlan_s`|1.4.0.0| | | | | | | | | | +|`cutensorMgCopyPlan_t`|1.4.0.0| | | | | | | | | | +|`cutensorMgHandle_s`|1.4.0.0| | | | | | | | | | +|`cutensorMgHandle_t`|1.4.0.0| | | | | | | | | | +|`cutensorMgHostDevice_t`|1.4.0.0| | | | | | | | | | +|`cutensorMgTensorDescriptor_s`|1.4.0.0| | | | | | | | | | +|`cutensorMgTensorDescriptor_t`|1.4.0.0| | | | | | | | | | |`cutensorOperationDescriptorAttribute_t`|2.0.0.0| | | | | | | | | | |`cutensorOperator_t`|1.0.1.0| | | |`hiptensorOperator_t`|5.7.0| | | | | |`cutensorPlan`|2.0.0.0| | | | | | | | | | diff --git a/src/CUDA2HIP_TENSOR_API_types.cpp b/src/CUDA2HIP_TENSOR_API_types.cpp index b1b8898f..ce93f9e1 100644 --- a/src/CUDA2HIP_TENSOR_API_types.cpp +++ b/src/CUDA2HIP_TENSOR_API_types.cpp @@ -177,6 +177,10 @@ const std::map CUDA_TENSOR_TYPE_NAME_MAP { {"cutensorPlanAttribute_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, {"CUTENSOR_PLAN_REQUIRED_WORKSPACE", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorMgHostDevice_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_MG_DEVICE_HOST", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSOR_MG_DEVICE_HOST_PINNED", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorHandle_t", {"hiptensorHandle_t", "", CONV_TYPE, API_TENSOR, 1}}, {"cutensorHandle", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, {"cutensorTensorDescriptor_t", {"hiptensorTensorDescriptor_t", "", CONV_TYPE, API_TENSOR, 1}}, @@ -185,6 +189,16 @@ const std::map CUDA_TENSOR_TYPE_NAME_MAP { {"cutensorPlan_t", {"hiptensorContractionPlan_t", "", CONV_TYPE, API_TENSOR, 1}}, {"cutensorPlan", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, {"cutensorLoggerCallback_t", {"hiptensorLoggerCallback_t", "", CONV_TYPE, API_TENSOR, 1}}, + {"cutensorMgHandle_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorMgHandle_s", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorMgTensorDescriptor_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorMgTensorDescriptor_s", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorMgCopyDescriptor_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorMgCopyDescriptor_s", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorMgCopyPlan_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorMgCopyPlan_s", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorMgContractionDescriptor_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorMgContractionDescriptor_s", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, }; const std::map CUDA_TENSOR_TYPE_NAME_VER_MAP { @@ -337,6 +351,19 @@ const std::map CUDA_TENSOR_TYPE_NAME_VER_MAP { {"cutensorPlan_t", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, {"cutensorPlan", {CUTENSOR_2000, CUDA_0, CUDA_0 }}, {"cutensorLoggerCallback_t", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, + {"cutensorMgHostDevice_t", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"CUTENSOR_MG_DEVICE_HOST", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"CUTENSOR_MG_DEVICE_HOST_PINNED", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgHandle_s", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgHandle_t", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgTensorDescriptor_s", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgTensorDescriptor_t", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgCopyDescriptor_s", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgCopyDescriptor_t", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgCopyPlan_s", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgCopyPlan_t", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgContractionDescriptor_s", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgContractionDescriptor_t", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, }; const std::map HIP_TENSOR_TYPE_NAME_VER_MAP { From 0eeb128afea4eef41a1c910b9827a6d79aeec7ed Mon Sep 17 00:00:00 2001 From: Evgeny Mankov Date: Wed, 25 Dec 2024 13:50:40 +0100 Subject: [PATCH 09/17] [HIPIFY][TensorMg][feature] `cuTensorMg` support - Part 3 + Updated the regenerated `hipify-perl` and `TENSOR` `CUDA2HIP` docs accordingly --- bin/hipify-perl | 20 ++++++++++++++++++++ docs/tables/CUTENSOR_API_supported_by_HIP.md | 10 ++++++++++ src/CUDA2HIP_TENSOR_API_functions.cpp | 4 ++++ src/CUDA2HIP_TENSOR_API_types.cpp | 18 ++++++++++++++++++ 4 files changed, 52 insertions(+) diff --git a/bin/hipify-perl b/bin/hipify-perl index 2526c0dd..a31b286a 100755 --- a/bin/hipify-perl +++ b/bin/hipify-perl @@ -9939,12 +9939,20 @@ sub warnHipOnlyUnsupportedFunctions { "cutensorMgHostDevice_t", "cutensorMgHandle_t", "cutensorMgHandle_s", + "cutensorMgDestroy", + "cutensorMgCreate", "cutensorMgCopyPlan_t", "cutensorMgCopyPlan_s", "cutensorMgCopyDescriptor_t", "cutensorMgCopyDescriptor_s", + "cutensorMgContractionPlan_t", + "cutensorMgContractionPlan_s", + "cutensorMgContractionFind_t", + "cutensorMgContractionFind_s", + "cutensorMgContractionFindAttribute_t", "cutensorMgContractionDescriptor_t", "cutensorMgContractionDescriptor_s", + "cutensorMgAlgo_t", "cutensorJitMode_t", "cutensorHandleWritePlanCacheToFile", "cutensorHandleResizePlanCache", @@ -10366,6 +10374,8 @@ sub warnHipOnlyUnsupportedFunctions { "CUTENSOR_ALGO_TTGT", "CUTENSOR_ALGO_TGETT", "CUTENSOR_ALGO_GETT", + "CUTENSORMG_CONTRACTION_FIND_ATTRIBUTE_MAX", + "CUTENSORMG_ALGO_DEFAULT", "CUSPARSE_SPSV_UPDATE_GENERAL", "CUSPARSE_SPSV_UPDATE_DIAGONAL", "CUSPARSE_SPSM_UPDATE_GENERAL", @@ -11317,12 +11327,20 @@ sub warnRocOnlyUnsupportedFunctions { "cutensorMgHostDevice_t", "cutensorMgHandle_t", "cutensorMgHandle_s", + "cutensorMgDestroy", + "cutensorMgCreate", "cutensorMgCopyPlan_t", "cutensorMgCopyPlan_s", "cutensorMgCopyDescriptor_t", "cutensorMgCopyDescriptor_s", + "cutensorMgContractionPlan_t", + "cutensorMgContractionPlan_s", + "cutensorMgContractionFind_t", + "cutensorMgContractionFind_s", + "cutensorMgContractionFindAttribute_t", "cutensorMgContractionDescriptor_t", "cutensorMgContractionDescriptor_s", + "cutensorMgAlgo_t", "cutensorJitMode_t", "cutensorHandleWritePlanCacheToFile", "cutensorHandleResizePlanCache", @@ -11834,6 +11852,8 @@ sub warnRocOnlyUnsupportedFunctions { "CUTENSOR_ALGO_TTGT", "CUTENSOR_ALGO_TGETT", "CUTENSOR_ALGO_GETT", + "CUTENSORMG_CONTRACTION_FIND_ATTRIBUTE_MAX", + "CUTENSORMG_ALGO_DEFAULT", "CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED", "CUSPARSE_STATUS_MAPPING_ERROR", "CUSPARSE_STATUS_INSUFFICIENT_RESOURCES", diff --git a/docs/tables/CUTENSOR_API_supported_by_HIP.md b/docs/tables/CUTENSOR_API_supported_by_HIP.md index 2c282eb7..d593116a 100644 --- a/docs/tables/CUTENSOR_API_supported_by_HIP.md +++ b/docs/tables/CUTENSOR_API_supported_by_HIP.md @@ -4,6 +4,8 @@ |**CUDA**|**A**|**D**|**C**|**R**|**HIP**|**A**|**D**|**C**|**R**|**E**| |:--|:-:|:-:|:-:|:-:|:--|:-:|:-:|:-:|:-:|:-:| +|`CUTENSORMG_ALGO_DEFAULT`|1.4.0.0| | | | | | | | | | +|`CUTENSORMG_CONTRACTION_FIND_ATTRIBUTE_MAX`|1.5.0.0| | | | | | | | | | |`CUTENSOR_ALGO_DEFAULT`|1.0.1.0| | | |`HIPTENSOR_ALGO_DEFAULT`|5.7.0| | | | | |`CUTENSOR_ALGO_DEFAULT_PATIENT`|1.4.0.0| | | |`HIPTENSOR_ALGO_DEFAULT_PATIENT`|5.7.0| | | | | |`CUTENSOR_ALGO_GETT`|1.0.1.0| | | | | | | | | | @@ -144,8 +146,14 @@ |`cutensorHandle_t`|1.0.1.0| | | |`hiptensorHandle_t`|5.7.0| | | | | |`cutensorJitMode_t`|2.0.0.0| | | | | | | | | | |`cutensorLoggerCallback_t`|1.3.2.0| | | |`hiptensorLoggerCallback_t`|5.7.0| | | | | +|`cutensorMgAlgo_t`|1.4.0.0| | | | | | | | | | |`cutensorMgContractionDescriptor_s`|1.4.0.0| | | | | | | | | | |`cutensorMgContractionDescriptor_t`|1.4.0.0| | | | | | | | | | +|`cutensorMgContractionFindAttribute_t`|1.5.0.0| | | | | | | | | | +|`cutensorMgContractionFind_s`|1.4.0.0| | | | | | | | | | +|`cutensorMgContractionFind_t`|1.4.0.0| | | | | | | | | | +|`cutensorMgContractionPlan_s`|1.4.0.0| | | | | | | | | | +|`cutensorMgContractionPlan_t`|1.4.0.0| | | | | | | | | | |`cutensorMgCopyDescriptor_s`|1.4.0.0| | | | | | | | | | |`cutensorMgCopyDescriptor_t`|1.4.0.0| | | | | | | | | | |`cutensorMgCopyPlan_s`|1.4.0.0| | | | | | | | | | @@ -202,6 +210,8 @@ |`cutensorLoggerSetFile`|1.3.2.0| | | |`hiptensorLoggerSetFile`|5.7.0| | | | | |`cutensorLoggerSetLevel`|1.3.2.0| | | |`hiptensorLoggerSetLevel`|5.7.0| | | | | |`cutensorLoggerSetMask`|1.3.2.0| | | |`hiptensorLoggerSetMask`|5.7.0| | | | | +|`cutensorMgCreate`|1.4.0.0| | | | | | | | | | +|`cutensorMgDestroy`|1.4.0.0| | | | | | | | | | |`cutensorOperationDescriptorGetAttribute`|2.0.0.0| | | | | | | | | | |`cutensorOperationDescriptorSetAttribute`|2.0.0.0| | | | | | | | | | |`cutensorPermutation`|1.0.1.0| | |2.0.0.0|`hiptensorPermutation`|6.1.0| | | | | diff --git a/src/CUDA2HIP_TENSOR_API_functions.cpp b/src/CUDA2HIP_TENSOR_API_functions.cpp index a6440012..677d3c35 100644 --- a/src/CUDA2HIP_TENSOR_API_functions.cpp +++ b/src/CUDA2HIP_TENSOR_API_functions.cpp @@ -65,6 +65,8 @@ const std::map CUDA_TENSOR_FUNCTION_MAP { {"cutensorLoggerSetLevel", {"hiptensorLoggerSetLevel", "", CONV_LIB_FUNC, API_TENSOR, 2}}, {"cutensorLoggerSetMask", {"hiptensorLoggerSetMask", "", CONV_LIB_FUNC, API_TENSOR, 2}}, {"cutensorLoggerForceDisable", {"hiptensorLoggerForceDisable", "", CONV_LIB_FUNC, API_TENSOR, 2}}, + {"cutensorMgCreate", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgDestroy", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, }; const std::map CUDA_TENSOR_FUNCTION_VER_MAP { @@ -110,6 +112,8 @@ const std::map CUDA_TENSOR_FUNCTION_VER_MAP { {"cutensorLoggerSetLevel", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, {"cutensorLoggerSetMask", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, {"cutensorLoggerForceDisable", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, + {"cutensorMgCreate", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgDestroy", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, }; const std::map HIP_TENSOR_FUNCTION_VER_MAP { diff --git a/src/CUDA2HIP_TENSOR_API_types.cpp b/src/CUDA2HIP_TENSOR_API_types.cpp index ce93f9e1..9ef0aa0b 100644 --- a/src/CUDA2HIP_TENSOR_API_types.cpp +++ b/src/CUDA2HIP_TENSOR_API_types.cpp @@ -181,6 +181,12 @@ const std::map CUDA_TENSOR_TYPE_NAME_MAP { {"CUTENSOR_MG_DEVICE_HOST", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, {"CUTENSOR_MG_DEVICE_HOST_PINNED", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorMgAlgo_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSORMG_ALGO_DEFAULT", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + + {"cutensorMgContractionFindAttribute_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"CUTENSORMG_CONTRACTION_FIND_ATTRIBUTE_MAX", {"", "", CONV_NUMERIC_LITERAL, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorHandle_t", {"hiptensorHandle_t", "", CONV_TYPE, API_TENSOR, 1}}, {"cutensorHandle", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, {"cutensorTensorDescriptor_t", {"hiptensorTensorDescriptor_t", "", CONV_TYPE, API_TENSOR, 1}}, @@ -199,6 +205,10 @@ const std::map CUDA_TENSOR_TYPE_NAME_MAP { {"cutensorMgCopyPlan_s", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, {"cutensorMgContractionDescriptor_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, {"cutensorMgContractionDescriptor_s", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorMgContractionFind_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorMgContractionFind_s", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorMgContractionPlan_t", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, + {"cutensorMgContractionPlan_s", {"", "", CONV_TYPE, API_TENSOR, 1, UNSUPPORTED}}, }; const std::map CUDA_TENSOR_TYPE_NAME_VER_MAP { @@ -364,6 +374,14 @@ const std::map CUDA_TENSOR_TYPE_NAME_VER_MAP { {"cutensorMgCopyPlan_t", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, {"cutensorMgContractionDescriptor_s", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, {"cutensorMgContractionDescriptor_t", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgContractionFind_s", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgContractionFind_t", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgContractionPlan_s", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgContractionPlan_t", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgAlgo_t", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"CUTENSORMG_ALGO_DEFAULT", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgContractionFindAttribute_t", {CUTENSOR_1500, CUDA_0, CUDA_0 }}, + {"CUTENSORMG_CONTRACTION_FIND_ATTRIBUTE_MAX", {CUTENSOR_1500, CUDA_0, CUDA_0 }}, }; const std::map HIP_TENSOR_TYPE_NAME_VER_MAP { From 3e2e7da5bc918c9cb7d0874ed208e2a263d33d9f Mon Sep 17 00:00:00 2001 From: Evgeny Mankov Date: Thu, 26 Dec 2024 21:14:55 +0100 Subject: [PATCH 10/17] [HIPIFY][TensorMg][feature] `cuTensorMg` support - Part 4 - final + Updated the regenerated `hipify-perl` and `TENSOR` `CUDA2HIP` docs accordingly --- bin/hipify-perl | 34 ++++++++++++++++++++ docs/tables/CUTENSOR_API_supported_by_HIP.md | 17 ++++++++++ src/CUDA2HIP_TENSOR_API_functions.cpp | 34 ++++++++++++++++++++ 3 files changed, 85 insertions(+) diff --git a/bin/hipify-perl b/bin/hipify-perl index a31b286a..7021777d 100755 --- a/bin/hipify-perl +++ b/bin/hipify-perl @@ -9939,19 +9939,36 @@ sub warnHipOnlyUnsupportedFunctions { "cutensorMgHostDevice_t", "cutensorMgHandle_t", "cutensorMgHandle_s", + "cutensorMgDestroyTensorDescriptor", + "cutensorMgDestroyCopyPlan", + "cutensorMgDestroyCopyDescriptor", + "cutensorMgDestroyContractionPlan", + "cutensorMgDestroyContractionFind", + "cutensorMgDestroyContractionDescriptor", "cutensorMgDestroy", + "cutensorMgCreateTensorDescriptor", + "cutensorMgCreateCopyPlan", + "cutensorMgCreateCopyDescriptor", + "cutensorMgCreateContractionPlan", + "cutensorMgCreateContractionFind", + "cutensorMgCreateContractionDescriptor", "cutensorMgCreate", "cutensorMgCopyPlan_t", "cutensorMgCopyPlan_s", + "cutensorMgCopyGetWorkspace", "cutensorMgCopyDescriptor_t", "cutensorMgCopyDescriptor_s", + "cutensorMgCopy", "cutensorMgContractionPlan_t", "cutensorMgContractionPlan_s", + "cutensorMgContractionGetWorkspace", "cutensorMgContractionFind_t", "cutensorMgContractionFind_s", + "cutensorMgContractionFindSetAttribute", "cutensorMgContractionFindAttribute_t", "cutensorMgContractionDescriptor_t", "cutensorMgContractionDescriptor_s", + "cutensorMgContraction", "cutensorMgAlgo_t", "cutensorJitMode_t", "cutensorHandleWritePlanCacheToFile", @@ -11327,19 +11344,36 @@ sub warnRocOnlyUnsupportedFunctions { "cutensorMgHostDevice_t", "cutensorMgHandle_t", "cutensorMgHandle_s", + "cutensorMgDestroyTensorDescriptor", + "cutensorMgDestroyCopyPlan", + "cutensorMgDestroyCopyDescriptor", + "cutensorMgDestroyContractionPlan", + "cutensorMgDestroyContractionFind", + "cutensorMgDestroyContractionDescriptor", "cutensorMgDestroy", + "cutensorMgCreateTensorDescriptor", + "cutensorMgCreateCopyPlan", + "cutensorMgCreateCopyDescriptor", + "cutensorMgCreateContractionPlan", + "cutensorMgCreateContractionFind", + "cutensorMgCreateContractionDescriptor", "cutensorMgCreate", "cutensorMgCopyPlan_t", "cutensorMgCopyPlan_s", + "cutensorMgCopyGetWorkspace", "cutensorMgCopyDescriptor_t", "cutensorMgCopyDescriptor_s", + "cutensorMgCopy", "cutensorMgContractionPlan_t", "cutensorMgContractionPlan_s", + "cutensorMgContractionGetWorkspace", "cutensorMgContractionFind_t", "cutensorMgContractionFind_s", + "cutensorMgContractionFindSetAttribute", "cutensorMgContractionFindAttribute_t", "cutensorMgContractionDescriptor_t", "cutensorMgContractionDescriptor_s", + "cutensorMgContraction", "cutensorMgAlgo_t", "cutensorJitMode_t", "cutensorHandleWritePlanCacheToFile", diff --git a/docs/tables/CUTENSOR_API_supported_by_HIP.md b/docs/tables/CUTENSOR_API_supported_by_HIP.md index d593116a..7406dbdb 100644 --- a/docs/tables/CUTENSOR_API_supported_by_HIP.md +++ b/docs/tables/CUTENSOR_API_supported_by_HIP.md @@ -210,8 +210,25 @@ |`cutensorLoggerSetFile`|1.3.2.0| | | |`hiptensorLoggerSetFile`|5.7.0| | | | | |`cutensorLoggerSetLevel`|1.3.2.0| | | |`hiptensorLoggerSetLevel`|5.7.0| | | | | |`cutensorLoggerSetMask`|1.3.2.0| | | |`hiptensorLoggerSetMask`|5.7.0| | | | | +|`cutensorMgContraction`|1.4.0.0| | | | | | | | | | +|`cutensorMgContractionFindSetAttribute`|1.5.0.0| | | | | | | | | | +|`cutensorMgContractionGetWorkspace`|1.4.0.0| | | | | | | | | | +|`cutensorMgCopy`|1.4.0.0| | | | | | | | | | +|`cutensorMgCopyGetWorkspace`|1.4.0.0| | | | | | | | | | |`cutensorMgCreate`|1.4.0.0| | | | | | | | | | +|`cutensorMgCreateContractionDescriptor`|1.4.0.0| | | | | | | | | | +|`cutensorMgCreateContractionFind`|1.4.0.0| | | | | | | | | | +|`cutensorMgCreateContractionPlan`|1.4.0.0| | | | | | | | | | +|`cutensorMgCreateCopyDescriptor`|1.4.0.0| | | | | | | | | | +|`cutensorMgCreateCopyPlan`|1.4.0.0| | | | | | | | | | +|`cutensorMgCreateTensorDescriptor`|1.4.0.0| | | | | | | | | | |`cutensorMgDestroy`|1.4.0.0| | | | | | | | | | +|`cutensorMgDestroyContractionDescriptor`|1.4.0.0| | | | | | | | | | +|`cutensorMgDestroyContractionFind`|1.4.0.0| | | | | | | | | | +|`cutensorMgDestroyContractionPlan`|1.4.0.0| | | | | | | | | | +|`cutensorMgDestroyCopyDescriptor`|1.4.0.0| | | | | | | | | | +|`cutensorMgDestroyCopyPlan`|1.4.0.0| | | | | | | | | | +|`cutensorMgDestroyTensorDescriptor`|1.4.0.0| | | | | | | | | | |`cutensorOperationDescriptorGetAttribute`|2.0.0.0| | | | | | | | | | |`cutensorOperationDescriptorSetAttribute`|2.0.0.0| | | | | | | | | | |`cutensorPermutation`|1.0.1.0| | |2.0.0.0|`hiptensorPermutation`|6.1.0| | | | | diff --git a/src/CUDA2HIP_TENSOR_API_functions.cpp b/src/CUDA2HIP_TENSOR_API_functions.cpp index 677d3c35..e802cf4f 100644 --- a/src/CUDA2HIP_TENSOR_API_functions.cpp +++ b/src/CUDA2HIP_TENSOR_API_functions.cpp @@ -67,6 +67,23 @@ const std::map CUDA_TENSOR_FUNCTION_MAP { {"cutensorLoggerForceDisable", {"hiptensorLoggerForceDisable", "", CONV_LIB_FUNC, API_TENSOR, 2}}, {"cutensorMgCreate", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, {"cutensorMgDestroy", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgCreateTensorDescriptor", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgDestroyTensorDescriptor", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgCreateCopyDescriptor", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgDestroyCopyDescriptor", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgCopyGetWorkspace", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgCreateCopyPlan", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgDestroyCopyPlan", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgCopy", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgCreateContractionFind", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgDestroyContractionFind", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgContractionFindSetAttribute", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgCreateContractionDescriptor", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgDestroyContractionDescriptor", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgContractionGetWorkspace", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgCreateContractionPlan", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgDestroyContractionPlan", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, + {"cutensorMgContraction", {"", "", CONV_LIB_FUNC, API_TENSOR, 2, UNSUPPORTED}}, }; const std::map CUDA_TENSOR_FUNCTION_VER_MAP { @@ -114,6 +131,23 @@ const std::map CUDA_TENSOR_FUNCTION_VER_MAP { {"cutensorLoggerForceDisable", {CUTENSOR_1320, CUDA_0, CUDA_0 }}, {"cutensorMgCreate", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, {"cutensorMgDestroy", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgCreateTensorDescriptor", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgDestroyTensorDescriptor", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgCreateCopyDescriptor", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgDestroyCopyDescriptor", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgCopyGetWorkspace", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgCreateCopyPlan", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgDestroyCopyPlan", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgCopy", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgCreateContractionFind", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgDestroyContractionFind", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgContractionFindSetAttribute", {CUTENSOR_1500, CUDA_0, CUDA_0 }}, + {"cutensorMgCreateContractionDescriptor", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgDestroyContractionDescriptor", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgContractionGetWorkspace", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgCreateContractionPlan", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgDestroyContractionPlan", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, + {"cutensorMgContraction", {CUTENSOR_1400, CUDA_0, CUDA_0 }}, }; const std::map HIP_TENSOR_FUNCTION_VER_MAP { From 39866b41587860591ea68f7245e98f83b04c844a Mon Sep 17 00:00:00 2001 From: Evgeny Mankov Date: Fri, 27 Dec 2024 14:33:01 +0100 Subject: [PATCH 11/17] [HIPIFY][MIOpen][6.3.0] Added the missing `MIOPEN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR` + Updated the regenerated `hipify-perl`, synthetic test, and `MIOPEN` `CUDA2HIP` docs accordingly --- bin/hipify-perl | 2 +- docs/tables/CUDNN_API_supported_by_HIP_and_MIOPEN.md | 2 +- docs/tables/CUDNN_API_supported_by_MIOPEN.md | 2 +- src/CUDA2HIP_DNN_API_types.cpp | 3 ++- tests/unit_tests/synthetic/libraries/cudnn2miopen.cu | 2 ++ 5 files changed, 7 insertions(+), 4 deletions(-) diff --git a/bin/hipify-perl b/bin/hipify-perl index 7021777d..b3b16e75 100755 --- a/bin/hipify-perl +++ b/bin/hipify-perl @@ -3564,6 +3564,7 @@ sub MIOpenSubstitutions { subst("CUDNN_BACKEND_OPERATION_REDUCTION_DESCRIPTOR", "MIOPEN_BACKEND_OPERATION_REDUCTION_DESCRIPTOR", "numeric_literal"); subst("CUDNN_BACKEND_OPERATION_RESAMPLE_BWD_DESCRIPTOR", "MIOPEN_BACKEND_OPERATION_RESAMPLE_BWD_DESCRIPTOR", "numeric_literal"); subst("CUDNN_BACKEND_OPERATION_RESAMPLE_FWD_DESCRIPTOR", "MIOPEN_BACKEND_OPERATION_RESAMPLE_FWD_DESCRIPTOR", "numeric_literal"); + subst("CUDNN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR", "MIOPEN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR", "numeric_literal"); subst("CUDNN_BACKEND_OPERATION_RNG_DESCRIPTOR", "MIOPEN_BACKEND_OPERATION_RNG_DESCRIPTOR", "numeric_literal"); subst("CUDNN_BACKEND_OPERATION_SIGNAL_DESCRIPTOR", "MIOPEN_BACKEND_OPERATION_SIGNAL_DESCRIPTOR", "numeric_literal"); subst("CUDNN_BACKEND_POINTWISE_DESCRIPTOR", "MIOPEN_BACKEND_POINTWISE_DESCRIPTOR", "numeric_literal"); @@ -13487,7 +13488,6 @@ sub warnMIOpenOnlyUnsupportedFunctions { "CUDNN_BATCHNORM_OPS_BN_ADD_ACTIVATION", "CUDNN_BATCHNORM_OPS_BN_ACTIVATION", "CUDNN_BATCHNORM_OPS_BN", - "CUDNN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR", "CUDNN_BACKEND_OPERATION_PAGED_CACHE_LOAD_DESCRIPTOR", "CUDNN_BACKEND_OPERATION_BN_FINALIZE_STATISTICS_DESCRIPTOR", "CUDNN_BACKEND_OPERATION_BN_BWD_WEIGHTS_DESCRIPTOR", diff --git a/docs/tables/CUDNN_API_supported_by_HIP_and_MIOPEN.md b/docs/tables/CUDNN_API_supported_by_HIP_and_MIOPEN.md index 64adaf7f..7cde5c9d 100644 --- a/docs/tables/CUDNN_API_supported_by_HIP_and_MIOPEN.md +++ b/docs/tables/CUDNN_API_supported_by_HIP_and_MIOPEN.md @@ -270,7 +270,7 @@ |`CUDNN_BACKEND_OPERATION_REDUCTION_DESCRIPTOR`|8.1.0| | | | | | | | | |`MIOPEN_BACKEND_OPERATION_REDUCTION_DESCRIPTOR`|6.2.0| | | | | |`CUDNN_BACKEND_OPERATION_RESAMPLE_BWD_DESCRIPTOR`|8.3.0| | | | | | | | | |`MIOPEN_BACKEND_OPERATION_RESAMPLE_BWD_DESCRIPTOR`|6.2.0| | | | | |`CUDNN_BACKEND_OPERATION_RESAMPLE_FWD_DESCRIPTOR`|8.3.0| | | | | | | | | |`MIOPEN_BACKEND_OPERATION_RESAMPLE_FWD_DESCRIPTOR`|6.2.0| | | | | -|`CUDNN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR`|8.7.0| | | | | | | | | | | | | | | | +|`CUDNN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR`|8.7.0| | | | | | | | | |`MIOPEN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR`|6.3.0| | | | | |`CUDNN_BACKEND_OPERATION_RNG_DESCRIPTOR`|8.7.0| | | | | | | | | |`MIOPEN_BACKEND_OPERATION_RNG_DESCRIPTOR`|6.2.0| | | | | |`CUDNN_BACKEND_OPERATION_SIGNAL_DESCRIPTOR`|8.5.0| | | | | | | | | |`MIOPEN_BACKEND_OPERATION_SIGNAL_DESCRIPTOR`|6.2.0| | | | | |`CUDNN_BACKEND_POINTWISE_DESCRIPTOR`|8.0.1| | | | | | | | | |`MIOPEN_BACKEND_POINTWISE_DESCRIPTOR`|6.2.0| | | | | diff --git a/docs/tables/CUDNN_API_supported_by_MIOPEN.md b/docs/tables/CUDNN_API_supported_by_MIOPEN.md index b7432d2e..c920c9ea 100644 --- a/docs/tables/CUDNN_API_supported_by_MIOPEN.md +++ b/docs/tables/CUDNN_API_supported_by_MIOPEN.md @@ -270,7 +270,7 @@ |`CUDNN_BACKEND_OPERATION_REDUCTION_DESCRIPTOR`|8.1.0| | | |`MIOPEN_BACKEND_OPERATION_REDUCTION_DESCRIPTOR`|6.2.0| | | | | |`CUDNN_BACKEND_OPERATION_RESAMPLE_BWD_DESCRIPTOR`|8.3.0| | | |`MIOPEN_BACKEND_OPERATION_RESAMPLE_BWD_DESCRIPTOR`|6.2.0| | | | | |`CUDNN_BACKEND_OPERATION_RESAMPLE_FWD_DESCRIPTOR`|8.3.0| | | |`MIOPEN_BACKEND_OPERATION_RESAMPLE_FWD_DESCRIPTOR`|6.2.0| | | | | -|`CUDNN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR`|8.7.0| | | | | | | | | | +|`CUDNN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR`|8.7.0| | | |`MIOPEN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR`|6.3.0| | | | | |`CUDNN_BACKEND_OPERATION_RNG_DESCRIPTOR`|8.7.0| | | |`MIOPEN_BACKEND_OPERATION_RNG_DESCRIPTOR`|6.2.0| | | | | |`CUDNN_BACKEND_OPERATION_SIGNAL_DESCRIPTOR`|8.5.0| | | |`MIOPEN_BACKEND_OPERATION_SIGNAL_DESCRIPTOR`|6.2.0| | | | | |`CUDNN_BACKEND_POINTWISE_DESCRIPTOR`|8.0.1| | | |`MIOPEN_BACKEND_POINTWISE_DESCRIPTOR`|6.2.0| | | | | diff --git a/src/CUDA2HIP_DNN_API_types.cpp b/src/CUDA2HIP_DNN_API_types.cpp index 218bbeab..5884c5f9 100644 --- a/src/CUDA2HIP_DNN_API_types.cpp +++ b/src/CUDA2HIP_DNN_API_types.cpp @@ -729,7 +729,7 @@ const std::map CUDA_DNN_TYPE_NAME_MAP { {"CUDNN_BACKEND_OPERATION_SIGNAL_DESCRIPTOR", {"HIPDNN_BACKEND_OPERATION_SIGNAL_DESCRIPTOR", "MIOPEN_BACKEND_OPERATION_SIGNAL_DESCRIPTOR", CONV_NUMERIC_LITERAL, API_DNN, 1, HIP_UNSUPPORTED}}, {"CUDNN_BACKEND_OPERATION_NORM_FORWARD_DESCRIPTOR", {"HIPDNN_BACKEND_OPERATION_NORM_FORWARD_DESCRIPTOR", "MIOPEN_BACKEND_OPERATION_NORM_FORWARD_DESCRIPTOR", CONV_NUMERIC_LITERAL, API_DNN, 1, HIP_UNSUPPORTED}}, {"CUDNN_BACKEND_OPERATION_NORM_BACKWARD_DESCRIPTOR", {"HIPDNN_BACKEND_OPERATION_NORM_BACKWARD_DESCRIPTOR", "MIOPEN_BACKEND_OPERATION_NORM_BACKWARD_DESCRIPTOR", CONV_NUMERIC_LITERAL, API_DNN, 1, HIP_UNSUPPORTED}}, - {"CUDNN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR", {"HIPDNN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR", "", CONV_NUMERIC_LITERAL, API_DNN, 1, UNSUPPORTED}}, + {"CUDNN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR", {"HIPDNN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR", "MIOPEN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR", CONV_NUMERIC_LITERAL, API_DNN, 1, HIP_UNSUPPORTED}}, {"CUDNN_BACKEND_RNG_DESCRIPTOR", {"HIPDNN_BACKEND_RNG_DESCRIPTOR", "MIOPEN_BACKEND_RNG_DESCRIPTOR", CONV_NUMERIC_LITERAL, API_DNN, 1, HIP_UNSUPPORTED}}, {"CUDNN_BACKEND_OPERATION_RNG_DESCRIPTOR", {"HIPDNN_BACKEND_OPERATION_RNG_DESCRIPTOR", "MIOPEN_BACKEND_OPERATION_RNG_DESCRIPTOR", CONV_NUMERIC_LITERAL, API_DNN, 1, HIP_UNSUPPORTED}}, {"CUDNN_BACKEND_KERNEL_CACHE_DESCRIPTOR", {"HIPDNN_BACKEND_KERNEL_CACHE_DESCRIPTOR", "", CONV_NUMERIC_LITERAL, API_DNN, 1, UNSUPPORTED}}, @@ -2269,4 +2269,5 @@ const std::map HIP_DNN_TYPE_NAME_VER_MAP { {"miopenPaddingDefault", {HIP_2010, HIP_0, HIP_0 }}, {"miopenPaddingSame", {HIP_2010, HIP_0, HIP_0 }}, {"miopenPaddingValid", {HIP_2010, HIP_0, HIP_0 }}, + {"MIOPEN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR", {HIP_6030, HIP_0, HIP_0 }}, }; diff --git a/tests/unit_tests/synthetic/libraries/cudnn2miopen.cu b/tests/unit_tests/synthetic/libraries/cudnn2miopen.cu index 366a5bd4..a76e51c8 100644 --- a/tests/unit_tests/synthetic/libraries/cudnn2miopen.cu +++ b/tests/unit_tests/synthetic/libraries/cudnn2miopen.cu @@ -1589,8 +1589,10 @@ int main() { #if CUDNN_VERSION >= 8700 // CHECK: miopenBackendDescriptorType_t BACKEND_RNG_DESCRIPTOR = MIOPEN_BACKEND_RNG_DESCRIPTOR; // CHECK-NEXT: miopenBackendDescriptorType_t BACKEND_OPERATION_RNG_DESCRIPTOR = MIOPEN_BACKEND_OPERATION_RNG_DESCRIPTOR; + // CHECK-NEXT: miopenBackendDescriptorType_t BACKEND_OPERATION_RESHAPE_DESCRIPTOR = MIOPEN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR; cudnnBackendDescriptorType_t BACKEND_RNG_DESCRIPTOR = CUDNN_BACKEND_RNG_DESCRIPTOR; cudnnBackendDescriptorType_t BACKEND_OPERATION_RNG_DESCRIPTOR = CUDNN_BACKEND_OPERATION_RNG_DESCRIPTOR; + cudnnBackendDescriptorType_t BACKEND_OPERATION_RESHAPE_DESCRIPTOR = CUDNN_BACKEND_OPERATION_RESHAPE_DESCRIPTOR; // CHECK: miopenBackendAttributeType_t TYPE_RNG_DISTRIBUTION = MIOPEN_TYPE_RNG_DISTRIBUTION; cudnnBackendAttributeType_t TYPE_RNG_DISTRIBUTION = CUDNN_TYPE_RNG_DISTRIBUTION; From 6c3f1c05ce3717e2789c51a3368f4dd3ed99eb48 Mon Sep 17 00:00:00 2001 From: Evgeny Mankov Date: Fri, 27 Dec 2024 14:38:46 +0100 Subject: [PATCH 12/17] [HIPIFY][HIP][tests] Added the missing test for `hipDrvGraphExecMemsetNodeSetParams` --- tests/unit_tests/synthetic/driver_functions.cu | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/tests/unit_tests/synthetic/driver_functions.cu b/tests/unit_tests/synthetic/driver_functions.cu index af597ca9..263b5c86 100644 --- a/tests/unit_tests/synthetic/driver_functions.cu +++ b/tests/unit_tests/synthetic/driver_functions.cu @@ -523,7 +523,7 @@ int main() { result = cuMemGetInfo_v2(&bytes, &bytes_2); // CUDA: CUresult CUDAAPI cuMemHostAlloc(void **pp, size_t bytesize, unsigned int Flags); - // HIP: DEPRECATED("use hipHostMalloc instead") hipError_t hipHostAlloc(void** ptr, size_t size, unsigned int flags); + // HIP: hipError_t hipHostAlloc(void** ptr, size_t size, unsigned int flags); // CHECK: result = hipHostAlloc(&image, bytes, flags); result = cuMemHostAlloc(&image, bytes, flags); @@ -1487,6 +1487,11 @@ int main() { // HIP: hipError_t hipDrvGraphExecMemcpyNodeSetParams(hipGraphExec_t hGraphExec, hipGraphNode_t hNode, const HIP_MEMCPY3D* copyParams, hipCtx_t ctx); // CHECK: result = hipDrvGraphExecMemcpyNodeSetParams(graphExec, graphNode, &MEMCPY3D, context); result = cuGraphExecMemcpyNodeSetParams(graphExec, graphNode, &MEMCPY3D, context); + + // CUDA: CUresult CUDAAPI cuGraphExecMemsetNodeSetParams(CUgraphExec hGraphExec, CUgraphNode hNode, const CUDA_MEMSET_NODE_PARAMS *memsetParams, CUcontext ctx); + // HIP: hipError_t hipDrvGraphExecMemsetNodeSetParams(hipGraphExec_t hGraphExec, hipGraphNode_t hNode, const HIP_MEMSET_NODE_PARAMS* memsetParams, hipCtx_t ctx); + // CHECK: result = hipDrvGraphExecMemsetNodeSetParams(graphExec, graphNode, &MEMSET_NODE_PARAMS, context); + result = cuGraphExecMemsetNodeSetParams(graphExec, graphNode, &MEMSET_NODE_PARAMS, context); #endif #if CUDA_VERSION >= 10020 && CUDA_VERSION < 12000 From c0b08824a0035fa85b47c9dfcf80d7ce5d3487a3 Mon Sep 17 00:00:00 2001 From: Evgeny Mankov Date: Mon, 30 Dec 2024 20:45:38 +0100 Subject: [PATCH 13/17] [HIPIFY] Formatting --- src/CUDA2HIP.cpp | 90 +++--- src/CUDA2HIP_BLAS_API_functions.cpp | 405 ++++++++++++------------- src/CUDA2HIP_BLAS_API_types.cpp | 2 +- src/CUDA2HIP_CAFFE2_API_types.cpp | 1 - src/CUDA2HIP_DNN_API_functions.cpp | 1 - src/CUDA2HIP_Driver_API_functions.cpp | 20 +- src/CUDA2HIP_Driver_API_types.cpp | 10 +- src/CUDA2HIP_FFT_API_types.cpp | 1 - src/CUDA2HIP_RAND_API_functions.cpp | 204 ++++++------- src/CUDA2HIP_RAND_API_types.cpp | 240 +++++++-------- src/CUDA2HIP_Runtime_API_functions.cpp | 20 +- src/CUDA2HIP_Runtime_API_types.cpp | 41 ++- src/CUDA2HIP_SOLVER_API_functions.cpp | 26 +- src/CUDA2HIP_SOLVER_API_types.cpp | 18 +- src/StringUtils.cpp | 1 - 15 files changed, 537 insertions(+), 543 deletions(-) diff --git a/src/CUDA2HIP.cpp b/src/CUDA2HIP.cpp index 85e09176..75e72e22 100644 --- a/src/CUDA2HIP.cpp +++ b/src/CUDA2HIP.cpp @@ -25,29 +25,29 @@ THE SOFTWARE. // Maps CUDA header names to HIP header names const std::map CUDA_INCLUDE_MAP { // CUDA includes - {"cuda.h", {"hip/hip_runtime.h", "", CONV_INCLUDE_CUDA_MAIN_H, API_DRIVER, 0}}, - {"cuda_runtime.h", {"hip/hip_runtime.h", "", CONV_INCLUDE_CUDA_MAIN_H, API_RUNTIME, 0}}, - {"device_launch_parameters.h", {"", "", CONV_INCLUDE, API_RUNTIME, 0}}, - {"cuda_runtime_api.h", {"hip/hip_runtime_api.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, - {"channel_descriptor.h", {"hip/channel_descriptor.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, - {"device_functions.h", {"hip/device_functions.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, - {"driver_types.h", {"hip/driver_types.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, - {"cuda_fp16.h", {"hip/hip_fp16.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, - {"cuda_fp8.h", {"hip/hip_fp8.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, - {"cuda_texture_types.h", {"hip/hip_texture_types.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, - {"texture_fetch_functions.h", {"", "", CONV_INCLUDE, API_RUNTIME, 0}}, - {"vector_types.h", {"hip/hip_vector_types.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, - {"cuda_profiler_api.h", {"hip/hip_runtime_api.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, - {"cooperative_groups.h", {"hip/hip_cooperative_groups.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, - {"library_types.h", {"hip/library_types.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, - {"math_constants.h", {"hip/hip_math_constants.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, + {"cuda.h", {"hip/hip_runtime.h", "", CONV_INCLUDE_CUDA_MAIN_H, API_DRIVER, 0}}, + {"cuda_runtime.h", {"hip/hip_runtime.h", "", CONV_INCLUDE_CUDA_MAIN_H, API_RUNTIME, 0}}, + {"device_launch_parameters.h", {"", "", CONV_INCLUDE, API_RUNTIME, 0}}, + {"cuda_runtime_api.h", {"hip/hip_runtime_api.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, + {"channel_descriptor.h", {"hip/channel_descriptor.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, + {"device_functions.h", {"hip/device_functions.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, + {"driver_types.h", {"hip/driver_types.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, + {"cuda_fp16.h", {"hip/hip_fp16.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, + {"cuda_fp8.h", {"hip/hip_fp8.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, + {"cuda_texture_types.h", {"hip/hip_texture_types.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, + {"texture_fetch_functions.h", {"", "", CONV_INCLUDE, API_RUNTIME, 0}}, + {"vector_types.h", {"hip/hip_vector_types.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, + {"cuda_profiler_api.h", {"hip/hip_runtime_api.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, + {"cooperative_groups.h", {"hip/hip_cooperative_groups.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, + {"library_types.h", {"hip/library_types.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, + {"math_constants.h", {"hip/hip_math_constants.h", "", CONV_INCLUDE, API_RUNTIME, 0}}, // cuComplex includes - {"cuComplex.h", {"hip/hip_complex.h", "", CONV_INCLUDE_CUDA_MAIN_H, API_COMPLEX, 0}}, + {"cuComplex.h", {"hip/hip_complex.h", "", CONV_INCLUDE_CUDA_MAIN_H, API_COMPLEX, 0}}, // cuBLAS includes - {"cublas.h", {"hipblas.h", "rocblas.h", CONV_INCLUDE_CUDA_MAIN_H, API_BLAS, 0}}, - {"cublas_v2.h", {"hipblas.h", "rocblas.h", CONV_INCLUDE_CUDA_MAIN_V2_H, API_BLAS, 0}}, - {"cublas_api.h", {"hipblas.h", "rocblas.h", CONV_INCLUDE, API_BLAS, 0}}, - {"cublasLt.h", {"hipblaslt.h", "", CONV_INCLUDE, API_BLAS, 0}}, + {"cublas.h", {"hipblas.h", "rocblas.h", CONV_INCLUDE_CUDA_MAIN_H, API_BLAS, 0}}, + {"cublas_v2.h", {"hipblas.h", "rocblas.h", CONV_INCLUDE_CUDA_MAIN_V2_H, API_BLAS, 0}}, + {"cublas_api.h", {"hipblas.h", "rocblas.h", CONV_INCLUDE, API_BLAS, 0}}, + {"cublasLt.h", {"hipblaslt.h", "", CONV_INCLUDE, API_BLAS, 0}}, // cuRAND includes {"curand.h", {"hiprand/hiprand.h", "rocrand/rocrand.h", CONV_INCLUDE_CUDA_MAIN_H, API_RAND, 0}}, {"curand_kernel.h", {"hiprand/hiprand_kernel.h", "rocrand/rocrand_kernel.h", CONV_INCLUDE, API_RAND, 0}}, @@ -67,36 +67,36 @@ const std::map CUDA_INCLUDE_MAP { {"curand_precalc.h", {"hiprand/hiprand_kernel.h", "rocrand/rocrand_xorwow_precomputed.h", CONV_INCLUDE, API_RAND, 0}}, {"curand_uniform.h", {"hiprand/hiprand_kernel.h", "rocrand/rocrand_uniform.h", CONV_INCLUDE, API_RAND, 0}}, // cuDNN includes - {"cudnn.h", {"hipDNN.h", "miopen/miopen.h", CONV_INCLUDE_CUDA_MAIN_H, API_DNN, 0}}, + {"cudnn.h", {"hipDNN.h", "miopen/miopen.h", CONV_INCLUDE_CUDA_MAIN_H, API_DNN, 0}}, // cuTensor includes - {"cutensor.h", {"hiptensor.h", "", CONV_INCLUDE_CUDA_MAIN_H, API_TENSOR, 0}}, - {"cutensorMg.h", {"hiptensor.h", "", CONV_INCLUDE, API_TENSOR, 0}}, + {"cutensor.h", {"hiptensor.h", "", CONV_INCLUDE_CUDA_MAIN_H, API_TENSOR, 0}}, + {"cutensorMg.h", {"hiptensor.h", "", CONV_INCLUDE, API_TENSOR, 0}}, // cuFFT includes - {"cufft.h", {"hipfft/hipfft.h", "", CONV_INCLUDE_CUDA_MAIN_H, API_FFT, 0}}, - {"cufftXt.h", {"hipfft/hipfftXt.h", "", CONV_INCLUDE, API_FFT, 0}}, + {"cufft.h", {"hipfft/hipfft.h", "", CONV_INCLUDE_CUDA_MAIN_H, API_FFT, 0}}, + {"cufftXt.h", {"hipfft/hipfftXt.h", "", CONV_INCLUDE, API_FFT, 0}}, // cuSPARSE includes - {"cusparse.h", {"hipsparse.h", "rocsparse.h", CONV_INCLUDE_CUDA_MAIN_H, API_SPARSE, 0}}, - {"cusparse_v2.h", {"hipsparse.h", "rocsparse.h", CONV_INCLUDE_CUDA_MAIN_V2_H, API_SPARSE, 0}}, + {"cusparse.h", {"hipsparse.h", "rocsparse.h", CONV_INCLUDE_CUDA_MAIN_H, API_SPARSE, 0}}, + {"cusparse_v2.h", {"hipsparse.h", "rocsparse.h", CONV_INCLUDE_CUDA_MAIN_V2_H, API_SPARSE, 0}}, // cuSOLVER includes - {"cusolverDn.h", {"hipsolver.h", "rocsolver/rocsolver.h", CONV_INCLUDE_CUDA_MAIN_H, API_SOLVER, 0}}, - {"cusolverMg.h", {"hipsolver.h", "rocsolver/rocsolver.h", CONV_INCLUDE_CUDA_MAIN_H, API_SOLVER, 0}}, - {"cusolverRf.h", {"hipsolver.h", "rocsolver/rocsolver.h", CONV_INCLUDE_CUDA_MAIN_H, API_SOLVER, 0}}, - {"cusolverSp.h", {"hipsolver.h", "rocsolver/rocsolver.h", CONV_INCLUDE_CUDA_MAIN_H, API_SOLVER, 0}}, - {"cusolverSp_LOWLEVEL_PREVIEW.h", {"hipsolver.h", "rocsolver/rocsolver.h", CONV_INCLUDE_CUDA_MAIN_H, API_SOLVER, 0}}, - {"cusolver_common.h", {"hipsolver.h", "rocsolver/rocsolver.h", CONV_INCLUDE_CUDA_MAIN_H, API_SOLVER, 0}}, + {"cusolverDn.h", {"hipsolver.h", "rocsolver/rocsolver.h", CONV_INCLUDE_CUDA_MAIN_H, API_SOLVER, 0}}, + {"cusolverMg.h", {"hipsolver.h", "rocsolver/rocsolver.h", CONV_INCLUDE_CUDA_MAIN_H, API_SOLVER, 0}}, + {"cusolverRf.h", {"hipsolver.h", "rocsolver/rocsolver.h", CONV_INCLUDE_CUDA_MAIN_H, API_SOLVER, 0}}, + {"cusolverSp.h", {"hipsolver.h", "rocsolver/rocsolver.h", CONV_INCLUDE_CUDA_MAIN_H, API_SOLVER, 0}}, + {"cusolverSp_LOWLEVEL_PREVIEW.h", {"hipsolver.h", "rocsolver/rocsolver.h", CONV_INCLUDE_CUDA_MAIN_H, API_SOLVER, 0}}, + {"cusolver_common.h", {"hipsolver.h", "rocsolver/rocsolver.h", CONV_INCLUDE_CUDA_MAIN_H, API_SOLVER, 0}}, // CUB includes - {"cub/cub.cuh", {"hipcub/hipcub.hpp", "", CONV_INCLUDE_CUDA_MAIN_H, API_CUB, 0}}, + {"cub/cub.cuh", {"hipcub/hipcub.hpp", "", CONV_INCLUDE_CUDA_MAIN_H, API_CUB, 0}}, // CAFFE2 includes - {"caffe2/core/common_gpu.h", {"caffe2/core/hip/common_gpu.h", "", CONV_INCLUDE, API_CAFFE2, 0, UNSUPPORTED}}, - {"caffe2/core/context_gpu.h", {"caffe2/core/hip/context_gpu.h", "", CONV_INCLUDE, API_CAFFE2, 0, UNSUPPORTED}}, - {"caffe2/operators/operator_fallback_gpu.h", {"", "", CONV_INCLUDE, API_CAFFE2, 0, UNSUPPORTED}}, - {"caffe2/operators/spatial_batch_norm_op.h", {"caffe2/operators/hip/spatial_batch_norm_op_miopen.hip", "", CONV_INCLUDE, API_CAFFE2, 0}}, - {"caffe2/operators/generate_proposals_op_util_nms_gpu.h", {"", "", CONV_INCLUDE, API_CAFFE2, 0, UNSUPPORTED}}, - {"caffe2/operators/max_pool_with_index_gpu.h", {"", "", CONV_INCLUDE, API_CAFFE2, 0, UNSUPPORTED}}, - {"caffe2/operators/rnn/recurrent_network_executor_gpu.h", {"", "", CONV_INCLUDE, API_CAFFE2, 0, UNSUPPORTED}}, - {"caffe2/utils/math/reduce.cuh", {"caffe2/utils/math/hip/reduce.cuh", "", CONV_INCLUDE, API_CAFFE2, 0, UNSUPPORTED}}, - {"caffe2/operators/gather_op.cuh", {"caffe2/operators/math/gather_op.cuh", "", CONV_INCLUDE, API_CAFFE2, 0, UNSUPPORTED}}, - {"caffe2/core/common_cudnn.h", {"caffe2/core/hip/common_miopen.h", "", CONV_INCLUDE, API_CAFFE2, 0}}, + {"caffe2/core/common_gpu.h", {"caffe2/core/hip/common_gpu.h", "", CONV_INCLUDE, API_CAFFE2, 0, UNSUPPORTED}}, + {"caffe2/core/context_gpu.h", {"caffe2/core/hip/context_gpu.h", "", CONV_INCLUDE, API_CAFFE2, 0, UNSUPPORTED}}, + {"caffe2/operators/operator_fallback_gpu.h", {"", "", CONV_INCLUDE, API_CAFFE2, 0, UNSUPPORTED}}, + {"caffe2/operators/spatial_batch_norm_op.h", {"caffe2/operators/hip/spatial_batch_norm_op_miopen.hip", "", CONV_INCLUDE, API_CAFFE2, 0}}, + {"caffe2/operators/generate_proposals_op_util_nms_gpu.h", {"", "", CONV_INCLUDE, API_CAFFE2, 0, UNSUPPORTED}}, + {"caffe2/operators/max_pool_with_index_gpu.h", {"", "", CONV_INCLUDE, API_CAFFE2, 0, UNSUPPORTED}}, + {"caffe2/operators/rnn/recurrent_network_executor_gpu.h", {"", "", CONV_INCLUDE, API_CAFFE2, 0, UNSUPPORTED}}, + {"caffe2/utils/math/reduce.cuh", {"caffe2/utils/math/hip/reduce.cuh", "", CONV_INCLUDE, API_CAFFE2, 0, UNSUPPORTED}}, + {"caffe2/operators/gather_op.cuh", {"caffe2/operators/math/gather_op.cuh", "", CONV_INCLUDE, API_CAFFE2, 0, UNSUPPORTED}}, + {"caffe2/core/common_cudnn.h", {"caffe2/core/hip/common_miopen.h", "", CONV_INCLUDE, API_CAFFE2, 0}}, // RTC includes {"nvrtc.h", {"hiprtc.h", "", CONV_INCLUDE_CUDA_MAIN_H, API_RTC, 0}}, }; diff --git a/src/CUDA2HIP_BLAS_API_functions.cpp b/src/CUDA2HIP_BLAS_API_functions.cpp index d7cc90bb..537a535a 100644 --- a/src/CUDA2HIP_BLAS_API_functions.cpp +++ b/src/CUDA2HIP_BLAS_API_functions.cpp @@ -26,7 +26,6 @@ using SEC = blas::BLAS_API_SECTIONS; // Map of all functions const std::map CUDA_BLAS_FUNCTION_MAP { - // Blas management functions {"cublasInit", {"hipblasInit", "rocblas_initialize", CONV_LIB_FUNC, API_BLAS, SEC::BLAS_HELPER, HIP_UNSUPPORTED}}, {"cublasShutdown", {"hipblasShutdown", "", CONV_LIB_FUNC, API_BLAS, SEC::BLAS_HELPER, UNSUPPORTED}}, @@ -1933,158 +1932,158 @@ const std::map HIP_BLAS_FUNCTION_VER_MAP { {"hipblasLtMatmulPreferenceSetAttribute", {HIP_5050, HIP_0, HIP_0 }}, {"hipblasLtMatmulPreferenceGetAttribute", {HIP_5050, HIP_0, HIP_0 }}, {"hipblasLtMatmulAlgoGetHeuristic", {HIP_5050, HIP_0, HIP_0 }}, - {"hipblasSgbmv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDgbmv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCgbmv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZgbmv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasSgemv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDgemv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCgemv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZgemv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasSgemvBatched_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDgemvBatched_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCgemvBatched_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZgemvBatched_v2_64", {HIP_6020, HIP_0, HIP_0, }}, + {"hipblasSgbmv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDgbmv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCgbmv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZgbmv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasSgemv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDgemv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCgemv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZgemv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasSgemvBatched_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDgemvBatched_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCgemvBatched_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZgemvBatched_v2_64", {HIP_6020, HIP_0, HIP_0 }}, {"hipblasSgemvStridedBatched", {HIP_3000, HIP_0, HIP_0 }}, {"hipblasDgemvStridedBatched", {HIP_3000, HIP_0, HIP_0 }}, {"hipblasSgemvBatched", {HIP_1060, HIP_0, HIP_0 }}, {"hipblasDgemvBatched", {HIP_3000, HIP_0, HIP_0 }}, - {"hipblasSgemvStridedBatched_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDgemvStridedBatched_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCgemvStridedBatched_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZgemvStridedBatched_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasSger_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDger_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCgeru_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCgerc_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZgeru_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZgerc_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasChbmv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZhbmv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasChemv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZhemv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCher_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZher_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCher2_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZher2_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasChpmv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZhpmv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasChpr_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZhpr_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasChpr2_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZhpr2_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasSsbmv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDsbmv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasSspmv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDspmv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasSspr_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDspr_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasSspr2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDspr2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasSsymv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDsymv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCsymv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZsymv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasSsyr_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDsyr_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCsyr_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZsyr_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasSsyr2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDsyr2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCsyr2_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZsyr2_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasStbmv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDtbmv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCtbmv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZtbmv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasStbsv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDtbsv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCtbsv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZtbsv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasStpmv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDtpmv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCtpmv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZtpmv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasStpsv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDtpsv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCtpsv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZtpsv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasStrmv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDtrmv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCtrmv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZtrmv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasStrsv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDtrsv_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasCtrsv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasZtrsv_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasAxpyEx_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDotEx_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasDotcEx_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasNrm2Ex_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasRotEx_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasScalEx_v2_64", {HIP_6020, HIP_0, HIP_0, }}, - {"hipblasHgemm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasSgemm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasDgemm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasCgemm_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasZgemm_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasHgemmBatched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasSgemmBatched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasDgemmBatched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasCgemmBatched_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasZgemmBatched_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasHgemmStridedBatched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasSgemmStridedBatched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasDgemmStridedBatched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasCgemmStridedBatched_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasZgemmStridedBatched_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasCherk_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasZherk_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasCherkx_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasZherkx_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasCher2k_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasZher2k_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasSsymm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasDsymm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasCsymm_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasZsymm_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasSsyrk_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasDsyrk_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasCsyrk_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasZsyrk_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasSsyr2k_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasDsyr2k_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasCsyr2k_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasZsyr2k_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasSsyrkx_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasDsyrkx_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasCsyrkx_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasZsyrkx_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasSgeam_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasDgeam_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasCgeam_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasZgeam_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasChemm_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasZhemm_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasStrmm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasDtrmm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasCtrmm_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasZtrmm_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasStrsm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasDtrsm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasCtrsm_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasZtrsm_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasStrsmBatched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasDtrsmBatched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasCtrsmBatched_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasZtrsmBatched_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasSdgmm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasDdgmm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasCdgmm_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasZdgmm_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasGemmEx_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasGemmBatchedEx_v2_64", {HIP_6030, HIP_0, HIP_0, }}, - {"hipblasGemmStridedBatchedEx_v2_64", {HIP_6030, HIP_0, HIP_0, }}, + {"hipblasSgemvStridedBatched_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDgemvStridedBatched_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCgemvStridedBatched_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZgemvStridedBatched_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasSger_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDger_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCgeru_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCgerc_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZgeru_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZgerc_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasChbmv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZhbmv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasChemv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZhemv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCher_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZher_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCher2_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZher2_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasChpmv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZhpmv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasChpr_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZhpr_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasChpr2_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZhpr2_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasSsbmv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDsbmv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasSspmv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDspmv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasSspr_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDspr_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasSspr2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDspr2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasSsymv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDsymv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCsymv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZsymv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasSsyr_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDsyr_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCsyr_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZsyr_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasSsyr2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDsyr2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCsyr2_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZsyr2_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasStbmv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDtbmv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCtbmv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZtbmv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasStbsv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDtbsv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCtbsv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZtbsv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasStpmv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDtpmv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCtpmv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZtpmv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasStpsv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDtpsv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCtpsv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZtpsv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasStrmv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDtrmv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCtrmv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZtrmv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasStrsv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDtrsv_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasCtrsv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasZtrsv_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasAxpyEx_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDotEx_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasDotcEx_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasNrm2Ex_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasRotEx_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasScalEx_v2_64", {HIP_6020, HIP_0, HIP_0 }}, + {"hipblasHgemm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasSgemm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasDgemm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasCgemm_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasZgemm_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasHgemmBatched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasSgemmBatched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasDgemmBatched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasCgemmBatched_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasZgemmBatched_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasHgemmStridedBatched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasSgemmStridedBatched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasDgemmStridedBatched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasCgemmStridedBatched_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasZgemmStridedBatched_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasCherk_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasZherk_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasCherkx_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasZherkx_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasCher2k_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasZher2k_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasSsymm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasDsymm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasCsymm_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasZsymm_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasSsyrk_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasDsyrk_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasCsyrk_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasZsyrk_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasSsyr2k_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasDsyr2k_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasCsyr2k_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasZsyr2k_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasSsyrkx_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasDsyrkx_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasCsyrkx_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasZsyrkx_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasSgeam_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasDgeam_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasCgeam_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasZgeam_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasChemm_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasZhemm_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasStrmm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasDtrmm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasCtrmm_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasZtrmm_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasStrsm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasDtrsm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasCtrsm_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasZtrsm_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasStrsmBatched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasDtrsmBatched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasCtrsmBatched_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasZtrsmBatched_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasSdgmm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasDdgmm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasCdgmm_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasZdgmm_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasGemmEx_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasGemmBatchedEx_v2_64", {HIP_6030, HIP_0, HIP_0 }}, + {"hipblasGemmStridedBatchedEx_v2_64", {HIP_6030, HIP_0, HIP_0 }}, {"rocblas_status_to_string", {HIP_3050, HIP_0, HIP_0 }}, {"rocblas_sscal", {HIP_1050, HIP_0, HIP_0 }}, @@ -2473,60 +2472,60 @@ const std::map HIP_BLAS_FUNCTION_VER_MAP { {"rocblas_dtrsm_batched_64", {HIP_6020, HIP_0, HIP_0 }}, {"rocblas_ctrsm_batched_64", {HIP_6020, HIP_0, HIP_0 }}, {"rocblas_ztrsm_batched_64", {HIP_6020, HIP_0, HIP_0 }}, - {"rocblas_hgemm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_sgemm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_dgemm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_cgemm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_zgemm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_hgemm_batched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_sgemm_batched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_dgemm_batched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_cgemm_batched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_zgemm_batched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_hgemm_strided_batched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_sgemm_strided_batched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_dgemm_strided_batched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_cgemm_strided_batched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_zgemm_strided_batched_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_cherk_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_zherk_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_cherkx_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_zherkx_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_cher2k_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_zher2k_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_ssymm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_dsymm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_csymm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_zsymm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_ssyrk_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_dsyrk_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_csyrk_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_zsyrk_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_ssyr2k_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_dsyr2k_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_csyr2k_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_zsyr2k_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_ssyrkx_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_dsyrkx_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_csyrkx_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_zsyrkx_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_sgeam_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_dgeam_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_cgeam_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_zgeam_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_chemm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_zhemm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_strmm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_dtrmm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_ctrmm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_ztrmm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_sdgmm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_ddgmm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_cdgmm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_zdgmm_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_gemm_ex_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_gemm_batched_ex_64", {HIP_6030, HIP_0, HIP_0, }}, - {"rocblas_gemm_strided_batched_ex_64", {HIP_6030, HIP_0, HIP_0, }}, + {"rocblas_hgemm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_sgemm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_dgemm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_cgemm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_zgemm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_hgemm_batched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_sgemm_batched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_dgemm_batched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_cgemm_batched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_zgemm_batched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_hgemm_strided_batched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_sgemm_strided_batched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_dgemm_strided_batched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_cgemm_strided_batched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_zgemm_strided_batched_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_cherk_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_zherk_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_cherkx_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_zherkx_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_cher2k_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_zher2k_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_ssymm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_dsymm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_csymm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_zsymm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_ssyrk_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_dsyrk_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_csyrk_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_zsyrk_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_ssyr2k_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_dsyr2k_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_csyr2k_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_zsyr2k_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_ssyrkx_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_dsyrkx_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_csyrkx_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_zsyrkx_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_sgeam_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_dgeam_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_cgeam_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_zgeam_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_chemm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_zhemm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_strmm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_dtrmm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_ctrmm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_ztrmm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_sdgmm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_ddgmm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_cdgmm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_zdgmm_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_gemm_ex_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_gemm_batched_ex_64", {HIP_6030, HIP_0, HIP_0 }}, + {"rocblas_gemm_strided_batched_ex_64", {HIP_6030, HIP_0, HIP_0 }}, }; const std::map HIP_BLAS_FUNCTION_CHANGED_VER_MAP { diff --git a/src/CUDA2HIP_BLAS_API_types.cpp b/src/CUDA2HIP_BLAS_API_types.cpp index 6ec55479..587ed381 100644 --- a/src/CUDA2HIP_BLAS_API_types.cpp +++ b/src/CUDA2HIP_BLAS_API_types.cpp @@ -2241,7 +2241,7 @@ const std::map HIP_BLAS_TYPE_NAME_VER_MAP { {"HIPBLASLT_MATMUL_PREF_SEARCH_MODE", {HIP_5050, HIP_0, HIP_0 }}, {"HIPBLASLT_MATMUL_PREF_MAX_WORKSPACE_BYTES", {HIP_5050, HIP_0, HIP_0 }}, {"hipblasLtMatmulHeuristicResult_t", {HIP_5050, HIP_0, HIP_0 }}, - {"HIPBLASLT_MATMUL_DESC_AMAX_D_POINTER", {HIP_6020, HIP_0, HIP_0, }}, + {"HIPBLASLT_MATMUL_DESC_AMAX_D_POINTER", {HIP_6020, HIP_0, HIP_0 }}, {"rocblas_handle", {HIP_1050, HIP_0, HIP_0 }}, {"_rocblas_handle", {HIP_1050, HIP_0, HIP_0 }}, diff --git a/src/CUDA2HIP_CAFFE2_API_types.cpp b/src/CUDA2HIP_CAFFE2_API_types.cpp index 8b81e171..df37ff9d 100644 --- a/src/CUDA2HIP_CAFFE2_API_types.cpp +++ b/src/CUDA2HIP_CAFFE2_API_types.cpp @@ -38,4 +38,3 @@ const std::map CUDA_CAFFE2_TYPE_NAME_VER_MAP { const std::map HIP_CAFFE2_TYPE_NAME_VER_MAP { }; - diff --git a/src/CUDA2HIP_DNN_API_functions.cpp b/src/CUDA2HIP_DNN_API_functions.cpp index 96c9f6d8..b3dbac0b 100644 --- a/src/CUDA2HIP_DNN_API_functions.cpp +++ b/src/CUDA2HIP_DNN_API_functions.cpp @@ -24,7 +24,6 @@ THE SOFTWARE. // Map of all functions const std::map CUDA_DNN_FUNCTION_MAP { - // NOTE: MIOPEN_EXPORT miopenStatus_t miopenGetVersion(size_t* major, size_t* minor, size_t* patch) and size_t CUDNNWINAPI cudnnGetVersion(void) have different signatures {"cudnnGetVersion", {"hipdnnGetVersion", "", CONV_LIB_FUNC, API_DNN, 2, ROC_UNSUPPORTED}}, {"cudnnGetCudartVersion", {"hipdnnGetCudartVersion", "", CONV_LIB_FUNC, API_DNN, 2, UNSUPPORTED}}, diff --git a/src/CUDA2HIP_Driver_API_functions.cpp b/src/CUDA2HIP_Driver_API_functions.cpp index 4f09c001..c327e60b 100644 --- a/src/CUDA2HIP_Driver_API_functions.cpp +++ b/src/CUDA2HIP_Driver_API_functions.cpp @@ -1655,16 +1655,16 @@ const std::map HIP_DRIVER_FUNCTION_VER_MAP { {"hipDrvGraphAddMemcpyNode", {HIP_6000, HIP_0, HIP_0 }}, {"hipDrvGraphAddMemsetNode", {HIP_6010, HIP_0, HIP_0 }}, {"hipTexRefGetBorderColor", {HIP_6010, HIP_6010, HIP_0 }}, - {"hipMemcpyAtoD", {HIP_6020, HIP_0, HIP_0, }}, - {"hipMemcpyDtoA", {HIP_6020, HIP_0, HIP_0, }}, - {"hipMemcpyAtoA", {HIP_6020, HIP_0, HIP_0, }}, - {"hipMemcpyAtoHAsync", {HIP_6020, HIP_0, HIP_0, }}, - {"hipMemcpyHtoAAsync", {HIP_6020, HIP_0, HIP_0, }}, - {"hipDrvGraphAddMemFreeNode", {HIP_6030, HIP_0, HIP_0, }}, - {"hipDrvGraphMemcpyNodeGetParams", {HIP_6030, HIP_0, HIP_0, }}, - {"hipDrvGraphMemcpyNodeSetParams", {HIP_6030, HIP_0, HIP_0, }}, - {"hipDrvGraphExecMemcpyNodeSetParams", {HIP_6030, HIP_0, HIP_0, }}, - {"hipDrvGraphExecMemsetNodeSetParams", {HIP_6030, HIP_0, HIP_0, }}, + {"hipMemcpyAtoD", {HIP_6020, HIP_0, HIP_0 }}, + {"hipMemcpyDtoA", {HIP_6020, HIP_0, HIP_0 }}, + {"hipMemcpyAtoA", {HIP_6020, HIP_0, HIP_0 }}, + {"hipMemcpyAtoHAsync", {HIP_6020, HIP_0, HIP_0 }}, + {"hipMemcpyHtoAAsync", {HIP_6020, HIP_0, HIP_0 }}, + {"hipDrvGraphAddMemFreeNode", {HIP_6030, HIP_0, HIP_0 }}, + {"hipDrvGraphMemcpyNodeGetParams", {HIP_6030, HIP_0, HIP_0 }}, + {"hipDrvGraphMemcpyNodeSetParams", {HIP_6030, HIP_0, HIP_0 }}, + {"hipDrvGraphExecMemcpyNodeSetParams", {HIP_6030, HIP_0, HIP_0 }}, + {"hipDrvGraphExecMemsetNodeSetParams", {HIP_6030, HIP_0, HIP_0 }}, {"hipStreamBatchMemOp", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, {"hipGraphAddBatchMemOpNode", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, {"hipGraphBatchMemOpNodeGetParams", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, diff --git a/src/CUDA2HIP_Driver_API_types.cpp b/src/CUDA2HIP_Driver_API_types.cpp index 4fb6d8e3..978d32e2 100644 --- a/src/CUDA2HIP_Driver_API_types.cpp +++ b/src/CUDA2HIP_Driver_API_types.cpp @@ -4298,12 +4298,12 @@ const std::map HIP_DRIVER_TYPE_NAME_VER_MAP { {"hipDeviceAttributeHostRegisterSupported", {HIP_6000, HIP_0, HIP_0 }}, {"hipExternalSemaphoreSignalNodeParams", {HIP_6000, HIP_0, HIP_0 }}, {"hipExternalSemaphoreWaitNodeParams", {HIP_6000, HIP_0, HIP_0 }}, - {"hipDriverProcAddressQueryResult", {HIP_6020, HIP_0, HIP_0, }}, - {"HIP_GET_PROC_ADDRESS_SUCCESS", {HIP_6020, HIP_0, HIP_0, }}, - {"HIP_GET_PROC_ADDRESS_SYMBOL_NOT_FOUND", {HIP_6020, HIP_0, HIP_0, }}, - {"HIP_GET_PROC_ADDRESS_VERSION_NOT_SUFFICIENT", {HIP_6020, HIP_0, HIP_0, }}, + {"hipDriverProcAddressQueryResult", {HIP_6020, HIP_0, HIP_0 }}, + {"HIP_GET_PROC_ADDRESS_SUCCESS", {HIP_6020, HIP_0, HIP_0 }}, + {"HIP_GET_PROC_ADDRESS_SYMBOL_NOT_FOUND", {HIP_6020, HIP_0, HIP_0 }}, + {"HIP_GET_PROC_ADDRESS_VERSION_NOT_SUFFICIENT", {HIP_6020, HIP_0, HIP_0 }}, {"HIP_MEMSET_NODE_PARAMS", {HIP_6010, HIP_0, HIP_0 }}, - {"hipStreamLegacy", {HIP_6020, HIP_0, HIP_0, }}, + {"hipStreamLegacy", {HIP_6020, HIP_0, HIP_0 }}, {"hipStreamBatchMemOpParams_union", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, {"hipStreamBatchMemOpParams", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, {"hipBatchMemOpNodeParams", {HIP_6040, HIP_0, HIP_0, HIP_LATEST}}, diff --git a/src/CUDA2HIP_FFT_API_types.cpp b/src/CUDA2HIP_FFT_API_types.cpp index 9011dd23..b2b86555 100644 --- a/src/CUDA2HIP_FFT_API_types.cpp +++ b/src/CUDA2HIP_FFT_API_types.cpp @@ -24,7 +24,6 @@ THE SOFTWARE. // Map of all functions const std::map CUDA_FFT_TYPE_NAME_MAP { - // cuFFT defines {"CUFFT_FORWARD", {"HIPFFT_FORWARD", "", CONV_NUMERIC_LITERAL, API_FFT, 1}}, // -1 {"CUFFT_INVERSE", {"HIPFFT_BACKWARD", "", CONV_NUMERIC_LITERAL, API_FFT, 1}}, // 1 diff --git a/src/CUDA2HIP_RAND_API_functions.cpp b/src/CUDA2HIP_RAND_API_functions.cpp index 4d1647cb..06597805 100644 --- a/src/CUDA2HIP_RAND_API_functions.cpp +++ b/src/CUDA2HIP_RAND_API_functions.cpp @@ -87,112 +87,112 @@ const std::map CUDA_RAND_FUNCTION_MAP { }; const std::map CUDA_RAND_FUNCTION_VER_MAP { - {"curandGetProperty", {CUDA_80, CUDA_0, CUDA_0, }}, - {"__curand_umul", {CUDA_115, CUDA_0, CUDA_0, }}, + {"curandGetProperty", {CUDA_80, CUDA_0, CUDA_0 }}, + {"__curand_umul", {CUDA_115, CUDA_0, CUDA_0 }}, }; const std::map HIP_RAND_FUNCTION_VER_MAP { - {"hiprandCreateGenerator", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandCreateGeneratorHost", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandCreatePoissonDistribution", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandDestroyDistribution", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandDestroyGenerator", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandGenerate", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandGenerateLogNormal", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandGenerateLogNormalDouble", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandGenerateNormal", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandGenerateNormalDouble", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandGeneratePoisson", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandGenerateSeeds", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandGenerateUniform", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandGenerateUniformDouble", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandGetVersion", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandSetGeneratorOffset", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandSetPseudoRandomGeneratorSeed", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandSetQuasiRandomGeneratorDimensions", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandSetStream", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandMakeMTGP32Constants", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandMakeMTGP32KernelState", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_init", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_log_normal", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_log_normal_double", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_log_normal2", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_log_normal2_double", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_log_normal4", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_log_normal4_double", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_normal", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_normal_double", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_normal2", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_normal2_double", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_normal4", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_normal4_double", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_uniform", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_uniform_double", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_uniform2_double", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_uniform4", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_uniform4_double", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_discrete", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_discrete4", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_poisson", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprand_poisson4", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandGetDirectionVectors32", {HIP_6000, HIP_0, HIP_0, }}, - {"hiprandGetDirectionVectors64", {HIP_6000, HIP_0, HIP_0, }}, - {"hiprandGetScrambleConstants32", {HIP_6000, HIP_0, HIP_0, }}, - {"hiprandGetScrambleConstants64", {HIP_6000, HIP_0, HIP_0, }}, - {"hiprandSetGeneratorOrdering", {HIP_6020, HIP_0, HIP_0, }}, - {"hiprandGenerateLongLong", {HIP_5050, HIP_0, HIP_0, }}, + {"hiprandCreateGenerator", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandCreateGeneratorHost", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandCreatePoissonDistribution", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandDestroyDistribution", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandDestroyGenerator", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandGenerate", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandGenerateLogNormal", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandGenerateLogNormalDouble", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandGenerateNormal", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandGenerateNormalDouble", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandGeneratePoisson", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandGenerateSeeds", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandGenerateUniform", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandGenerateUniformDouble", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandGetVersion", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandSetGeneratorOffset", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandSetPseudoRandomGeneratorSeed", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandSetQuasiRandomGeneratorDimensions", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandSetStream", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandMakeMTGP32Constants", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandMakeMTGP32KernelState", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_init", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_log_normal", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_log_normal_double", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_log_normal2", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_log_normal2_double", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_log_normal4", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_log_normal4_double", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_normal", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_normal_double", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_normal2", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_normal2_double", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_normal4", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_normal4_double", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_uniform", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_uniform_double", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_uniform2_double", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_uniform4", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_uniform4_double", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_discrete", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_discrete4", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_poisson", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprand_poisson4", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandGetDirectionVectors32", {HIP_6000, HIP_0, HIP_0 }}, + {"hiprandGetDirectionVectors64", {HIP_6000, HIP_0, HIP_0 }}, + {"hiprandGetScrambleConstants32", {HIP_6000, HIP_0, HIP_0 }}, + {"hiprandGetScrambleConstants64", {HIP_6000, HIP_0, HIP_0 }}, + {"hiprandSetGeneratorOrdering", {HIP_6020, HIP_0, HIP_0 }}, + {"hiprandGenerateLongLong", {HIP_5050, HIP_0, HIP_0 }}, - {"rocrand_create_generator", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_create_generator_host_blocking", {HIP_6020, HIP_0, HIP_0, }}, - {"rocrand_destroy_generator", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_generate", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_generate_long_long", {HIP_5040, HIP_0, HIP_0, }}, - {"rocrand_generate_uniform", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_generate_uniform_double", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_generate_normal", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_generate_normal_double", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_generate_log_normal", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_generate_log_normal_double", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_generate_poisson", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_initialize_generator", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_set_stream", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_set_seed", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_set_offset", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_set_ordering", {HIP_5050, HIP_0, HIP_0, }}, - {"rocrand_set_quasi_random_generator_dimensions", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_get_version", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_create_poisson_distribution", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_get_direction_vectors32", {HIP_6000, HIP_0, HIP_0, }}, - {"rocrand_get_direction_vectors64", {HIP_6000, HIP_0, HIP_0, }}, - {"rocrand_get_scramble_constants32", {HIP_6000, HIP_0, HIP_0, }}, - {"rocrand_get_scramble_constants64", {HIP_6000, HIP_0, HIP_0, }}, - {"rocrand_destroy_discrete_distribution", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_make_constant", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_make_state_mtgp32", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_init", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_log_normal", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_log_normal_double", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_log_normal2", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_log_normal_double2", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_log_normal4", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_log_normal_double4", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_normal", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_normal_double", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_normal2", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_normal_double2", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_normal4", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_normal_double4", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_uniform", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_uniform_double", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_uniform_double2", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_uniform4", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_uniform_double4", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_discrete", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_discrete4", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_poisson", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_poisson4", {HIP_1050, HIP_0, HIP_0, }}, + {"rocrand_create_generator", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_create_generator_host_blocking", {HIP_6020, HIP_0, HIP_0 }}, + {"rocrand_destroy_generator", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_generate", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_generate_long_long", {HIP_5040, HIP_0, HIP_0 }}, + {"rocrand_generate_uniform", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_generate_uniform_double", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_generate_normal", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_generate_normal_double", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_generate_log_normal", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_generate_log_normal_double", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_generate_poisson", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_initialize_generator", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_set_stream", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_set_seed", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_set_offset", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_set_ordering", {HIP_5050, HIP_0, HIP_0 }}, + {"rocrand_set_quasi_random_generator_dimensions", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_get_version", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_create_poisson_distribution", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_get_direction_vectors32", {HIP_6000, HIP_0, HIP_0 }}, + {"rocrand_get_direction_vectors64", {HIP_6000, HIP_0, HIP_0 }}, + {"rocrand_get_scramble_constants32", {HIP_6000, HIP_0, HIP_0 }}, + {"rocrand_get_scramble_constants64", {HIP_6000, HIP_0, HIP_0 }}, + {"rocrand_destroy_discrete_distribution", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_make_constant", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_make_state_mtgp32", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_init", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_log_normal", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_log_normal_double", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_log_normal2", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_log_normal_double2", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_log_normal4", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_log_normal_double4", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_normal", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_normal_double", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_normal2", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_normal_double2", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_normal4", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_normal_double4", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_uniform", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_uniform_double", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_uniform_double2", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_uniform4", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_uniform_double4", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_discrete", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_discrete4", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_poisson", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_poisson4", {HIP_1050, HIP_0, HIP_0 }}, }; const std::map CUDA_RAND_API_SECTION_MAP { diff --git a/src/CUDA2HIP_RAND_API_types.cpp b/src/CUDA2HIP_RAND_API_types.cpp index b236fee1..fb39a89d 100644 --- a/src/CUDA2HIP_RAND_API_types.cpp +++ b/src/CUDA2HIP_RAND_API_types.cpp @@ -139,128 +139,128 @@ const std::map CUDA_RAND_TYPE_NAME_MAP { }; const std::map CUDA_RAND_TYPE_NAME_VER_MAP { - {"CURAND_ORDERING_PSEUDO_LEGACY", {CUDA_110, CUDA_0, CUDA_0, }}, // A: CUDA_VERSION 11001, CURAND_VERSION 10200, CURAND_VER_MAJOR 10 CURAND_VER_MINOR 2 CURAND_VER_PATCH 0 - {"CURAND_ORDERING_PSEUDO_DYNAMIC", {CUDA_115, CUDA_0, CUDA_0, }}, // A: CUDA_VERSION 11052, CURAND_VERSION 10207, CURAND_VER_MAJOR 10 CURAND_VER_MINOR 2 CURAND_VER_PATCH 7 + {"CURAND_ORDERING_PSEUDO_LEGACY", {CUDA_110, CUDA_0, CUDA_0 }}, // A: CUDA_VERSION 11001, CURAND_VERSION 10200, CURAND_VER_MAJOR 10 CURAND_VER_MINOR 2 CURAND_VER_PATCH 0 + {"CURAND_ORDERING_PSEUDO_DYNAMIC", {CUDA_115, CUDA_0, CUDA_0 }}, // A: CUDA_VERSION 11052, CURAND_VERSION 10207, CURAND_VER_MAJOR 10 CURAND_VER_MINOR 2 CURAND_VER_PATCH 7 }; const std::map HIP_RAND_TYPE_NAME_VER_MAP { - {"hiprandStatus", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandStatus_t", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandRngType_t", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandGenerator_st", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandGenerator_t", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandDiscreteDistribution_st", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandDiscreteDistribution_t", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandDirectionVectors32_t", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandStateMtgp32", {HIP_1080, HIP_0, HIP_0, }}, - {"hiprandStateMtgp32_t", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandStateSobol32", {HIP_1080, HIP_0, HIP_0, }}, - {"hiprandStateSobol32_t", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandStateMRG32k3a", {HIP_1080, HIP_0, HIP_0, }}, - {"hiprandStateMRG32k3a_t", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandStatePhilox4_32_10", {HIP_1080, HIP_0, HIP_0, }}, - {"hiprandStatePhilox4_32_10_t", {HIP_1080, HIP_0, HIP_0, }}, - {"hiprandStateXORWOW", {HIP_1080, HIP_0, HIP_0, }}, - {"hiprandStateXORWOW_t", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandState", {HIP_1080, HIP_0, HIP_0, }}, - {"hiprandState_t", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_STATUS_SUCCESS", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_STATUS_VERSION_MISMATCH", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_STATUS_NOT_INITIALIZED", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_STATUS_ALLOCATION_FAILED", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_STATUS_TYPE_ERROR", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_STATUS_OUT_OF_RANGE", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_STATUS_LENGTH_NOT_MULTIPLE", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_STATUS_DOUBLE_PRECISION_REQUIRED", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_STATUS_LAUNCH_FAILURE", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_STATUS_PREEXISTING_FAILURE", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_STATUS_INITIALIZATION_FAILED", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_STATUS_ARCH_MISMATCH", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_STATUS_INTERNAL_ERROR", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_RNG_TEST", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_RNG_PSEUDO_DEFAULT", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_RNG_PSEUDO_XORWOW", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_RNG_PSEUDO_MRG32K3A", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_RNG_PSEUDO_MTGP32", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_RNG_PSEUDO_MT19937", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_RNG_PSEUDO_PHILOX4_32_10", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_RNG_QUASI_DEFAULT", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_RNG_QUASI_SOBOL32", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_RNG_QUASI_SCRAMBLED_SOBOL32", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_RNG_QUASI_SOBOL64", {HIP_1050, HIP_0, HIP_0, }}, - {"HIPRAND_RNG_QUASI_SCRAMBLED_SOBOL64", {HIP_1050, HIP_0, HIP_0, }}, - {"hiprandDirectionVectorSet_t", {HIP_6000, HIP_0, HIP_0, }}, - {"HIPRAND_DIRECTION_VECTORS_32_JOEKUO6", {HIP_6000, HIP_0, HIP_0, }}, - {"HIPRAND_SCRAMBLED_DIRECTION_VECTORS_32_JOEKUO6", {HIP_6000, HIP_0, HIP_0, }}, - {"HIPRAND_DIRECTION_VECTORS_64_JOEKUO6", {HIP_6000, HIP_0, HIP_0, }}, - {"HIPRAND_SCRAMBLED_DIRECTION_VECTORS_64_JOEKUO6", {HIP_6000, HIP_0, HIP_0, }}, - {"hiprandDirectionVectors64_t", {HIP_6000, HIP_0, HIP_0, }}, - {"hiprandOrdering", {HIP_6020, HIP_0, HIP_0, }}, - {"hiprandOrdering_t", {HIP_6020, HIP_0, HIP_0, }}, - {"HIPRAND_ORDERING_PSEUDO_BEST", {HIP_6020, HIP_0, HIP_0, }}, - {"HIPRAND_ORDERING_PSEUDO_DEFAULT", {HIP_6020, HIP_0, HIP_0, }}, - {"HIPRAND_ORDERING_PSEUDO_SEEDED", {HIP_6020, HIP_0, HIP_0, }}, - {"HIPRAND_ORDERING_PSEUDO_LEGACY", {HIP_6020, HIP_0, HIP_0, }}, - {"HIPRAND_ORDERING_PSEUDO_DYNAMIC", {HIP_6020, HIP_0, HIP_0, }}, - {"HIPRAND_ORDERING_QUASI_DEFAULT", {HIP_6020, HIP_0, HIP_0, }}, - {"hiprandStateScrambledSobol32", {HIP_6020, HIP_0, HIP_0, }}, - {"hiprandStateScrambledSobol32_t", {HIP_6020, HIP_0, HIP_0, }}, - {"hiprandStateScrambledSobol64", {HIP_6020, HIP_0, HIP_0, }}, - {"hiprandStateScrambledSobol64_t", {HIP_6020, HIP_0, HIP_0, }}, - {"hiprandStateSobol64", {HIP_6020, HIP_0, HIP_0, }}, - {"hiprandStateSobol64_t", {HIP_6020, HIP_0, HIP_0, }}, + {"hiprandStatus", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandStatus_t", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandRngType_t", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandGenerator_st", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandGenerator_t", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandDiscreteDistribution_st", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandDiscreteDistribution_t", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandDirectionVectors32_t", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandStateMtgp32", {HIP_1080, HIP_0, HIP_0 }}, + {"hiprandStateMtgp32_t", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandStateSobol32", {HIP_1080, HIP_0, HIP_0 }}, + {"hiprandStateSobol32_t", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandStateMRG32k3a", {HIP_1080, HIP_0, HIP_0 }}, + {"hiprandStateMRG32k3a_t", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandStatePhilox4_32_10", {HIP_1080, HIP_0, HIP_0 }}, + {"hiprandStatePhilox4_32_10_t", {HIP_1080, HIP_0, HIP_0 }}, + {"hiprandStateXORWOW", {HIP_1080, HIP_0, HIP_0 }}, + {"hiprandStateXORWOW_t", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandState", {HIP_1080, HIP_0, HIP_0 }}, + {"hiprandState_t", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_STATUS_SUCCESS", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_STATUS_VERSION_MISMATCH", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_STATUS_NOT_INITIALIZED", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_STATUS_ALLOCATION_FAILED", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_STATUS_TYPE_ERROR", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_STATUS_OUT_OF_RANGE", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_STATUS_LENGTH_NOT_MULTIPLE", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_STATUS_DOUBLE_PRECISION_REQUIRED", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_STATUS_LAUNCH_FAILURE", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_STATUS_PREEXISTING_FAILURE", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_STATUS_INITIALIZATION_FAILED", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_STATUS_ARCH_MISMATCH", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_STATUS_INTERNAL_ERROR", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_RNG_TEST", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_RNG_PSEUDO_DEFAULT", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_RNG_PSEUDO_XORWOW", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_RNG_PSEUDO_MRG32K3A", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_RNG_PSEUDO_MTGP32", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_RNG_PSEUDO_MT19937", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_RNG_PSEUDO_PHILOX4_32_10", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_RNG_QUASI_DEFAULT", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_RNG_QUASI_SOBOL32", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_RNG_QUASI_SCRAMBLED_SOBOL32", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_RNG_QUASI_SOBOL64", {HIP_1050, HIP_0, HIP_0 }}, + {"HIPRAND_RNG_QUASI_SCRAMBLED_SOBOL64", {HIP_1050, HIP_0, HIP_0 }}, + {"hiprandDirectionVectorSet_t", {HIP_6000, HIP_0, HIP_0 }}, + {"HIPRAND_DIRECTION_VECTORS_32_JOEKUO6", {HIP_6000, HIP_0, HIP_0 }}, + {"HIPRAND_SCRAMBLED_DIRECTION_VECTORS_32_JOEKUO6", {HIP_6000, HIP_0, HIP_0 }}, + {"HIPRAND_DIRECTION_VECTORS_64_JOEKUO6", {HIP_6000, HIP_0, HIP_0 }}, + {"HIPRAND_SCRAMBLED_DIRECTION_VECTORS_64_JOEKUO6", {HIP_6000, HIP_0, HIP_0 }}, + {"hiprandDirectionVectors64_t", {HIP_6000, HIP_0, HIP_0 }}, + {"hiprandOrdering", {HIP_6020, HIP_0, HIP_0 }}, + {"hiprandOrdering_t", {HIP_6020, HIP_0, HIP_0 }}, + {"HIPRAND_ORDERING_PSEUDO_BEST", {HIP_6020, HIP_0, HIP_0 }}, + {"HIPRAND_ORDERING_PSEUDO_DEFAULT", {HIP_6020, HIP_0, HIP_0 }}, + {"HIPRAND_ORDERING_PSEUDO_SEEDED", {HIP_6020, HIP_0, HIP_0 }}, + {"HIPRAND_ORDERING_PSEUDO_LEGACY", {HIP_6020, HIP_0, HIP_0 }}, + {"HIPRAND_ORDERING_PSEUDO_DYNAMIC", {HIP_6020, HIP_0, HIP_0 }}, + {"HIPRAND_ORDERING_QUASI_DEFAULT", {HIP_6020, HIP_0, HIP_0 }}, + {"hiprandStateScrambledSobol32", {HIP_6020, HIP_0, HIP_0 }}, + {"hiprandStateScrambledSobol32_t", {HIP_6020, HIP_0, HIP_0 }}, + {"hiprandStateScrambledSobol64", {HIP_6020, HIP_0, HIP_0 }}, + {"hiprandStateScrambledSobol64_t", {HIP_6020, HIP_0, HIP_0 }}, + {"hiprandStateSobol64", {HIP_6020, HIP_0, HIP_0 }}, + {"hiprandStateSobol64_t", {HIP_6020, HIP_0, HIP_0 }}, - {"rocrand_status", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_STATUS_SUCCESS", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_STATUS_VERSION_MISMATCH", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_STATUS_NOT_CREATED", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_STATUS_ALLOCATION_FAILED", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_STATUS_TYPE_ERROR", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_STATUS_OUT_OF_RANGE", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_STATUS_LENGTH_NOT_MULTIPLE", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_STATUS_DOUBLE_PRECISION_REQUIRED", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_STATUS_LAUNCH_FAILURE", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_STATUS_INTERNAL_ERROR", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_rng_type", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_RNG_PSEUDO_DEFAULT", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_RNG_PSEUDO_XORWOW", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_RNG_PSEUDO_MRG32K3A", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_RNG_PSEUDO_MTGP32", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_RNG_PSEUDO_MT19937", {HIP_5050, HIP_0, HIP_0, }}, - {"ROCRAND_RNG_PSEUDO_PHILOX4_32_10", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_RNG_QUASI_DEFAULT", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_RNG_QUASI_SOBOL32", {HIP_1050, HIP_0, HIP_0, }}, - {"ROCRAND_RNG_QUASI_SCRAMBLED_SOBOL32", {HIP_5040, HIP_0, HIP_0, }}, - {"ROCRAND_RNG_QUASI_SOBOL64", {HIP_4050, HIP_0, HIP_0, }}, - {"ROCRAND_RNG_QUASI_SCRAMBLED_SOBOL64", {HIP_5040, HIP_0, HIP_0, }}, - {"rocrand_ordering", {HIP_5050, HIP_0, HIP_0, }}, - {"ROCRAND_ORDERING_PSEUDO_BEST", {HIP_5050, HIP_0, HIP_0, }}, - {"ROCRAND_ORDERING_PSEUDO_DEFAULT", {HIP_5050, HIP_0, HIP_0, }}, - {"ROCRAND_ORDERING_PSEUDO_SEEDED", {HIP_5050, HIP_0, HIP_0, }}, - {"ROCRAND_ORDERING_PSEUDO_LEGACY", {HIP_5050, HIP_0, HIP_0, }}, - {"ROCRAND_ORDERING_PSEUDO_DYNAMIC", {HIP_5050, HIP_0, HIP_0, }}, - {"ROCRAND_ORDERING_QUASI_DEFAULT", {HIP_5050, HIP_0, HIP_0, }}, - {"rocrand_direction_vector_set", {HIP_6000, HIP_0, HIP_0, }}, - {"ROCRAND_DIRECTION_VECTORS_32_JOEKUO6", {HIP_6000, HIP_0, HIP_0, }}, - {"ROCRAND_SCRAMBLED_DIRECTION_VECTORS_32_JOEKUO6", {HIP_6000, HIP_0, HIP_0, }}, - {"ROCRAND_DIRECTION_VECTORS_64_JOEKUO6", {HIP_6000, HIP_0, HIP_0, }}, - {"ROCRAND_SCRAMBLED_DIRECTION_VECTORS_64_JOEKUO6", {HIP_6000, HIP_0, HIP_0, }}, - {"rocrand_generator_base_type", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_generator", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_discrete_distribution_st", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_discrete_distribution", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_device::philox4x32_10_engine", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_state_philox4x32_10", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_device::mtgp32_engine", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_state_mtgp32", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_device::scrambled_sobol32_engine", {HIP_5040, HIP_0, HIP_0, }}, - {"rocrand_state_scrambled_sobol32", {HIP_5040, HIP_0, HIP_0, }}, - {"rocrand_device::scrambled_sobol64_engine", {HIP_5040, HIP_0, HIP_0, }}, - {"rocrand_state_scrambled_sobol64", {HIP_5040, HIP_0, HIP_0, }}, - {"rocrand_device::sobol32_engine", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_state_sobol32", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_device::sobol64_engine", {HIP_4050, HIP_0, HIP_0, }}, - {"rocrand_state_sobol64", {HIP_4050, HIP_0, HIP_0, }}, - {"rocrand_device::mrg32k3a_engine", {HIP_1050, HIP_0, HIP_0, }}, - {"rocrand_state_mrg32k3a", {HIP_1050, HIP_0, HIP_0, }}, + {"rocrand_status", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_STATUS_SUCCESS", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_STATUS_VERSION_MISMATCH", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_STATUS_NOT_CREATED", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_STATUS_ALLOCATION_FAILED", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_STATUS_TYPE_ERROR", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_STATUS_OUT_OF_RANGE", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_STATUS_LENGTH_NOT_MULTIPLE", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_STATUS_DOUBLE_PRECISION_REQUIRED", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_STATUS_LAUNCH_FAILURE", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_STATUS_INTERNAL_ERROR", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_rng_type", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_RNG_PSEUDO_DEFAULT", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_RNG_PSEUDO_XORWOW", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_RNG_PSEUDO_MRG32K3A", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_RNG_PSEUDO_MTGP32", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_RNG_PSEUDO_MT19937", {HIP_5050, HIP_0, HIP_0 }}, + {"ROCRAND_RNG_PSEUDO_PHILOX4_32_10", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_RNG_QUASI_DEFAULT", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_RNG_QUASI_SOBOL32", {HIP_1050, HIP_0, HIP_0 }}, + {"ROCRAND_RNG_QUASI_SCRAMBLED_SOBOL32", {HIP_5040, HIP_0, HIP_0 }}, + {"ROCRAND_RNG_QUASI_SOBOL64", {HIP_4050, HIP_0, HIP_0 }}, + {"ROCRAND_RNG_QUASI_SCRAMBLED_SOBOL64", {HIP_5040, HIP_0, HIP_0 }}, + {"rocrand_ordering", {HIP_5050, HIP_0, HIP_0 }}, + {"ROCRAND_ORDERING_PSEUDO_BEST", {HIP_5050, HIP_0, HIP_0 }}, + {"ROCRAND_ORDERING_PSEUDO_DEFAULT", {HIP_5050, HIP_0, HIP_0 }}, + {"ROCRAND_ORDERING_PSEUDO_SEEDED", {HIP_5050, HIP_0, HIP_0 }}, + {"ROCRAND_ORDERING_PSEUDO_LEGACY", {HIP_5050, HIP_0, HIP_0 }}, + {"ROCRAND_ORDERING_PSEUDO_DYNAMIC", {HIP_5050, HIP_0, HIP_0 }}, + {"ROCRAND_ORDERING_QUASI_DEFAULT", {HIP_5050, HIP_0, HIP_0 }}, + {"rocrand_direction_vector_set", {HIP_6000, HIP_0, HIP_0 }}, + {"ROCRAND_DIRECTION_VECTORS_32_JOEKUO6", {HIP_6000, HIP_0, HIP_0 }}, + {"ROCRAND_SCRAMBLED_DIRECTION_VECTORS_32_JOEKUO6", {HIP_6000, HIP_0, HIP_0 }}, + {"ROCRAND_DIRECTION_VECTORS_64_JOEKUO6", {HIP_6000, HIP_0, HIP_0 }}, + {"ROCRAND_SCRAMBLED_DIRECTION_VECTORS_64_JOEKUO6", {HIP_6000, HIP_0, HIP_0 }}, + {"rocrand_generator_base_type", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_generator", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_discrete_distribution_st", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_discrete_distribution", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_device::philox4x32_10_engine", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_state_philox4x32_10", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_device::mtgp32_engine", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_state_mtgp32", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_device::scrambled_sobol32_engine", {HIP_5040, HIP_0, HIP_0 }}, + {"rocrand_state_scrambled_sobol32", {HIP_5040, HIP_0, HIP_0 }}, + {"rocrand_device::scrambled_sobol64_engine", {HIP_5040, HIP_0, HIP_0 }}, + {"rocrand_state_scrambled_sobol64", {HIP_5040, HIP_0, HIP_0 }}, + {"rocrand_device::sobol32_engine", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_state_sobol32", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_device::sobol64_engine", {HIP_4050, HIP_0, HIP_0 }}, + {"rocrand_state_sobol64", {HIP_4050, HIP_0, HIP_0 }}, + {"rocrand_device::mrg32k3a_engine", {HIP_1050, HIP_0, HIP_0 }}, + {"rocrand_state_mrg32k3a", {HIP_1050, HIP_0, HIP_0 }}, }; diff --git a/src/CUDA2HIP_Runtime_API_functions.cpp b/src/CUDA2HIP_Runtime_API_functions.cpp index 6550e309..17cf7a19 100644 --- a/src/CUDA2HIP_Runtime_API_functions.cpp +++ b/src/CUDA2HIP_Runtime_API_functions.cpp @@ -1427,16 +1427,16 @@ const std::map HIP_RUNTIME_FUNCTION_VER_MAP { {"hipGraphExternalSemaphoresWaitNodeSetParams", {HIP_5070, HIP_0, HIP_0 }}, {"hipGraphExecExternalSemaphoresSignalNodeSetParams", {HIP_5070, HIP_0, HIP_0 }}, {"hipGraphExecExternalSemaphoresWaitNodeSetParams", {HIP_5070, HIP_0, HIP_0 }}, - {"hipGraphInstantiateWithParams", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGraphAddNode", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGetProcAddress", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGetFuncBySymbol", {HIP_6020, HIP_0, HIP_0, }}, - {"hipStreamBeginCaptureToGraph", {HIP_6020, HIP_0, HIP_0, }}, - {"hipSetValidDevices", {HIP_6020, HIP_0, HIP_0, }}, - {"hipMemcpy2DArrayToArray", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGraphExecGetFlags", {HIP_6030, HIP_0, HIP_0, }}, - {"hipGraphNodeSetParams", {HIP_6030, HIP_0, HIP_0, }}, - {"hipGraphExecNodeSetParams", {HIP_6030, HIP_0, HIP_0, }}, + {"hipGraphInstantiateWithParams", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGraphAddNode", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGetProcAddress", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGetFuncBySymbol", {HIP_6020, HIP_0, HIP_0 }}, + {"hipStreamBeginCaptureToGraph", {HIP_6020, HIP_0, HIP_0 }}, + {"hipSetValidDevices", {HIP_6020, HIP_0, HIP_0 }}, + {"hipMemcpy2DArrayToArray", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGraphExecGetFlags", {HIP_6030, HIP_0, HIP_0 }}, + {"hipGraphNodeSetParams", {HIP_6030, HIP_0, HIP_0 }}, + {"hipGraphExecNodeSetParams", {HIP_6030, HIP_0, HIP_0 }}, }; const std::map CUDA_RUNTIME_FUNCTION_CHANGED_VER_MAP { diff --git a/src/CUDA2HIP_Runtime_API_types.cpp b/src/CUDA2HIP_Runtime_API_types.cpp index bf2c801f..e76af61b 100644 --- a/src/CUDA2HIP_Runtime_API_types.cpp +++ b/src/CUDA2HIP_Runtime_API_types.cpp @@ -26,7 +26,6 @@ using SEC = runtime::CUDA_RUNTIME_API_SECTIONS; // Maps the names of CUDA RUNTIME API types to the corresponding HIP types const std::map CUDA_RUNTIME_TYPE_NAME_MAP { - // 1. Structs // no analogue @@ -3081,32 +3080,32 @@ const std::map HIP_RUNTIME_TYPE_NAME_VER_MAP { {"hipGPUDirectRDMAWritesOrderingNone", {HIP_6010, HIP_0, HIP_0 }}, {"hipGPUDirectRDMAWritesOrderingOwner", {HIP_6010, HIP_0, HIP_0 }}, {"hipGPUDirectRDMAWritesOrderingAllDevices", {HIP_6010, HIP_0, HIP_0 }}, - {"hipGraphInstantiateResult", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGraphInstantiateSuccess", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGraphInstantiateError", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGraphInstantiateInvalidStructure", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGraphInstantiateNodeOperationNotSupported", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGraphInstantiateMultipleDevicesNotSupported", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGraphInstantiateParams", {HIP_6020, HIP_0, HIP_0, }}, + {"hipGraphInstantiateResult", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGraphInstantiateSuccess", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGraphInstantiateError", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGraphInstantiateInvalidStructure", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGraphInstantiateNodeOperationNotSupported", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGraphInstantiateMultipleDevicesNotSupported", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGraphInstantiateParams", {HIP_6020, HIP_0, HIP_0 }}, {"hipMemcpyNodeParams", {HIP_6010, HIP_0, HIP_0 }}, {"hipChildGraphNodeParams", {HIP_6010, HIP_0, HIP_0 }}, {"hipEventWaitNodeParams", {HIP_6010, HIP_0, HIP_0 }}, {"hipEventRecordNodeParams", {HIP_6010, HIP_0, HIP_0 }}, {"hipMemFreeNodeParams", {HIP_6010, HIP_0, HIP_0 }}, {"hipGraphNodeParams", {HIP_6010, HIP_0, HIP_0 }}, - {"hipLaunchAttributeID", {HIP_6020, HIP_0, HIP_0, }}, - {"hipLaunchAttributeAccessPolicyWindow", {HIP_6020, HIP_0, HIP_0, }}, - {"hipLaunchAttributeCooperative", {HIP_6020, HIP_0, HIP_0, }}, - {"hipLaunchAttributePriority", {HIP_6020, HIP_0, HIP_0, }}, - {"hipLaunchAttributeValue", {HIP_6020, HIP_0, HIP_0, }}, - {"hipKernelNodeAttributePriority", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGraphKernelNodePortDefault", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGraphKernelNodePortLaunchCompletion", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGraphKernelNodePortProgrammatic", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGraphDependencyType", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGraphDependencyTypeDefault", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGraphDependencyTypeProgrammatic", {HIP_6020, HIP_0, HIP_0, }}, - {"hipGraphEdgeData", {HIP_6020, HIP_0, HIP_0, }}, + {"hipLaunchAttributeID", {HIP_6020, HIP_0, HIP_0 }}, + {"hipLaunchAttributeAccessPolicyWindow", {HIP_6020, HIP_0, HIP_0 }}, + {"hipLaunchAttributeCooperative", {HIP_6020, HIP_0, HIP_0 }}, + {"hipLaunchAttributePriority", {HIP_6020, HIP_0, HIP_0 }}, + {"hipLaunchAttributeValue", {HIP_6020, HIP_0, HIP_0 }}, + {"hipKernelNodeAttributePriority", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGraphKernelNodePortDefault", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGraphKernelNodePortLaunchCompletion", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGraphKernelNodePortProgrammatic", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGraphDependencyType", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGraphDependencyTypeDefault", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGraphDependencyTypeProgrammatic", {HIP_6020, HIP_0, HIP_0 }}, + {"hipGraphEdgeData", {HIP_6020, HIP_0, HIP_0 }}, {"HIP_INF_F", {HIP_5030, HIP_0, HIP_0 }}, {"HIP_NAN_F", {HIP_5030, HIP_0, HIP_0 }}, {"HIP_MIN_DENORM_F", {HIP_5030, HIP_0, HIP_0 }}, diff --git a/src/CUDA2HIP_SOLVER_API_functions.cpp b/src/CUDA2HIP_SOLVER_API_functions.cpp index 4de55ef1..f873fb0e 100644 --- a/src/CUDA2HIP_SOLVER_API_functions.cpp +++ b/src/CUDA2HIP_SOLVER_API_functions.cpp @@ -1438,19 +1438,19 @@ const std::map HIP_SOLVER_FUNCTION_VER_MAP { {"hipsolverSpDcsrlsvchol", {HIP_6010, HIP_0, HIP_0 }}, {"hipsolverSpScsrlsvcholHost", {HIP_6010, HIP_0, HIP_0 }}, {"hipsolverSpDcsrlsvcholHost", {HIP_6010, HIP_0, HIP_0 }}, - {"hipsolverDnCreateParams", {HIP_6020, HIP_0, HIP_0, }}, - {"hipsolverDnDestroyParams", {HIP_6020, HIP_0, HIP_0, }}, - {"hipsolverDnSetAdvOptions", {HIP_6020, HIP_0, HIP_0, }}, - {"hipsolverDnXgetrf", {HIP_6020, HIP_0, HIP_0, }}, - {"hipsolverDnXgetrf_bufferSize", {HIP_6020, HIP_0, HIP_0, }}, - {"hipsolverDnXgetrs", {HIP_6020, HIP_0, HIP_0, }}, - {"hipsolverDnSetDeterministicMode", {HIP_6030, HIP_0, HIP_0, }}, - {"hipsolverDnGetDeterministicMode", {HIP_6030, HIP_0, HIP_0, }}, - {"hipsolverDnXgeqrf_bufferSize", {HIP_6030, HIP_0, HIP_0, }}, - {"hipsolverDnXgeqrf", {HIP_6030, HIP_0, HIP_0, }}, - {"hipsolverDnXpotrf_bufferSize", {HIP_6030, HIP_0, HIP_0, }}, - {"hipsolverDnXpotrf", {HIP_6030, HIP_0, HIP_0, }}, - {"hipsolverDnXpotrs", {HIP_6030, HIP_0, HIP_0, }}, + {"hipsolverDnCreateParams", {HIP_6020, HIP_0, HIP_0 }}, + {"hipsolverDnDestroyParams", {HIP_6020, HIP_0, HIP_0 }}, + {"hipsolverDnSetAdvOptions", {HIP_6020, HIP_0, HIP_0 }}, + {"hipsolverDnXgetrf", {HIP_6020, HIP_0, HIP_0 }}, + {"hipsolverDnXgetrf_bufferSize", {HIP_6020, HIP_0, HIP_0 }}, + {"hipsolverDnXgetrs", {HIP_6020, HIP_0, HIP_0 }}, + {"hipsolverDnSetDeterministicMode", {HIP_6030, HIP_0, HIP_0 }}, + {"hipsolverDnGetDeterministicMode", {HIP_6030, HIP_0, HIP_0 }}, + {"hipsolverDnXgeqrf_bufferSize", {HIP_6030, HIP_0, HIP_0 }}, + {"hipsolverDnXgeqrf", {HIP_6030, HIP_0, HIP_0 }}, + {"hipsolverDnXpotrf_bufferSize", {HIP_6030, HIP_0, HIP_0 }}, + {"hipsolverDnXpotrf", {HIP_6030, HIP_0, HIP_0 }}, + {"hipsolverDnXpotrs", {HIP_6030, HIP_0, HIP_0 }}, {"rocsolver_spotrf", {HIP_3020, HIP_0, HIP_0 }}, {"rocsolver_dpotrf", {HIP_3020, HIP_0, HIP_0 }}, diff --git a/src/CUDA2HIP_SOLVER_API_types.cpp b/src/CUDA2HIP_SOLVER_API_types.cpp index 87001718..5e6023f2 100644 --- a/src/CUDA2HIP_SOLVER_API_types.cpp +++ b/src/CUDA2HIP_SOLVER_API_types.cpp @@ -323,15 +323,15 @@ const std::map HIP_SOLVER_TYPE_NAME_VER_MAP { {"hipsolverRfHandle_t", {HIP_5060, HIP_0, HIP_0 }}, {"HIPSOLVER_STATUS_MATRIX_TYPE_NOT_SUPPORTED", {HIP_6010, HIP_0, HIP_0 }}, {"hipsolverSpHandle_t", {HIP_6010, HIP_0, HIP_0 }}, - {"hipsolverDnParams_t", {HIP_6020, HIP_0, HIP_0, }}, - {"hipsolverAlgMode_t", {HIP_6020, HIP_0, HIP_0, }}, - {"HIPSOLVER_ALG_0", {HIP_6020, HIP_0, HIP_0, }}, - {"HIPSOLVER_ALG_1", {HIP_6020, HIP_0, HIP_0, }}, - {"hipsolverDnFunction_t", {HIP_6020, HIP_0, HIP_0, }}, - {"HIPSOLVERDN_GETRF", {HIP_6020, HIP_0, HIP_0, }}, - {"hipsolverDeterministicMode_t", {HIP_6030, HIP_0, HIP_0, }}, - {"HIPSOLVER_DETERMINISTIC_RESULTS", {HIP_6030, HIP_0, HIP_0, }}, - {"HIPSOLVER_ALLOW_NON_DETERMINISTIC_RESULTS", {HIP_6030, HIP_0, HIP_0, }}, + {"hipsolverDnParams_t", {HIP_6020, HIP_0, HIP_0 }}, + {"hipsolverAlgMode_t", {HIP_6020, HIP_0, HIP_0 }}, + {"HIPSOLVER_ALG_0", {HIP_6020, HIP_0, HIP_0 }}, + {"HIPSOLVER_ALG_1", {HIP_6020, HIP_0, HIP_0 }}, + {"hipsolverDnFunction_t", {HIP_6020, HIP_0, HIP_0 }}, + {"HIPSOLVERDN_GETRF", {HIP_6020, HIP_0, HIP_0 }}, + {"hipsolverDeterministicMode_t", {HIP_6030, HIP_0, HIP_0 }}, + {"HIPSOLVER_DETERMINISTIC_RESULTS", {HIP_6030, HIP_0, HIP_0 }}, + {"HIPSOLVER_ALLOW_NON_DETERMINISTIC_RESULTS", {HIP_6030, HIP_0, HIP_0 }}, {"rocblas_int", {HIP_3000, HIP_0, HIP_0 }}, {"rocblas_status", {HIP_3000, HIP_0, HIP_0 }}, diff --git a/src/StringUtils.cpp b/src/StringUtils.cpp index 29f7e5fb..01d47c35 100644 --- a/src/StringUtils.cpp +++ b/src/StringUtils.cpp @@ -88,4 +88,3 @@ std::string getAbsoluteDirectoryPath(const std::string &sDir, std::error_code &E } return dirAbsPath.c_str(); } - From 59ee744067d8947af1f96ff497daa691de120846 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Fri, 3 Jan 2025 14:04:53 +0000 Subject: [PATCH 14/17] Bump rocm-docs-core from 1.12.0 to 1.12.1 in /docs/sphinx Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.12.0 to 1.12.1. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.12.0...v1.12.1) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] --- docs/sphinx/requirements.in | 2 +- docs/sphinx/requirements.txt | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/sphinx/requirements.in b/docs/sphinx/requirements.in index 89fa3e55..8d3ce811 100644 --- a/docs/sphinx/requirements.in +++ b/docs/sphinx/requirements.in @@ -1 +1 @@ -rocm-docs-core==1.12.0 +rocm-docs-core==1.12.1 diff --git a/docs/sphinx/requirements.txt b/docs/sphinx/requirements.txt index 3da6c8bf..f7df9c4d 100644 --- a/docs/sphinx/requirements.txt +++ b/docs/sphinx/requirements.txt @@ -92,7 +92,7 @@ requests==2.32.2 # via # pygithub # sphinx -rocm-docs-core==1.12.0 +rocm-docs-core==1.12.1 # via -r requirements.in smmap==5.0.1 # via gitdb From cd1402e61fd3485e25240fae32a3bac9a3c36065 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Fri, 3 Jan 2025 15:45:36 +0000 Subject: [PATCH 15/17] Bump jinja2 from 3.1.4 to 3.1.5 in /docs/sphinx Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.4 to 3.1.5. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst) - [Commits](https://github.com/pallets/jinja/compare/3.1.4...3.1.5) --- updated-dependencies: - dependency-name: jinja2 dependency-type: indirect ... Signed-off-by: dependabot[bot] --- docs/sphinx/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sphinx/requirements.txt b/docs/sphinx/requirements.txt index f7df9c4d..225be42f 100644 --- a/docs/sphinx/requirements.txt +++ b/docs/sphinx/requirements.txt @@ -46,7 +46,7 @@ idna==3.7 # via requests imagesize==1.4.1 # via sphinx -jinja2==3.1.4 +jinja2==3.1.5 # via # myst-parser # sphinx From 06e543a733cd98cccf65d4e5272c68a415925eee Mon Sep 17 00:00:00 2001 From: Evgeny Mankov Date: Mon, 6 Jan 2025 19:27:46 +0100 Subject: [PATCH 16/17] [HIPIFY][doc] Updated the `LICENSE.txt` and `CHANGELOG.md` --- CHANGELOG.md | 21 +++++++++++++++++++++ LICENSE.txt | 2 +- 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index b94c1348..f06363a7 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,6 +3,27 @@ Documentation for HIPIFY is available at [https://rocmdocs.amd.com/projects/HIPIFY/en/latest/](https://rocmdocs.amd.com/projects/HIPIFY/en/latest/). +## HIPIFY for ROCm 6.4.0 + +### Added + +* CUDA 12.6.3 support +* cuDNN 9.6.0 support +* cuTENSOR 2.0.2.1 support +* LLVM 19.1.6 support +* Full support for direct hipification of `cuRAND` into `rocRAND` under the `--roc` option +* [#1617] Support for `fp8` math device/host API + +### Resolved issues + +* `MIOpen` support in hipify-perl under the `-miopen` option +* [#1769] Support for `fp16` device/host API +* [#1800] Fix instructions on building LLVM for HIPIFY on Linux + +### Known issues + +* [#833] `hipify-clang` build failure against LLVM 15-18 on `Ubuntu`, `CentOS`, and `Fedora` + ## HIPIFY for ROCm 6.3.1 ### Added diff --git a/LICENSE.txt b/LICENSE.txt index 179fbd0e..17b270dd 100644 --- a/LICENSE.txt +++ b/LICENSE.txt @@ -1,4 +1,4 @@ -Copyright (c) 2024 Advanced Micro Devices, Inc. +Copyright (c) 2025 Advanced Micro Devices, Inc. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal From 7ffac99243aa70e7ea30c2aa03e6f0fb142f262b Mon Sep 17 00:00:00 2001 From: Evgeny Mankov Date: Mon, 6 Jan 2025 20:05:51 +0100 Subject: [PATCH 17/17] [HIPIFY][doc] Added hyperlinks to `CHANGELOG.md` --- CHANGELOG.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index f06363a7..e6d1b6ad 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -12,17 +12,17 @@ Documentation for HIPIFY is available at * cuTENSOR 2.0.2.1 support * LLVM 19.1.6 support * Full support for direct hipification of `cuRAND` into `rocRAND` under the `--roc` option -* [#1617] Support for `fp8` math device/host API +* [#1617](https://github.com/ROCm/HIPIFY/issues/1617) Support for `fp8` math device/host API ### Resolved issues * `MIOpen` support in hipify-perl under the `-miopen` option -* [#1769] Support for `fp16` device/host API -* [#1800] Fix instructions on building LLVM for HIPIFY on Linux +* [#1769](https://github.com/ROCm/HIPIFY/issues/1769) Support for `fp16` device/host API +* [#1800](https://github.com/ROCm/HIPIFY/issues/1800) Fix instructions on building LLVM for HIPIFY on Linux ### Known issues -* [#833] `hipify-clang` build failure against LLVM 15-18 on `Ubuntu`, `CentOS`, and `Fedora` +* [#833](https://github.com/ROCm/HIPIFY/issues/833) `hipify-clang` build failure against LLVM 15-18 on `Ubuntu`, `CentOS`, and `Fedora` ## HIPIFY for ROCm 6.3.1 @@ -48,7 +48,7 @@ Documentation for HIPIFY is available at * `rocBLAS` 64-bit APIs support * Initial support for direct hipification of `cuDNN` into `MIOpen` under the `--roc` option * Initial support for direct hipification of `cuRAND` into `rocRAND` under the `--roc` option -* [#1650] Added a filtering ability for the supplementary hipification scripts +* [#1650](https://github.com/ROCm/HIPIFY/pull/1650) Added a filtering ability for the supplementary hipification scripts ### Resolved issues @@ -56,7 +56,7 @@ Documentation for HIPIFY is available at ### Known issues -* [#1617] Support for `fp8` data types +* [#1617](https://github.com/ROCm/HIPIFY/issues/1617) Support for `fp8` data types ## HIPIFY for ROCm 6.2.4 @@ -143,8 +143,8 @@ Documentation for HIPIFY is available at ### Known issues -* [#837] Added a new function to call transformation type "additional non-const arg" -* [#1014] Added a new function to call transformation type "replace argument with a const" +* [#837](https://github.com/ROCm/HIPIFY/issues/837) Added a new function to call transformation type "additional non-const arg" +* [#1014](https://github.com/ROCm/HIPIFY/issues/1014) Added a new function to call transformation type "replace argument with a const" ## HIPIFY for ROCm 5.7.0 @@ -164,8 +164,8 @@ Documentation for HIPIFY is available at ### Known issues -* [#822] Added a new function to call transformation type "additional const by value arg" -* [#830] Added a new function to call transformation type "move arg from place X to place Y" +* [#822](https://github.com/ROCm/HIPIFY/issues/822) Added a new function to call transformation type "additional const by value arg" +* [#830](https://github.com/ROCm/HIPIFY/issues/830) Added a new function to call transformation type "move arg from place X to place Y" ## HIPIFY for ROCm 5.6.0