Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR: Refine ggml-hexagon backend(Qualcomm Hexagon NPU backend) for latest ggml,whisper.cpp,llama.cpp #12326

Open
wants to merge 118 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
118 commits
Select commit Hold shift + click to select a range
82e0af4
ggml-qnn: add Qualcomm QNN backend for GGML
zhouwg Feb 14, 2025
a142068
ggml-qnn: santiy check
zhouwg Feb 15, 2025
a754622
ggml-qnn: update script build-run-android.sh to compare peformance of…
zhouwg Feb 16, 2025
ea3f3d5
ggml-qnn: fix minor issue in test-backend-ops.cpp
zhouwg Feb 17, 2025
971fe40
ggml-qnn: merge QNN RPC feature from https://github.com/zhouwg/kantv/…
zhouwg Feb 18, 2025
4f392f0
ggml-qnn: sync from branch kantvai-ggmlqnn-npurpc
zhouwg Feb 18, 2025
57ca3c1
ggml-qnn: a concise approach to offload mulmat to QNN backend(sync fr…
zhouwg Feb 19, 2025
a776d1b
ggml-qnn: remove redundant codes
zhouwg Feb 20, 2025
46d2b3d
ggml-qnn: sync from branch kantvai-ggmlqnn-npurpc
zhouwg Feb 20, 2025
0bc3aaa
ggml-qnn: sync from branch kantvai-ggmlqnn-npurpc
zhouwg Feb 20, 2025
aecca29
ggml-qnn: sync from branch kantvai-ggmlqnn-npurpc
zhouwg Feb 21, 2025
8c44769
ggml-qnn: add Qualcomm QNN backend for GGML
zhouwg Feb 14, 2025
f9b0219
ggml-qnn: merge QNN RPC feature from https://github.com/zhouwg/kantv/…
zhouwg Feb 18, 2025
646aa79
ggml-qnn: sync from branch kantvai-ggmlqnn-npurpc
zhouwg Feb 18, 2025
3e784e8
ggml-qnn: a concise approach to offload mulmat to QNN backend(sync fr…
zhouwg Feb 19, 2025
2f4f4e3
ggml-qnn: remove redundant codes
zhouwg Feb 20, 2025
ad68230
ggml-qnn: sync from branch kantvai-ggmlqnn-npurpc
zhouwg Feb 20, 2025
99b4c68
ggml-qnn: sync from branch kantvai-ggmlqnn-npurpc
zhouwg Feb 20, 2025
985302e
ggml-qnn: sync from branch kantvai-ggmlqnn-npurpc
zhouwg Feb 21, 2025
a4290e2
ggml-qnn: fix a minior typo in internal doc
zhouwg Feb 23, 2025
aee48b0
ggml-qnn: refine function ggml_qnn_create_general_tensor() to avoid c…
zhouwg Feb 23, 2025
fdc2c2c
ggml-qnn: fix a minor typo in source code
zhouwg Feb 24, 2025
cf938a2
build: avoid ggml-qnn backend breaking other backend's builds
zhouwg Feb 24, 2025
83df41f
ggml-qnn: remove redundant codes to make PR reviewers happy
zhouwg Feb 25, 2025
9bd734f
ggml-qnn: refine code format
zhouwg Feb 25, 2025
6071221
ggml-qnn: offload quantized type mulmat to QNN backend
zhouwg Feb 26, 2025
3b1fb7f
ggml-qnn: refine source code structure to make code more clearly
zhouwg Feb 27, 2025
e874cb0
ggml-qnn: enable release build with necessary logs to make reviewers …
zhouwg Feb 27, 2025
14a2a04
ggml-qnn: enable all quantize type with 2d mulmat
zhouwg Feb 27, 2025
7b58fd2
ggml-qnn: enable log output of GGMLQNN_LOG_INFO in command line mode …
zhouwg Feb 28, 2025
218de3b
ggml-qnn: Windows port --- step2
zhouwg Feb 28, 2025
b1d9e88
ggml-qnn: merge UT code and corresponding script from local dev branc…
zhouwg Mar 2, 2025
596d9c1
ggml-qnn: merge ggml_qnn_mul_mat_4d from local dev branch to make wor…
zhouwg Mar 2, 2025
6a6d2d8
ggml-qnn: submit AI-assisted ggml_qnn_mul_mat_4d(not worked currently…
zhouwg Mar 2, 2025
8181665
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step2
zhouwg Mar 2, 2025
65c88f3
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step3
zhouwg Mar 2, 2025
9b4e356
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step4
zhouwg Mar 2, 2025
88f5901
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step5
zhouwg Mar 2, 2025
ce51094
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step6
zhouwg Mar 2, 2025
4ca8081
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step7
zhouwg Mar 2, 2025
3f3bdb2
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step8
zhouwg Mar 2, 2025
7933032
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- good in step9
zhouwg Mar 2, 2025
8349ff6
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- narrow down t…
zhouwg Mar 2, 2025
f7d0fbf
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step10
zhouwg Mar 2, 2025
60f71da
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- narrow down t…
zhouwg Mar 2, 2025
cfce884
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- step11
zhouwg Mar 2, 2025
1328b30
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 --- both ok in st…
zhouwg Mar 2, 2025
00dd246
ggml-qnn: AI-assisted ggml_qnn_mul_mat_4d by Grok 3 ---finalizing ver…
zhouwg Mar 2, 2025
f883549
ggml-qnn: refine ggml_qnn_mul_mat and ggml_qnn_general_node according…
zhouwg Mar 2, 2025
5b708be
ggml-qnn: remove no-needed comments
zhouwg Mar 2, 2025
ac8fa22
ggml-qnn: Windows port --- step3
zhouwg Mar 3, 2025
906b6e5
ggml-qnn: remove un-needed function
zhouwg Mar 4, 2025
f3d3f1e
ggml-qnn:rebase to upstream
zhouwg Mar 4, 2025
18000b2
ggml-qnn: fix a minior issue during rebase to upstream
zhouwg Mar 4, 2025
807e81f
ggml-qnn: update script according to https://github.com/ggml-org/llam…
zhouwg Mar 4, 2025
6a35665
ggml-qnn: fix a minior issue in ggmlqnn_create_general_tensor()
zhouwg Mar 4, 2025
c4378df
ggml-qnn: active member variable _device_id in class qnn_instance
zhouwg Mar 4, 2025
2efa6d3
ggml-qnn: refine ggml_qnn_general_node and ggml_qnn_mul_mat to make c…
zhouwg Mar 4, 2025
53e6796
ggml-qnn: Windows port --- step4
zhouwg Mar 6, 2025
5d450dc
ggml-qnn: Windows port -- step5
zhouwg Mar 7, 2025
97a683e
ggml-qnn: WoA(Windows on ARM) -- step6
zhouwg Mar 8, 2025
5e451c1
ggml-qnn: rebase to upstream
zhouwg Mar 9, 2025
29e1c94
ggml-qnn: pr to upstream
zhouwg Mar 11, 2025
18aa66d
ggml-qnn: rebase to upstream
zhouwg Mar 18, 2025
8e54e17
ggml-qnn: self code-review
zhouwg Mar 18, 2025
10a2ce8
ggml-qnn: rebase upstream
zhouwg Mar 19, 2025
db955a6
ggml-qnn: add approach through Hexagon cDSP
zhouwg Mar 22, 2025
0911a0b
ggml-qnn: refine general approach through Hexagon cDSP
zhouwg Mar 23, 2025
963d4a1
ggml-qnn: refine the entire ggml-qnn.cpp to make code more clear
zhouwg Mar 24, 2025
360ded9
ggml-qnn: refine the entire ggml-qnn.cpp to make code more clear
zhouwg Mar 24, 2025
c0a7e36
ggml-qnn: add build script for libggmlop_skel.so
zhouwg Mar 24, 2025
9bf6aaf
ggml-qnn: remove redundant functions in this PR and make codes more c…
zhouwg Mar 25, 2025
fc564dc
ggml-qnn: original ggml_compute_forward_add and ggml_compute_forward_…
zhouwg Mar 25, 2025
6b89ca7
ggml-qnn: modify build-run-android.sh to verify mulmat and validate m…
zhouwg Mar 25, 2025
15f29c0
ggml-qnn: make host code(ggml-qnn.cpp) more clear and more stable
zhouwg Mar 26, 2025
4fd90af
ggml-qnn: refine code according to self code-review and make code mor…
zhouwg Mar 26, 2025
519e01e
ggml-qnn: offload more ggml op to Hexagon cDSP
zhouwg Mar 27, 2025
cb9c113
ggml-hexagon: code on AP(arm-cpu) side is stable now
zhouwg Mar 28, 2025
9f8dbb2
ggml-hexagon: optimize GGML_OP_ADD on cDSP side
zhouwg Mar 28, 2025
8442cc9
ggml-hexagon: simplify hexagon-kernel build logic in CMakeLists.txt
zhouwg Mar 29, 2025
3d1d83d
ggml-hexagon: release ggml-hexagon v0.98
zhouwg Mar 29, 2025
b8e1dd9
ggml-hexagon: release ggml-hexagon v0.99
zhouwg Mar 29, 2025
1652c24
ggml-hexagon: try to offload q6_k mulmat to cDSP
zhouwg Mar 29, 2025
7c9f9b2
ggml-hexagon: fix minior issue in ggml-hexagon.cpp after self code-re…
zhouwg Mar 29, 2025
c3eb7dc
ggml-hexagon: check validation of ggml-hexagon.cfg before create appr…
zhouwg Mar 30, 2025
d1ad39b
ggml-hexagon: fix all compiler warnings in ggml-hexagon.cpp
zhouwg Mar 30, 2025
07e7f0e
ggml-hexagon: enable only one backend device for HWACCEL_CDSP and ena…
zhouwg Mar 31, 2025
7961da1
ggml-hexagon: rpc ion memory pool and test-backend-ops works fine in …
zhouwg Mar 31, 2025
df4003c
ggml-hexagon: make comprision of mulmat performance between HWACCEL_Q…
zhouwg Mar 31, 2025
16c1b65
ggml-hexagon: release ggml-hexagon v1.00
zhouwg Mar 31, 2025
7a03f7e
ggml-hexagon: rebase to upstream
zhouwg Apr 1, 2025
c6b7737
ggml-hexagon: check configuration of enable_rpc_dma_mempool in functi…
zhouwg Apr 1, 2025
dc60f78
ggml-hexagon: uniform rpc_ion_memsize and rpc_ion_usage between HWACC…
zhouwg Apr 1, 2025
0bc1a25
ggml-hexagon: make buffer mechanism more clear in HWACCEL_CDSP approach
zhouwg Apr 1, 2025
47419b3
ggml-hexagon: add perf function in hexagon kernerls on cDSP side
zhouwg Apr 2, 2025
35eda9d
ggml-hexagon: fix a stupid issue of why set rpc latency failure and i…
zhouwg Apr 2, 2025
d71bc92
ggml-hexagon: make helper function ggmlhexagon_get_timestring() threa…
zhouwg Apr 2, 2025
266c585
ggml-hexagon: fix a typo in ggml-hexagon.cpp
zhouwg Apr 2, 2025
0499662
ggml-hexagon: list all known todo and fixme tasks in ggml-hexagon.cpp
zhouwg Apr 2, 2025
4ca02b7
ggml-hexagon: fix units MB -> MiB
zhouwg Apr 2, 2025
3208c72
ggml-hexagon: try to make ggml-hexagon backend works fine in a standa…
zhouwg Apr 3, 2025
16dae82
ggml-hexagon: remove reduament code and make debug log more clear
zhouwg Apr 3, 2025
7b29b49
ggml-hexagon: add gemma-3-4b-it-Q8_0.gguf to verify q8_0 mulmat on cDSP
zhouwg Apr 3, 2025
abefec9
ggml-hexagon:add skeleton code of offload GGML_OP_SOFT_MAX/GGML_OP_RM…
zhouwg Apr 3, 2025
fbce56a
ggml-hexagon: release ggml-dsp v0.60 on cDSP side
zhouwg Apr 4, 2025
7c2154c
ggml-hexagon: merge build logic in kernels/Makefile to ggml-hexagon/C…
zhouwg Apr 5, 2025
f084217
ggml-hexagon: fix a typo in ggml-hexagon.cpp
zhouwg Apr 5, 2025
39ad12f
ggml-hexagon: uniform NDEBUG usage in ggml-hexagon.cpp and ggml-dsp.c
zhouwg Apr 6, 2025
3e742d5
ggml-hexagon: add profiler feature for purpose of visualize NPU perfo…
zhouwg Apr 7, 2025
72246e3
ggml-hexagon: remove so-called dma memory pool to avoid confusion and…
zhouwg Apr 8, 2025
456ed6d
ggml-hexagon: make function ggmlhexagon_init_rpcmempool in ggml-hexag…
zhouwg Apr 8, 2025
c9bc945
ggml-hexagon: fix potential resource leak in class hexagon_profiler
zhouwg Apr 8, 2025
79a12b1
ggml-hexagon: enable multi-threading feature on cDSP side
zhouwg Apr 8, 2025
01bc3cb
ggml-hexagon: upgrade QNN SDK to v2.33.0.250327
zhouwg Apr 9, 2025
154513f
ggml-hexagon: fix typo in ggml-hexagon.cpp
zhouwg Apr 9, 2025
f852cd4
ggml-dsp: probe QuRT RTOS information in function ggmlop_dsp_open
zhouwg Apr 9, 2025
e00955e
ggml-hexagon: setting enable_rpc_ion_mempool to 1 and make test-backe…
zhouwg Apr 10, 2025
a4ce74c
ggml-hexagon: check whether user's specified htp arch is valid in CMa…
zhouwg Apr 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,16 @@ set(CMAKE_WARN_UNUSED_CLI YES)

set(CMAKE_EXPORT_COMPILE_COMMANDS ON)

if(CMAKE_SYSTEM_NAME STREQUAL "Android")
set(TARGET_SNAPDRAGON8GEN3 ON)
if(TARGET_SNAPDRAGON8GEN3)
#works fine on Snapdragon 8Gen3 with 1.5x(45+ tokens/second)-3x(70+ tokens/second) performance gain through the default ggml backend
add_definitions(-march=armv8.7-a)
add_definitions(-mcpu=cortex-x1)
add_definitions(-mtune=cortex-x1)
endif()
endif()

if (NOT XCODE AND NOT MSVC AND NOT CMAKE_BUILD_TYPE)
set(CMAKE_BUILD_TYPE Release CACHE STRING "Build type" FORCE)
set_property(CACHE CMAKE_BUILD_TYPE PROPERTY STRINGS "Debug" "Release" "MinSizeRel" "RelWithDebInfo")
Expand Down Expand Up @@ -119,6 +129,7 @@ llama_option_depr(WARNING LLAMA_RPC GGML_RPC)
llama_option_depr(WARNING LLAMA_SYCL GGML_SYCL)
llama_option_depr(WARNING LLAMA_SYCL_F16 GGML_SYCL_F16)
llama_option_depr(WARNING LLAMA_CANN GGML_CANN)
llama_option_depr(WARNING LLAMA_HEXAGON GGML_HEXAGON)

if (NOT MSVC)
if (LLAMA_SANITIZE_THREAD)
Expand Down
2 changes: 2 additions & 0 deletions ggml/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,7 @@ option(GGML_OPENCL_EMBED_KERNELS "ggml: embed kernels"
option(GGML_OPENCL_USE_ADRENO_KERNELS "ggml: use optimized kernels for Adreno" ON)
set (GGML_OPENCL_TARGET_VERSION "300" CACHE STRING
"gmml: OpenCL API version to target")
option(GGML_HEXAGON "ggml: use HEXAGON" OFF)

# toolchain for vulkan-shaders-gen
set (GGML_VULKAN_SHADERS_GEN_TOOLCHAIN "" CACHE FILEPATH "ggml: toolchain file for vulkan-shaders-gen")
Expand Down Expand Up @@ -269,6 +270,7 @@ set(GGML_PUBLIC_HEADERS
include/ggml-rpc.h
include/ggml-sycl.h
include/ggml-vulkan.h
include/ggml-hexagon.h
include/gguf.h)

set_target_properties(ggml PROPERTIES PUBLIC_HEADER "${GGML_PUBLIC_HEADERS}")
Expand Down
54 changes: 54 additions & 0 deletions ggml/include/ggml-hexagon.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
/*
* Copyright (c) 2023-2025 The ggml authors
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to
* deal in the Software without restriction, including without limitation the
* rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
* sell copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#pragma once

#include "ggml.h"
#include "ggml-backend.h"

#ifdef __cplusplus
extern "C" {
#endif

#define GGML_HEXAGON_MAX_DEVICES 3
#define GGML_HEXAGON_BACKEND_NAME "hexagon"

enum HEXAGONBackend {
HEXAGON_BACKEND_QNNCPU = 0,
HEXAGON_BACKEND_QNNGPU = 1,
HEXAGON_BACKEND_QNNNPU = 2,
HEXAGON_BACKEND_CDSP = 2,
HEXAGON_BACKEND_GGML = 3, //"fake" QNN backend for compare performance between HEXAGON backend and ggml backend
};

GGML_BACKEND_API ggml_backend_t ggml_backend_hexagon_init(size_t dev_num, const char * qnn_lib_path);

GGML_BACKEND_API bool ggml_backend_is_hexagon(ggml_backend_t backend);

GGML_BACKEND_API int ggml_backend_hexagon_get_device_count(void);

GGML_BACKEND_API ggml_backend_reg_t ggml_backend_hexagon_reg(void);

const char * ggml_backend_hexagon_get_devname(size_t dev_num);

#ifdef __cplusplus
}
#endif
1 change: 1 addition & 0 deletions ggml/src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -310,6 +310,7 @@ ggml_add_backend(RPC)
ggml_add_backend(SYCL)
ggml_add_backend(Vulkan)
ggml_add_backend(OpenCL)
ggml_add_backend(HEXAGON)

foreach (target ggml-base ggml)
target_include_directories(${target} PUBLIC $<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/../include> $<INSTALL_INTERFACE:include>)
Expand Down
8 changes: 8 additions & 0 deletions ggml/src/ggml-backend-reg.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,10 @@
#include "ggml-kompute.h"
#endif

#ifdef GGML_USE_HEXAGON
#include "ggml-hexagon.h"
#endif

// disable C++17 deprecation warning for std::codecvt_utf8
#if defined(__clang__)
# pragma clang diagnostic push
Expand Down Expand Up @@ -187,6 +191,9 @@ struct ggml_backend_registry {
#ifdef GGML_USE_KOMPUTE
register_backend(ggml_backend_kompute_reg());
#endif
#ifdef GGML_USE_HEXAGON
register_backend(ggml_backend_hexagon_reg());
#endif
#ifdef GGML_USE_CPU
register_backend(ggml_backend_cpu_reg());
#endif
Expand Down Expand Up @@ -577,6 +584,7 @@ void ggml_backend_load_all_from_path(const char * dir_path) {
ggml_backend_load_best("vulkan", silent, dir_path);
ggml_backend_load_best("opencl", silent, dir_path);
ggml_backend_load_best("musa", silent, dir_path);
ggml_backend_load_best("hexagon", silent, dir_path);
ggml_backend_load_best("cpu", silent, dir_path);
// check the environment variable GGML_BACKEND_PATH to load an out-of-tree backend
const char * backend_path = std::getenv("GGML_BACKEND_PATH");
Expand Down
115 changes: 115 additions & 0 deletions ggml/src/ggml-hexagon/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
project(ggml-hexagon)
message(STATUS "Using HEXAGON backend")
message("CMAKE_SYSTEM_NAME : ${CMAKE_SYSTEM_NAME}")

set(CMAKE_CXX_STANDARD 20)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

if(NOT DEFINED QNN_SDK_PATH)
message(FATAL_ERROR "QNN_SDK_PATH not defined")
endif()

if(NOT DEFINED HEXAGON_SDK_PATH)
message(FATAL_ERROR "HEXAGON_SDK_PATH not defined")
endif()

message("QNN_SDK_PATH : ${QNN_SDK_PATH}")
message("HEXAGON_SDK_PATH: ${HEXAGON_SDK_PATH}")
message("HTP_ARCH_VERSION: ${HTP_ARCH_VERSION}")

if (CMAKE_BUILD_TYPE STREQUAL "Debug")
set(DEBUG_FLAG "-Wall")
message("Debug mode:${DEBUG_FLAG}")
else()
set(DEBUG_FLAG "-DNDEBUG -Wall")
message("Release mode:${DEBUG_FLAG}")
endif()


#v68 --- Snapdragon 888
#v69 --- Snapdragon 8 Gen1
#v73 --- Snapdragon 8 Gen2
#v75 --- Snapdragon 8 Gen3
#v79 --- Snapdragon 8 Elite(aka Gen4)
if(NOT DEFINED HTP_ARCH_VERSION)
message(FATAL_ERROR "HTP_ARCH_VERSION not defined, valid htp arch: v68,v69,v73,v75,v79")
endif()

#check whether user's specified htp arch is valid
set(CHECK_HTP_ARCH "WRONG")
foreach (feat v68 v69 v73 v75 v79)
if (${feat} STREQUAL ${HTP_ARCH_VERSION})
set(CHECK_HTP_ARCH "GOOD")
endif()
endforeach()
if (${CHECK_HTP_ARCH} STREQUAL "WRONG")
message(FATAL_ERROR "ggml-hexagon backend only support htp arch v68,v69,v73,v75,v79")
endif()

#cross compiling for hexagon kernels on cDSP side
set(HEXAGON_CC "${HEXAGON_SDK_PATH}/tools/HEXAGON_Tools/8.8.06/Tools/bin/hexagon-clang")
set(HEXAGON_CXX "${HEXAGON_SDK_PATH}/tools/HEXAGON_Tools/8.8.06/Tools/bin/hexagon-clang")
set(HEXAGON_TARGET libggmlop_skel${HTP_ARCH_VERSION}.so)
set(HEXAGON_KERNELS_PATH "${CMAKE_CURRENT_LIST_DIR}/kernels")
set(HEXAGON_COMPUTE "compute${HTP_ARCH_VERSION}")

if(CMAKE_SYSTEM_NAME STREQUAL "Android")
find_library(LOG_LIB log)

add_library(cdsprpc
SHARED
IMPORTED)
set_target_properties(cdsprpc
PROPERTIES
IMPORTED_LOCATION
${HEXAGON_SDK_PATH}/ipc/fastrpc/remote/ship/android_aarch64/libcdsprpc.so)

set(QNN_LINK_LIBRARIES ${LOG_LIB} cdsprpc)
set(QNN_DEFAULT_LIB_SEARCH_PATH "/data/local/tmp/" CACHE STRING "customized library search path for QNN backend")

include_directories(${HEXAGON_SDK_PATH}/incs)
include_directories(${HEXAGON_SDK_PATH}/incs/stddef)
include_directories(${HEXAGON_SDK_PATH}/ipc/fastrpc/incs)
include_directories(${HEXAGON_SDK_PATH}/ipc/fastrpc/rpcmem/inc)
include_directories(${HEXAGON_SDK_PATH}/ipc/fastrpc/remote/ship/android_Debug_aarch64)
include_directories(${HEXAGON_SDK_PATH}/utils/examples)
include_directories(${HEXAGON_SDK_PATH}/ipc/fastrpc/rtld/ship/android_aarch64)
include_directories(${HEXAGON_SDK_PATH}/libs/atomic/inc)
include_directories(${HEXAGON_SDK_PATH}/libs/atomic/android_Debug_aarch64/ship)
include_directories(${CMAKE_SOURCE_DIR}/ggml/src/ggml-hexagon/)
include_directories(${CMAKE_SOURCE_DIR}/ggml/src/ggml-hexagon/kernels/)
elseif(CMAKE_SYSTEM_NAME STREQUAL "Windows")
set(QNN_DEFAULT_LIB_SEARCH_PATH "C:\\" CACHE STRING "customized library search path for QNN backend")
else()
message(FATAL_ERROR "ggml-hexagon now only available on Android and Windows(Windows on ARM)")
endif()

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DGGML_USE_HEXAGON ${DEBUG_FLAG}")
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -O3")

file(GLOB HEXAGON_SOURCES "${CMAKE_CURRENT_LIST_DIR}/*.cpp" "${CMAKE_CURRENT_LIST_DIR}/kernels/ggmlop_ap_skel.c")
ggml_add_backend_library(ggml-hexagon ${HEXAGON_SOURCES})

target_include_directories(ggml-hexagon PRIVATE ${QNN_SDK_PATH}/include/QNN ${HEXAGON_SDK_PATH} ${CMAKE_CURRENT_LIST_DIR})
target_link_libraries(ggml-hexagon PRIVATE ${QNN_LINK_LIBRARIES})

string(REGEX REPLACE "/$" "" QNN_DEFAULT_LIB_SEARCH_PATH "${QNN_DEFAULT_LIB_SEARCH_PATH}")
target_compile_definitions(ggml-hexagon PRIVATE QNN_DEFAULT_LIB_SEARCH_PATH="${QNN_DEFAULT_LIB_SEARCH_PATH}/")

function(ggml_hexagon_build_kernel KNAME)
message(STATUS "ggml_hexagon: build hexagon-kernel ${KNAME}")

add_custom_command(
TARGET ${PROJECT_NAME}
POST_BUILD
COMMAND echo "current working path:`pwd`\n"
COMMAND ${HEXAGON_CC} -o ${HEXAGON_KERNELS_PATH}/ggml-dsp.o -c ${HEXAGON_KERNELS_PATH}/ggml-dsp.c -m${HTP_ARCH_VERSION} -c -Ofast -Wall -Wstrict-prototypes -fno-zero-initialized-in-bss -fdata-sections -fpic ${DEBUG_FLAG} -D__V_DYNAMIC__ -mhvx -mhvx-length=128B -fno-finite-math-only -I${HEXAGON_SDK_PATH}/incs -I${HEXAGON_SDK_PATH}/libs/qprintf/inc -I${HEXAGON_SDK_PATH}/incs/stddef -I${HEXAGON_SDK_PATH}/ipc/fastrpc/incs -I${HEXAGON_SDK_PATH}/ipc/fastrpc/rpcmem/inc -I${HEXAGON_SDK_PATH}/utils/examples -I${HEXAGON_SDK_PATH}/ipc/fastrpc/rtld/ship/inc -I${HEXAGON_SDK_PATH}/libs/atomic/inc -I${HEXAGON_SDK_PATH}/utils/sim_utils/inc -I${HEXAGON_SDK_PATH}/rtos/qurt/${HEXAGON_COMPUTE}/include/posix -I${HEXAGON_SDK_PATH}/rtos/qurt/${HEXAGON_COMPUTE}/include/qurt/
COMMAND ${HEXAGON_CC} -o ${HEXAGON_KERNELS_PATH}/ggmlop_cdsp_skel.o -c ${HEXAGON_KERNELS_PATH}/ggmlop_cdsp_skel.c -m${HTP_ARCH_VERSION} -c -Ofast -Wall -Wstrict-prototypes -fno-zero-initialized-in-bss -fdata-sections -fpic -D__V_DYNAMIC__ -mhvx -mhvx-length=128B -fno-finite-math-only -I${HEXAGON_SDK_PATH}/incs -I${HEXAGON_SDK_PATH}/libs/qprintf/inc -I${HEXAGON_SDK_PATH}/incs/stddef -I${HEXAGON_SDK_PATH}/ipc/fastrpc/incs -I${HEXAGON_SDK_PATH}/ipc/fastrpc/rpcmem/inc -I${HEXAGON_SDK_PATH}/utils/examples -I${HEXAGON_SDK_PATH}/ipc/fastrpc/rtld/ship/inc -I${HEXAGON_SDK_PATH}/libs/atomic/inc -I${HEXAGON_SDK_PATH}/utils/sim_utils/inc
COMMAND ${HEXAGON_CC} -m${HTP_ARCH_VERSION} -Wl,--defsym=ISDB_TRUSTED_FLAG=2 -Wl,--defsym=ISDB_SECURE_FLAG=2 -Wl,--no-threads -fpic -shared -Wl,-Bsymbolic -Wl,--wrap=malloc -Wl,--wrap=calloc -Wl,--wrap=free -Wl,--wrap=realloc -Wl,--wrap=memalign -lc -Wl,-soname=${HEXAGON_TARGET} -o ../../../bin/${HEXAGON_TARGET} -Wl,--start-group ${HEXAGON_KERNELS_PATH}/ggmlop_cdsp_skel.o ${HEXAGON_KERNELS_PATH}/ggml-dsp.o -Wl,--end-group
COMMAND ls -l ../../../bin/${HEXAGON_TARGET}
COMMAND /bin/cp -fv ../../../bin/${HEXAGON_TARGET} ../../../bin/libggmlop_skel.so
COMMENT "build hexagon-kernel"
)
endfunction()

ggml_hexagon_build_kernel("cdsp")
Loading
Loading