Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: add Linux cross-compile build #12428

Merged
merged 27 commits into from
Apr 4, 2025
Merged

Conversation

bandoti
Copy link
Collaborator

@bandoti bandoti commented Mar 17, 2025

This change introduces a cross-compile build targeting Riscv and Arm on Linux. The goal is to reduce regression issues associated with cross-compiling and to serve as an example how to cross-compile using Ubuntu. In the future we can update this to store artifacts if there is a larger demand to run on Riscv (or other) hardware.

@github-actions github-actions bot added the devops improvements to build systems and github actions label Mar 17, 2025
@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Mar 17, 2025
@bandoti bandoti requested a review from jeffbolznv March 17, 2025 19:45
@bandoti
Copy link
Collaborator Author

bandoti commented Mar 17, 2025

@jeffbolznv I am having an issue with the cross-compile build missing an extension. Here I arbitrarily selected Riscv for the cross-compile, and although I can pull libvulkan-dev:riscv64, I am wondering whether the version in the LunarG SDK is farther along and I cannot build without newer features.

/home/runner/work/llama.cpp/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:940:9: error: ‘PipelineRobustnessCreateInfoEXT’ is not a member of ‘vk’; did you mean ‘PipelineColorWriteCreateInfoEXT’?
  940 |     vk::PipelineRobustnessCreateInfoEXT rci;
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |         PipelineColorWriteCreateInfoEXT
/home/runner/work/llama.cpp/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:943:9: error: ‘rci’ was not declared in this scope
  943 |         rci.storageBuffers = vk::PipelineRobustnessBufferBehaviorEXT::eDisabled;
      |         ^~~
/home/runner/work/llama.cpp/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:943:34: error: ‘vk::PipelineRobustnessBufferBehaviorEXT’ has not been declared
  943 |         rci.storageBuffers = vk::PipelineRobustnessBufferBehaviorEXT::eDisabled;
      |                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@jeffbolznv
Copy link
Collaborator

Pipeline robustness is a pretty old extension at this point - changelog says "July 14, 2022 Vulkan 1.3.221". I had recently seen 1.3.261.1 SDK used in one of the CI jobs, so maybe there's an even older version used in another job?

We can potentially change the code to make it not rely on this, but it would be better to get all builds on a more recent SDK.

@bandoti
Copy link
Collaborator Author

bandoti commented Mar 17, 2025

Looks like updating to "ubuntu-latest" may have fixed that issue. I'm seeing a different one, which might indicate an issue with the generated shaders.

[ 20%] Building CXX object ggml/src/ggml-vulkan/CMakeFiles/ggml-vulkan.dir/ggml-vulkan-shaders.cpp.o
/home/runner/work/llama.cpp/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp: In function ‘void ggml_vk_load_shaders(vk_device&)’:
/home/runner/work/llama.cpp/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:1876:55: error: ‘matmul_f32_f32_coopmat_len’ was not declared in this scope; did you mean ‘matmul_f32_f32_fp32_len’?
 1876 |         CREATE_MM(GGML_TYPE_F32, pipeline_matmul_f32, matmul_f32_f32, , wg_denoms, warptile, vk_mat_mat_push_constants, 3, );
      |                                                       ^~~~~~~~~~~~~~
/home/runner/work/llama.cpp/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:1855:95: note: in definition of macro ‘CREATE_MM’
 1855 |             ggml_vk_create_pipeline(device, device-> PIPELINE_NAME ->l, #NAMELC #F16ACC "_l", NAMELC ## F16ACC ## _coopmat_len, NAMELC ## F16ACC ## _coopmat_data, "main", PARAMCOUNT, sizeof(PUSHCONST), l_ ## WG_DENOMS, l_ ## WARPTILE, 1, false, true);   \
      |                                                                                               ^~~~~~
/home/runner/work/llama.cpp/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:1876:55: error: ‘matmul_f32_f32_coopmat_data’ was not declared in this scope; did you mean ‘matmul_f32_f32_fp32_data’?
 1876 |         CREATE_MM(GGML_TYPE_F32, pipeline_matmul_f32, matmul_f32_f32, , wg_denoms, warptile, vk_mat_mat_push_constants, 3, );
      |                                                       ^~~~~~~~~~~~~~
/home/runner/work/llama.cpp/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:1855:129: note: in definition of macro ‘CREATE_MM’
 1855 |             ggml_vk_create_pipeline(device, device-> PIPELINE_NAME ->l, #NAMELC #F16ACC "_l", NAMELC ## F16ACC ## _coopmat_len, NAMELC ## F16ACC ## _coopmat_data, "main", PARAMCOUNT, sizeof(PUSHCONST), l_ ## WG_DENOMS, l_ ## WARPTILE, 1, false, true);   \
      |                                                                                                                                 ^~~~~~

@jeffbolznv
Copy link
Collaborator

This looks similar to issues like #11695. I suspect these are due to using a different glslc in the two environments (i.e. compilation actually happens with an older glslc than the feature detection does). Maybe there's some logging you can add to the cmake to see if that's happening?

@bandoti
Copy link
Collaborator Author

bandoti commented Apr 3, 2025

@Icenowy Your recommendation to use gcc/g++ 14 fixed the RISCV build issue. Thank you.

@bandoti
Copy link
Collaborator Author

bandoti commented Apr 3, 2025

@jeffbolznv When you are able, this change is ready to be reviewed. The Vulkan cross-compiles are working as expected now. Thank you.

@jeffbolznv
Copy link
Collaborator

I don't have any experience with github workflows and haven't built in these cross-compile environments, so I'm probably not an appropriate reviewer. But if you can't find anybody more appropriate, I can try.

@bandoti bandoti requested a review from slaren April 3, 2025 17:48
@bandoti
Copy link
Collaborator Author

bandoti commented Apr 3, 2025

Looks like the build failure is a sync issue. I will retry once the other jobs finish.

@bandoti bandoti merged commit 1be76e4 into ggml-org:master Apr 4, 2025
89 of 92 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops improvements to build systems and github actions ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants