-
Notifications
You must be signed in to change notification settings - Fork 11.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vulkan: fix coopmat shader generation when cross-compiling #12272
Conversation
I wonder if this will fix #11695. In that case it seemed like there were two different glslcs on the system with different coopmat(2) support. |
@jeffbolznv I am not sure, this PR can never fix anything about "Vulkan_GLSLC_EXECUTABLE-NOTFOUND". It only fixes the issue that the detected glslc isn't properly tested in the vulkan-shaders-gen project. However, the original issue might be really fixed, because that issue looks like coopmat(2) detection issue. |
8c76cc2
to
74159e1
Compare
Rebased and added fallback OFF value. I am still not sure whether to make these options. |
There's a similar PR happening at ggml-org/whisper.cpp#2914. |
@jeffbolznv After patching #12428 with this change it appears to have resolved the cross-compile there, consistent with reporting in whisper.cpp 2914. I think if we go ahead and merge this PR in before revisiting the whisper one, it may resolve the issue on whisper side. This change is needed before #12428 can be merged. And, as you mention, it may also resolve #11695 as well. cc: @sandrohanea |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change LGTM, though I am not an expert on cmake. Let's give @0cc4m another chance to review before merging.
Thanks, but I'm not much help with cmake. I can check that it still works for my use cases, but usually I just trust @bandoti with it. |
74159e1
to
2c15c4c
Compare
@bandoti added CACHE BOOL to the code setting it to true. I am dumb on CMake, so please check whether this is expected by you. |
Previously the status of coopmat{,2} support isn't passed to the vulkan-shaders-gen project building on the host, which leads to build failure because of the cross-compiling code expecting coopmat{,2} shaders that didn't get generated. Fix this by passing the coopmat{,2} support status to vulkan-shaders subproject. Signed-off-by: Icenowy Zheng <[email protected]>
2c15c4c
to
32707cc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Icenowy Thank you for getting those changes in! I forgot to mention wrapping these checks in if(NOT DEFINED GGML_VULKAN_COOPMAT_...)
, but I just went ahead and made that basic change myself—since everything else looks good. Without this, the check would run every time regardless of caching or not.
Thank you for the contribution. |
Great news that this was merged! Any timeline for ggml sync @ggerganov (so this change to be propagated in whisper.cpp as well) ? |
else() | ||
message(STATUS "GL_NV_cooperative_matrix2 supported by glslc") | ||
add_compile_definitions(GGML_VULKAN_COOPMAT2_GLSLC_SUPPORT) | ||
if(NOT DEFINED GGML_VULKAN_COOPMAT2_GLSLC_SUPPORT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the goal of these if
tests? I found that when I switch to an older glslc, it continues to use the cached value and won't reevaluate the support. This will also be a problem for people updating to newer SDKs - they won't get updated support unless they manually delete the cmake cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. My assumption on using the cache variable in general was not based on switching glslc versions around. Is this a typical behavior or something mainly Vulkan devs experience?
We could fix this by checking the glslc version and "invalidate the cache" if it changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A new Vulkan SDK is released every couple months, so we should expect everybody's glslc version to change periodically. This bug would prevent them from picking up new features. glslc doesn't have a clear versioning system, and I don't see what's gained by caching these values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Several months, however, is aeons with the code velocity—and in this case we're typically not expecting folks to downgrade OR upgrade their Vulkan SDK without rebuilding the entire thing anyway, right?
This bug would prevent them from picking up new features.
This is not actually the case—as this particular caching only affects coopmat checks. Other new feature detection doesn't need to be cached, but my gut is telling me it should be.
I am happy to to submit a PR to remove the caching—but I would say if the check is redundant 99% of the time, it will perhaps be good to consider keeping caching. Another possibility we can move it from INTERNAL caching to BOOL to allow the cache variable to be changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and in this case we're typically not expecting folks to downgrade OR upgrade their Vulkan SDK without rebuilding the entire thing anyway, right?
I would wager that most people will not expect to need to delete their build folder (or cmake cache files) when they update the Vulkan SDK. It's certainly not part of any build instructions.
This is not actually the case—as this particular caching only affects coopmat checks. Other new feature detection doesn't need to be cached, but my gut is telling me it should be.
Why does any of this need to be cached? What's actually gained? Is it to save a fraction of a second during the cmake generate step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you put it that way—I suppose we don't gain anything haha (nervous laugh).
Probably a premature optimization on my part. I will submit a PR to remove it. 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created PR #12719 to address the issue.
Previously the status of coopmat{,2} support isn't passed to the vulkan-shaders-gen project building on the host, which leads to build failure because of the cross-compiling code expecting coopmat{,2} shaders that didn't get generated.
Fix this by passing the coopmat{,2} support status to vulkan-shaders subproject.
Tested by building with a glslc w/ coopmat and w/o coopmat2 both natively and cross-compiling (to RISC-V).