Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSL: divergence in generated code for MSL/GLSL when using precise (DecorationNoContraction) #2408

Closed
mabaro opened this issue Nov 7, 2024 · 7 comments

Comments

@mabaro
Copy link
Contributor

mabaro commented Nov 7, 2024

Hello!
Context:
Using spirv-cross to generate code from HLSL to target both android(vulkan)/ios, we're facing divergence problems between two passes.
Vertex output position is marked as invariant and, in general, this seems to work fine and the vulkan spirv works without issues on android (so far).
On the other hand, the generated MSL counterpart exhibits glitches.
Checking the generated GLSL/MSL code, there are clear divergences as the MSL compiler codepath doesn't handle DecorationNoContraction.

Some investigation insights:
It seems the precise tag (DecorationNoContraction) is not being taken into account, at least on some bits, on the MSL side. spirv_msl::emit_instruction() does some rewritings (i.e., for OpMatrixTimesVector) which ends up on emit_op(..., forward) where forward ignores the DecorationNoContraction. This happens in particular when invariant_float_math is enabled.
The general case is redirected through the macro MSL_BFOP, which ends in emit_binary_func_op(), which doesn't handle DecorationNoContraction.

My first impression is that in order to make GLSL/MSL to have the "same" behavior we'd be missing handling DecorationNoContraction in emit_binary_func_op(), but this would introduce some changes also for GLSL, as it's currently only taken into account in emit_binary_op().

I'm wondering whether adding the same behavior in emit_binary_func_op/emit_binary_op() could be right/wrong regarding the specs.
Just for the record, for this particular test (see attached simple.hlsl.txt) adding the check for NoContraction in emit_binary_func_op/emit_binary_op and in spirv_msl::emit_instruction() for the case OpMatrixTimesVector, results in equivalent code for MSL/GLSL.

On top of that, I wonder wheter propagating the precise behavior to all operands involved and cascading it (i.e., bottom-up) could be necessary in order to avoid this kind of divergences. Specially since DecorationInvariant seems to not be enough in its current state.

Cheers!

Shader-playground:

FILES:
simple.hlsl used as source for the tests
simple.spv generated from DXC
simple.glsl_cross generated GLSL
simple.msl_cross generated MSL

Originally posted by @mabaro in #1665 (comment)

@HansKristian-Work
Copy link
Contributor

On top of that, I wonder wheter propagating the precise behavior to all operands involved and cascading it

NoContraction is a local thing in SPIR-V. It only means that specific instruction cannot be refactored, but if the inputs are not no-contraction, those inputs can be reordered.

Precise doesn't seem to propagate in the SPIR-V here because some math happens in a function. Not sure what HLSL rules are here.

@HansKristian-Work
Copy link
Contributor

Which SPIRV-Cross build are you using? That MSL output shouldn't be possible since

commit fb3defc9ef4c22bba596c245c1cbc3a5fd3b891d
Author: Bill Hollings <[email protected]>
Date:   Thu Sep 23 14:37:08 2021 -0400

    MSL: Honor DecorationNoContraction when compiling using fast-math.
    
    Add [[clang::optnone]] attribute to spvF*() functions used for handling
    floating point operations decorated with DecorationNoContraction.
    
    Just using precise::fma() did not work.
    
    Adjust SPIRV-Cross unit test reference shaders to accommodate these changes.

@mabaro
Copy link
Contributor Author

mabaro commented Nov 13, 2024

Hi, thanks for the quick response!
We do have some local changes replacing (undoing) the aforementioned change from optnone -> precise as it didn't work properly for us. Unluckily, the change doesn't seem to work in all cases neither.
image

I am exploring the option of propagating the "NoContraction" decorator to all operands involved in operations with variables decorated with invariant/noContraction in order to try and enforce same behaviour on different shaders/passes.

I would appreciate any hints or suggestion on that matter.

@HansKristian-Work
Copy link
Contributor

HansKristian-Work commented Nov 13, 2024

Well, you'll just have to experiment, I don't know what I can do here. I have no way to test any of this. Bills original PR said that precise::fma broke in weird ways, and I have no idea why and I'm reluctant to open that can of worms without a clear explanation of exactly what goes wrong.

@mabaro
Copy link
Contributor Author

mabaro commented Nov 13, 2024

If you wanted to give it a try, this simple vertex shader in shader playground shows:

  • optnone and precise give the very same output (3 lines with contracted operations)
  • invariant position producing what I think would be the wanted output, with all involved operations being separated letting Metal compiler do the work. Though it doesn't seem to extend to more complex shaders, wich is the case i am trying to fix with precise propagation

image

I'll be posting any relevant findings if any.
Cheers.

@HansKristian-Work
Copy link
Contributor

This should be resolved now. Thanks for driving the fix.

@mabaro
Copy link
Contributor Author

mabaro commented Dec 12, 2024

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants