Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spirv_new frem/fmod tests may require excessive precision #1548

Open
karolherbst opened this issue Oct 14, 2022 · 12 comments · Fixed by #1614
Open

spirv_new frem/fmod tests may require excessive precision #1548

karolherbst opened this issue Oct 14, 2022 · 12 comments · Fixed by #1614

Comments

@karolherbst
Copy link
Contributor

karolherbst commented Oct 14, 2022

I am not sure if this is an error from my understanding of the Env spec or if the assumption made in the test is actually wrong, but what the test is doing is to assume that OpFRem is equal to the OpenCL C builtin fmod which I don't think is a valid assumption.

It should compare the fmod OpenCL ExtInstr against a OpenCL C kernel using fmod instead.

But maybe we have to assume that spir-v alu instructions have the same precision requirements as the OpenCL builtins? That would make layering OpenCL on top of Vulkan much harder though.

Thoughts on this?

@karolherbst karolherbst changed the title spirv_new tests inconsistencies spirv_new frem/fmod tests might be wrong Oct 14, 2022
@bashbaug
Copy link
Contributor

bashbaug commented Nov 7, 2022

Hi, it looks like something odd is going on here.

To make sure I'm looking in the right place, I believe this is the relevant code:

kernelStream << "#define spirv_frem(a, b) fmod(a, b) \n";

This is essentially create an OpenCL C kernel to compare against. The OpenCL C kernel will have the form:

#define spirv_fadd(a, b) (a) + (b)               
#define spirv_fsub(a, b) (a) - (b)               
#define spirv_fmul(a, b) (a) * (b)               
#define spirv_fdiv(a, b) (a) / (b)               
#define spirv_frem(a, b) fmod(a, b)              
#define spirv_fmod(a, b) copysign(fmod(a,b),b)   
#define T float
#define FUNC spirv_frem
__kernel void fmath_cl(__global T *out,          
const __global T *lhs, const __global T *rhs)    
{                                                
    int id = get_global_id(0);                   
    out[id] = FUNC(lhs[id], rhs[id]);            
}                                                

The define for spirv_frem does look a little strange, but because there is no frem OpenCL C built-in function to map to we need to do something special to handle this case.

Note that the SPIR-V description for OpFRem is:

https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpFRem

The floating-point remainder whose sign matches the sign of Operand 1.

Does this mean that we have the spirv_frem and spirv_fmod cases backwards? In other words, should the kernel instead have the form:

...
#define spirv_fmod(a, b) fmod(a, b)              
#define spirv_frem(a, b) copysign(fmod(a,b),b)   

This seems to work fine for our implementation, although we didn't have an issue with the defines the other way around either so I'm not sure how much this tells us.

@StuartDBrady
Copy link
Contributor

StuartDBrady commented Nov 7, 2022

This rings a bell!

See KhronosGroup/SPIRV-LLVM-Translator#296.

It is correct to implement OpFRem as per the OpenCL C fmod builtin. The name of this builtin originally comes from SVr4 / 4.3BSD – I'm not sure of its original semantics, but in C90/C99, it is described as a remainder operation returning the sign of the first operand, i.e. the dividend – this means that fmod(x, y) matches trunc(x / y), i.e. abs(x - (fmod(x, y) + trunc(x / y))) < ε, and in the OpenCL C specification it is described, equivalently, as returning x - y * trunc(x / y).

OpFMod is quite different to the OpenCL C fmod builtin. I presume the name for this comes from OpenGL, but it is described as a remainder operation returning the sign of the second operand, i.e. the divisor. There is an implementation of this in terms of OpFRem in the SPIR-V/LLVM Translator, added in the above-mentioned PR.

The remainder builtin is yet another thing, with semantics such that essentially remainder(x, y) matches round(x / y), i.e. abs(x - (remainder(x, y) + round(x / y))) < ε. This is quite different to OpFMod and it is also quite different to OpFRem.

copysign(fmod(x, y), y) is quite simply wrong.

The testing passes because the input values it uses are generated with genrand_real1, which produces values in the interval [0, 1], i.e. it does not produce negative test values for either the dividend or the divisor.

@karolherbst
Copy link
Contributor Author

My point is we can't use OpenCL C builtins at all, because SPIR-V instruction don't provide the necessary precision. So even if the formula is correct, implementations are free to implement OpFRem and OpFMod with lower precision.

@karolherbst
Copy link
Contributor Author

And yeah.. it passes with some random seeds and does not with some special ones. I got it to pass just running the subtests, but running all tests makes them fail.

@bashbaug
Copy link
Contributor

bashbaug commented Nov 7, 2022

FWIW, I think there are several SPIR-V instructions that are relevant to this issue:

  1. OpFRem. This is what gets tested by the spirv_new test op_frem_float_regular.
  2. OpFMod. This is what gets tested by the spirv_new test op_fmod_float_regular.
  3. The fmod instruction in the OpenCL extended instruction set.
  4. Maybe the remainder instruction in the OpenCL extended instruction set?

Of these instructions, only (3) and (4) have defined accuracy in the OpenCL SPIR-V Environment Spec. That seems like a problem, and we should define the expected accuracy for (1) and (2) as well. @karolherbst is this what you mean by "we can't use OpenCL C builtins at all, because SPIR-V instruction don't provide the necessary precision"?

Note that the required accuracy for the OpenCL C fmod and remainder built-in functions is 0 ULP so if we do define the accuracy of the SPIR-V OpFMod and OpFRem based on the OpenCL C fmod then they will need to be very accurate as well.

@b-sumner
Copy link
Contributor

b-sumner commented Nov 7, 2022

Another question is, what is the expectation when the integer part of the quotient cannot be represented by the floating point type. For example 2^100 and 2^-100 are both 32-bit floating point numbers, but the integer part of their quotient is not. Which of these SPIR-V instructions guarantees the correct result for these inputs?

@karolherbst
Copy link
Contributor Author

FWIW, I think there are several SPIR-V instructions that are relevant to this issue:

  1. OpFRem. This is what gets tested by the spirv_new test op_frem_float_regular.
  2. OpFMod. This is what gets tested by the spirv_new test op_fmod_float_regular.
  3. The fmod instruction in the OpenCL extended instruction set.
  4. Maybe the remainder instruction in the OpenCL extended instruction set?

Of these instructions, only (3) and (4) have defined accuracy in the OpenCL SPIR-V Environment Spec. That seems like a problem, and we should define the expected accuracy for (1) and (2) as well. @karolherbst is this what you mean by "we can't use OpenCL C builtins at all, because SPIR-V instruction don't provide the necessary precision"?

Note that the required accuracy for the OpenCL C fmod and remainder built-in functions is 0 ULP so if we do define the accuracy of the SPIR-V OpFMod and OpFRem based on the OpenCL C fmod then they will need to be very accurate as well.

Yes, and please don't. Project layering OpenCL on Vulkan will have a hard time dealing with that if the precision requirements diverge. One is free to use the OpenCL ExtInst inside SPIR-V if one need that precision. The normal SPIR-V instruction should probably be left alone. At least I don't see the point if using the OpenCL ExtInst is a way out.

@bashbaug
Copy link
Contributor

bashbaug commented Nov 8, 2022

Yes, and please don't. Project layering OpenCL on Vulkan will have a hard time dealing with that if the precision requirements diverge.

Does Vulkan define accuracy requirements for OpFMod and OpFRem? In other words, if we wanted to align the accuracy requirements, what would we require?

@karolherbst
Copy link
Contributor Author

Yes, and please don't. Project layering OpenCL on Vulkan will have a hard time dealing with that if the precision requirements diverge.

Does Vulkan define accuracy requirements for OpFMod and OpFRem? In other words, if we wanted to align the accuracy requirements, what would we require?

from https://registry.khronos.org/vulkan/specs/1.3-extensions/html/vkspec.html#spirvenv-op-prec

The OpFRem and OpFMod instructions use cheap approximations of remainder, and the error can be large due to the discontinuity in trunc() and floor(). This can produce mathematically unexpected results in some cases, such as FMod(x,x) computing x rather than 0, and can also cause the result to have a different sign than the infinitely precise result.

And the OpenCL SPIR-V env spec doesn't define anything here (only for the ExtInsts, and a couple of SPIR-V instructions ), so I think we are good as it is. Maybe some clarification might be helpful.

https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Env.html#_ulp_values_for_math_instructions_full_profile

@bashbaug
Copy link
Contributor

bashbaug commented Nov 8, 2022

Thanks, for whatever reason my search for "vulkan spir-v environment" found the Vulkan 1.1 spec before the Vulkan 1.3 spec and it doesn't have accuracy requirements for these instructions.

It sounds like if we want to align with Vulkan the actions that are required are:

  1. We need to update the OpenCL SPIR-V environment spec to describe that the accuracy of OpFMod and OpFRem are derived from trunc and floor and hence may have a very large error.
    • We should check if any other instructions are missing, similarly.
  2. We should update the spirv_new tests so they do not test for equivalence with the OpenCL C built-ins. I'm not sure exactly what values we'd test against, but at the very least we should ensure kernels with these instructions continue to compile.

Shall I open an OpenCL-Docs issue for (1) to discuss the specification updates?

@bashbaug
Copy link
Contributor

Spec issue filed: KhronosGroup/OpenCL-Docs#859

karolherbst added a commit to karolherbst/OpenCL-CTS that referenced this issue Jan 4, 2023
The OpenCL SPIR-V Environment Specification does not require the SPIR-V
opcodes `OpFMod` and `OpFRem` to match any OpenCL C semantics, so
implementations implementing those two instructions according to Vulkan
and/or OpenGL semantics will fail those tests without actually violating
any OpenCL specification.

Instead `OpExtInst fmod` should be checked against OpenCL C `fmod` as this
is actually specified to match in precision by the SPIR-V Environment
Specification and this is what the SPIRV-LLVM-Translator ends up using for
fmod anyway.

This also allows implementations to relax their implementations of
`OpFMod` and `OpFRem` to trade performance for precision, but also to
allow for more consistent results between OpenCL and Vulkan using those
SPIR-V instructions.

Closes KhronosGroup#1548

Signed-off-by: Karol Herbst <[email protected]>
@StuartDBrady StuartDBrady changed the title spirv_new frem/fmod tests might be wrong spirv_new frem/fmod tests may require excessive precision Jan 10, 2023
karolherbst added a commit to karolherbst/OpenCL-CTS that referenced this issue Jan 11, 2023
The OpenCL SPIR-V Environment Specification does not require the SPIR-V
opcodes `OpFMod` and `OpFRem` to match any OpenCL C semantics, so
implementations implementing those two instructions according to Vulkan
and/or OpenGL semantics will fail those tests without actually violating
any OpenCL specification.

Instead `OpExtInst fmod` should be used if the application wants to use
the OpenCL C `fmod` builting as this is actually specified to match in
precision by the SPIR-V Environment Specification and this is what the
SPIRV-LLVM-Translator ends up using for fmod anyway.

This also allows implementations to relax their implementations of
`OpFMod` and `OpFRem` to trade performance for precision, but also to
allow for more consistent results between OpenCL and Vulkan using those
SPIR-V instructions.

We should keep those tests disabled until we figure out what the actual
precision requierement should be and update the test accordingly.

Closes KhronosGroup#1548

Signed-off-by: Karol Herbst <[email protected]>
karolherbst added a commit to karolherbst/OpenCL-CTS that referenced this issue Jan 11, 2023
The OpenCL SPIR-V Environment Specification does not require the SPIR-V
opcodes `OpFMod` and `OpFRem` to match any OpenCL C semantics, so
implementations implementing those two instructions according to Vulkan
and/or OpenGL semantics will fail those tests without actually violating
any OpenCL specification.

Instead `OpExtInst fmod` should be used if the application wants to use
the OpenCL C `fmod` builting as this is actually specified to match in
precision by the SPIR-V Environment Specification and this is what the
SPIRV-LLVM-Translator ends up using for fmod anyway.

This also allows implementations to relax their implementations of
`OpFMod` and `OpFRem` to trade performance for precision, but also to
allow for more consistent results between OpenCL and Vulkan using those
SPIR-V instructions.

We should keep those tests disabled until we figure out what the actual
precision requierement should be and update the test accordingly.

Closes KhronosGroup#1548

Signed-off-by: Karol Herbst <[email protected]>
karolherbst added a commit to karolherbst/OpenCL-CTS that referenced this issue Jan 24, 2023
The OpenCL SPIR-V Environment Specification does not require the SPIR-V
opcodes `OpFMod` and `OpFRem` to match any OpenCL C semantics, so
implementations implementing those two instructions according to Vulkan
and/or OpenGL semantics will fail those tests without actually violating
any OpenCL specification.

Instead `OpExtInst fmod` should be used if the application wants to use
the OpenCL C `fmod` builting as this is actually specified to match in
precision by the SPIR-V Environment Specification and this is what the
SPIRV-LLVM-Translator ends up using for fmod anyway.

This also allows implementations to relax their implementations of
`OpFMod` and `OpFRem` to trade performance for precision, but also to
allow for more consistent results between OpenCL and Vulkan using those
SPIR-V instructions.

We should keep those tests disabled until we figure out what the actual
precision requierement should be and update the test accordingly.

Closes KhronosGroup#1548

Signed-off-by: Karol Herbst <[email protected]>
@bashbaug
Copy link
Contributor

bashbaug commented Feb 7, 2023

Reopening - these tests are disabled for now, but we still need to decide what should be tested for OpFMod and OpFRem.

@bashbaug bashbaug reopened this Feb 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants