-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPIR-V Environment Spec does not describe required accuracy for OpFMod and OpFRem #859
Comments
For somebody like me trying to implement OpenCL on native drivers but also on top of Vulkan, I think it might be worth saying that if nothing is specified otherwise it follows Vulkan rules. dealing with I think it's a good idea that with |
Could you help explain a little about the layering with Vulkan, please? Naïvely, I had previously assumed the translation of OpFRem instructions in SPIR-V using the Kernel execution into SPIR-V using the GLCompute execution model would be similar to the translation of uses of OpExtInst OpenCL fmod, and that translation of OpFMod would be implemented in terms of translation of OpFRem. Does layering of OpenCL on Vulkan not work the way that I would have assumed? Note in particular that in the SPIR-V/LLVM Translator, LLVM's frem instruction is translated into OpFRem. The LangRef states "This is the same output as a libm 'fmod' function". Perhaps this is a bug in the SPIR-V/LLVM Translator, and we should be translating to fmod in the OpenCL extended instruction set, instead? |
That would make things just difficult. If one wants to have the precision of OpenCL C builtins, the
yeah.. from a gut feeling I'd say it shall translate to the |
I was against introducing OpFRem, and still believe that it should be removed rather than producing completely wrong answers in certain cases. Why can't those that want an approximation that only works sometimes just generate the sequence of SPIR-V ops that produce it? |
We need to support OpFRem and OpFMod because these are standard SPIR-V instructions, so removing them is not an option. We do have options how they should behave, though. Option 1: Derive behavior from
|
I'm a bit torn. On the one hand, I want consistency. On the other hand, this is the libclc implementation of From a driver compiler point of view, I really want OpFMod and OpFRem to really be the same for Vulkan and OpenCL. Mesa currently tries very hard to not have special cases in our SPIR-V parser for the target API. I'd rather those mean the same thing regardless of whether you're targeting Vulkan or OpenCL and have a separate thing for if you want the super-precise version for which we probably have to invoke libclc. Yeah, ok, I think I convinced myself that these should match Vulkan. 😁 |
Yeah, our implementation of the precise fmod is similar (we don't have a dedicated HW instruction). In some ways the only reason we're having this discussion is because we currently have two different ways to perform similar operations - in this case a core SPIR-V instruction and an extended instruction - therefore we can (possibly) assign different behavior (accuracy requirements) to each operation. Two interesting philosophical questions are:
If we want to match Vulkan I think we'll need at least some cases where there are two ways of performing a similar operation, because of the differences in behavior (specifically, accuracy requirements). |
The consistency argument cuts both ways, unfortunately. Do we want the SPIR-V opcode to behave consistently across APIs or do we want the two things with the same name in OpenCL to behave the same in OpenCL? |
I'm a bit skeptical of treating SPIR-V opcodes and OpenCL C builtins the same. For the builtins we have Not having a good example here, but e.g. to satisfy OpenCL's requirement on Granted, that's mostly a Mesa issue as our internal program IR is SPIR-V, which we also use for linking, so other runtimes might not care about that as much as we do, but at least for us, everything goes through SPIR-V one way or the other. |
Discussed in the November 29th teleconference. We're trending towards matching Vulkan for the SPIR-V |
As noted in today's WG call, FPFastMathMode decorations are always valid for OpFRem, but they are only valid for fmod in the OpenCL Extended Instruction Set when using a SPIR-V environment that explicitly allows it. As we don't yet have a SPIR-V environment for OpenCL for SPIR-V 1.3 or above, we can (and IMO, we should) make sure that this is included in any such environment. @bashbaug, any thoughts as to how we should best track this? (Also, we should update the environment to allow ZeroExtend and SignExtend for images with SPIR-V 1.4 - and probably do a more general search for "missing before version 1.x".) |
On the call on 2023-01-17 there was a concern that SPIR-V LLVM translator could be less accurate than the Vulkan requirements and thus not meet the changes in #871. That would need further investigation. |
I've been asked to clarify my concerns with defining the required accuracy of OpFRem and OpFMod in terms of sequences using trunc and floor. I don't have any specific cases for which this would cause a problem. My concern is whether all existing implementations already provide the required accuracy for all possible inputs, depending on rounding behavior. This is a concern for both OpFRem and OpFMod, regardless of the behavior of the SPIR-V/LLVM Translator. Even if we assume that OpFRem is always implemented along the lines of x − y × trunc(x ÷ y), as per Vulkan (of which I do not know), I am unsure whether we can be sure that OpFMod would have accuracy as good as x − y × floor(x ÷ y) for all possible inputs when using the SPIR-V/LLVM Translator, given the algorithm used in the translator and how it may be affected by rounding. The SPIR-V/LLVM Translator implements OpFMod in terms of LLVM's frem op with a correction applied to it. frem is stated by the specification to match the "libm 'fmod' function" – for OpenCL, there is no libm, but it seems reasonable (but not necessarily guaranteed) that this would be taken as OpenCL's fmod builtin function, which is defined in terms of x − y × trunc(x ÷ y). The correction is a final step that adds or subtracts y to produce the correct answer. I agree that implementations using Vulkan's accuracy for OpFRem and OpFMod should be considered accurate enough. I was previously concerned that there may be content that uses OpFRem for the OpenCL fmod function, but if the view is that such SPIR-V content is invalid and should be regenerated to use the OpenCL Extended Instruction Set, then there accuracy of OpFRem seems a lot less important. In any case, fmod is defined as x − y × trunc(x ÷ y). The only question that remains is whether any existing implementations of OpFMod are less accurate than this, though. |
See related discussion: KhronosGroup/OpenCL-CTS#1548
In short, the OpenCL SPIR-V environment spec currently does not describe the required accuracy for the OpFMod and OpFRem instructions (although it does describe the required accuracy for the
fmod
andremainder
extended instructions).OpenCL SPIR-V environment spec reference:
https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Env.html#_ulp_values_for_math_instructions_full_profile
The spirv_new CTS tests are currently testing that OpFMod and OpFRem matches the behavior of these OpenCL C instructions, which may or may not be correct:
For Vulkan, the accuracy of OpFMod and OpFRem is derived from
trunc
andfloor
, which makes them cheap but potentially highly inaccurate:https://registry.khronos.org/vulkan/specs/1.3-extensions/html/vkspec.html#spirvenv-op-prec
What do we want to do for OpenCL?
The text was updated successfully, but these errors were encountered: