Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: clarify that sqrt must be correctly rounded in accordance with IEEE 754 #882

Merged
merged 5 commits into from
Feb 6, 2025

Conversation

kgryte
Copy link
Contributor

@kgryte kgryte commented Jan 9, 2025

This PR:

  • closes Specify correct rounding for sqrt #826 and closes Minor clarification on allowed rounding mode #830
  • clarifies that sqrt should follow IEEE 754 and always return correctly rounded result. This was implied (i.e., conforming implementations should be IEEE 754 compliant), but never made explicit.
  • clarifies that accuracy requirements apply to real-valued floating-point operands and not complex-valued floating-point operands.
  • adds missing functions to list of functions not covered by accuracy requirements.
  • specifies that subnormals may or may not be supported.

@kgryte kgryte added this to the v2024 milestone Jan 9, 2025
@rgommers
Copy link
Member

rgommers commented Jan 9, 2025

Given #826 (comment) says that the default for single-precision sqrt in CUDA isn't to round with this precision, I'm not sure how useful/achievable this will be. @leofang any comments on that?

@hpkfft
Copy link
Contributor

hpkfft commented Jan 9, 2025

The default for single-precision divide in CUDA isn't to round with this precision either.
Nevertheless, this spec did the right thing by requiring CUDA libraries to be compiled with the flag that enables correct rounding.
It should be equally useful/achievable for sqrt.

@hpkfft
Copy link
Contributor

hpkfft commented Jan 9, 2025

I want to comment that correctly rounded is defined by IEEE 754-2019 as follows:

correct rounding: This standard’s method of converting an infinitely precise result to a floating-point
number, as determined by the applicable rounding direction. A floating-point number so obtained is said to
be correctly rounded.

@leofang
Copy link
Contributor

leofang commented Jan 9, 2025

Given #826 (comment) says that the default for single-precision sqrt in CUDA isn't to round with this precision, I'm not sure how useful/achievable this will be. @leofang any comments on that?

The default is to do correct rounding, as @hpkfft noted. However, it involves to not set -ftz=true which is a requirement we've discussed to avoid, ex:

.. note::
IEEE 754-2019 requires support for subnormal (a.k.a., denormal) numbers, which are useful for supporting gradual underflow. However, hardware support for subnormal numbers is not universal, and many platforms (e.g., accelerators) and compilers support toggling denormals-are-zero (DAZ) and/or flush-to-zero (FTZ) behavior to increase performance and to guard against timing attacks.
Accordingly, conforming implementations may vary in their support for subnormal numbers.

Therefore, I do not think the "correctly rounded" requirement is achievable. Certainly it is not CuPy's default.

Ref: https://docs.nvidia.com/cuda/floating-point/index.html#compiler-flags

Copy link
Contributor

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(see above)

@hpkfft
Copy link
Contributor

hpkfft commented Jan 10, 2025

Denormal result support (as opposed to flush to zero (ftz)) is orthogonal. It's not specific to square root, but rather applies to everything: addition, subtraction, ....
Yes, flushing to zero gives zero, which is neither the correctly rounded result nor the nearest representable value.
I think the note makes it clear that this is a global exception to requiring IEEE 754-2019 conformance. (Of course, suggestions to improve/clarify the wording are welcome if you feel that would be helpful.)
Otherwise, when the result is normal, I think it would be strange for this spec to deviate from the IEEE 754-2019 requirement of correct rounding for add, sub, mul, div, and sqrt. As @kgryte pointed out, conformance was already implied by this spec, and this PR merely makes it explicit.

@hpkfft
Copy link
Contributor

hpkfft commented Jan 10, 2025

Oh, I think the note you referenced is not in the spec itself?
I agree that the spec should explicitly mention that denormal support is not a requirement, i.e., ftz and/or daz are acceptable. [The former applies to denormal results and the latter applies to denormal inputs.]

But, I would suggest that your observation ought to be considered a separate bug/PR and not block this PR.
It might be the case that denormal support requires discussion.

@leofang
Copy link
Contributor

leofang commented Jan 21, 2025

Hi @hpkfft @kgryte, sorry for my late reply. I reached out to our math team (at NVIDIA) asking for clarification, and I believe what @hpkfft stated above is generally correct. Denormals can be flushed while still keeping the rounding behavior correct, at least for sqrt this is documented in the PTX instruction:
https://docs.nvidia.com/cuda/parallel-thread-execution/#floating-point-instructions-sqrt
(both sqrt.rnd.f32 and sqrt.rnd.ftz.f32 are considered IEEE compliant). With this understanding, @kgryte could you please kindly apply @hpkfft's suggestions, then we can approve/merge?

@kgryte
Copy link
Contributor Author

kgryte commented Feb 6, 2025

@leofang and @hpkfft I've updated the PR accordingly. If you can give it a once over, that would be appreciated!

Copy link
Contributor

@hpkfft hpkfft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thank you.

@kgryte kgryte removed the request for review from leofang February 6, 2025 05:46
@kgryte
Copy link
Contributor Author

kgryte commented Feb 6, 2025

Thanks for the review @hpkfft! As the changes implement the suggestions discussed above, I'll go ahead and merge. Thanks all!

@kgryte kgryte merged commit 4dccde5 into data-apis:main Feb 6, 2025
3 checks passed
@kgryte kgryte deleted the feat/sqrt-accuracy branch February 6, 2025 05:48
@leofang
Copy link
Contributor

leofang commented Feb 6, 2025

Thanks to you both 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Minor clarification on allowed rounding mode Specify correct rounding for sqrt
4 participants