Skip to content

Different NaN behavior with various float functions on beta-vs-stable #139487

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alexcrichton opened this issue Apr 7, 2025 · 10 comments
Closed
Labels
A-floating-point Area: Floating point numbers and arithmetic C-discussion Category: Discussion or questions that doesn't represent real issues.

Comments

@alexcrichton
Copy link
Member

I've run across an upcoming difference in float-related functions such as floor, trunc, ceil, etc, using a beta compiler on x86_64-unknown-linux-gnu relative to the stable compiler. Bisection points to rust-lang/compiler-builtins#763 as the regression point so while it seems that changing sources for the function implementations I wanted to confirm that it's expected that a behavior change was desired here as well.

One example change is specifically this program:

fn main() {
    println!("{:#x}", f32::from_bits(0xffa00000).floor().to_bits());
}

currently prints 0xffe00000 while the beta compiler prints 0xffa00000.

This was discovered through running the WebAssembly test suite in Wasmtime using a beta/nightly compiler instead of a stable compiler, and previously some functions were "just" the_float.floor() where now we need NaN checks around them as well. Not a huge issue for us and we're happy to work around it, but I wanted to flag this behavior nonetheless to ensure it's intended.

If this is indeed an intended behavior change (which personally I think is totally fine), I'm happy to close this.

cc @tgross35 for the relevance to rust-lang/compiler-builtins#763

@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Apr 7, 2025
alexcrichton added a commit to alexcrichton/wasmtime that referenced this issue Apr 7, 2025
This commit implements a workaround for rust-lang/rust#139487 which is
necessary to have these implementations continue to work on more recent
compilers due to the underlying implementation changing.
github-merge-queue bot pushed a commit to bytecodealliance/wasmtime that referenced this issue Apr 7, 2025
This commit implements a workaround for rust-lang/rust#139487 which is
necessary to have these implementations continue to work on more recent
compilers due to the underlying implementation changing.
@m-ou-se
Copy link
Member

m-ou-se commented Apr 7, 2025

Related: #139483

@RalfJung
Copy link
Member

RalfJung commented Apr 7, 2025

Is 0xffa00000 a NaN? If so then yes indeed we guarantee close to nothing about the result of your code except that it will be a NaN. See https://doc.rust-lang.org/nightly/std/primitive.f32.html#nan-bit-patterns for details. Though I can't read binary float representations so I can't tell which part of your input ends up determining the NaN payload; the exact payload would matter for these rules.

@RalfJung
Copy link
Member

RalfJung commented Apr 7, 2025

If I understand https://float.exposed/0xffa00000 correctly, your float has a payload with the quiet bit not set (i.e., a signaling NaN), and a non-zero payload? So that would make it "not a preferred NaN" and we guarantee nothing about which NaN you get as a result. AFAIK that matches the wasm spec.

If you input 0xffc00000 instead, you should be guaranteed to get 0xffc00000 or 0x7fc00000 as return value.

@primoly
Copy link
Contributor

primoly commented Apr 7, 2025

The culprit seems to be https://github.com/rust-lang/libm/blob/cdbf42389e40e15350745cc29e86cd5d93cb2a76/src/math/floorf.rs#L7, specifically https://github.com/rust-lang/libm/blob/cdbf42389e40e15350745cc29e86cd5d93cb2a76/src/math/floorf.rs#L19-L21 The code was refactored in a later version that compiler-builtins is using (rust-lang/libm#437), but I haven’t checked if it still behaves the same.

Note how the new behaviour only occurs in debug builds, release is the same as before.

I think @RalfJung is right that this is allowed to be nondeterministic anyways.

@alexcrichton
Copy link
Member Author

One failing test is this one, notably:

(assert_return (invoke "floor" (f32.const -nan:0x200000)) (f32.const nan:arithmetic))

The input 0xffa00000 is the normal nan 0x7fc00000, negated to produce 0xffc00000, then or'd with the nan payload 0x200000 to produce 0xffa00000. WebAssembly nan propagation in this case notably says that the result is non-deterministic in all bits except (a) the exponent bits are all 1 and (b) the most significant significand bit is set to 1.

Notably this means that the wasm spec disagrees with your Rust's definition in this case. To me that's fine, I don't mind having an extra branch here, but I nevertheless wanted to confirm that it was intended. If the intention is to match the wasm spec on other platforms then this might be a bug, but if the intention is to provide fewer guarantees about nan then this definitely isn't a bug.

Also, FWIW, I don't know what's going on with the link you posted. If I click it it seems to redirect to https://float.exposed/0x7fc00000 and doesn't actually show any information about 0xffa00000, only about 0x7fc00000

@primoly
Copy link
Contributor

primoly commented Apr 7, 2025

@alexcrichton you’re right, I forgot about the whole arithmetic NaN thing.

@RalfJung
Copy link
Member

RalfJung commented Apr 7, 2025

Ah, so wasm says that signaling NaNs must be quieted? Seems I forgot about that.

Rust indeed does not guarantee this. This is crucial to support e.g. optimizing x * 1.0 to x.

Also, FWIW, I don't know what's going on with the link you posted. If I click it it seems to redirect to https://float.exposed/0x7fc00000 and doesn't actually show any information about 0xffa00000, only about 0x7fc00000

That's not the link I posted, but oddly the link I posted seems to redirect to 0x7fc00000. Strange.

@alexcrichton
Copy link
Member Author

Ok sounds like this is intended behavior, so I'll go ahead and close.

And yeah I don't mean to say you pasted the wrong link, I meant that when I clicked the link you posted it sent me to the link that I pasted. I didn't mean to imply that you pasted the link I pasted, I'll try to reword that better next time.

@tgross35
Copy link
Contributor

tgross35 commented Apr 7, 2025

The test will already fail built against musl :) https://rust.godbolt.org/z/EGqdq816K (musl disregards sNaNs, https://wiki.musl-libc.org/mathematical-library).

Glibc is doing the IEEE754-specified behavior of quieting signaling NaNs and raising FE_INVALID, but we don't require this. The change wasn't intentional per se since it's a side effect of using our softfloat libraries more. However, it's probably better to propagate as-is anyway; the quieting behavior doesn't really make sense when we don't distinguish sNaN vs. qNaN and don't have the FP env.

To meet the wasm spec, intercepting at the next level up is probably the better option anyway given quieting also isn't done by all C libm implementations.

@RalfJung
Copy link
Member

RalfJung commented Apr 8, 2025

And yeah I don't mean to say you pasted the wrong link, I meant that when I clicked the link you posted it sent me to the link that I pasted. I didn't mean to imply that you pasted the link I pasted, I'll try to reword that better next time.

No worries. :)

@jieyouxu jieyouxu added A-floating-point Area: Floating point numbers and arithmetic C-discussion Category: Discussion or questions that doesn't represent real issues. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Apr 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-floating-point Area: Floating point numbers and arithmetic C-discussion Category: Discussion or questions that doesn't represent real issues.
Projects
None yet
Development

No branches or pull requests

7 participants