From 69710af200a0254262c92ddfb96f8bbf4fb6636d Mon Sep 17 00:00:00 2001 From: div72 Date: Sat, 2 Mar 2024 12:14:54 +0300 Subject: [PATCH] build: enforce SSE2 on x86 targets MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This commit targets the floating point errors caused by x87 floating point calculations breaking floating point equality on the testing suite. Let's take the disassembly from `project.m_rac == 123.45` check at src/test/gridcoin/researcher_tests.cpp#L272 for example: ``` │ 0x5911c7c5 fldl -0xa4(%ebp) ... │ 0x5911c7e0 fldt -0x6fa844(%esi) ... │ 0x5911c81d fucompp │ 0x5911c81f fnstsw %ax ``` The fldl instruction loads a double to the floating point registers, while the fldt instruction loads a long double(80-bits) to registers. Combining that with the fact that since ASLR is enabled, the one with the massive offset against the stack is probably the 123.45 float literal while the first instruction is for the project.m_rac. Before the fucompp instruction(which compares two floating points), the floating point registers look like this: ``` st0 123.449999999999999997 (raw 0x4005f6e6666666666666) st1 123.450000000000002842 (raw 0x4005f6e6666666666800) ``` The x87 floating point registers seem to work like a stack, since the first fldl instruction loads the 123.450...2842 and the second fldt instruction loads the 123.449...97 value. From the raw value, it seems the first 8 bytes of the values are identical, but the last two bytes (corresponding to the extra 16 bits granted by the x87 extension, which should map to the fraction part of the float) differ slightly, which seems to cause the fucompp instruction to think that these two values are different. This is normally not a problem, as floating point equality comparisons are expected to be not stable. The problem however arises from the fact that it is the **literal 123.45** whose value is off. When a basic C program is compiled(with -m32 option or in a 32-bit environment), the loaded value for 123.45 is identical to the computed value in the st1 register; which means something is going wrong in either during runtime loading of the value or compile-time storing of the value. GDB is unable to actually read that memory address weirdly, so I'm unable to exactly pinpoint which. Because of the issue lying on the extra bits of the fp register, I assumed that the -ffloat-store(which should've ensured registers to not have more precision than a double) or the -fexcess-precision=standard(an option which should be a superset of the former) or the -mpc64(an option which rounds the significand of the results of FP ops to 53-bits) should have fixed the issue, but they didn't and I will not bother to re-examine the disassembly to figure out why. Rather than that, this commit enforces the SSE2 instructions for the FP operations, which are already used on x86_64 hosts and operate under double precision instead of long double. This should be fine compatibility wise, as SSE2 is supported on CPUs after Pentium 4 where I assume the prior CPUs don't have enough computing power to run the client in the first place. --- configure.ac | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/configure.ac b/configure.ac index 8fa47af3b0..d0b56fb0e3 100755 --- a/configure.ac +++ b/configure.ac @@ -278,6 +278,16 @@ if test "$CXXFLAGS_overridden" = "no"; then AX_CHECK_COMPILE_FLAG([-Wdeprecated-copy],[CXXFLAGS="$CXXFLAGS -Wno-deprecated-copy"],,[[$CXXFLAG_WERROR]]) fi + +dnl x87 FP operations on x86 hosts can break assumptions made about the floating point values. +dnl See the commit message which introduced this change for more details. +case $host in + i?86-*|x86_64-*) + AX_CHECK_COMPILE_FLAG([-msse2],[CXXFLAGS="$CXXFLAGS -msse2 -mfpmath=sse"],[AC_MSG_ERROR([SSE2 support is required on x86 targets.])],[[$CXXFLAG_WERROR]]) + ;; + *) ;; +esac + enable_sse42=no enable_sse41=no enable_avx2=no