-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propertly Truncate Paired Single Move Instructions #13059
base: master
Are you sure you want to change the base?
Conversation
Can you add a unit test matching the tests you ran on hardware? That would help show what has been tested and prevent any future regressions. Also, the second commit should be squashed into the first one. |
I can probably commit to the hwtests repo if I can figure out how to get it set up ^^; |
While it would definitely be good to have that in the hwtests repo, I was primarily thinking about here which gets run on every PR. |
I'm mildly afraid to try that bc this is interpreter-only and would require an entire thing somehow setting that up :,D |
I would very much like a hwtest. And if we have that, I don't think there's any need to require a unit test on top. Yes, being able to automatically test stuff is nice, but:
|
I've confirmed with your hwtest (dolphin-emu/hwtests#56) plus isync that the ps_merge*/ps_mr/ps_neg/ps_abs/ps_nabs changes match hardware. If you move the changes for those opcodes (and the shared code) into a separate PR, we can get that merged now and leave ps_sel/ps_rsqrte/ps_sum0/ps_sum1 verification for later. |
Oh right I forgot about sum0 and sum1, ty for reminding me ^^ -- I was planning try to figure out what in the world is failing the 3 specific tests with ps_rsqrte first though :O |
Based on some limited testing, I think ps_sel doesn't have the NaN/infinity special case. |
I did get the tests fully working, but the changes might be larger and more complicated than initially expected ^^; I'm going to abstract as much away as possible though |
ARM tests are actually failing which I think is really funny ^^; I would really like more information to see what exactly is failing and can try to fix the JIT (or my own code!!) to resolve the issue need be!! |
You can see it in the build log for the failing build:
|
I can check on hardware quickly, but at least from the |
Thinking about it, actually I do need to modify the JIT if it ends up calling this function!! x86-64 definitely does this at least... |
Jit64 (but not JitArm64) calls ApproximateReciprocal and ApproximateReciprocalSquareRoot for edge cases, but what's also worth noting is that both JITs read from fres_expected and BaseAndDec. Note that while the unit test is set up to check that the JIT gives the same results as the FloatUtils functions, that's just the way this test happens to be written - we have no absolute policy that the behavior of the JIT and interpreter must match. But we do try to make the JIT and interpreter match when it's feasible performance-wise. On the other hand, we do have a requirement that the behavior of Jit64 matches the behavior of JitArm64, so we don't break cross-architecture netplay. I'd be happy to help out with updating the JITs to match your changes to the interpreter, but if you'd like me to take the lead on it, you may have to wait a bit before I'll have time. |
I'm going to try my best as of now for x86-64 at the very least because the outputs of that function seem to match those failing in ARM64 on my machine :o It's so weird too... |
Although the build should pass, note that this PR can't really be submitted yet until it's decided for JITs in the |
Matches hardware in rounding paired singles after move operations!! Move operations consist of operations which only transfer direct bits This also accounts for ps_rsqrte, because it has a similar quirk Specifically in hardware they're rounded in accordance to their slot: - PS0 rounds *only the mantissa* in accordance to the set rounding mode - PS1 truncates the mantissa ps_rsqrte also truncates for PS0 for some reason ^^; This has all been tested on hardware, along with a few edge case tests Co-Authored-By: JosJuice <[email protected]>
5161a38
to
c512ae1
Compare
Also changes >= 0 to > 0.0 This technically leads to fewer branches taken ^^; More importantly it looks/feels nicer to me Fixes the approximate reciprocal function - Currently not optimized - Considering rewrite for cleanliness Moves PS rounding to FloatUtils - Done because it's used in more places now Changes TruncateMantissa to occur on read - This is to account for reciprocal cases Adds PS1 getting function for reciprocals Fixes ps_sum1 edge case with rounding TODO: Test what ops can set PS1 edge case - ps_merge is known to be able to
Assume NI Set For Unit Tests This does *not* match x86-64, which properly handles any weird values using a function call This should hopefully pass tests though, which is important before fixing that issue I had forgotten that the JITs would use the same modified base and pair tables ^^; Also fixes call for complex inputs in x86 This saves an instruction on both x86-64 and ARM64!! TODO: Due to fixes with interpreter, ARM64 JIT likely doesn't match x86 JIT which calls a fallback on weird inputs
Matches hardware in rounding paired singles after move operations!!
Move operations consist of operations which only transfer direct bits (including absolute values and merges)
This also accounts for
ps_rsqrte
because it has a similar quirkSpecifically in hardware they're rounded in accordance to their slot:
ps_rsqrte
also only truncates for PS0 for some reason ^^;This has all been tested on hardware, along with a few edge case tests to ensure this implementation works