-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VR4300 JITTER #1116
Comments
If somebody wants to try it (linux only for now) you can build it by specifying VR4300_JITTER=1 as an argument to make. |
Hey, cool to see someone else bothering with writing a new JIT as well. Now I just need to find some people to help me wrap up my AOT recompiler lol |
What i do to test performance is i modified the graphics plugin to render normally for ~60 seconds and then to render only every 3600'th frame so that i'm not gpu bottlenecked. Then i build and run a unmodified version of the emulator and run it with this plugin and some game. Then i build and run my version of the emulator with the same plugin and game and take the same measurements. I recently pushed a fix which also improves games which modify the FR status bit a lot and some other minor fixes. Note: I set ACCURATE_FPU=0 for both new_dynarec and vr4300_jitter. new_dynarec does not seem to implement this feature so to be fair i turned it off (please correct me if i'm wrong). Also i'm comparing the pure interpreter and my jit using the core compare feature (except not while doing performance tests). In some games there are still issues but in some games they are equal (the other recompilers aren't afaict - unless i'm doing something wrong). I have run some games for multiple hours with the compare feature to see if any errors pop up. I haven't done that recently though so it might have regressed. Tbh new_dynarec and vr4300_jitter have comparable speeds, but that said as an example mario 64 runs (with the above 60 seconds method) at about 4650VI/s after the intro cutscene with new_dynarec and 5400VI/s with vr4300_jitter, roughly, on my machine, similar in other situations. It may very well be an outlier. There are probably still lots of instructions that have untested cases. I hope to implement some sort of unit testing which would compare the instructions (maybe with some randomizations) against what the pure interpreter does at some point, such that i can find all the cases and fix them. Ofc there are still glaring issues with vr4300_jitter: no arm64 support at the moment, no support outside of linux (as of yet), lack of widespread testing. However all these things can be addressed. Also vr4300_jitter does not have some of the optimizations that new_dynarec has, like treating certain blocks differently depending on whether they use 32 or 64 bit registers only and such. I haven't understood this optimization fully yet (as you can probably tell from my ramblings) and therefore it's not implemented yet, still performance is comparable and sometimes better. I can do other performance comparisons if anybody is interested or has a more rigorous testing method for me. |
And thank you for your reply, i was worried nobody would even look at it. And if there is a more direct way of communication let me know as well. |
You might want to test with stop_after_jal = 1 (see new_dynarec.c), this causes the blocks to be shorter and recompilation times to be shorter too. usually that reduces perf but in your test that might improve them. In sm64 even things get loaded, invalided and recompiled all the time due to the overlays and block size / block linking etc do cause a considerable impact in those measurements |
@krnlyng Don't worry about comparing it to the new dynarec. That code is an unmaintainable mess and most people involved with mupen64plus would be happy with a better replacement. If you want to discuss, most of the people involved with mupen64plus in this Discord server (#emu_mupen channel): |
Like this? It runs at around 4300VI/s then. |
yea like that. I thought that would reduce the overhead for this kind of measurement, but guess it doesnt. |
Whats your implementation of the counter reg? did u orient yourself on new dynarec? |
I tried to keep it matching to the pure interpreter. The core compare feature wouldn't work if it's not matching to the pure interpreter. It for sure can and should be improved. |
iirc the ones that desync with new_dynarec had some interpret_ flag but i cant remember if @Gillou68310 back then made it fully match. We started a bit of a effort back then but we didnt complete it |
One thing u could test is how both jit's perform at varying count per op levels. |
oh i didnt see you do a fastmem implementation. that would explain it. We actually started implementation for new dynarec before too but ran into some issues |
Hi, I don't know if anybody is interested in this at all, but i've been working on a dynamic recompiler for mupen64plus.
It's very much WIP and currently x86_64 only, and some code is based on dolphin. I thought it makes sense to share a snapshot, although it still needs lots of improvements. That said some games do run faster with it.
Here is the code:
https://github.com/krnlyng/mupen64plus-core/tree/vr4300_jitter_snapshot
And some details:
WIP VR4300_JITTER.
It's a dynamic recompiler (currently x86_64 only) which is based on ideas
from the dolphin emulator, also reusing some of its code.
It's very much WIP but i thought i'd share a snapshot in case somebody is
interested.
Features:
feature).
interpreter, all while running at similar speeds and sometimes (up to
10%-15%) faster than the other recompilers in mupen (when running in
unlimited mode and with a graphics plugin modified to only draw every
3600 frames or so, to avoid GPU bottlenecks).
matches that of the pure interpreter of mupen. But i hope to do some tests
sometime in the future and improve the accuracy.
implementation).
table (similar as dolphin).
tweaks for vr4300).
change in the future.
addresses or virtual addresses are handled via faults (SIGSEGV on Linux).
this currently performs slower than without so it's disabled until further
investiations have been done.
register cache) are optimized and do not need fault handling.
generated from code at runtime.
Un-features:
up other platforms.
mappings, which in itself is not a problem but on platforms where this can't
be supported there is currently no fallback.
There is probably more i can't think of right now.
The text was updated successfully, but these errors were encountered: