-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AP_Scripting: fixed hard fault on handling of lua exception #27056
Conversation
I have done some testing on a CubeOrangePlus. Both with the example script in this PR and the Master: This PR: Master with both the stash and pop removed: Master with stash, pop and O0 optimize removed from lua: Master with stash, pop and O0 optimize removed from lua and from the scripting thread: So, from my testing its part of the last fix that causes this issue. |
this exercises rapid fault handling
1d60829
to
c831b98
Compare
@IamPete1 as discussed, I think the save of the extra fp registers is needed. A proper test suite for setjmp/longjmp would be nice to have! |
Registers s0-s15 are caller-saved, so theoretically |
can you point me at a reference for that? |
this function with O0 opt:
with optimisation:
|
the register save must happen before the setjmp() call, which means outside of the LUAI_TRY() macro. We also should be saving all 32 floating point registers
c831b98
to
ea5418e
Compare
I test flew this today on a CubeOrange quadplane with 4 lua scripts |
CI is being very slow, but this PR has passed CI in my personal repo: |
this PR adds a test that exercises rapid fault handling and fixes it by shifting the save of the extra registers before the setjmp call and adding save/restore of the first 16 fp registers
The test was derived from the networking web server. The reason the web server triggered this issue is it continues running after a fault, and continually calls async functions (server side lua calls) that may fault. Most lua scripts when they fault will stop, so only have one chance to trigger the bug. The web server gets lots of chances per second to trigger the bug.
without the fix this lua script hard faults in a few seconds
This bug has been confirmed to happen with 4.3.x and 4.4.x as well as current stable 4.5.x