-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AP_Scripting: reduce Lua state memory usage #27426
Conversation
Saves ~100B of statically allocated RAM.
They are never accessed from globals. Only their metatables are accessed, using luaL_getmetatable. Saves ~2.9K of Lua heap.
The __call metamethod was set to the metatable itself. With __call not present, Lua will try to call the metatable (and fail), which is the same behavior as with the __call metamethod set to the metatable. Saves ~2K Lua heap.
The global table is then used as the __index metamethod of each state's environment table. Avoids the overhead of loading binding objects into each state. The binding objects are immutable from Lua so sandboxing is not violated. Does have the slight downside that a script can no longer know all the binding names by enumerating _ENV. Saves ~700B of memory per loaded script.
Nice. Looks like this applies the same optimization we did for singletons to globals so we can keep them in flash rather than memory. I think the only thing we need to check is how fast it is before vs after. I did do a little testing when we moved the singletons here: In that case it did approximately double the time taken to call a method. My concern here would be that we pay that on every call. Looks like ahrs is at the end of the list, so it will be the worst case. Its not clear to me if we have to do the search for each call or just the first? There is a What happens if we overload a singleton? EG |
Like the commit explains, the lookup result is stored in the global table so there should only be an additional cost the first time a particular singleton is accessed. I will look into timing it to see if I can notice a difference. There might be a small new cost each call since the name has to fail to be found in the script environment before Lua looks in the global table. If this turns out to be unacceptable we can store the result in the environment at the cost of a bit more RAM usage. If a script overrides a singleton name, then it will be successful, but will only affect that script as each script has its own environment separate from the global table. If the script deletes the override then it can access the singleton properly again. |
Did some timing with the following script: function update()
local start_t = micros()
local lt = 0
for i = 0, 2000 do
-- Location userdata creation function at the bottom of the table serach, so
-- should be a worst case test for the lookup
-- lt = Location -- Reference
-- lt = Location() -- Instantiate
end
local end_t = micros()
gcs:send_text(0, "took: " .. tostring(end_t-start_t) .. "us val:"
.. tostring(lt))
return update, 1000
end
return update() And got some interesting results (times in microseconds):
Turns out that extra lookup costs more than I thought. It is probably faster overall in both cases in the instantiate case because the garbage collector has to do less work, so this should be a decent speed improvement overall. Based on these results I am changing this PR to put the looked up entries into the script environment table instead of the global table. There is going to be a slight downside in that scripts now won't share userdatas, so if two scripts use the same singleton it will be ~32 more bytes than before. But that concern is probably overthinking it. |
9847b85
to
4f13c81
Compare
Only create the binding object (singleton metatable/userdata or C function reference) once the user first references a particular singleton or userdata creation function. Once created, the object is stored into the script's environment so it doesn't get recreated on the next reference and there isn't any further overhead. The userdatas are no longer shared between scripts which imposes a slight memory penalty for multiple scripts using the same singleton but this avoids an additional lookup time cost. Userdata and ap_objects aren't eligible for this optimization as the C++ code might want a particular metatable at any time. Saves ~9.3K Lua heap.
4f13c81
to
b414c6b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything works expected in my testing. This saves both run time and memory. I guess the more userdata items a script uses the closer it will approach the current code. This simply loads into memory on demand rather than all of them up front.
We could probably use this method of loading for the userdata level methods too, we already save all the memory there but we could trade some of that back for better runtime.
This does make the logged per run memory usage a little more confusing because if you use a userdata for the first time the memory of loading that will be included, previously that was accounted for in the initial load memory.
This does also mean that a script that may previously have run out of memory immediately when it was loaded is now likely to load and then run out at runtime.
A few re-engineerings to make the state setup more efficient and on-demand. Please see the commits for details.
There are no visible changes to scripts except for ones that play games reading
_ENV
, which none in the repo do. There might be a slight downside of increased risk of fragmentation and out of memory problems at inopportune moments, but I think the savings are worth it.Tested some scripts including my REPL in SITL and on Cube Orange.
Total savings on Cube Orange: