Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move dlopenflags handling into new library loader #247

Merged
merged 7 commits into from
Dec 10, 2023

Conversation

riconnon
Copy link
Contributor

@riconnon riconnon commented Oct 8, 2023

The C library is no longer dynamically loaded directly when _lupa is imported, so we need to move the special handling of dlopenflags into the _import_newest_lib() function in order to ensure they are set when the lupa C extension is dlopened.

Closes #246

@scoder
Copy link
Owner

scoder commented Oct 8, 2023 via email

@riconnon
Copy link
Contributor Author

riconnon commented Oct 8, 2023

By doing it in __getattr__ are you suggesting trying to solve for people having transitive C module dependencies within their C module? eg. their Lua depends on library A which also exports symbols that library B expects to find?

I, personally, don't use __getattr__ but am just doing require in my Lua code, which, with this change, works fine for a single C module dependency (cjson). Specifically, the reproduction steps in the linked issue are fixed.

I think the idea of moving it to a context processor for users with special needs seems decent.

@riconnon
Copy link
Contributor Author

riconnon commented Oct 8, 2023

Something I'd really like to do is have a test case for this behaviour.
I can't think of an easy way to do this without making the tests a lot fiddlier to run, though.

@riconnon riconnon force-pushed the fix-dlopenflags branch 2 times, most recently from 1beff57 to 74da7d3 Compare October 8, 2023 08:50
@riconnon
Copy link
Contributor Author

riconnon commented Oct 8, 2023

By doing it in __getattr__ are you suggesting trying to solve for people having transitive C module dependencies within their C module? eg. their Lua depends on library A which also exports symbols that library B expects to find?

I, personally, don't use __getattr__ but am just doing require in my Lua code, which, with this change, works fine for a single C module dependency (cjson). Specifically, the reproduction steps in the linked issue are fixed.

I think the idea of moving it to a context processor for users with special needs seems decent.

OH. I misunderstood...
You mean to handle, eg. from lupa import lua51

@riconnon
Copy link
Contributor Author

riconnon commented Oct 8, 2023

So I was trying to produce a local test case for dynamic linking with multiple lua versions available via from lupa import X.
It seems like that method only becomes available when using bundled Lua though and in that setup lua doesn't seem to allow dynamic linking at all:

python
Python 3.11.5 (main, Aug 28 2023, 00:00:00) [GCC 13.2.1 20230728 (Red Hat 13.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import lupa
>>> lupa.LuaRuntime().execute("require('cjson')")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "lupa/lua54.pyx", line 412, in lupa.lua54.LuaRuntime.execute
  File "lupa/lua54.pyx", line 1736, in lupa.lua54.run_lua
  File "lupa/lua54.pyx", line 1750, in lupa.lua54.call_lua
  File "lupa/lua54.pyx", line 1776, in lupa.lua54.execute_lua_call
  File "lupa/lua54.pyx", line 1665, in lupa.lua54.raise_lua_error
lupa.lua54.LuaError: error loading module 'cjson' from file './cjson.so':
        dynamic libraries not enabled; check your Lua installation
stack traceback:
        [string "<python>"]:1: in main chunk
        [C]: in function 'require'
        [C]: in ?
>>>

@scoder
Copy link
Owner

scoder commented Oct 10, 2023

I updated the implementation in the master branch to make an explicit import in __getattr__. Now we can use the context manager there, too. Could you please rebase your PR branch?

@riconnon
Copy link
Contributor Author

I updated the implementation in the master branch to make an explicit import in __getattr__. Now we can use the context manager there, too. Could you please rebase your PR branch?

I've done this now. It's worth noting that if I understand the import mechanism correctly this won't work if someone does:

import lupa.lua54
lupa.lua54.LuaRuntime()

I suspect that in that case they'd have to use the context manager themselves...

lupa/__init__.py Outdated Show resolved Hide resolved
@riconnon
Copy link
Contributor Author

@scoder what's going on with the CI runs?
I'm not sure, but I think the failures aren't related to my changes. I also note that master branch commits aren't successful and that the 2.1 release didn't make it to PyPI

@scoder
Copy link
Owner

scoder commented Oct 10, 2023

setup lua doesn't seem to allow dynamic linking at all

I think we have to enable -DLUA_USE_DLOPEN in the CFLAGS. I'll add an option for that and enable it in the wheel builds.

@scoder
Copy link
Owner

scoder commented Oct 10, 2023

what's going on with the CI runs?

The master branch built just fine half an hour before.

the 2.1 release didn't make it to PyPI

It's not released yet. I'll try to get this PR in first.

@riconnon
Copy link
Contributor Author

@scoder I suspect that we can't eagerly globally load lots of different lua runtimes because they will have conflicting symbol names.

lupa/__init__.py Outdated Show resolved Hide resolved
lupa/__init__.py Outdated Show resolved Hide resolved
@scoder
Copy link
Owner

scoder commented Oct 11, 2023

I suspect that we can't eagerly globally load lots of different lua runtimes because they will have conflicting symbol names.

Hmm. That sounds reasonable – and annoying. Then … maybe we need to leave the usage of the context manager really to the users? Let them apply it where they need it, and leave it out everywhere else? The need to load a Lua binary module probably isn't very widespread.

We could still consider keeping it for the default module loading, i.e. when you just do import lupa and use the LuaRuntime from that. Reasoning being that existing code probably expects that to work and it's unlikely that users would then go and import other more specific version modules afterwards.

@scoder
Copy link
Owner

scoder commented Oct 11, 2023

dlopen() is not available on Windows. I fixed that by disabling the flag setting there.

macOS works fine, apparently, but the tests crash on Linux with Python 3.x (2.7 is ok as well, it seems). I couldn't reproduce this locally yet.

@riconnon
Copy link
Contributor Author

riconnon commented Oct 11, 2023 via email

@scoder
Copy link
Owner

scoder commented Oct 12, 2023

Triggering a fresh CI run against latest master.

@scoder scoder closed this Oct 12, 2023
@scoder scoder reopened this Oct 12, 2023
@scoder
Copy link
Owner

scoder commented Oct 12, 2023

You're right that symbols seem to get mixed up. Here's a local session, now that I got DLL loading working on my machine.

Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from lupa import lua53
>>> from lupa import lua54
>>> from lupa import lua52
>>> lua54.LuaRuntime().execute("require('cjson')")

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff70cae5f in luaH_getshortstr (t=0x555555d0abb0, key=0x0) at third-party/lua53/ltable.c:540
540       Node *n = hashstr(t, key);
(gdb) bt 20
#0  0x00007ffff70cae5f in luaH_getshortstr (t=0x555555d0abb0, key=0x0) at third-party/lua53/ltable.c:540
#1  0x00007ffff70cc20f in luaT_gettm (events=0x555555d0abb0, event=TM_GC, ename=<optimized out>) at third-party/lua53/ltm.c:60
#2  0x00007ffff70bd9ec in luaC_checkfinalizer (L=0x555555d03668, o=0x555555d0a620, mt=<optimized out>) at third-party/lua53/lgc.c:905
#3  0x00007ffff70b342e in lua_setmetatable (L=0x555555d03668, objindex=<optimized out>) at third-party/lua53/lapi.c:871
#4  0x00007ffff7f9cb4a in lua_cjson_new () from ./cjson.so
#5  0x00007ffff7f9f0ed in luaopen_cjson () from ./cjson.so
#6  0x00007ffff6fd4853 in precallC (f=0x7ffff7f9f0e0 <luaopen_cjson>, nresults=1, func=0x555555c60b30, L=0x555555d03668) at third-party/lua54/ldo.c:506
#7  luaD_precall (L=L@entry=0x555555d03668, func=<optimized out>, func@entry=0x555555c60b30, nresults=nresults@entry=1) at third-party/lua54/ldo.c:572
#8  0x00007ffff6fd4af2 in ccall (inc=65537, nResults=1, func=0x555555c60b30, L=0x555555d03668) at third-party/lua54/ldo.c:607
#9  luaD_callnoyield (L=L@entry=0x555555d03668, func=0x555555c60b30, nResults=nResults@entry=1) at third-party/lua54/ldo.c:627
#10 0x00007ffff6fca712 in lua_callk (L=0x555555d03668, nargs=<optimized out>, nresults=1, ctx=<optimized out>, k=<optimized out>) at third-party/lua54/lapi.c:1023
#11 0x00007ffff6fdb685 in ll_require (L=0x555555d03668) at third-party/lua54/loadlib.c:671
#12 0x00007ffff6fd4769 in precallC (f=0x7ffff6fdb5b7 <ll_require>, nresults=0, func=0x555555c60ad0, L=0x555555d03668) at third-party/lua54/ldo.c:506

Note that require() is correctly being executed in the Lua 5.4 runtime, but then cjson seems to get dynlinked against Lua 5.3 and uses that, which leads to the crash.

@scoder
Copy link
Owner

scoder commented Oct 12, 2023

So, back to my proposal in #247 (comment)

@riconnon
Copy link
Contributor Author

riconnon commented Oct 12, 2023 via email

lupa/__init__.py Outdated Show resolved Hide resolved
riconnon and others added 2 commits November 1, 2023 18:33
The C library is no longer dynamically loaded directly when `_lupa` is
imported, so we need to move the special handling of dlopenflags into
the `_import_newest_lib()` function in order to ensure they are set when
the lupa C extension is `dlopen`'d.
We also expose a context manager `eager_global_linking()` for users with
special needs to get the same behaviour.
scoder and others added 2 commits November 1, 2023 18:33
This was previously handled by a plain try-except at module level.
@riconnon riconnon force-pushed the fix-dlopenflags branch 2 times, most recently from 6d439b4 to 5a64885 Compare November 1, 2023 18:40
Enabling module loading by default doesn't work when there are multiple
Lua modules because the symbols collide with each other when loaded with
RTLD_GLOBAL.
Make the default use of the dlopenflags conditional on there only being
one available Lua module.
@riconnon
Copy link
Contributor Author

riconnon commented Nov 1, 2023

My only comment is that at present the only people who can load dynamically linked modules at all are those using nonbundled Lua and those people will not have multiple runtimes available. Can we detect that scenario and autoconfigure if its the case? -------- Original Message --------

On 12 Oct 2023, 10:52, scoder wrote: So, back to my proposal in [#247 (comment)](#247 (comment)) — Reply to this email directly, [view it on GitHub](#247 (comment)), or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

@scoder I've pushed up an attempt to do this. I don't know if it's going to pass CI and it seems that it's blocked behind your approval to execute. Certainly this commit works in the CI for my application, but that's very narrow in terms of platform/setup in that it runs a single Python version with a single version of system Lua

lupa/__init__.py Outdated Show resolved Hide resolved
@riconnon
Copy link
Contributor Author

riconnon commented Nov 2, 2023

@scoder I've just confirmed that b082b23 works correctly for my application.
Keen to keep this moving and get it merged/released. Let me know if there's anything I can do to help.

@scoder
Copy link
Owner

scoder commented Dec 10, 2023

Ok, then let's keep it like this.

@riconnon, could you write up some documentation for this in the readme? It seems worth explaining, also more generally why this is an issue that users need to take control of.

lupa/__init__.py Outdated Show resolved Hide resolved
lupa/__init__.py Outdated Show resolved Hide resolved
lupa/__init__.py Outdated Show resolved Hide resolved
@scoder scoder merged commit 042267d into scoder:master Dec 10, 2023
87 of 88 checks passed
@scoder scoder added this to the 2.1 milestone Dec 10, 2023
@riconnon riconnon deleted the fix-dlopenflags branch May 17, 2024 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error loading C modules in 2.x
2 participants