Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

__embind_initialize_bindings is stripped from export table in Release + pthreads build #22968

Closed
laminowany opened this issue Nov 20, 2024 · 20 comments · Fixed by #23159
Closed

Comments

@laminowany
Copy link
Contributor

laminowany commented Nov 20, 2024

This is a regression in 3.1.70, on previously tested version 3.1.56 it worked correctly. My application gets undefined reference to __embind_initialize_bindings during runtime.

There is a code which adds __embind_initialize_bindings to exported symbols:

emscripten/tools/link.py

Lines 508 to 509 in 119a427

if settings.EMBIND:
settings.REQUIRED_EXPORTS.append('_embind_initialize_bindings')

but this seems to be stripped during call to binaryen:

out = run_binaryen_command('wasm-metadce',
wasm_file,
wasm_file,
args,
debug=debug_info,
stdout=PIPE)

The binaryen takes an DCEGraph as an input, and when I inspect the json of graph it has following entry:

{
    "name": "emcc$export$__embind_initialize_bindings",
    "export": "_embind_initialize_bindings",
    "reaches": []
 },

I can fix this issue by manually editing this intermediate file and changing this entry to:

   {
    "name": "emcc$export$__embind_initialize_bindings",
    "export": "_embind_initialize_bindings",
    "reaches": [],
    "root": true
  }

My guess: _embind_initialize_bindings should be marked with root: true but for some reason is not.

Another workaround is manually exporting this function via:
EXPORTED_FUNCTIONS=_main,__embind_initialize_bindings"
but I believe it should not be needed.

This issue is similiar to: #21844

@kripken
Copy link
Member

kripken commented Nov 20, 2024

Metadce can strip things that are not used, so somehow it is not seeing a use of that function, is my best guess. Do you have a testcase showing the issue that we can investigate?

cc @brendandahl

@sbc100
Copy link
Collaborator

sbc100 commented Nov 20, 2024

@laminowany can you share more details about your use case? What link flags are you using? I assume you are using pthreads since that is the only place _embind_initialize_bindings is called.

@laminowany
Copy link
Contributor Author

laminowany commented Nov 21, 2024

I don't really have minimal reproducible case yet, I was trying to build https://github.com/mitchcurtis/slate with Qt dev built from source.

Linker flags are following:
em++.py -O3 -DNDEBUG -s MODULARIZE=1 -s EXPORT_NAME=app_entry -s EXPORTED_RUNTIME_METHODS=UTF16ToString,stringToUTF16,JSEvents,specialHTMLTargets,FS,callMain -s PTHREAD_POOL_SIZE=4 -s INITIAL_MEMORY=50MB -s MAXIMUM_MEMORY=4GB -s MAX_WEBGL_VERSION=2 -s FETCH=1 -s WASM_BIGINT=1 -s STACK_SIZE=5MB -pthread -s ALLOW_MEMORY_GROWTH -sASYNCIFY_IMPORTS=qt_asyncify_suspend_js,qt_asyncify_resume_js -s ERROR_ON_UNDEFINED_SYMBOLS=1 -Wno-pthreads-mem-growth

@sbc100
Copy link
Collaborator

sbc100 commented Nov 21, 2024

I don't see either -lembind or --bind on the command line there, which mean that program doesn't look like its using embind at all.

The resulting code should therefor not contains a call to _embind_initialize_bindings in the pthread startup code:

#if EMBIND
#if PTHREADS_DEBUG
dbg(`worker: Pthread 0x${_pthread_self().toString(16)} initializing embind.`);
#endif
// Embind must initialize itself on all threads, as it generates support JS.
// We only do this once per worker since they get reused
__embind_initialize_bindings();
#endif // EMBIND
.

Can you confirm where the call from _embind_initialize_bindings is coming from? Is it the pthread startup code, or somewhere else?

If possible perhaps you can bisect to figure out which change caused the regression: https://emscripten.org/docs/contributing/developers_guide.html#bisecting

@laminowany
Copy link
Contributor Author

laminowany commented Nov 28, 2024

Sorry for late reply, but I needed to double check that the flags I got are right. I missed the -lembind flag because it was hidden between list of *.a files on linker argument list.

So the total list of flags looks like this:

-O3 -DNDEBUG -s MODULARIZE=1 -s EXPORT_NAME=app_entry -s EXPORTED_RUNTIME_METHODS=UTF16ToString,stringToUTF16,JSEvents,specialHTMLTargets,FS,callMain -s PTHREAD_POOL_SIZE=4 -s INITIAL_MEMORY=50MB -s MAXIMUM_MEMORY=4GB -s MAX_WEBGL_VERSION=2 -s FETCH=1 -s WASM_BIGINT=1 -s STACK_SIZE=5MB -pthread -s ALLOW_MEMORY_GROWTH -sASYNCIFY_IMPORTS=qt_asyncify_suspend_js,qt_asyncify_resume_js -s ERROR_ON_UNDEFINED_SYMBOLS=1 -Wno-pthreads-mem-growth -lembind -o app/app.js

The application depends on Qt framework which uses embind internally, and all Qt projects will get this flag set by default.
This is done by:

https://github.com/qt/qtbase/blob/a13eb28329b08c53085bc2902c4a99d0758f972c/src/corelib/Qt6WasmMacros.cmake#L187

initialize_bindings is being called from pthread startup code:
image

@sbc100
Copy link
Collaborator

sbc100 commented Dec 2, 2024

Can you link with EMCC_DEBUG=1 and attach the output?

@laminowany
Copy link
Contributor Author

Output looks following: https://pastebin.com/A9aVVTx7

Relevant part is here I guess:

building:DEBUG: unused_exports: ['_embind_initialize_bindings', '_emscripten_thread_crashed', 'emscripten_main_runtime_thread_id', 'emscripten_main_thread_process_queued_calls']

@sbc100
Copy link
Collaborator

sbc100 commented Dec 3, 2024

Can you attach a copy of /tmp/emscripten_temp/emcc-02-original.js after running the build? (or maybe the entire /tmp/emscripten_temp directory)

@laminowany
Copy link
Contributor Author

Here is the entire /tmp/emscripten_temp/ dir: https://drive.google.com/file/d/1CmDsPpUk2srMFzeuFSYIcha0RkTF2qsI/view?usp=sharing

@laminowany
Copy link
Contributor Author

From what I tested the breakage was somewhere between 3.1.56 and 3.1.58 version.

@laminowany
Copy link
Contributor Author

This seems to be a culprit: 37e1e68

@sbc100
Copy link
Collaborator

sbc100 commented Dec 13, 2024

There is some kind of bug in DCE here though since _embind_initialize_bindings should never be detected as part of unused_exports since it is clearly used in the output right? So metadce must be somehow confused about that symbol.

@kripken
Copy link
Member

kripken commented Dec 13, 2024

@sbc100 that does sound likely. Looking at the emcc-NN-* temp files, emcc-09-AJSDCE.js looks ok (there is a use and a definition of that function), while the next JS file, emcc-11-applyDCEGraphRemovals.js looks bad (there is only a use, no def). I'm looking into it.

@laminowany
Copy link
Contributor Author

laminowany commented Dec 13, 2024

This is fix for me: https://github.com/emscripten-core/emscripten/compare/main...laminowany:emscripten:22968?expand=1

But I dont know if its a "proper" fix.

The problems is that here it gets filtered out and not marked as a root:

required_symbols = user_requested_exports.union(set(settings.SIDE_MODULE_IMPORTS))

While it was rooted in previous emscripten

Emscripten 3.1.56 was doing something similiar:
image

@kripken
Copy link
Member

kripken commented Dec 13, 2024

@laminowany That works around it, but I think there is something else going on here.

I am pretty sure the issue is that emitDCEGraph has a bug where functions with the same name, but in different scopes, are mixed up. Specifically, in this JS, there are two handleMessage functions, one for pthreads - which includes the call to this embind method - and one for SOCKFS. The SOCKFS one tramples the former, so the DCE graph no longer has anything that reaches that embind method.

This diff renames the SOCKFS one:

diff --git a/src/library_sockfs.js b/src/library_sockfs.js
index e5c1b8674..5d33996ab 100644
--- a/src/library_sockfs.js
+++ b/src/library_sockfs.js
@@ -287,7 +287,7 @@ addToLibrary({
           }
         };
 
-        function handleMessage(data) {
+        function handleSocketMessage(data) {
           if (typeof data == 'string') {
             var encoder = new TextEncoder(); // should be utf-8
             data = encoder.encode(data); // make a typed array from the string
@@ -331,7 +331,7 @@ addToLibrary({
             if (!isBinary) {
               return;
             }
-            handleMessage((new Uint8Array(data)).buffer); // copy from node Buffer -> ArrayBuffer
+            handleSocketMessage((new Uint8Array(data)).buffer); // copy from node Buffer -> ArrayBuffer
           });
           peer.socket.on('close', function() {
             SOCKFS.emit('close', sock.stream.fd);
@@ -351,7 +351,7 @@ addToLibrary({
             SOCKFS.emit('close', sock.stream.fd);
           };
           peer.socket.onmessage = function peer_socket_onmessage(event) {
-            handleMessage(event.data);
+            handleSocketMessage(event.data);
           };
           peer.socket.onerror = function(error) {
             // The WebSocket spec only allows a 'simple event' to be thrown on error,

@laminowany can you see if that fixes it? It does for me locally, but I am just running the specific tools not the full emcc command.

Assuming that fixes it, the proper fix is to fix emitDCEGraph's scoping.

@sbc100
Copy link
Collaborator

sbc100 commented Dec 13, 2024

Nice work find that alon! What a horrible bug

@laminowany
Copy link
Contributor Author

I tested your patch @kripken and it does solve a problem on my machine, thanks for looking into it!

@kripken
Copy link
Member

kripken commented Dec 13, 2024

@laminowany Ok, #23159 should fix this properly. If it's not too much trouble, can you try that PR on your code?

@laminowany
Copy link
Contributor Author

I tested with my code and #23159 does fix it for me!

@kripken
Copy link
Member

kripken commented Dec 13, 2024

Perfect, thanks @laminowany ! I'll land it as soon as tests pass.

hedwigz pushed a commit to hedwigz/emscripten that referenced this issue Dec 18, 2024
…en-core#23159)

Other than simple defined functions (function foo() {}), JS also has
arrow functions and object methods. We need to be aware of those
in metadce.

This fixes emscripten-core#22968 which was an issue with an object method that
enclosed a function of the same name as another toplevel function.
To fix this, this PR makes us aware of such scopes, and we do not
optimize functions in them. That is, we only ever optimize defined
functions at the toplevel scope, and anything enclosed is considered
fixed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants