-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT compiled frame symbols are missing #226
Comments
Hi, I’d be happy to support JIT code. I briefly looked into this once before when someone else asked about it but I didn’t have time to dive very far into it. I’d be happy to accept a PR if you’re interested or I might be able to take a look next week. |
As a side note, I’m a huge fan of Jank :) |
My man! Not much overlap of C++ and Clojure folks. :)
If this is something you're open to tackling, I'd rather leave this one to you. I'm happy to help test and I can also provide some guidance for how you can test this with some actual JIT compiled code. If you ultimately don't have the resources, I can look into getting someone from the jank community to help out. |
I'm definitely happy to tackle this and try to get it working for Jank. It would probably be easiest for me to test this workflow with a local build / instrumentation of Jank so I don't have to setup a simple LLVM JIT myself. If you have any pointers for a good way to test this I'd be interested, otherwise I can explore :) |
You bet. I have merged cpptrace into main now. So, if you can compile jank from source, you should be able to then iterate on the cpptrace submodule. The build instructions are here: https://github.com/jank-lang/jank/blob/main/compiler+runtime/doc/build.md In short, please make sure you're either on Linux or macOS x86_64. LLVM has an issue with exception unwinding across JIT frames across aarch64 right now. 😦 After installing your build deps, just run this (in ./bin/configure -GNinja -DCMAKE_BUILD_TYPE=Debug
./bin/compile Then, to replicate the missing symbol issue, create a (ns test)
(defn foo []
(throw "meow"))
(defn -main [& args]
(foo))
(-main) And run it with You can verify that The cpptrace submodule is in cd third-party/cpptrace
git remote add upstream https://github.com/jeremy-rifkin/cpptrace
git checkout -b jit-symbols
# Make your changes...
git push upstream jit-symbols Whenever you make changes to cpptrace, you can run Happy to help, if you have any issues, since you're doing me a solid here. |
Thanks! I'll give this a try once able! |
I gave this a shot locally but I seem to be running into errors
Here's what I did:
I picked Have you seen this before? |
Yep, I've seen this one. It generally happens when the user doesn't have a proper clang/llvm setup. Where did you get that clang/llvm 19? |
I did this, but it's definitely possible I have something messed up on the machine I'm testing with
Logs did indicate the right clang appeared to be used |
Is this on an old distro, like Ubuntu 22.04 or something? |
Yes indeed, I'm on ubuntu 22.04. Is 24 needed? (I'd be surprised if that were the case) |
I think that's the issue. Likely due to the gcc libs installed from 22.04, namely libstdc++, since that's what clang will be using. Try this (right from your jank directory -- distrobox is amazing): distrobox create jank-ubuntu --image ubuntu:24.10
distrobox enter jank-ubuntu
sudo apt-get install -y curl git git-lfs zip build-essential entr libssl-dev libdouble-conversion-dev pkg-config ninja-build cmake zlib1g-dev libffi-dev clang libclang-dev llvm llvm-dev libzip-dev libbz2-dev doctest-dev gcc g++ libgc-dev
export CC=clang; export CXX=clang++
./bin/configure -GNinja -DCMAKE_BUILD_TYPE=Debug
./bin/compile I just did these exact commands, on my machine, and compiled cleanly.
Clang is just the compiler. The standard lib package (which comes from gcc) you're using is 3 years old, back when C++20 features were still very new. So it's understandable when we run into C++20 issues on such an old distro. |
Thanks! A different libstdc++ makes sense. I was able to get it to build with an ubuntu 24.10 container like you suggested. I was also able to get the example to run and see the expected missing frames. Will start diving into the JIT interface stuff next. |
Niiice! Great job. Thanks again for taking the time for this, Jeremy. 🙂 I owe you one. |
My pleasure, thanks for the interest in the library! :) I didn't have a ton of time to look into this today but exploring things a bit: I instrumented cpptrace to walk the Typing up some notes, mainly for myself but also if you have any thoughts I'd be happy to hear: Looking at the missing frame 5 which is a jit function
This frame has raw address This seems to correspond to the .text section of the second object file to be registered via the jit interface
So this falls within the
Path forward
If jank isn't emitting dwarf symbols here what I'll plan on doing is just setting up a preliminary interface for cpptrace and basic support for scanning in-memory symbol tables, and then dwarf support can hopefully be an easy addition later. TL;DR: I think this may be easier than I feared and I'm hoping to have something this weekend |
Woah, great work! That was a quick first pass. You're right that jank isn't generating debug info alongside the LLVM IR yet. We should have that done in a month or two. jank-lang/jank#242 I'm totally cool with just symbol support for now. Would really appreciate getting the debug info in there, once jank supports it, though. Those frames are going to be the ones which matter most, since it'll be the user's jank code, while the rest are mainly part of the jank compiler/runtime. In terms of the way forward, ultimately I have little input to give, so long as it works for jank. If you'd like, though, I can either connect you with Lang Hames, the main LLVM JIT guy, or try to get his input on this ticket. For example, that might be allow us to answer the "I don't know if there's an easy way for you to get the in-memory object file handle" bit, among others. If you'd like to reach out to him, this is the LLVM Discord: https://discord.gg/xS7Z362 There's a |
Thanks for confirming! The debug info will definitely be important. I can maybe try to wire support through but I won't have an easy way to test that it's really working. I might be able to throw something together with the kaleidoscope jit example. I'll ask on #jit about in-memory object file access! |
I've gotten the basic foundation working on linux, I'll work over the next couple days on robustness and various cleanup and macos support |
Currently I’m not relying on any special llvm interfaces just the pseudo-standard gdb jit interface. I expect this to work for LLVM 20, 19, older, and later, but I haven’t tested :) |
I've merged preliminary support for linux and mac to the dev branch. I haven't been able to test on mac yet but linux works. The API I settled on is two parts. The core interface that allows registering / unregistering in-memory object files is: namespace cpptrace {
void register_jit_object(const char*, std::size_t);
void unregister_jit_object(const char*);
void clear_all_jit_objects();
} Then there's a helper in namespace cpptrace {
void register_jit_objects_from_gdb_jit_interface();
} This will need to be called once the JIT object from LLVM is fully prepared for execution and has been added to I hope this works for you, please let me know if there's anything else I can do on my end that would be helpful! |
Thanks, Jeremy! This is exciting. So, when we're loading modules, using REPL-based interactive sessions, etc, we're adding new objects to LLVM all of the time. Is the intended behavior to call once Just to be clear, the |
Gotcha, that sort of REPL behavior will throw a wrench into things. I take it LLVM objects are only added in REPL mode not removed? While cpptrace's bookkeeping here should be fast this would indeed have concerning time complexity implications. I had in mind Jank using the gdb jit utility as opposed to I will give this some more thought. I'm thinking of some janky (😄) approaches but I'd like to find something more robust. Could you point me to a place in Jank where LLVM passes are registered? |
You're right that we're not currently removing any LLVM modules. We may try to in the future, but it's not something we should worry about for now. When we create an LLVM module, we register some optimization passes. I suspect that's where we could hook in some other passes, though I'm not very familiar with APIs (and it's riddled with TODOs as a result). That's here: https://github.com/jank-lang/jank/blob/main/compiler%2Bruntime/src/cpp/jank/codegen/llvm_processor.cpp#L65-L68 Once we codegen the whole module, we then give it to LLVM. LLVM's There's another case, which is interesting. jank loads object files from disk, when loading modules which have already been precompiled. For that, we use another In both of these cases, we'll want the symbols registered. The two fns we're calling are defined within LLVM here, in case that's helpful: https://github.com/llvm/llvm-project/blob/main/llvm/lib/ExecutionEngine/Orc/LLJIT.cpp#L920-L928 I don't want you to feel left on out in the rain to solve this on your own, so if I can help in any way, let me know. Happy to chat more in real-time, too, if that can help. |
Jeremy! This project is great. I'm trying to add it to jank right now, as the fallback exception handler.
I have a fun twist for you. jank is a native Clojure dialect on LLVM, so I'm generating LLVM IR and then giving it to LLVM to JIT compile. It looks like cpptrace is unable to find the symbols for any JIT compiled functions.
However, given the same exact executable and flags, gdb can see the symbols. I've asked an LLVM why and I've provided the info below.
Details
CMake
OS
Linux x86_64
Executable
What gdb/lldb are doing
I asked Lang Hames (the main LLVM JIT guy) on the LLVM Discord about this and here's what he said.
What do you think? I'm hoping to get this working for both Linux and macOS (x86_64 and aarch64).
The text was updated successfully, but these errors were encountered: