Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JITModule: rework/fix __udivdi3 handling #8389

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -560,6 +560,34 @@ if (WITH_SERIALIZATION_JIT_ROUNDTRIP_TESTING)
target_compile_definitions(Halide PRIVATE WITH_SERIALIZATION_JIT_ROUNDTRIP_TESTING)
endif ()

if (NOT DEFINED Halide_COMPILER_BUILTIN_LIBRARY)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh, rebases aren't fun.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to blow up on Windows since MSVC isn't going to understand the --print-libgcc-file-name flag. Under which circumstances exactly do we need to include the __udivdi3 symbol?

Similarly, there's an issue on macOS with it finding arguably the wrong builtins.

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/15.0.0/lib/darwin/libclang_rt.builtins-x86_64.a

Can this all be guarded by if (CMAKE_SIZEOF_VOID_P EQUAL 4 AND ... something ...)? LINUX AND CMAKE_CXX_COMPILER_ID STREQUAL "GNU", maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under which circumstances exactly do we need to include the __udivdi3 symbol?

Whenever the JIT might produce it but libHalide happens not to be linked
to the library that provides it. I'm hoping/wondering if we can just ask the clang
that we've found and are using to compile some runtimes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this kind-of works, and finds the right builtins on osx now:
https://buildbot.halide-lang.org/master/#/builders/171/builds/117

ninja: Entering directory `/Users/halidenightly/build_bot/worker/halide-testbranch-main-llvm20-arm-64-osx-cmake/halide-build'
ninja: error: '/Users/halidenightly/build_bot/worker/llvm-20-arm-64-osx/llvm-install/lib/clang/20/lib/darwin/libclang_rt.osx.a', needed by 'src/libHalide.19.0.0.dylib', missing and no known rule to make it

Similarly for windows:
https://buildbot.halide-lang.org/master/#/builders/157/builds/117

ninja: Entering directory `C:/build_bot/worker/halide-testbranch-main-llvm20-x86-64-windows-cmake/halide-build'
ninja: error: 'libgcc.a', needed by 'bin/Halide.dll', missing and no known rule to make it

I have absolutely no clue about win/osx, but that looks like the compiler-rt subproject is not enabled,
or, in osx case, the clang should be configured to use "system" libgcc from xcode?

# FIXME: `execute_process()` can not use `$<TARGET_FILE:clang>`, workarounding that.
get_property(clang_LOCATION TARGET clang PROPERTY LOCATION)
execute_process(
COMMAND ${CMAKE_CROSSCOMPILING_EMULATOR} ${CMAKE_C_COMPILER_LAUNCHER} ${clang_LOCATION} "--print-libgcc-file-name"
OUTPUT_VARIABLE Halide_COMPILER_BUILTIN_LIBRARY
OUTPUT_STRIP_TRAILING_WHITESPACE
ERROR_VARIABLE Halide_COMPILER_BUILTIN_LIBRARY_stderr
ERROR_STRIP_TRAILING_WHITESPACE
RESULT_VARIABLE Halide_COMPILER_BUILTIN_LIBRARY_exit_code
)
if (NOT "${Halide_COMPILER_BUILTIN_LIBRARY_exit_code}" STREQUAL "0")
message(FATAL_ERROR "Unable to invoke clang --print-libgcc-file-name: ${Halide_COMPILER_BUILTIN_LIBRARY_stderr}")
endif()
endif ()

set(Halide_COMPILER_BUILTIN_LIBRARY "${Halide_COMPILER_BUILTIN_LIBRARY}"
CACHE FILEPATH "Library containing __udivdi3 symbol, either “libgcc.a” or “libclang_rt.builtins.*.a”.")
LebedevRI marked this conversation as resolved.
Show resolved Hide resolved

add_library(Halide::__udivdi3 UNKNOWN IMPORTED)
set_target_properties(
Halide::__udivdi3
PROPERTIES
IMPORTED_LOCATION "${Halide_COMPILER_BUILTIN_LIBRARY}"
)

target_link_libraries(Halide PRIVATE Halide::__udivdi3)

# Note that we (deliberately) redeclare these versions here, even though the macros
# with identical versions are expected to be defined in source; this allows us to
# ensure that the versions defined between all build systems are identical.
Expand Down
47 changes: 24 additions & 23 deletions src/JITModule.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,13 @@
#include "Pipeline.h"
#include "WasmExecutor.h"

extern "C" unsigned long __udivdi3(unsigned long a, unsigned long b);

namespace Halide {
namespace Internal {

using std::string;

#if defined(__GNUC__) && defined(__i386__)
extern "C" unsigned long __udivdi3(unsigned long a, unsigned long b);
#endif

#ifdef _WIN32
void *get_symbol_address(const char *s) {
return (void *)GetProcAddress(GetModuleHandle(nullptr), s);
Expand Down Expand Up @@ -135,6 +133,26 @@ void load_webgpu() {
<< "(Try setting the env var HL_WEBGPU_NATIVE_LIB to an explicit path to fix this.)\n";
}

llvm::orc::SymbolMap GetListOfAdditionalSymbols(const llvm::orc::LLJIT &Jit) {
// Inject a number of symbols that may be in libgcc.a where they are
// not found automatically. See also the upstream issue:
// https://github.com/llvm/llvm-project/issues/61289.

static const std::pair<const char *, const void *> NamePtrList[] = {
// NOTE: at least libgcc does not provide this symbol on 64-bit x86_64, only on 32-bit i386.
{"__udivdi3", (((CHAR_BIT * sizeof(void *)) == 32) ? ((void *)&__udivdi3) : nullptr)},
};

llvm::orc::SymbolMap AdditionalSymbols;
for (const auto &NamePtr : NamePtrList) {
auto Addr = static_cast<llvm::orc::ExecutorAddr>(
reinterpret_cast<uintptr_t>(NamePtr.second));
AdditionalSymbols[Jit.mangleAndIntern(NamePtr.first)] =
llvm::orc::ExecutorSymbolDef(Addr, llvm::JITSymbolFlags::Exported);
}
return AdditionalSymbols;
}

} // namespace

using namespace llvm;
Expand Down Expand Up @@ -216,25 +234,6 @@ class HalideJITMemoryManager : public SectionMemoryManager {
}
}
uint64_t result = SectionMemoryManager::getSymbolAddress(name);
#if defined(__GNUC__) && defined(__i386__)
// This is a workaround for an odd corner case (cross-compiling + testing
// Python bindings x86-32 on an x86-64 system): __udivdi3 is a helper function
// that GCC uses to do u64/u64 division on 32-bit systems; it's usually included
// by the linker on these systems as needed. When we JIT, LLVM will include references
// to this call; MCJIT fixes up these references by doing (roughly) dlopen(NULL)
// to look up the symbol. For normal JIT tests, this works fine, as dlopen(NULL)
// finds the test executable, which has the right lookups to locate it inside libHalide.so.
// If, however, we are running a JIT-via-Python test, dlopen(NULL) returns the
// CPython executable... which apparently *doesn't* include this as an exported
// function, so the lookup fails and crashiness ensues. So our workaround here is
// a bit icky, but expedient: check for this name if we can't find it elsewhere,
// and if so, return the one we know should be present. (Obviously, if other runtime
// helper functions of this sort crop up in the future, this should be expanded
// into a "builtins map".)
if (result == 0 && name == "__udivdi3") {
result = (uint64_t)&__udivdi3;
}
#endif
internal_assert(result != 0)
<< "HalideJITMemoryManager: unable to find address for " << name << "\n";
return result;
Expand Down Expand Up @@ -350,6 +349,8 @@ void JITModule::compile_module(std::unique_ptr<llvm::Module> m, const string &fu
internal_assert(gen) << llvm::toString(gen.takeError()) << "\n";
JIT->getMainJITDylib().addGenerator(std::move(gen.get()));

cantFail(JIT->getMainJITDylib().define(absoluteSymbols(GetListOfAdditionalSymbols(*JIT))));

llvm::orc::ThreadSafeModule tsm(std::move(m), std::move(jit_module->context));
auto err = JIT->addIRModule(std::move(tsm));
internal_assert(!err) << llvm::toString(std::move(err)) << "\n";
Expand Down
Loading