Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LDC crashes deep in LLVM #4759

Open
yanok opened this issue Sep 25, 2024 · 7 comments
Open

LDC crashes deep in LLVM #4759

yanok opened this issue Sep 25, 2024 · 7 comments

Comments

@yanok
Copy link
Contributor

yanok commented Sep 25, 2024

I see a crash trying to compile this code (already minified by dustmite):

import std;
struct notrace {}
struct string_tracetail {
    string value;
    alias value this;
}
class ReactorFiber{
string_tracetail funcName;
    @notrace setCallback(alias F)() {
        funcName = fullyQualifiedName!F;
    }

}

struct FiberHandle {
}
struct Reactor {
ReactorFiber allocFiber() {
    return new ReactorFiber;
    }

FiberHandle spawnFiber(alias F)() {
        auto fib = allocFiber;
        fib.setCallback!F;
        return FiberHandle();
    }

}


Reactor theReactor;
void closure(void delegate() ) {
}

unittest
{
    closure({
        static f() {}
        theReactor.spawnFiber!f;
    });
}

Compile command: ldc2 -c crash.d --unittest.

Stack trace:

0  libLLVM-17.so.1 0x00007f6ba4711bbf llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 63
1  libLLVM-17.so.1 0x00007f6ba470febb llvm::sys::RunSignalHandlers() + 43
2  libLLVM-17.so.1 0x00007f6ba47122c0
3  libc.so.6       0x00007f6ba34b7320
4  libLLVM-17.so.1 0x00007f6ba4f3a4aa llvm::SelectionDAG::getNode(unsigned int, llvm::SDLoc const&, llvm::EVT, llvm::SDValue, llvm::SDNodeFlags) + 74
5  libLLVM-17.so.1 0x00007f6ba4e5737b
6  libLLVM-17.so.1 0x00007f6ba4e5717b
7  libLLVM-17.so.1 0x00007f6ba4e73c96
8  libLLVM-17.so.1 0x00007f6ba4e7759d llvm::SelectionDAG::LegalizeTypes() + 1309
9  libLLVM-17.so.1 0x00007f6ba4f6f143 llvm::SelectionDAGISel::CodeGenAndEmitDAG() + 323
10 libLLVM-17.so.1 0x00007f6ba4f6e4d3 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) + 4227
11 libLLVM-17.so.1 0x00007f6ba4f6c846 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) + 1862
12 libLLVM-17.so.1 0x00007f6ba7293016
13 libLLVM-17.so.1 0x00007f6ba4ad83ce llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 622
14 libLLVM-17.so.1 0x00007f6ba485b4c2 llvm::FPPassManager::runOnFunction(llvm::Function&) + 738
15 libLLVM-17.so.1 0x00007f6ba48613f4 llvm::FPPassManager::runOnModule(llvm::Module&) + 68
16 libLLVM-17.so.1 0x00007f6ba485bbae llvm::legacy::PassManagerImpl::run(llvm::Module&) + 1006
17 ldc2            0x00005647a7392f9b
18 ldc2            0x00005647a7393a88 writeModule(llvm::Module*, char const*) + 2328
19 ldc2            0x00005647a7390025 ldc::CodeGenerator::writeAndFreeLLModule(char const*) + 1253
20 ldc2            0x00005647a7390471 ldc::CodeGenerator::emit(Module*) + 177
21 ldc2            0x00005647a73681c0 codegenModules(Array<Module*>&) + 640
22 ldc2            0x00005647a71265d1 mars_mainBody(Param&, Array<char const*>&, Array<char const*>&) + 6289
23 ldc2            0x00005647a73668da cppmain() + 7082
24 ldc2            0x00005647a74f792d _D2rt6dmain212_d_run_main2UAAamPUQgZiZ6runAllMFZv + 77
25 ldc2            0x00005647a74f7747 _d_run_main2 + 407
26 ldc2            0x00005647a74f759d _d_run_main + 141
27 ldc2            0x00005647a6ff18ac main + 540
28 libc.so.6       0x00007f6ba349c1ca
29 libc.so.6       0x00007f6ba349c28b __libc_start_main + 139
30 ldc2            0x00005647a6ff3e45 _start + 37

The problem goes away if I move static f() {} definition out of the function body.

Checked on Ubuntu with

$ ldc2 --version
LDC - the LLVM D compiler (1.36.0):
  based on DMD v2.106.1 and LLVM 17.0.6
  built with LDC - the LLVM D compiler (1.36.0)
  Default target: x86_64-pc-linux-gnu
  Host CPU: skylake

and on M3 Mac with

LDC - the LLVM D compiler (1.39.0):
  based on DMD v2.109.1 and LLVM 18.1.8
  built with LDC - the LLVM D compiler (1.39.0)
  Default target: arm64-apple-darwin23.6.0
  Host CPU: apple-m1

Though on Mac I have to pass --mtripple=x86_64-pc-linux-gnu to trigger the crash.

@JohanEngelen
Copy link
Member

Reduced testcase:

auto setCallback(alias F)() {
    return __traits(identifier, __traits(parent, F));
}

void foo()
{
    static void closure(void delegate() ) {}

    closure({
        static f() {}
        setCallback!f;
    });
}
> bin/ldc2 -c gh4759.d --unittest --mtriple=x86_64-pc-linux-gnu
Assertion failed: (!isaStruct(t)), function DtoBitCast, file tollvm.cpp, line 603.

assert(!isaStruct(t));

@JohanEngelen
Copy link
Member

Hint towards the issue:

auto setCallback(alias F)() {
    auto a = __traits(parent, F);
}

void closure(void delegate() ) {}
void foo()
{
    closure({
        static f() {}
        setCallback!f;
    });
}
❯ bin/ldc2 -c gh4759.d --unittest --mtriple=x86_64-pc-linux-gnu
gh4759.d(2): Error: forward reference to inferred return type of function call `__lambda1()`
gh4759.d(10): Error: template instance `gh4759.setCallback!(f)` error instantiating

@JohanEngelen
Copy link
Member

Reduced testcase:

void closure(void delegate() ) {}
void foo()
{
    closure({
        static f() {}
        auto a = __traits(parent, f); // Error: forward reference to inferred return type of function call `__lambda1()`
        auto a = __traits(identifier, __traits(parent, f)); // --> crash
    });
}

@kinke
Copy link
Member

kinke commented Sep 25, 2024

Yeah thanks for the report, we're hitting assertions earlier, related to the dark corners around DFuncValue (doesn't handle Tdelegate, DtoRVal() yields the function pointer only). I'll try to fix that.

@ljmf00-wekaio
Copy link

My reduced test case. Thanks @kinke

void func(void delegate())
{
    enum ident(alias F) = __traits(identifier, __traits(parent, F));
    // uncaught forward reference
    func({ cast(void)ident!({}); });
}

@yanok
Copy link
Contributor Author

yanok commented Sep 26, 2024

Thanks for the reduced examples!

I'm really new to D, so I have some noob questions. Do I understand it correctly that:

  1. __traits(parent, f) for f defined inside lambda body will never work? Because lambda's return type is still to be inferred? But parent symbol has to reference it?
  2. The problem here is __traits(identifier, ...) somehow masks the problem of getting parent from above? So the desired behavior here is a compiler error?

BTW, I believe dmd is also affected: it doesn't crash, but the emitted code does (the same with compiling for Apple silicon on Mac).

@ljmf00-wekaio
Copy link

  1. __traits(parent, f) is to get the parent symbol attached to the given symbol (see https://dlang.org/spec/traits.html#parent ). So, e.g. a inner function, gets the outer function, a function in the global scope of the module, gets the module symbol, etc, etc... From what I can understand, it shouldn't require to resolve the return type (unless I'm missing something), actually, just resolve which parent is even if its a forward reference, but, as far as I know, the compiler only seem to resolve it after the semantic3 (the last semantic pass), which implicitly requires the return type to be resolved, hence the error on __traits(parent, f) alone.
  2. I would say, at minimum, a compiler error should be shown, never an ICE, but the forward reference seem to be possible to resolve. Forward references of inter-dependent symbols usually can get pretty complex to resolve. Take the example of .sizeof done inside an inner struct of the outer struct, there's many situations like this, specially when mixin's are in the equation. Imo, the DMD frontend is not capable of solving these inter-dependencies in a correct way, they naively try their best and some times just give up and give erroneous information to the rest of the code. There's checks LDC add to warn the user about it, e.g. one I know of is mangling mismatch (different AST symbols generating the same mangling). You can take another example, attribute inference, which is something we suffer a lot by having missing symbols at link-time, because one compilation unit inferred one attribute and another, other attributes, yielding a different mangle, and therefore possibly undefined references (at best, duplicate unused symbols).

Last time I talked with compiler frontend people, at Dconf, I suggested an optional glue code coherency checker of the AST, which is something I have planned to add as a compiler plugin, so we can detect possible incoherent state like uncaught forward references, attributes inference mismatch, alongside other checks that might be expensive to run by default, but worth running in a compiler debugging session. It would be a complete AST traversal with additional checks, after semantics, pretty much.


BTW, I believe dmd is also affected: it doesn't crash, but the emitted code does (the same with compiling for Apple silicon on Mac).

Yeah, DMD backend is a toy to me, I wouldn't trust the codegen of DMD backend for anything in production. There's many situations where DMD goes happy and the user goes mad.

kinke added a commit to kinke/ldc that referenced this issue Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants