Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GTIRB frontend produces GoTos between procedures #271

Open
ailrst opened this issue Nov 13, 2024 · 6 comments
Open

GTIRB frontend produces GoTos between procedures #271

ailrst opened this issue Nov 13, 2024 · 6 comments

Comments

@ailrst
Copy link
Contributor

ailrst commented Nov 13, 2024

In the attached example, when lifted with ddisasm and gtirb_semantics it produces a goto in close_file that targets a block in __stdio_seek.

gate_server.tar.gz

[ERROR]   4246544$2: GoTo($__stdio_seek_4245436$__0__$_ka7cqh7QzOFuSiPbkNnsQ) has target outside parent procedure close_file_4246448 [[email protected]:64]

This property is checked with the following code: https://gist.github.com/ailrst/847a9aecc909a47b05c8634c5aa8070a. This is checked in to the simplification-pass branch/pr.

@l-kent
Copy link
Contributor

l-kent commented Nov 14, 2024

I can't even get this binary to lift with gtirb-semantics, even after updating my installation of it and ddisasm? What version of each are you using?

I get the following from gtirb-semantics:

error during aslp disassembly (opcode 0xd4000001, bytes 01 00 00 D4):

Fatal error: exception Failure("Casting unhandled value type to expression: {.NS = 1'0', .exceptype = Exception_SupervisorCall, .ipaddress = 52'0000000000000000000000000000000000000000000000000000', .ipavalid = FALSE, .syndrome = 25'0000000000000000000000000', .vaddress = 64'0000000000000000000000000000000000000000000000000000000000000000'}")
Raised at Stdlib.failwith in file "stdlib.ml", line 29, characters 17-33
Called from LibASL_stage1__Dis.dis_lexpr_chain in file "libASL/dis.ml", line 1158, characters 52-62
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 509, characters 18-25
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 509, characters 18-25
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 509, characters 18-25
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 509, characters 18-25
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 509, characters 18-25
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 509, characters 18-25
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 509, characters 18-25
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 499, characters 16-23
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let+ in file "libASL/dis.ml", line 504, characters 16-23
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let+ in file "libASL/dis.ml", line 504, characters 16-23
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 499, characters 16-23
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 508, characters 16-23
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 509, characters 18-25
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 509, characters 18-25
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 509, characters 18-25
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 509, characters 18-25
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 508, characters 16-23
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 509, characters 18-25
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 509, characters 18-25
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 509, characters 18-25
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 499, characters 16-23
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let+ in file "libASL/dis.ml", line 504, characters 16-23
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 508, characters 16-23
Called from LibASL_stage1__Dis.(>>) in file "libASL/dis.ml", line 509, characters 18-25
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 499, characters 16-23
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Rws.RWSBase.locally_ in file "libASL/rws.ml", line 109, characters 21-26
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 499, characters 16-23
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 499, characters 16-23
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 499, characters 16-23
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 499, characters 16-23
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 499, characters 16-23
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 499, characters 16-23
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.let@ in file "libASL/dis.ml", line 500, characters 18-29
Called from LibASL_stage1__Dis.dis_core in file "libASL/dis.ml", line 1566, characters 28-71
Called from LibASL_stage1__Dis.dis_decode_entry in file "libASL/dis.ml" (inlined), line 1608, characters 9-64
Called from LibASL_stage1__Dis.retrieveDisassembly in file "libASL/dis.ml", line 1653, characters 4-58
Called from Dune__exe__Main.do_module.to_asli.do_dis in file "bin/main.ml", line 135, characters 13-79

ddisasm gives these warnings too, but it produces a .gtirb file successfully:

Building the initial gtirb representation
WARNING: resurrectSymbols: STRTAB not found.[  10ms]
Processing module: gate_server
    disassembly              load [  55ms]    compute [ 800ms]  transform WARNING: Moving symbol to first block of section: __preinit_array_start
WARNING: Moving symbol to first block of section: __preinit_array_end
[  29ms]
    SCC analysis                              compute [   2ms]  transform [   0ms]
    no return analysis       load [   3ms]    compute [  16ms]  transform [   0ms]
    function inference       load [   4ms]    compute [   5ms]  transform [   2ms]

@l-kent
Copy link
Contributor

l-kent commented Nov 14, 2024

Without having the input to examine it's hard to say what the issue is, but it could very well be a bug in ddisasm's function bounds identification.

Can you share the .json output just for the close_file subroutine? That should be enough to figure out what's going on.

@ailrst
Copy link
Contributor Author

ailrst commented Nov 14, 2024 via email

@l-kent
Copy link
Contributor

l-kent commented Nov 14, 2024

If you're on leave then you don't need to respond to this until you're back, it's fine.

Looking at the assembly, there's an indirect call in close_file that uses br not blr which would be why ddisasm treats it as a within-function jump. The issue is that it's actually a non-returning tail call which isn't something we currently account for, but it should be easy to recognise that case and handle it better. The control flow won't really be working properly until we have non-returning calls implemented properly though.

The indirect call appears to be to a function pointer contained within a struct that's passed by reference to close_file, so it's not easy to tell where it can point - if ddisasm has managed resolved it to a single location then that's somewhat surprising.

@ailrst
Copy link
Contributor Author

ailrst commented Nov 28, 2024

The json for that procedure is here: https://gist.github.com/ailrst/0178f78739fe18aed99e445f1b16b2a4#file-close_file-txt-L235

The control flow won't really be working properly until we have non-returning calls implemented properly though.

I'm not sure what you mean by 'implemented properly'; purely recognising them in the loader?

The absence of a return edge should indicate it is a non-returning call, if the jump targets a procedure header (which it should?) then we can just convert it to DirectCall(__stdio_seek); Unreachable;. And in general we can assume 'unreachable' follows calls unless we have an edge indicating otherwise. The main issue is distinguishing whether its a GoTo or call based on the target rather than edge flags?

@l-kent
Copy link
Contributor

l-kent commented Nov 28, 2024

Non-returning control flow is not really handled properly at present because non-returning calls do usually return somewhere, just indirectly, and we do not currently account for this. This happens in this case (though it's a relatively complex one).

Here, __stdio_seek calls __syscall_ret at its end - this is a direct, non-returning call. There is a ret at the end of __syscall_ret so it returns, but since R30's value has been maintained since close_file was called, __syscall_ret returns all the way back to close_file's call site (or even earlier, sometimes close_file appears to be called without R30 being set). We do not currently account for this sort of case at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants