Call as statement #230

ailrst · 2024-08-08T07:50:08Z

This moves direct and indirect calls from block terminators into the statement list. It also adds Unreachable (which has the semantics of assume false) and Return Jumps, so a block terminator is now just Unreachable | Return | GoTo.

We currently still have the invariant of

1 call per block
calls can only appear as the last statement in the block

The IndirectCall(R30) -> return and return -> goto returnblock transforms are applied to this IR

Dataflow analyses seem mostly unaffected since the framework handles control flow, possibly need to look at the IDE solver more closely. ~~IndirectCall resolution works, but the Live vars analysis using the IDE solver is broken for a couple of the test cases for unclear reasons.~~ LiveVars tests works at least now, since nothing uses the forwards IDE analysis it might have superficial bugs but is probably fine.

Motivation

the existing block terminator is very weird and makes modifying block jumps very complicated to ensure it remains well-formed, and to handle the case of when it is a call and potentially has a fallthrough.
Calls are now structurally ensured to have a successor command (rather than it being possibly None). This successor is Unreachable in the case of a non-returning call.
This change removes the obligation for call statements to know about their fall-through edge (hence meaning they describe exactly one control-flow edge in each direction). Control always passes to the next statement, and the next statement describes where control flows after the call.
It separates the inter-procedural control-flow from the block-level intraprocedural control flow; A call edges exist across call statements and goto edges exist between blocks (See first point), so you don't have to handle both in a context where you only care about one.
Facilitates a higher-level IR in the future and later transformation stages in the tool:
- Allow later transforms to place multiple calls in a single block, making it more readable and expressive for higher level call structures
- Calls which take local arguments and assign their return values to locals

l-kent · 2024-08-21T03:42:56Z

The procedure summaries (which are now in main) and the DSA (not merged in yet) both use the forward IDE solver.

l-kent · 2024-08-21T04:18:08Z

Exceptions get thrown when running each of the following tests with --analyse:

correct/indirect_call/gcc_O2:BAP
correct/malloc_with_local3/gcc_O2:BAP
correct/syscall/gcc_O2:BAP

ailrst · 2024-08-26T00:36:04Z

correct/syscall/gcc_O2:BAP

main is a stub of fork() which makes the externalremover remove the entry/return blocks after they were added, and as a result IDEsolver falls over getting the entry and exit of mainProcedure. Its also strange that the bap loader for this proc gets two blocks with the label lfork, which seems to break the assumptions of BAPtoIR.

ie with

--- a/src/main/scala/translating/BAPToIR.scala
+++ b/src/main/scala/translating/BAPToIR.scala
@@ -44,4 +44,5 @@ class BAPToIR(var program: BAPProgram, mainAddress: Int) {
     for (s <- program.subroutines) {
       val procedure = nameToProcedure(s.name)
+      println(s"blocks ${procedure.name}:\n${s.blocks.map(b => s"block ${b.toString}").mkString(" ")}")
       for (b <- s.blocks) {
         val block = labelToBlock(b.label)

blocks fork
block lfork Some(1600)
block lfork Some(1552)
R16 := 65536bv64;
R17 := mem[BAPBinOp(PLUS,R16,4024bv64)];
R16 := BAPBinOp(PLUS,R16,4024bv64);

from adt

  Sub(
   Tid(1_499, "@fork"),
   Attrs([
    Attr("c.proto", "signed (*)(void)"),
    Attr("returns-twice", "()"),
    Attr("address", "0x640"),
    Attr("stub", "()")]),
   "fork",
   Args([
    Arg(
     Tid(1_543, "%00000607"),
     Attrs([
      Attr("c.layout", "[signed : 32]"),
      Attr("c.data", "Top:u32"),
      Attr("c.type", "signed")]),
     Var("fork_result", Imm(32)),
     LOW(32, Var("R0", Imm(64))),
     Out())]),
   Blks([
    Blk(
     Tid(424, "@fork"),
     Attrs([
      Attr("address", "0x640")]),
     Phis([]),
     Defs([]),
     Jmps([
      Goto(
       Tid(427, "%000001ab"),
       Attrs([
        Attr("address", "0x640"),
        Attr("insn", "b #-0x30")]),
       Int(1, 1),
       Direct(Tid(425, "@fork")))])),

I'm not really familiar with the frontend and not sure what is happening here

ailrst · 2024-08-26T00:37:03Z

The procedure summaries (which are now in main) and the DSA (not merged in yet) both use the forward IDE solver.

I still have to merge in main and check the forward solver works, good to know

l-kent · 2024-08-26T00:52:22Z

For correct/syscall/gcc_O2:BAP, that seems to be because BAP produces multiple blocks with the same name in that test case (it's a case that BAP doesn't really lift particularly well for various reasons), and we've just been lucky and never had that cause problems before somehow. I'll make an issue for it and tweak the BAP loader to account for that.

l-kent · 2024-08-26T01:39:00Z

Your fix makes it so that the stub procedures which had bodies removed now have bodies created for them, which I don't think we really want?

An example of what is produced (test/correct/function1/clang):

implementation printf()
{
  printf_basil_entry:
    assume {:captureState "printf_basil_entry"} true;
    goto printf_basil_return;
  printf_basil_return:
    assume {:captureState "printf_basil_return"} true;
    return;
}

ailrst · 2024-08-26T02:00:39Z

Yeah, I was a bit frustrated by this on Friday. I agree this is not ideal as it conveys a procedure that does nothing rather than a procedure with missing/external behaviour. Calling IDESolver on an external function could be reasonably considered to be violating its contract, so I'm ok with an exception in this case, maybe we just pick a different procedure as its starting point or disable it in this case?

for this reason the behaviour of addReturnNodes(p, true) is wrong and I think the IDESolver has been properly fixed so isCall ignores stubs so that it doesn't need entry and exits for all procedures anymore, which was my initial mistake.

l-kent · 2024-08-26T03:55:31Z

I don't think we need an exception as long as we make it so that IDESolver can just ignore procedure stubs properly?

ailrst · 2024-08-29T23:37:09Z

That's true in general, but when the main procedure is a stub the IDESolver constructor wants to initialise it as the entrypoint, I guess we could pick another procedure arbitrarily as the entry point in that case.

ailrst · 2024-08-30T00:39:48Z

I've fixed the external external functions issue so externalremover works propertly, and made it so the IDESolver analyses are disabled (printing a warning) when the main procedure is external.

l-kent · 2024-08-30T01:24:53Z

That's true in general, but when the main procedure is a stub the IDESolver constructor wants to initialise it as the entrypoint, I guess we could pick another procedure arbitrarily as the entry point in that case.

If the main procedure is a stub then there isn't really anything for the IDESolver to do - picking another procedure arbitrarily isn't a good solution. A better solution would be to just to not do any analysis at all.

l-kent · 2024-08-30T01:29:34Z

Oh you came to the same conclusion already, I'm glad

l-kent · 2024-08-30T01:54:14Z

There's an error for correct/syscall/clang_O2:BAP with the analysis on - it doesn't generate a return block for main, so then when the BackwardIDESolver is initialised for the InterLiveVarsAnalysis, IRWalk.lastInProc returns None and NoSuchElementException gets thrown.

ailrst · 2024-08-30T04:30:24Z

That's because main is nonreturning in that case, should probably also disable the IDE solver in this case.

l-kent · 2024-08-30T04:58:13Z

It looked like some other cases where main is non-returning don't have the same issue though, just that one. It is a weird case that BAP doesn't lift very well.

ailrst force-pushed the call-statement branch 2 times, most recently from b875191 to 37ee553 Compare August 9, 2024 07:37

ailrst force-pushed the call-statement branch 2 times, most recently from 1001446 to fd56c9a Compare August 19, 2024 01:57

ailrst marked this pull request as ready for review August 19, 2024 02:15

ailrst added 5 commits August 30, 2024 09:40

refactor call to a statement & add unreachable and return jumps

1a139d2

move transforms out of RunUtils.scala

7e94536

remove old cfg

aae0064

update docs

e2c0cd4

fix

a3adee3

ailrst force-pushed the call-statement branch from 0023da7 to a3adee3 Compare August 30, 2024 00:05

ailrst added 2 commits August 30, 2024 10:31

fix externals

4a92c99

disable IDE analyses if mainproc is external

b994c99

disable ide in case of no return block

99d7ca4

l-kent approved these changes Sep 9, 2024

View reviewed changes

ailrst merged commit c9ad83d into main Sep 9, 2024
2 checks passed

l-kent mentioned this pull request Sep 16, 2024

IRTest broken by Call as Statement #244

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Call as statement #230

Call as statement #230

ailrst commented Aug 8, 2024 •

edited

Loading

l-kent commented Aug 21, 2024

l-kent commented Aug 21, 2024

ailrst commented Aug 26, 2024

ailrst commented Aug 26, 2024

l-kent commented Aug 26, 2024

l-kent commented Aug 26, 2024

ailrst commented Aug 26, 2024 •

edited

Loading

l-kent commented Aug 26, 2024

ailrst commented Aug 29, 2024

ailrst commented Aug 30, 2024

l-kent commented Aug 30, 2024

l-kent commented Aug 30, 2024

l-kent commented Aug 30, 2024 •

edited

Loading

ailrst commented Aug 30, 2024

l-kent commented Aug 30, 2024

Call as statement #230

Call as statement #230

Conversation

ailrst commented Aug 8, 2024 • edited Loading

l-kent commented Aug 21, 2024

l-kent commented Aug 21, 2024

ailrst commented Aug 26, 2024

ailrst commented Aug 26, 2024

l-kent commented Aug 26, 2024

l-kent commented Aug 26, 2024

ailrst commented Aug 26, 2024 • edited Loading

l-kent commented Aug 26, 2024

ailrst commented Aug 29, 2024

ailrst commented Aug 30, 2024

l-kent commented Aug 30, 2024

l-kent commented Aug 30, 2024

l-kent commented Aug 30, 2024 • edited Loading

ailrst commented Aug 30, 2024

l-kent commented Aug 30, 2024

ailrst commented Aug 8, 2024 •

edited

Loading

ailrst commented Aug 26, 2024 •

edited

Loading

l-kent commented Aug 30, 2024 •

edited

Loading