Skip to content

UQ-PAC/BASIL

Repository files navigation

BIL-to-Boogie Translator

About

The BIL-to-Boogie Translator generates semantically equivalent Boogie source files (.bpl) from AArch64/ARM64 binaries that have been lifted to the BAP (Binary Analysis Platform) intermediate ADT format.

This repository contains a program which takes BAP adt files and readelf output as input, and produces a boogie program.

Example

$ mill run
[53/53] run 
Missing arguments: -a --adt <str> -r --relf <str>
Expected Signature: BASIL
  -a --adt <str>                BAP ADT file name.
  --analyse                     Run static analysis pass.
  --analysis-results <str>      Log analysis results in files at specified path.
  --analysis-results-dot <str>  Log analysis results in .dot form at specified path.
  --boogie-use-lambda-stores    Use lambda representation of store operations.
  --dump-il <str>               Dump the Intermediate Language to text.
  -h --help                     Show this help message.
  --interpret                   Run BASIL IL interpreter.
  -o --output <str>             Boogie output destination file.
  -r --relf <str>               Name of the file containing the output of 'readelf -s -r -W'.
  -s --spec <str>               BASIL specification file.
  -v --verbose                  Show extra debugging logs.

$ mill run --adt src/test/correct/secret_write/clang/secret_write.adt --relf src/test/correct/secret_write/clang/secret_write.relf --spec src/test/correct/secret_write/secret_write.spec --output boogie_out.bpl
[53/53] run 
[INFO]   [!] Writing file... [run@RunUtils.scala:65]
[INFO]   [!] Removing external function calls [loadAndTranslate@RunUtils.scala:85]
[INFO]   [!] Translating to Boogie [loadAndTranslate@RunUtils.scala:114]
[INFO]   [!] Done! Exiting... [loadAndTranslate@RunUtils.scala:116]
$ tail boogie_out.bpl 
    mem, Gamma_mem := memory_store32_le(mem, bvadd64(R9, 52bv64), R8[32:0]), gamma_store32(Gamma_mem, bvadd64(R9, 52bv64), Gamma_R8);
    assert ((bvadd64(R9, 52bv64) == $z_addr) ==> (L(mem, $x_addr) ==> Gamma_x_old));
    assert bvsge32(memory_load32_le(mem, $z_addr), z_old);
    main_argv_out, Gamma_main_argv_out := R1, Gamma_R1;
    main_result, Gamma_main_result := R0[32:0], Gamma_R0;
    FP_out, Gamma_FP_out := R29, Gamma_R29;
    LR_out, Gamma_LR_out := R30, Gamma_R30;
    SP_out, Gamma_SP_out := R31, Gamma_R31;
    return;
}
$ boogie boogie_out.bpl 

Boogie program verifier finished with 4 verified, 0 errors

Installation and use

This is a Scala 3 project using the mill build system, it can be run with mill run.

Local Development

The tool is a Scala project, and is therefore OS-independent, but is only tested under linux.

Furthermore, lifting input files from a given AArch64 binary is Linux-specific, and all commands given are for Linux. On Windows, WSL2 may be used to run any Linux-specific tasks, but it is less thoroughly tested.

Installing mill, sbt and JDK >= 17 is required.

This can be done via coursier.

IDE Support

Mill supports the Metals language server and IntelliJ through Build Server Protocol (BSP), which is the recommended use.

SBT has an intellij plugin in addition to BSP.

Note about using sbt with IntelliJ

See Also

The project uses sbt to build the project. For the most part this should just work, and there are IntelliJ run files (located in the .run folder) which automatically compile and run the project. However, to prevent some issues in IntelliJ, it is necessary to unmark target/scala-3.1.0/src_managed/ as a sources root folder, in the File > Project Structure > Modules dialog.


The tool takes as inputs a BAP ADT file (here denoted with .adt) and a file containing the output of readelf (here denoted with .relf), both created from the same AArch64/ARM64 binary, and outputs a semantically equivalent .bpl Boogie-language source file. The default output file is boogie_out.bpl, but the output location can be specified.

To build and run the tool using sbt, use the following command:

sbt "run --adt file.adt --relf file.relf [--spec file.spec] [--output output.bpl] [--analyse] [--interpret]" where the output filename is optional and specification filenames are optional. The specification filename must end in .spec.

or mill:

sbt run --adt file.adt --relf file.relf [--spec file.spec] [--output output.bpl] [--analyse] [--interpret].

Usage

The --analyse flag is optional and enables the static analysis functionality which resolves indirect calls where possible.

Expected Signature: BASIL
  -a --adt <str>                BAP ADT file name.
  --analyse                     Run static analysis pass.
  --analysis-results <str>      Log analysis results in files at specified path.
  --analysis-results-dot <str>  Log analysis results in .dot form at specified path.
  --boogie-use-lambda-stores    Use lambda representation of store operations.
  --dump-il <str>               Dump the Intermediate Language to text.
  -h --help                     Show this help message.
  --interpret                   Run BASIL IL interpreter.
  -o --output <str>             Boogie output destination file.
  -r --relf <str>               Name of the file containing the output of 'readelf -s -r -W'.
  -s --spec <str>               BASIL specification file.
  -v --verbose                  Show extra debugging logs.

The sbt shell can also be used for multiple tasks with less overhead by executing sbt and then the relevant sbt commands.

To build a standalone .jar file, use the following command:

mill assembly

From sbt the resulting .jar is located at target/scala-3.1.0/wptool-boogie-assembly-0.0.1.jar and from mill this is out/assembly.dest/out.jar.

To compile the source without running it - this helps IntelliJ highlight things properly:

mill compile

Running Tests

The test suites use ScalaTest, they can be run via.

$ mill test

To run a single test suite, for example only the system tests (requires boogie):

$ mill.test.testOnly SystemTests

To run single tests in from the test suite:

$ mill test.testOnly SystemTests -- -z basic_arrays_read -z basic_arrays_write

To update the expected basil output files from the test results run

$ mill updateExpected

This is used to keep track of which tests have passed previously, as well as changes to the basil output.

Generating inputs (Lifting)

Many lifted examples are already profied in the tests directory: src/test/correct, these instructions are for if you want to lift new compiled binaries.

The tool takes a .adt and a .relf file as inputs, which are produced by BAP and readelf, respectively.

Requirements

  • bap with the ASLp plugin
  • readelf
  • cross-compiliation toolchains, e.g.:
    • gcc-13-aarch64-linux-gnu
    • clang

This can be installed using the nix packages.

The dev podman container also contains these tools and can be used by mounting current directory into the container as described at docker/readme.md.

BAP can be installed by following the instructions in the link given.

Given a AArch64/ARM64 binary file (*.out), the .adt file can be produced by running

bap *.out -d adt:*.adt

and the .relf file can be produced by running

aarch64-linux-gnu-readelf -s -r -W *.out > *.relf.

To cross-compile a C source file to a AArch64 binary, gcc-aarch64-linux-gnu must be installed. This can be done with the following commands on Ubuntu:

sudo apt-get update

sudo apt-get install gcc-aarch64-linux-gnu

The binary (i.e *.out) can then be generated from a C source file using:

aarch64-linux-gnu-gcc *.c -o *.out

See src/test/correct/liftone.sh for more examples for flag combinations that work.

Verifying the generated Boogie file

Boogie can be installed by following the instructions in the given link.

Boogie can be run on the output *.bpl file with the command boogie \useArrayAxioms *.bpl.

It is recommended to use Boogie version 3.0.4 and Z3 version 4.8.8 (which is recommended by Boogie). Other versions and combinations may not have been tested.

The \useArrayAxioms flag is necessary for Boogie versions 2.16.8 and greater; for earlier versions it can be removed.

Installing boogie

Boogie can be installed through dotnet and requires dotnet 6.

sudo apt-get install dotnet6
dotnet install boogie

Other useful commands

To compile a C file without the global offset table being used in address calculation, use the following command:

aarch64-linux-gnu-gcc -fno-plt -fno-pic *.c -o *.out

To produce assembly from the binary, use either of the following commands:

aarch64-linux-gnu-objdump -d *.out

bap *.out -d asm

To view the hex dump of the data section of the binary:

readelf -x .data *.out

To produce a BIR (BAP Intermediate Representation) file (which contains similar information to the BAP ADT file but is more human-readable):

bap *.out -d:*.bir

To compile a C file with the stack guard turned off:

aarch64-linux-gnu-gcc -fno-stack-protector -fno-plt -fno-pic *.c -o *.out

To produce a translation to BIL (BAP Intermediate Language) for one instruction at a time:

bap objdump *.out --show-{memory,bil,insn}

Open Source License

Copyright 2022 The University of Queensland

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at:

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.