Welcome to the Fuzzification. We will guide users about how to apply the technique on target application. Since we are working on LLVM-IR level instrumentation, we don't provide fully automated approach. Instead, user should understand compilation process of target application. This tutorial will introduce how to prepare prerequisites and how to apply Fuzzification on existing project.
We use binutils for our example, especially v2.23 which contains lots of bugs and reveals many crashes during the fuzzing.
src/randomcode.py
will generate delay primitives for SpeedBump component. (another components for LLVM-IR instrumentation and BranchTrap/AntiHybrid should be compiled at this stage. If user don't finish this, please visit Installation first)- There are several options for the generation. Important parameters are:
DELAYS
: array for length of delay. If it is set as[1,2,3]
, we will try to generate random code which will delay 1, 2, and 3 ms.OFFSET
: allowable error on the length of delay. If use0.2
, we will allow 20% of error.
- To generate delays, just run the script.
$ cd {repo}/src
$ python randomcode.py
- Upon completion, it will generate multiple codes for delay. For example, we will be able to see the three files, according to the configuration (
DELAYS = [1,2,3]
).- We recommend the user to generate various delays.
- e.g.,
[10,20,30,40,50,60,70,80,90, 100, 200, 300, 400, 500]
$ {repo}/src/.csmith/delay_1.c
$ {repo}/src/.csmith/delay_2.c
$ {repo}/src/.csmith/delay_3.c
- Finally, user should check the generated delays to
{repo}/src/llvm_pass/bump/delaysrc
directory where Fuzzification actually refers and compile the file again using the script. (Don't forget to compile the generated code)
$ cd {repo}/src
$ ./compile_pass.sh
- Actually, Fuzzification is made for the developer who knows about the source code and has many samples (or test-cases) for testing. Therefore, Fuzzification requires some (very few) knowledge about the code.
- In this tutorial, we will introduce where to modify in the code for the Fuzzification. Before that, let's compile the target binaries (readelf, nm-new, objdump, and objdump) using script that we provided.
$ cd {repo}/antifuzz-tutorial
$ python compile.py
- Important directories
antifuzz-tutorial/archive # store original source code
antifuzz-tutorial/temp # temporary directory for the compilation
antifuzz-tutorial/test/binaries # has copy of original binary for overhead measurement
antifuzz-tutorial/test/patches # contain code patch information (e.g., where to inject)
antifuzz-tutorial/test/samples # contain correct/incorrect sample for target
antifuzz-tutorial/test/output # store fuzzified binary here
- Most importantly, user can check the generated output from
antifuzz-tutorial/test/output
directory.
-
Fuzzification configuration: user can adjust options. Plesae read our paper about the detail of these options.
-
antifuzz_all.py
DELAYS = [30,50,70,100,200,300] # try these delays (ms) to the binary and find best one
OVERHEAD = [3] # user can check different overhead budget
NON_VISIT_BUMP = 5 # ratio of non-visited-BBs for SpeedBump
NON_VISIT_COV = 5 # ratio of non-visited-BBs for BranchTrap
NON_VISIT_ANTI = 5 # ratio of non-visited-BBs for AntiHybrid
NORMAL_COV = 1 # ratio of non-visited-BBs for BranchTrap (for all fuzzifications)
NORMAL_ANTI = 5 # ratio of non-visited-BBs for AntiHybrid (for all fuzzifications)
INCLUDE_NON_VISIT = True # should we include non-visited-BBs for the instrumentation?
PROCESS_INCORRECT = True # should we instrument error-handling routine?
conf.py
SAMPLE = "readelf" # fuzzifying binary name
ARGS = " -a " # argument for measuring overhead
COMP_DIR = os.path.join(COMPILE_ROOT, "binutils") # compilation path
- After choose one desirable setup, user can run the script for the fuzzificaction.
cd {repo}/src
python antifuzz_all.py
- You can check the fortified binary under
{repo}/antifuzz-tutorial/test/output
- If user generated all
readelf
binaries without any problem, user will be able to seereadelf_ori
,readelf_all
,readelf_bump
,readelf_coverage
, andreadelf_anti
under this directory.
Ok, now you see how the fuzzification generate target binary. You may notice that user of fuzzification should tell where to modify at the source code. However, we hope you do not worry about that too much. Our example will help the user understand necessary code modifications.
Let's use another binary from the binutils project. This time we will provide detail instruction for fortifying objdump
binary.
The overall process is as follows:
- Patch the original code (usually the file with
main()
function) - Put samples (correct, incorrect) and patched code into working directory
- Run normal compilation and store the compilation commands
- Provide the part of compilation command to Fuzzification framework with notation
- Provide the actual command for executing target binary
We need to modify the code to define global variables and to guide Fuzzification about where to patch. To help the patch, we provide three code injector and they are under {repo}/src/code_injector/
-
There are three code injectors
- simple_injector: inject one function to wrap the argument of filename (i.e., input file's path)
- jtable_injector: inject global variables for storing jump-table of ROP-based BranchTrap
- anti-taint: transform explicit data-flow into implicit data-flow
-
simple_injector.py
- arg#1: target source code filename
- arg#2: line to inject the wrapper
- we recommend the line number as begining of the main function
- arg#3: position of input file
- we assume
objdump -d inpurfile
as fuzzing command. In this case, inputfile name is at the second argument of the command; thus user can use2
for this argument.
- we assume
- So user can inject the code by using the following command.
$ python simple_injector.py objdump.c 3418 2
- jtable_injector.py
- arg#1: target source code filename
- arg#2: line to inject the reference to jump-table
- we also recommend this line number be the beginning of the main function. So user can just use the same line number as above.
$ python jtable_inject.py objdump.c 3418
- anti-taint.py
- arg#1: target source code filename
- arg#2: modified code's filename
- arg#3: ratio of data-flow transformation
- NOTE that the injection should be on top of the abobe two commands and user should separately store the result by adding
_anti
postfix.
python anti-taint.py objdump.c objdump_anti.c 30
We can keep use {repo}/antifuzz-tutorial
as our working directory, if you want to change the working directory, you can change EVAL_REPO
from {repo}/src/conf.py
file. User should put currect files under this directory, specipically under {repo}/antifuzz-tutorial/test
.
-
samples directory: user should provide correct and incorrect samples to fuzzification. Since we are forifying
objdump
, user should create directoryobjdump/cor
andobjdump/incor
under samples directory, then put samples for the profiling.- We already put samples for
objdump
to the directory. Please check the directory if you want to make sure about the correect setup.
- We already put samples for
-
patches directory: user should provide patched source code. We will introduce how to patch the code at the next section. Necessary files are:
objdump.c
: changed to contain necessary global variables as well as input-sensitive functions. We use this code for all compilation. If user choose not to use specific component, we will replace the called function with dummy function.objdump_anti.c
: changed to some character arrays to pass implicit data-flow to hinder dynamic taint analysis.- We also placed the necessary files under the correct directory, please check the directory.
User should extract original compilation script from the project. For the binutils package,
./configure --disable-werror --disable-silent-rules
make > output
will store the actual commands used for the compilation. Among many commands, we only need two commands to generate objdump
binary and objdump.o
file. User can easily identify these commands by searching for -o objdump
and -o objdump.o
strings.
# command for objdump.o
'afl-clang-fast -DHAVE_CONFIG_H -I. -I. -I. -I../bfd -I./../bfd -I./../include -DLOCALEDIR="\\"/usr/local/share/locale\\"" -Dbin_dummy_emulation=bin_vanilla_emulation -W -Wall -Wstrict-prototypes -Wmissing-prototypes -Wshadow -O0 -flto -std=c11 -lpthread -MT objdump.o -MD -MP -MF .deps/objdump.Tpo -c -o objdump.o -DOBJDUMP_PRIVATE_VECTORS="" ./objdump.c'
# command for objdump
"/bin/bash ./libtool --tag=CC --mode=link afl-clang-fast -W -Wall -Wstrict-prototypes -Wmissing-prototypes -Wshadow -O0 -flto -std=c11 -lpthread -flto -o objdump objdump.o dwarf.o prdbg.o rddbg.o debug.o stabs.o ieee.o rdcoff.o bucomm.o version.o filemode.o elfcomm.o ../opcodes/libopcodes.la ../bfd/libbfd.la ../libiberty/libiberty.a -lz"
Now user can slightly modify the above command. Let's see the output first. The output is under {repo}/antifuzz-tutorial/build_template.py
# command for objdump.o
COMMAND["objdump.o"] = 'afl-clang-fast -DHAVE_CONFIG_H -I. -I. -I. -I../bfd -I./../bfd -I./../include -DLOCALEDIR="\\"/usr/local/share/locale\\"" -Dbin_dummy_emulation=bin_vanilla_emulation -W -Wall -Wstrict-prototypes -Wmissing-prototypes -Wshadow -O0 -flto -std=c11 -lpthread -MT objdump.o -MD -MP -MF .deps/objdump.Tpo -c -o objdump.o -DOBJDUMP_PRIVATE_VECTORS="" ./{SRC} 1> /dev/null 2> /tmp/makeout'
# command for objdump
COMMAND["objdump"] = "/bin/bash ./libtool --tag=CC --mode=link afl-clang-fast -W -Wall -Wstrict-prototypes -Wmissing-prototypes -Wshadow -O0 -flto -std=c11 -lpthread -flto -o $1 {object} [[objdump.o dwarf.o prdbg.o rddbg.o debug.o stabs.o ieee.o rdcoff.o bucomm.o version.o filemode.o elfcomm.o]] ../opcodes/libopcodes.la ../bfd/libbfd.la ../libiberty/libiberty.a -lz 1> /dev/null 2> /dev/null"
Can you see the notations we applied? (and that you need to do for another Fuzzification)
-
objdump.o
- actual source code filename
objdump.c
is changed to{SRC}
- actual source code filename
-
objdump
- actual output name
objdump
is changed to{$1}
- list of object files are wrapped with
[[
and]]
{object}
keyword is inserted in front of[[
- actual output name
Finally, user should tell which command to use for target binary execution. In this example, we assume that user will use objdump -d filename
.
To tell this, user should modify conf.py
(under {repo}/src/conf.py
).
SAMPLE = "objdump"
ARGS = " -d "
That's all procedures that the user should prepare for applying Fuzzification. After all preparation, we can run the fuzzification script and check the output.
# Fuzzification
$ cd {repo}/src
$ python antifuzz_all.py
# Check the output
$ cd {repo}/antifuzz-tutorial/test/output