-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
greg
committed
Dec 4, 2023
1 parent
81e8b9d
commit ac3b448
Showing
1 changed file
with
100 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,100 @@ | ||
--- | ||
layout: post | ||
title: "Precondition inference: Release 0.1.0 of PreCA is out" | ||
categories: releases preca | ||
date: 2023-12-01 | ||
--- | ||
|
||
|
||
The PreCA precondition inference framework is now released. Check out its features and how you can use it for research and code understanding. | ||
|
||
Do you often need to reverse some code or simply to use not-so-well-documented functions? Then you surely asked yourself many times on which input a function must be called. | ||
Answering this question is called precondition inference and PreCA is a new framework to tackle this problem. | ||
PreCA is completely black-box and, thus, does not rely on static analysis. | ||
As such, it will not be impacted by code translations like optimizations or obfuscation. Moreover, if the source code is not available, it makes no difference for PreCA, | ||
you only need to be able to run the code. | ||
|
||
So why not trying it? | ||
|
||
|
||
## But first, what is precondition inference exactly? | ||
|
||
Given a function `F` it may be useful to know on which input we can run `F` to get the desired output. This is exactly what precondition inference does. | ||
It takes the function `F` under analysis and a predicate `Q` called a postcondition, stating what is the desired output. | ||
From `F` and `Q`, it infers a predicate `P`, s.t., the execution of `F` on any input agreeing with `P` terminates, does not crash, and the output agrees with `Q`. | ||
Such predicate `P` is called a precondition. The best precondition, i.e., the less restrictive one, is then called the weakest precondition. | ||
|
||
Enough with theory, now to practice! | ||
|
||
## Example: understanding strcat | ||
|
||
Here is a basic implementation of the standard `strcat` function which appends the second string to the first one. | ||
|
||
```c | ||
char * strcat(char *s, char* append) | ||
{ | ||
char *save = s; | ||
|
||
for (; *s; ++s); | ||
while (*s++ = *append++); | ||
return(save); | ||
} | ||
``` | ||
Take time to understand it and try to find on which input the execution terminates and does not crash i.e., the precondition when `Q = true`. | ||
At first glance, we just have two pointers that are dereferenced. As such, it seams natural to say that `P = valid(s) and valid(append)`. | ||
Unfortunately, this is not enough. Indeed, if the two pointers are valid, no crash will occur. However, if the strings | ||
overlap, `append` will rewrite itself and execution will never terminate. | ||
As such, it seems reasonable to state that `P = valid(s) and valid(append) and not strOverlap(s, append)`. This is indeed what we can | ||
understand from the manual. However, this precondition is too restrictive. Indeed, if the `append` string is empty, the two strings can overlap. | ||
In the end, the weakest precondition is: `P = valid(s) and valid(append) and [ strlen(append) > 0 => not strOverlap(s, append) ]`. | ||
If we run PreCA over this function, this is indeed what we get. | ||
``` | ||
$ java -ea -jar preca.jar -file examples/strcat_conacq.txt | ||
|
||
... | ||
|
||
*************Learned Network CL example ****** | ||
network var=4 cst=3 | ||
------------------------- | ||
CONSTRAINTS: | ||
Valid(var0) | ||
Valid(var1) | ||
NotOverlap(var0, var1)_or_StrlenEq0(var1)[0, 1, 2, 3] | ||
------------------------- | ||
|
||
*************Learned Network CL example (SMTLIB) ****** | ||
(and (valid v0) (valid v1) (or (not (overlap v0 v1)) (strleneq v1 #x00000000))) | ||
``` | ||
## PreCA: Precondition Constraint Acquisition | ||
**What is it?** | ||
PreCA it a tool that aims to infer the precondition of a function at binary level. | ||
It is completely black-box and, as such, is not impacted by syntactic complexity introduced by optimizations or obfuscation. | ||
But how does it works? PreCA relies on Constraint Acquisition, a symbolic machine learning framework, to infer the precondition from | ||
observed executions. PreCA generates itself the needed testcases (active learning), which enables it to enjoy good guarantees. | ||
In our IJCAI'22 [paper](/nutshells/ijcai-22.md), we show that PreCA is more precise and faster that the state-of-the-art. | ||
**What it ships** | ||
PreCA 0.1.0 enables the following features: | ||
* Proposes to use different Constraint Acquisition algorithms (Conacq, DCA) | ||
* Exhibit a precondition language over pointers (overlap, aliasing ...) and integers | ||
* Gives all scripts to replicate experiments from the papers. | ||
Documentation explaining how to run PreCA is available [here](https://github.com/binsec/preca/blob/main/README.md). | ||
## Cannot help but to try it? | ||
Do not hesitate to get the [docker image](https://github.com/binsec/preca/releases/download/0.1.0/preca_docker.tar.gz) and the [source](https://github.com/binsec/preca) to try PreCA out. | ||
For more information about PreCA check the [README](https://github.com/binsec/preca/blob/main/README.md), | ||
the [PreCA paper](/assets/publications/papers/2022-ijcai.pdf), and the [DCA paper](/assets/publications/papers/2023-kr.pdf). | ||