From ac3b448681e671807a3a54f106d0bb831e7e549f Mon Sep 17 00:00:00 2001 From: greg Date: Mon, 4 Dec 2023 11:25:26 +0100 Subject: [PATCH] post for PreCA release 0.1.0 --- _posts/2023-12-04-preca-0.1.0.md | 100 +++++++++++++++++++++++++++++++ 1 file changed, 100 insertions(+) create mode 100644 _posts/2023-12-04-preca-0.1.0.md diff --git a/_posts/2023-12-04-preca-0.1.0.md b/_posts/2023-12-04-preca-0.1.0.md new file mode 100644 index 0000000..df6b7f6 --- /dev/null +++ b/_posts/2023-12-04-preca-0.1.0.md @@ -0,0 +1,100 @@ +--- +layout: post +title: "Precondition inference: Release 0.1.0 of PreCA is out" +categories: releases preca +date: 2023-12-01 +--- + + +The PreCA precondition inference framework is now released. Check out its features and how you can use it for research and code understanding. + +Do you often need to reverse some code or simply to use not-so-well-documented functions? Then you surely asked yourself many times on which input a function must be called. +Answering this question is called precondition inference and PreCA is a new framework to tackle this problem. +PreCA is completely black-box and, thus, does not rely on static analysis. +As such, it will not be impacted by code translations like optimizations or obfuscation. Moreover, if the source code is not available, it makes no difference for PreCA, +you only need to be able to run the code. + +So why not trying it? + + +## But first, what is precondition inference exactly? + +Given a function `F` it may be useful to know on which input we can run `F` to get the desired output. This is exactly what precondition inference does. +It takes the function `F` under analysis and a predicate `Q` called a postcondition, stating what is the desired output. +From `F` and `Q`, it infers a predicate `P`, s.t., the execution of `F` on any input agreeing with `P` terminates, does not crash, and the output agrees with `Q`. +Such predicate `P` is called a precondition. The best precondition, i.e., the less restrictive one, is then called the weakest precondition. + +Enough with theory, now to practice! + +## Example: understanding strcat + +Here is a basic implementation of the standard `strcat` function which appends the second string to the first one. + +```c +char * strcat(char *s, char* append) +{ + char *save = s; + + for (; *s; ++s); + while (*s++ = *append++); + return(save); +} +``` + +Take time to understand it and try to find on which input the execution terminates and does not crash i.e., the precondition when `Q = true`. + +At first glance, we just have two pointers that are dereferenced. As such, it seams natural to say that `P = valid(s) and valid(append)`. +Unfortunately, this is not enough. Indeed, if the two pointers are valid, no crash will occur. However, if the strings +overlap, `append` will rewrite itself and execution will never terminate. + +As such, it seems reasonable to state that `P = valid(s) and valid(append) and not strOverlap(s, append)`. This is indeed what we can +understand from the manual. However, this precondition is too restrictive. Indeed, if the `append` string is empty, the two strings can overlap. + +In the end, the weakest precondition is: `P = valid(s) and valid(append) and [ strlen(append) > 0 => not strOverlap(s, append) ]`. + +If we run PreCA over this function, this is indeed what we get. +``` +$ java -ea -jar preca.jar -file examples/strcat_conacq.txt + +... + +*************Learned Network CL example ****** +network var=4 cst=3 +------------------------- +CONSTRAINTS: +Valid(var0) +Valid(var1) +NotOverlap(var0, var1)_or_StrlenEq0(var1)[0, 1, 2, 3] +------------------------- + +*************Learned Network CL example (SMTLIB) ****** +(and (valid v0) (valid v1) (or (not (overlap v0 v1)) (strleneq v1 #x00000000))) +``` + + +## PreCA: Precondition Constraint Acquisition + +**What is it?** + +PreCA it a tool that aims to infer the precondition of a function at binary level. +It is completely black-box and, as such, is not impacted by syntactic complexity introduced by optimizations or obfuscation. +But how does it works? PreCA relies on Constraint Acquisition, a symbolic machine learning framework, to infer the precondition from +observed executions. PreCA generates itself the needed testcases (active learning), which enables it to enjoy good guarantees. + +In our IJCAI'22 [paper](/nutshells/ijcai-22.md), we show that PreCA is more precise and faster that the state-of-the-art. + +**What it ships** + +PreCA 0.1.0 enables the following features: +* Proposes to use different Constraint Acquisition algorithms (Conacq, DCA) +* Exhibit a precondition language over pointers (overlap, aliasing ...) and integers +* Gives all scripts to replicate experiments from the papers. + +Documentation explaining how to run PreCA is available [here](https://github.com/binsec/preca/blob/main/README.md). + +## Cannot help but to try it? + +Do not hesitate to get the [docker image](https://github.com/binsec/preca/releases/download/0.1.0/preca_docker.tar.gz) and the [source](https://github.com/binsec/preca) to try PreCA out. +For more information about PreCA check the [README](https://github.com/binsec/preca/blob/main/README.md), +the [PreCA paper](/assets/publications/papers/2022-ijcai.pdf), and the [DCA paper](/assets/publications/papers/2023-kr.pdf). +