Skip to content

Commit

Permalink
Merge pull request #8 from dlesbre/codex-nutshells
Browse files Browse the repository at this point in the history
Codex nutshells
  • Loading branch information
mlemerre authored Jul 25, 2024
2 parents c1d63c7 + 04e017f commit 01e6959
Show file tree
Hide file tree
Showing 8 changed files with 84 additions and 22 deletions.
10 changes: 10 additions & 0 deletions _data/publications.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,11 @@
easy-summary: nutshells/pldi-24.html
pdf: /assets/publications/papers/2024-pldi.pdf
bibtex: /assets/publications/bibtexs/2024-pldi.bib
artifact-available: true
artifact-evaluated-reusable: true
talk-video: "https://www.youtube.com/watch?v=2Btkn9AvM8o"
talk-slides: "/assets/publications/slides/2024-pldi.pdf"
comment: "Also presented at AFADL 2024"
- title: "Inference of Robust Reachability Constraints"
authors:
- name: Yanis Sellami
Expand Down Expand Up @@ -146,6 +151,11 @@
award: "Distinguished Paper Award"
pdf: "/assets/publications/papers/2023-popl-full-with-appendices.pdf"
talk-slides: "/assets/publications/slides/2023-popl.pdf"
talk-video: "https://www.youtube.com/watch?v=wkIfcN3Ipd4"
easy-summary: nutshells/popl-23.html
artefact: https://dl.acm.org/do/10.1145/3554341/full/
artifact-evaluated-reusable: true
artifact-available: true
- year: 2022
publications:
- title: "Automated Program Analysis: Revisiting Precondition Inference through Constraint Acquisition"
Expand Down
11 changes: 4 additions & 7 deletions _layouts/publications.html
Original file line number Diff line number Diff line change
Expand Up @@ -33,17 +33,14 @@ <h3 class="subtitle is-3">{{ y.year }}</h3>
<img src="{{ p.artifact-custom-image }}" width="{{ p.artifact-custom-image-width }}" height="{{ p.artifact-custom-image-height }}">
&nbsp;
{% endif %}
{% if p.artifact-evaluated-functional %}
{% if p.artifact-available %}
<img src="/assets/img/artifacts_available.png" alt="ACM Badge Artifact Available" width="70" height="70">
{% endif %}
{% if p.artifact-evaluated-functional %}
<img src="/assets/img/artifacts_evaluated_functional.png" alt="ACM Badge Artifact Functional" width="70" height="70">
&nbsp;
{% endif %}
{% if p.artifact-evaluated-reusable %}
<img src="/assets/img/artifacts_evaluated_reusable.png" alt="ACM Badge Artifact Reusable" width="70" height="70">
&nbsp;
{% endif %}
{% if p.artifact-available %}
<img src="/assets/img/artifacts_available.png" alt="ACM Badge Artifact Available" width="70" height="70">
&nbsp;
{% endif %}
{% if p.results-replicated %}
<img src="/assets/img/results_replicated.png" alt="ACM Badge Results Replicated" width="70" height="70">
Expand Down
Binary file modified assets/publications/papers/2024-pldi.pdf
Binary file not shown.
Binary file added assets/publications/slides/2024-pldi.pdf
Binary file not shown.
8 changes: 4 additions & 4 deletions nutshells/pldi-24.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,8 @@ Our paper shows the following novel results:
- A standard abstract interpretation framework can be turned into a
compiler: create a domain that is a **free-algebra** of the domain signature (i.e.
a domain where each domain operation is a constructor creating an expression), then the analysis
result **can be used to construct a new program**.
- **Functors can mimic compiler passes**.
result **can be used to construct a new program**. Different abstract domain signatures correspond to different languages: the classical domain signature correspond to imperative programs expressed as a control-flow graph, and we provide a SSA domain signature corresponding to programs in SSA form.
- **Functors can implement compiler passes**.
A functor is just a function that builds a new abstract domain on top of abstract
domains received as arguments. Functors are modular, they can be proved independently
and then combined in a full compilation chain. Functor soundness and completeness
Expand Down Expand Up @@ -104,6 +104,6 @@ Our paper shows the following novel results:
## Further information

- Read the [**paper**](/assets/publications/papers/2024-pldi.pdf)
- Download the [software artifact](https://doi.org/10.5281/zenodo.10895582) from
- Presented at the [Programming Language Design and Implementation (PLDI) 2024 conference](https://pldi24.sigplan.org/). Watch the [**talk video**](https://www.youtube.com/watch?v=2Btkn9AvM8o) or look at the [**slides**](/assets/publications/slides/2024-pldi.pdf)
- Download the [**software artifact**](https://doi.org/10.5281/zenodo.10895582) from
Zenodo to try out our example analyzer and explore the code.
- To appear at the [Programming Language Design and Implementation (PLDI) 2024 conference](https://pldi24.sigplan.org/)
53 changes: 53 additions & 0 deletions nutshells/popl-23.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
---
layout: post
title: "POPL'23: research paper"
categories: new publication
paper-title: "SSA Translation Is an Abstract Interpretation"
topic: "Static single assignement (SSA); Abstract interpretation; Cyclic term graph"
pdf: "/assets/publications/papers/2023-popl-full-with-appendices.pdf"
date: 2023-01-01
redirect_from: /new/publication/1970/01/01/nutshell-popl-23.html
---


## Summary

Conversion to Static Single Assignment (SSA) form is usually viewed as
a syntactic transformation algorithm that gives unique names to
program variables, and reconciles these names using "phi" functions
based on a notion of domination. We instead propose a semantic
approach, where SSA translation is performed using a simple dataflow
analysis. Based on a new technique to use cyclic terms in abstract
domains, we propose a Symbolic Expression abstract domain that
performs a Global Value Numbering analysis, upon which we build our
SSA translation. This implies a shift in perspective, as global value
numbering becomes a prerequisite of SSA translation, instead of
depending on SSA.

One application to performing SSA Translation by Abstract
Interpretation is that SSA optimizations passes can be implemented as
a combination of abstract domains, allowing to perform several
optimizations simultaneously to solve the usual phase ordering problem
and avoiding tedious maintenance of SSA invariants.

Our main motivation for this research is the design of Codex, an
analyzer for machine code which uses SSA as its main intermediate
representation. Machine code is too low-level to allow SSA translation
without a prior semantic analysis, while SSA is an intermediate
representation that makes static analysis easier than direct analysis
of machine code. Viewing SSA translation as a semantic analysis solves
this chicken-and-egg problem, allowing to simultaneously decompile
machine code to SSA and use the SSA representation to perform the
other semantic analyses (value analysis, memory analysis, and
control-flow analysis). Our [RTAS 2021
paper](/nutshells/rtas-21.html) illustrate the use
of such an analysis, combined with a novel type system for machine
code where type checking is also done by abstract interpretation, on
an embedded OS kernel where we prove security properties directly from
its executable.

## Further information

- Read the **paper**: either [published version](https://doi.acm.org?doi=3656392) or the [version with appendices](/assets/publications/papers/2023-popl-full-with-appendices.pdf)
- Look at the [slides](/assets/publications/slides/2023-popl.pdf) or watch the [video of the talk](https://www.youtube.com/watch?v=wkIfcN3Ipd4)
- Download the [software artifact](https://zenodo.org/records/10895582) which can be used as a reference implementation of our technique.
17 changes: 9 additions & 8 deletions nutshells/rtas-21.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ redirect_from: /new/publication/1970/01/01/nutshell-rtas-21.html
---


## Context

# Context
<img src="/assets/publications/pictures/rtas21.png" width="250" height="300"
Expand All @@ -32,27 +33,27 @@ contain such loops.
Another difficulty of analyzing embedded kernels is that they are usually
*parameterized*: values such as task priorities, or task memory region begin and
end addresses, may be constant throughout execution of the system, but are not
hardcoded in the kernel; instead, they are written in a special zone of memory
hard-coded in the kernel; instead, they are written in a special zone of memory
called the *interface* between kernel and tasks. The memory structure of this
zone is known but the precise values and addresses are not, which forces the
code analysis to step up in terms of complexity.

## Contributions


# Contributions
- A new method for verifying absence of privilege escalation and absence
of crash in an embedded kernel, based on abstract interpretation (handles
loops and verifies kernels independently from the application).
loops and verifies kernels independently of the application).
- A new abstraction of memory based on types, handling parameterization and
improving scalability precision of the analysis.
- We applied the technique to two embedded kernels (including an industrial
one), demonstrating that our method can verify kernels with less than 58 lines
of manual annotations (in some cases, requiring 0 lines of annotations, i.e.
the kernel is verified without human intervervention).

# Further information
the kernel is verified without human intervention).

## Further information

- Read the
[**paper**](/assets/publications/papers/2021-rtas.pdf) and [extended technical report](/assets/publications/papers/2021-rtas-technical-report-analysis.pdf).
[**paper**](/assets/publications/papers/2021-rtas.pdf) and [extended technical report](/assets/publications/papers/2021-rtas-technical-report-analysis.pdf).
- To appear at the [Real-Time and Embedded Technology and Applications Symposium
(RTAS'21)](http://2021.rtas.org/)
- Download the [tool and benchmark](https://github.com/binsec/rtas2021_artifact).
7 changes: 4 additions & 3 deletions nutshells/vmcai-22.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ date: 2021-11-24
redirect_from: /new/publication/1970/01/01/nutshell-vmcai-22.html
---

## Context

Memory errors are the source of the most pervasive and critical
security vulnerabilities. Programs written in low-level systems
languages, like C/C++ and assembly, perform low-level pointer and
Expand Down Expand Up @@ -70,7 +72,6 @@ fragment of the memory often entails no information can be recovered
about that region. Another limitation is that such analyses are difficult to apply to low-level
code, like low-level C or binary code. On the contrary, our technique scales less due to being mostly a storeless abstraction, and was made to handle the low-level type-punning code patterns typical of low-level systems programs.
## Contributions
To achieve this, we propose:
Expand All @@ -85,10 +86,10 @@ To achieve this, we propose:
the combination naturally deals with both C and binary code manipulating
dynamic data structures.
## Further information
# Further information
- Read the
[**paper**](/assets/publications/papers/2022-vmcai.pdf).
[**paper**](/assets/publications/papers/2022-vmcai.pdf).
- To appear at the [23rd International Conference on Verification, Model Checking, and Abstract Interpretation
(VMCAI'22)](https://popl22.sigplan.org/home/VMCAI-2022#About)
- Download the [tool and benchmark](https://doi.org/10.5281/zenodo.5589489).

0 comments on commit 01e6959

Please sign in to comment.