Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aslt changes + grammar + testing #53

Merged
merged 21 commits into from
Mar 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,11 @@ jobs:
- run: echo 'preparing nix shell environment'

- run: dune build --profile release
- run: ./coverage.sh test
- run: dune runtest -j4
- run: dune build '@coverage' -j4
- run: |
echo "OUTPUT=$(pwd)/_build/default/tests/coverage" >> $GITHUB_OUTPUT
rm -rf _build/default/tests/coverage/encodings
id: coverage

- name: Upload new coverage results
Expand Down
65 changes: 54 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,23 +167,22 @@ statements and expressions.

### Using the ASL partial evaluator

Arm's specification files are shipped with the ASLp fork in the mra_tools/ subfolder.
Arm's specification files are shipped with the ASLp fork in the mra\_tools/ subfolder.
We also define a small number of overrides in tests/override.asl and tests/override.prj.
These provide alternative but equivalent definitions of some pre-defined functions
which are more suitable for partial evaluation.
(See [alastairreid/asl-interpreter](https://github.com/alastairreid/asl-interpreter#using-asl-interpreter-with-arms-public-specifications) for more details.)

There is a script `./asli` which will load the required files.
These specification files will be automatically loaded when running ASLp
through dune.
For example, to print the partially-evaluated semantics of the `add x1, x2, x3, LSL 4` instruction:
```bash
$ ./asli
$ dune exec asli
ASLi> :sem A64 0x8b031041
Decoding instruction A64 8b031041
__array _R [ 1 ] = add_bits.0 {{ 64 }} (
__array _R [ 2 ] [ 0 +: 64 ],
append_bits.0 {{ 60,4 }} ( __array _R [ 3 ] [ 0 +: 60 ],'0000' )
) [ 0 +: 64 ] ;
ASLi>
__array _R [ 1 ] = add_bits.0 {{ 64 }} ( __array _R [ 2 ],append_bits.0 {{ 60,4 }} ( __array _R [ 3 ] [ 0 +: 60 ],'0000' ) ) ;
ASLi>

```
LLVM can be used to obtain the bytecode for a particular instruction mnemonic:
```bash
Expand All @@ -194,6 +193,21 @@ $ echo 'add x1, x2, x3, LSL 4' |
0: 8b031041 add x1, x2, x3, lsl #4
```

To obtain the semantics in a more machine-readable format, called "aslt", you can use the `:ast` command
(the output below has been manually wrapped for clarity).
```bash
ASLi> :ast A64 0x8b031041
Stmt_Assign(
LExpr_Array(LExpr_Var("_R"),1),
Expr_TApply("add_bits.0",[64],[
Expr_Array(Expr_Var("_R"),2);
Expr_TApply("append_bits.0",[60;4],[
Expr_Slices(Expr_Array(Expr_Var("_R"),3),[Slice_LoWd(0,60)]);'0000'])]))
```
The aslt format is also produced by the aslp-server program which serves a HTTP endpoint to request
disassembly of opcodes.
We provide an ANTLR grammar for this format, available at [libASL/Semantics.g4](libASL/Semantics.g4).

The interpreter's evaluation (inherited from the original ASLi) can be tested with these commands:
```
:init globals
Expand All @@ -212,7 +226,38 @@ Program exited by writing ^D to TUBE
Exception taken
```

Finally, the `:coverage` command is used to test equivalence of the partial evaluator and the interpreter.
Enjoy!

### Testing

The project's tests are defined within the tests/ directory and you can invoke them through dune.
This command will run the regular test suite, taking about 30 seconds.
Included in this suite, we have basic tests of the disassembling opcodes and parsing the aslt output with ANTLR
(the ANTLR tests require Java on your PATH and will fail without them).
```bash
dune runtest
```
To run a more thorough test, taking at least 5 minutes, use this command (see next section for more details).
```bash
dune build @coverage
```

Most of the tests are [cram tests](https://dune.readthedocs.io/en/latest/reference/cram.html) which describe shell
commands to run alongside their expected output.
This means they can be useful to see examples of ASLp's output formats.
[tests/aslt/test_dis.t](tests/aslt/test_dis.t) processes a short list of common instructions to obtain their their pretty-printed and aslt semantics.
[tests/aslt/test_cntlm.t](tests/aslt/test_cntlm.t) runs a list of instructions extracted from a binary program, but the output is from the ANTLR test harness
so it is less readable.

For the coverage tests, the tests are simple diff tests against an expected output file.

In both cases, dune will report test failures as a difference against the expected output.
[`dune promote`](https://dune.readthedocs.io/en/latest/concepts/promotion.html)
can be used to update the expected files with the output from the latest run.

#### Differential testing

The `:coverage` command is used to test equivalence of the partial evaluator and the interpreter.
It takes a regular expression of an instruction group, then generates and evaluates the partially evaluated
ASL as well as the original ASL and compares the final states.
Instruction groups can be found in the [mra_tools/arch/arch_instrs.asl](mra_tools/arch/arch_instrs.asl) file.
Expand All @@ -235,8 +280,6 @@ as we would expect if the partial evaluation is correct.
UNDEFINED means that particular bytecode is an undefined case in the architecture.
If an exception occurs somewhere else in the process, that will be reported as well.

Enjoy!

## Publication

- Lam, K., & Coughlin, N. (2023).
Expand Down
64 changes: 0 additions & 64 deletions coverage.sh

This file was deleted.

4 changes: 4 additions & 0 deletions dune
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,7 @@
)
(section (site (asli aslfiles)))
)

(alias
(name default)
(deps (package asli)))
1 change: 1 addition & 0 deletions dune-project
Original file line number Diff line number Diff line change
Expand Up @@ -42,3 +42,4 @@
; (documentation ...)

(generate_opam_files true)
(cram enable)
105 changes: 105 additions & 0 deletions libASL/Semantics.g4
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
grammar Semantics;

// this is the reference grammar for aslp's "aslt" format, i.e. the format produced by aslp-server and the :ast command.

// see also:
// - aslp/libASL/asl.ott for the original ASL grammar, including pp-raw rules which generate this output.

stmt: assignment_stmt | call_stmt | conditional_stmt;
stmts: OPEN_BRACKET (stmt (SCOLON stmt)*)? CLOSE_BRACKET;

stmt_lines: stmt* EOF; // a convenient entry point for space-separated stmts, unused within this gramamr

assignment_stmt:
'Stmt_Assign' OPEN_PAREN lexpr COMMA expr CLOSE_PAREN # Assign
| 'Stmt_ConstDecl' OPEN_PAREN type COMMA ident COMMA expr CLOSE_PAREN # ConstDecl
| 'Stmt_VarDecl' OPEN_PAREN type COMMA ident COMMA expr CLOSE_PAREN # VarDecl
| 'Stmt_VarDeclsNoInit' OPEN_PAREN type COMMA OPEN_BRACKET (ident (COMMA ident)*)? CLOSE_BRACKET CLOSE_PAREN # VarDeclsNoInit
| 'Stmt_Assert' OPEN_PAREN expr CLOSE_PAREN # Assert
| 'Stmt_Throw' OPEN_PAREN message=ident+ CLOSE_PAREN # Throw // untested
;

call_stmt:
'Stmt_TCall' OPEN_PAREN
ident COMMA

OPEN_BRACKET (targs (SCOLON targs)*)? CLOSE_BRACKET COMMA

OPEN_BRACKET (expr (SCOLON expr)*)? CLOSE_BRACKET

CLOSE_PAREN
# CallStmt
;

conditional_stmt:
'Stmt_If' OPEN_PAREN expr COMMA
tcase=stmts COMMA
OPEN_BRACKET CLOSE_BRACKET COMMA // elseif chains are transformed away by aslp
fcase=stmts CLOSE_PAREN
# ConditionalStmt
;

type_register_slices:
(COMMA OPEN_PAREN OPEN_BRACKET 'Slice_HiLo' OPEN_PAREN expr COMMA expr CLOSE_PAREN CLOSE_BRACKET COMMA ident CLOSE_PAREN)*
// untested
;

type:
'Type_Bits' OPEN_PAREN expr CLOSE_PAREN # TypeBits
| 'Type_Constructor(boolean)' # TypeBoolean
| 'Type_Constructor(' ident ')' # TypeConstructor
| 'Type_Register' OPEN_PAREN QUOTE width=integer QUOTE type_register_slices CLOSE_PAREN # TypeRegister
;

lexpr:
'LExpr_Var' OPEN_PAREN ident CLOSE_PAREN # LExprVar
| 'LExpr_Field' OPEN_PAREN lexpr COMMA ident CLOSE_PAREN # LExprField
| 'LExpr_Array' OPEN_PAREN lexpr COMMA expr CLOSE_PAREN # LExprArray
;

expr:
'Expr_Var' OPEN_PAREN ident CLOSE_PAREN # ExprVar
| 'Expr_TApply' OPEN_PAREN ident COMMA
OPEN_BRACKET (targs (SCOLON targs)*)? CLOSE_BRACKET COMMA
OPEN_BRACKET (expr (SCOLON expr)*)? CLOSE_BRACKET
CLOSE_PAREN # ExprTApply
| 'Expr_Slices' OPEN_PAREN expr COMMA
OPEN_BRACKET slice CLOSE_BRACKET
CLOSE_PAREN # ExprSlices
| 'Expr_Field' OPEN_PAREN expr COMMA ident CLOSE_PAREN # ExprField
| 'Expr_Array' OPEN_PAREN base=expr COMMA index=expr CLOSE_PAREN # ExprArray
| integer # ExprLitInt
| bits # ExprLitBits
| OPEN_PAREN expr CLOSE_PAREN # ExprParen
// | 'Expr_LitHex' OPEN_PAREN QUOTE HEXDIGIT+ QUOTE CLOSE_PAREN # ExprLitHex
// | 'Expr_LitMask' OPEN_PAREN QUOTE BINARY QUOTE CLOSE_PAREN # ExprLitMask
// | 'Expr_LitString' OPEN_PAREN QUOTE ident QUOTE CLOSE_PAREN # ExprLitString
;

ident: QUOTE ID QUOTE;

integer: DECIMAL;
bits: BINARY;

targs: expr;

slice: 'Slice_LoWd' OPEN_PAREN expr COMMA expr CLOSE_PAREN;

BINARY: SQUOTE [0-1]+ SQUOTE;
DECIMAL: [0-9]+;
ID: [a-zA-Z_][a-zA-Z0-9_.]*;

// Delimiters
OPEN_PAREN: '(';
CLOSE_PAREN: ')';
COMMA: ',';
OPEN_BRACKET: '[';
CLOSE_BRACKET: ']';
SQUOTE: '\'';
QUOTE: '"';
SCOLON: ';';

// Ignored
NEWLINE: ('\r\n' | '\n') -> skip;
WHITESPACE: ' '+ -> skip;
COMMENT: '//' ~[\r\n]* -> skip;
Loading
Loading