Formal equivalence checking with Sail

Here's a high-level overview of what this commit does: - Compiles Sail into SystemVerilog including patchin compiler bugs - Create a TCL file that tells JasperGold what to prove and assume - Check memory operations modelling the LSU Most of these properties now prove without time-bound on the response from memory due to alternative LSUs - Check memory even with Smepmp errors: Continues on top of riscv/sail-riscv#196 - CSR verification - Checks for instruction types such as B-Type, I-Type, R-Type - Check illegal instructions and WFI instructions - Using psgen language for proof generation - Documentation on how to use the setup - Wrap around proof that proves instructions executed in a row still match the specification. - Liveness proof to guarantee instructions will retire within a upper bound of cycles. All of these proofs make heavy use of the concept of k-induction. All the different properties and steps are necessary to help the tool get the useful properties it needs to prove the next step. The instruction correctness, wrap-around and liveness all give us increased confidence that Ibex is trace-equivalent to Sail. Throughout this process an issue was found in Ibex where the pipeline was not flushing properly on changing PMP registers using clear: lowRISC#2193 Alternative LSUs: This makes all top level memory properties prove quickly and at a low proof effort (1 or 2-induction). Three 'alternative LSUs' representing three stages of memory instructions: 1. Before the first response is received, in the EX stage 2. After the first response is received, but not the second grant, also in the EX stage 3. Before the last response is received in the WB stage. In each case we ask 'if the response came now, would the result be correct?'. Similar is applied for CSRs/PC though less directly. This is particularly interesting (read: ugly) in the case of a PMP error wbexc_exists makes Wrap properties fast to prove. The bottleneck becomes SpecPastNoWbexcPC, which fails only due to a bug. See the comment in riscv.proof. Co-authored-by: Marno van der Maas <[email protected]>
marnovandermaas · Jan 13, 2025 · 437e114 · 437e114
1 parent 5da1679
commit 437e114
Show file tree

Hide file tree

Showing 27 changed files with 3,546 additions and 0 deletions.
diff --git a/CREDITS.md b/CREDITS.md
@@ -6,6 +6,10 @@ University of Bologna under the name Zero-riscy. In December 2018, Ibex has
 been contributed to lowRISC who is maintaining and advancing the design since
 then.
 
+Similarly, the Ibex `dv/formal` work was originally developed by the University of
+Oxford in the summer of 2023. In the summer of 2024 this work was extended
+for and contributed to lowRISC who is now maintaining and advancing the design.
+
 Throughout the years, Ibex has seen contributions from many people and we at
 lowRISC are very thankful for all of them. This file lists the many people who
 contributed to what is called Ibex today. If you made a contribution to Ibex
@@ -38,6 +42,7 @@ please feel free to open a pull request to get your name added to this file.
 - Ivan Ribeiro
 - Karol Gugala
 - Leon Woestenberg
+- Louis-Emile Ploix
 - Luís Marques
 - Marek Pikuła
 - Markus Wegmann

diff --git a/dv/formal/.gitignore b/dv/formal/.gitignore
@@ -0,0 +1,2 @@
+jgproject
+jgproofs
diff --git a/dv/formal/Makefile b/dv/formal/Makefile
@@ -0,0 +1,38 @@
+SAIL=sail
+
+SAIL_RISCV_MODEL_DIR=../sail-riscv/model
+
+include Sources.mk
+
+PSGEN_SRCS=thm/btype.proof thm/ibex.proof thm/mem.proof thm/riscv.proof
+PSGEN_FLAGS=-root riscv -task
+
+SAIL_EXTRA_SRCS=../spec/main.sail
+
+SAIL_SV_FLAGS=-sv -sv-comb -sv-inregs -sv-outregs -sv-nostrings -sv-nopacked -sv-nomem -Oconstant_fold -memo-z3 \
+	-sv-unreachable translate -sv-unreachable lookup_TLB -sv-unreachable translate_tlb_miss -sv-unreachable translate_tlb_hit -sv-unreachable pt_walk \
+	-sv-fun2wires 2:read_ram \
+	-sv-fun2wires 2:write_ram \
+	-sv-fun2wires wX \
+	-sv-fun2wires 2:rX \
+
+.PHONY: sv
+sv:
+	mkdir -p build
+	python3 buildspec.py header > build/ibexspec_instrs.sv
+	cd build && $(SAIL) $(SAIL_SRCS) $(SAIL_EXTRA_SRCS)  $(SAIL_SV_FLAGS) `cd .. && python3 buildspec.py unreachables` -o ibexspec
+	python3 spec/fix_pmp_bug.py
+	python3 buildspec.py unreachable_loc_hack
+
+.PHONY: reset-sub
+reset-sub:
+	cd sail-riscv && git reset --hard HEAD
+
+.PHONY: update-sub
+update-sub: reset-sub
+	cd sail-riscv && git pull
+
+.PHONY: psgen
+psgen:
+	mkdir -p build
+	psgen $(addprefix -path ,$(PSGEN_SRCS)) $(PSGEN_FLAGS) -sv-out build/psgen.sv -tcl-out build/psgen.tcl
diff --git a/dv/formal/README.md b/dv/formal/README.md
@@ -0,0 +1,142 @@
+# End to End Formal Verification against the Sail
+
+Prerequisities (in your PATH):
+- [The lowRISC fork of the Sail compiler](https://github.com/lowRISC/sail/tree/lowrisc)
+- [psgen](https://github.com/mndstrmr/psgen)
+
+Build instructions:
+- `git submodule init`
+- `make psgen` to build the SV for the proofs given in `thm/`
+- `make sv` to build the SV translation of the Sail compiler. Will invoke `buildspec.py`, which can be configured to adjust which instructions are defined. By default all of them are, this is correct but slow.
+- Make the changes to Ibex described in the RTL changes.
+- `SAIL_DIR=<path to sail compiler source> jg verify.tcl`
+
+## Conclusivity
+All properties are currently known to be conclusive, with the exception of M-Types. The later stages (Memory, Top, RegMatch and Wrap and some Liveness) generally take longer.
+
+This means that (up to base case) Ibex is trace equivalent to the Sail (i.e. where the Sail main function is run in a loop). This means:
+- If one takes a sequence of instructions and ran Ibex on them, the same memory operations in the same order would be produced by running the Sail main function in a loop.
+- We do not prove that the instruction executed with a given PC was loaded with that PC.
+- We haven't proved that Ibex resets correctly, if it doesn't there is no trace equivalence.
+
+## RTL Changes
+- `ResetAll = 1` is required for now (instead of `ResetAll = Lockstep`)
+- Fix for [#2913](https://github.com/lowRISC/ibex/issues/2193): Include `|| csr_op_o == CSR_OP_CLEAR` in the cases for `csr_pipeline_flushes` in `ibex_id_stage.sv`.
+
+## Code Tour
+### Top Level Goals
+The top level objective is to get proofs for `Wrap`, `Live`, `Load`, `Store` and `NoMem`. These properties are themselves enough to prove trace equivalence (see below).
+They are intended to be simple and human interpretable, you should be able to convince yourself that those properties together imply specification conformance of Ibex (igoring the holes listed below).
+- `Wrap` proves that on each invocation of the specification (each time `spec_en` is high) the new inputs to the spec are consistent with the previous outputs of the spec.
+- `Live` proves that you will always eventually get a `spec_en` event, (i.e. it's not possible to avoid comparing with the spec). This is done by finding a very conservative but finite bound on the amount of time between spec `spec_en`.
+- `Load` and `Store` prove that the requests made to memory are correct, given a common spec/Ibex starting state. `Wrap`+`Live` proves that that starting state is actually the same. They are written only in terms of top level signals, and `instr_will_progress`, which directly implies `spec_en`.
+- `NoMem` checks that non memory instructions don't make memory requests.
+
+Everything else (about 1,000 properties) exist to help `Wrap` converge.
+
+Immediately below `Wrap` we have `Top`, which asserts that when any instruction retires it does so correctly. This is proved with a big case split, where we check the same for each individual instruction. Most of the work goes into proving those properties with many many invariants.
+
+### Helpers
+Most Ibex specific helpers can be found in `thm/ibex.proof`, it's a mess of invariants that help the model checker significantly.
+Each one was at some point obtained by analysing some kind of k-induction trace, or just by thinking about why a property can't be easily proved in its current state.
+
+The standout part of that file is `Memory`. It is the central graph induction used to prove that memory operations are well behaved.
+The definitions of the nodes are fairly clear, though their invariants can be complex. They are however fairly intuitive with some thought.
+
+Arguably the most important helpers are those in `thm/riscv.proof` that individually check each instruction or type of instruction.
+They pretty much all check the same thing: The RF addr/WE is correct, the RF data is correct, the CSRs are correct, and the PC is correct.
+Most also have `use structure_fast`, which is a great help.
+
+That file also contains checks for non-instructions like IRQs, illegal instructions and fetch errors. They are checked in largely similar ways.
+
+Towards the end of `riscv.proof` we have Top, RegMatch and Wrap. Wrap is the goal. Top is the statement: 'every instruction is correct', and RegMatch is to help Wrap
+with the otherwise difficult RegA and RegB properties.
+
+The other helpers are in `btype.proof` and `mem.proof`, which mostly just contain more graph inductions for properties that can otherwise be difficult to prove.
+
+### Liveness
+The final property of `Wrap` is `Live`, which essentially proves that there will be infinite points of comparison between Ibex and the spec. It is this signal which means you don't have to trust that `spec_en` means anything. `Live` is difficult to prove. Ideally it would say `always (s_eventually spec_en)`, but in practice it has to enforce a strict bound, since proving `s_eventually` properties is quite difficult: `always (##[0:N] spec_en)`. You can find this value for `N` in `Wrap`, at the time of writing it's 212, essentially, meaning that an instruction must move to writeback at least once in 212 cycles. This is obviously an extremely weak bound, but that's fine since any finite bound is sufficient to prove liveness.
+
+The proof of `Live` is fairly neat, and can be found in `ibex.proof`. It composes many properties about smaller time bounds into one large property, which conservatively just adds them all up. The major difficulty in particular is proving that if an instruction is killed, then the next one will not be. There are also issues with long running instructions and events (divide, memory, WFI).
+
+By default live (and the liveness lemmas) are commented out, since it currently only applies when `TIME_LIMIT = 1`.
+
+## Guide to Bug Hunting
+When you are looking to check you haven't introduced a bug you should be careful not to accidentically assume the absence of any bugs.
+
+**Either prove each task in order, or run a bounded check on the property you are interested in, but do so outside of the proof structure, i.e. in the `<embedded>` task.**
+
+It is enough to check `Wrap`, `Load*` and `Store*`. That said it might be more convenient to check a lower level property.
+- If you've changed the behaviour of an instruction (either its GPRs, CSRs, PC or whatever else) you should probably be checking the properties for that instruction or class of instructions. Alternatively you could check the `Top` properties.
+- If you've changed the behaviour of interrupts check the `IRQ` properties.
+- If you've changed PMP, or other memory related things check `FetchErr` as well as the data memory properties.
+- Pipeline control issues are only directly checked by `Wrap`, though will likely be picked up by something else.
+
+## Proof Holes
+1. Some CSRs are not checked. See `ex_is_checkable_csr` in `top.sv` for the full list. They are not checked either because they aren't specified, can't be specified or because I haven't got around to adding them yet. Some should be easy to check. When a CSR instruction touches one of those CSRs the next round of checks is essentially just disabled, anything can happen and verification will assume at the next `spec_en`.
+2. Bus errors are assumed not to occur. They are not specified, though the bigger issue is that they would be hard to prove correctness of.
+3. There are no NMIs or fast IRQs, since there are not specified.
+4. We forbid attempting to enter debug mode, as this is also not specified.
+5. We assume `ResetAll` and no clock gating. Both of these could be fixed with probably fairly little effort.
+6. I don't check that Ibex ever handles interrupts. If it does it handle an interrupt it does so correctly.
+7. I haven't proved security, I have proved correctness, or more specificially specification conformance. If there is a bug in the spec, or a side channel somewhere, I will not find it.
+8. There is currently no proof of correctness of instruction fetch, i.e. when IF returns an instruction with a given PC, it has not been proved that that instruction was actually loaded from memory with that PC. This was proved for CHERIoT-Ibex and could probably be easily proved here too.
+9. Ibex has not been proved to reset to the correct state, there's no reason it couldn't be.
+
+The precise configuration of Ibex can be found in `top.sv`, but is mostly the default with the following exceptions:
+- 4 PMP Regions, enabled
+- `BranchTargetALU = 1`
+- `ResetAll = 1` (but `SecureIbex = 0`)
+In particular, this means we have
+- `RV32MFast`
+- `RV32BNone` (Hence we are missing some instructions. This might be a good place for future work, it should be not terribly difficult to extend this)
+- `RegFileFF`
+
+The following are not holes per se, but are limitations, needed to obtain convergence of liveness or others:
+1. Memory grants/responses are bounded to take at most `TIME_LIMIT` cycles in `protocol/mem.sv`. Removing this restriction would be hard. That number is 10 at time of writing, but `Wrap` and `Live` have only been tested with 1. See below for more on that.
+2. WFI instructions may not currently be enterred if they couldn't ever be exitted in `protocol/irqs.sv`. This probably should be legal behaviour, but without it liveness is false. There could probably instead be a specific carveout for this case, since it's also legal behaviour.
+3. If a WFI is enterred it must be interrupted within `WFI_BOUND` cycles, in `protocol/irqs.sv`. An `s_eventually` equivalent would be nice but in practice is very difficult to prove properties with.
+4. When an interrupt is fired, that interrupt must remain fired until a memory request is granted (i.e. as if it was some MMIO operation), in `protocol/irqs.sv`. This might be removable.
+
+## Wrap Around
+The formal work includes `Wrap` properties. They check that each time the specification is 'used' the inputs to the specification equal its outputs the last time it was read.
+
+Wrap properties have currently only been tested with `TIME_LIMIT = 1`, which is way lower than is ideal. Higher numbers do probably work, but I have no idea how long they will take. If JG is doing its job properly it shouldn't matter at all. The liveness properties will fail for higher time limits (this should be easy but fiddly to fix! They would ideally have limits written in terms of `TIME_LIMIT` and `WFI_BOUND`), so they are commented out by default. 
+
+While the implications of such a property are intuitive, we can be satisfying formal too. Define five functions:
+1. The specification function $\text{Spec} : S \times I \to S$ maps an architectural state and input to the next architectural state. It does not matter what is meant by 'next' here, but in our case $\text{Spec}$ steps through either an illegal instruction, an interrupt or a regular instruction.
+2. $\text{SpecOut} : S \times I \to O$ maps an architectural state and input to an output (i.e. memory outputs).
+3. $\text{Ibex} : A \times I \to A$ maps one micro-architectural state and input to the next micro-architectural state, where 'next' is equivalent to 'next' for $\text{Spec}$ and might span multiple clock cycles.
+4. $\text{IbexOut} : S \times I \to O$ maps a micro-architectural state and input to an output.
+5. $\text{abs} : A \to S$ maps a micro-architectural state to an architectural state.
+
+The memory assertions (`Load`, `Store`, `NoMem`) prove the following statement, for any micro-architectural state $a$ and input $i$:
+```math
+\text{SpecOut}(\text{abs}(a), i) = \text{IbexOut}(a, i)
+```
+
+The assertions with prefix `Wrap_` (using essentially every other property we have as lemmas) prove the following statement:
+```math
+\text{Spec}(\text{abs}(a), i) = \text{abs}(\text{Ibex}(a, i))
+```
+If $a_0$ is the reset micro-architectural state we have, by a straightforward induction, that for a sequence of $n$ inputs $I$,
+```math
+\text{Spec}^n(\text{abs}(a_0), I) = \text{abs}(\text{Ibex}^n(a_0, I))
+```
+Where I here denote $f^{n + 1}(a, I)$ to be the the left fold of `f` over `I` using `a` initially, i.e. in Haskell syntax `foldl f a I`. Combined we derive that
+```math
+\text{SpecOut}(\text{abs}(\text{Ibex}^n (a_0, I)), i) = \text{SpecOut}(\text{Spec}^n (\text{abs}(a_0), I), i) = \text{IbexOut}(\text{Ibex}^n (a_0, I), i)
+```
+
+Finally, we will (but have not yet attempted to) prove that $\text{abs}(a_0) = s_0$, where $s_0$ is the initial state of the specification:
+```math
+\text{SpecOut}(\text{Spec}^n (s_0, I), i) = \text{IbexOut}(\text{Ibex}^n (a_0, I), i)
+```
+
+With that we can conclude that for any sequence of $n$ steps the outputs will remain the same between the specification and Ibex, proving their complete equivalence, so long as there are infinite such steps, which the `Live` property proves.
+
+Note a few things:
+1. We do not implicitly rely on (or define) 'correctness' for $\text{abs}$. Satisfying the above propositions is sufficient. In practice it will need to be 'correct' in some sense for $\text{Spec}$ to make use of it.
+2. We do not need to prove that the internal signals of Ibex mean anything, e.g. we have no explicit interpretation of the register file, the pipeline or anything else. To be convinced that I am correct you need to trust only that `Wrap`, `Live`, `Load`, `Store`, `NoMem` say what I claim they do above. If I am wrong about my interpretation about internal signals, it does not matter.
+3. In practice there are steps we cannot prove equivalence across (i.e. accessing CSRs we don't have a specification for). In such cases we just disable the next check and resume after that.
+
diff --git a/dv/formal/Sources.mk b/dv/formal/Sources.mk
@@ -0,0 +1,27 @@
+SAIL_XLEN := riscv_xlen32.sail
+
+SAIL_CHECK_SRCS = riscv_addr_checks_common.sail riscv_addr_checks.sail riscv_misa_ext.sail
+SAIL_DEFAULT_INST = riscv_insts_base.sail riscv_insts_cext.sail riscv_insts_mext.sail riscv_insts_zicsr.sail
+
+SAIL_SEQ_INST  = $(SAIL_DEFAULT_INST) riscv_jalr_seq.sail
+
+SAIL_SEQ_INST_SRCS  = riscv_insts_begin.sail $(SAIL_SEQ_INST) riscv_insts_end.sail
+
+SAIL_SYS_SRCS =  riscv_csr_map.sail
+SAIL_SYS_SRCS += riscv_sys_exceptions.sail
+SAIL_SYS_SRCS += riscv_sync_exception.sail
+SAIL_SYS_SRCS += riscv_csr_ext.sail
+SAIL_SYS_SRCS += riscv_sys_control.sail
+
+PRELUDE = prelude.sail $(SAIL_XLEN) prelude_mem_metadata.sail prelude_mem.sail
+
+SAIL_REGS_SRCS = riscv_reg_type.sail riscv_regs.sail riscv_pc_access.sail riscv_sys_regs.sail
+SAIL_REGS_SRCS += riscv_pmp_regs.sail riscv_pmp_control.sail
+SAIL_REGS_SRCS += riscv_ext_regs.sail $(SAIL_CHECK_SRCS)
+
+SAIL_ARCH_SRCS = $(PRELUDE)
+SAIL_ARCH_SRCS += riscv_types_common.sail riscv_types_ext.sail riscv_types.sail
+SAIL_ARCH_SRCS += riscv_vmem_types.sail $(SAIL_REGS_SRCS) $(SAIL_SYS_SRCS) riscv_platform.sail
+SAIL_ARCH_SRCS += riscv_mem.sail
+
+SAIL_SRCS      = $(addprefix $(SAIL_RISCV_MODEL_DIR)/,$(SAIL_ARCH_SRCS) $(SAIL_SEQ_INST_SRCS))
diff --git a/dv/formal/buildspec.py b/dv/formal/buildspec.py
@@ -0,0 +1,113 @@
+# Copyright lowRISC contributors.
+# Copyright 2024 University of Oxford, see also CREDITS.md.
+# Licensed under the Apache License, Version 2.0, see LICENSE for details.
+# Original author: Louis-Emile Ploix
+# SPDX-License-Identifier: Apache-2.0
+
+# This script is called from the Makefile.
+# Essentially it is a way to enable and disable instructions from the Sail specification to make proofs easier.
+
+import json
+import re
+import sys
+
+with open("instrs.json", "r") as f:
+    INSTRS = set(json.load(f))
+
+class SpecConfig:
+    def __init__(self):
+        self.instrs = set()
+
+    def add(self, *instrs):
+        for instr in instrs:
+            assert instr in INSTRS
+            self.instrs.add(instr)
+
+    def add_all(self):
+        self.instrs = INSTRS
+
+    def add_noncompressed(self):
+        self.instrs = {x for x in INSTRS if not x.startswith("C_")}
+
+    def remove(self, instr):
+        assert instr in INSTRS
+        self.instrs.discard(instr)
+
+    def make_sail_unreachables(self):
+        return " ".join(f"-sv_unreachable execute_{instr}" for instr in INSTRS.difference(self.instrs))
+
+    def make_sv_header(self):
+        return "\n".join([
+            "`ifndef SPEC_INSTRS",
+            "`define SPEC_INSTRS",
+            "",
+            *[f"`define SPEC_{instr.upper()} {int(instr in self.instrs)}" for instr in INSTRS],
+            "",
+            "`endif"
+        ])
+
+    def add_jmps(self):
+        self.add("RISCV_JAL", "RISCV_JALR")
+
+    def add_itype(self):
+        self.add("ITYPE")
+        self.add("SHIFTIOP")
+
+    def add_rtype(self):
+        self.add("RTYPE")
+
+    def add_fences(self):
+        self.add("FENCE", "FENCEI")
+
+    def add_loads(self):
+        self.add("LOAD")
+
+    def add_stores(self):
+        self.add("STORE")
+
+    def add_mtypes(self):
+        self.add("MUL", "DIV", "REM")
+
+    def add_illegal(self, extra=True):
+        self.add("ILLEGAL", "C_ILLEGAL")
+        if extra:
+            self.add("CSR")
+            self.add("MRET", "WFI")
+
+    def add_system(self):
+        self.add("ECALL", "EBREAK", "MRET", "WFI")
+
+    def add_csr(self):
+        self.add("CSR")
+
+    def add_utypes(self):
+        self.add("UTYPE")
+
+    def add_btypes(self):
+        self.add("BTYPE")
+
+conf = SpecConfig()
+conf.add_noncompressed()
+
+if len(sys.argv) > 1:
+    if sys.argv[1] == "unreachables":
+        print(conf.make_sail_unreachables())
+    elif sys.argv[1] == "header":
+        print(conf.make_sv_header())
+    elif sys.argv[1] == "unreachable_loc_hack":
+        with open("build/ibexspec.sv", "r") as f:
+            c = f.read()
+        c = c.replace(
+            "sail_reached_unreachable = 1;",
+            "begin sail_reached_unreachable = 1; sail_reached_unreachable_loc = `__LINE__; end"
+        ).replace(
+            "sail_reached_unreachable = 0;",
+            "begin sail_reached_unreachable = 0; sail_reached_unreachable_loc = -1; end"
+        ).replace(
+            "bit sail_reached_unreachable;",
+            "bit sail_reached_unreachable;\n\tlogic [31:0] sail_reached_unreachable_loc;"
+        )
+        with open("build/ibexspec.sv", "w") as f:
+            f.write(c)
+else:
+    print("Intended usage is to run make sv")