Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match emulator to circuits: use a Harvard Architecture #688

Merged
merged 6 commits into from
Dec 5, 2024

Conversation

matthiasgoergens
Copy link
Collaborator

@matthiasgoergens matthiasgoergens commented Dec 4, 2024

Our circuits use a modified Harvard architecture where the memory space for instructions and data are completely separated, even though they are both addressed with 32 bits. That is a Good Thing.

That means writing to the data cell with address x does not change the value of the instruction cell with address x.

In contrast, our emulator uses a von Neumann architecture at its core, but goes through quite a few gymnastics to hide that fact.

In a von Neumann architecture, data memory and stored program memory are intermixed. Modern computers have moved away from this architecture for security and performance reasons.

In this PR, we turn our emulator into a Harvard Architecture machine as well to match what our circuits are doing. That means we change the emulator to explicitly load instructions not from the data RAM, but from the program ROM, which we are already storing separately anyway.

Instead of thinking about separate address spaces for data and code, you can also imagine that we have an instruction cache and a data cache, and that we load the instructions into their cache once at the start of the program, and then never invalidate that cache.

This works towards #630

@matthiasgoergens matthiasgoergens changed the title Synchronise circuits and emulator to both use a Harvard Architecture Match emulator to circuits: use a Harvard Architecture Dec 4, 2024
@icemelon
Copy link
Member

icemelon commented Dec 4, 2024

Since you change the emulator to Harvard Architecture, should you also update the memory address space here?

@icemelon
Copy link
Member

icemelon commented Dec 4, 2024

Is this PR ready for review? I see there're many todos left

@matthiasgoergens
Copy link
Collaborator Author

Since you change the emulator to Harvard Architecture, should you also update the memory address space here?

Well, we could just totally remove the whole ROM section, because there's no problem with executing code anywhere, now that we separated the address spaces. But I want to keep this PR as small as possible, and there's also no immediate problem with keeping the ROM section contained, either. Or am I missing something?

Is this PR ready for review? I see there're many todos left

The TODOs are for future work, I want to keep this PR small. (I can convert them into issues, too, if you think that's better, but for these small organisational items I prefer to have them discoverable in the code.)

@matthiasgoergens
Copy link
Collaborator Author

@icemelon Let me make the follow up PR quickly that addresses the TODOs. Think of this PR as pre-emptively extracted from the big PR the TODOs hint at. Smaller PRs are easier to review and get consensus on.

Copy link
Collaborator

@lispc lispc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@icemelon
Copy link
Member

icemelon commented Dec 5, 2024

there's also no immediate problem with keeping the ROM section contained, either.

@matthiasgoergens Yes, you're right. I was thinking of having different memory space for ROM and RAM. But it's not necessary.

The TODOs are for future work, I want to keep this PR small.

I don't mind making PR small. But why not finish the implementation of the Hardvard architecture emulator in this PR? I don't think it will make the pr too large. But I'll leave this up to @lispc

@lispc
Copy link
Collaborator

lispc commented Dec 5, 2024

writing to the data cell with address x does not change the value of the instruction cell with address x

I think this pr finishes what it claims to do "writing to the data cell with address x does not change the value of the instruction cell with address x, so make emulator be consistent with prover".

the TODOs are good refactor, but not directly part of this purpose. So i think it is ok now.

@matthiasgoergens
Copy link
Collaborator Author

I don't mind making PR small. But why not finish the implementation of the Hardvard architecture emulator in this PR? I don't think it will make the pr too large. But I'll leave this up to @lispc

This PR finishes the Harvard architecture. The TODOs are for cleanup enabled by the switch.

@matthiasgoergens matthiasgoergens merged commit 43a9617 into master Dec 5, 2024
6 checks passed
@matthiasgoergens matthiasgoergens deleted the matthias/harvard-architecture branch December 5, 2024 07:11
matthiasgoergens added a commit that referenced this pull request Dec 13, 2024
In the [Harvard Architecture
PR](#688) we left a few TODOs.
Here we make good on them.

There's a few things happening in this PR:

- We remove the fake instruction `EANY`. `EANY` was only ever an
artifact of Risc0's incredible 'interesting' approach to decoding. We
use the real instructions `ECALL` and `EBREAK` instead.

- Remove `AUIPC` and `LUI`. Both of them are now implemented as
pseudo-instructions, that translate to `ADDI` during decoding.

- Use the same library as SP1 for RiscV decoding, instead of
copy-and-pasting-and-editing Risc0's 'interesting' decoder. That
simplifies our code, and comes with a lot more tests than we ever had.
Both because of explicit tests in the library, and because of the usage
in SP1 and other projects. This gets rid of much error prone bit
manipulating code.

- Use `struct Instruction` throughout the code when handling and testing
instructions, instead of `u32`. That makes specifying tests a lot
simpler and more readable. No more
`0b_000000001010_00000_000_00001_0010011, // addi x1, x0, 10` in the
code.

- Remove the notion of executable vs non-executable ROM. This is only
necessary for a von-Neumann architecture: everything that's in our
instruction-cache is meant to be executable already. (We can
re-implement this restriction later by controlling what is allowed to
make it into the instruction cache when we eg decode the ELF. But it's
unnecessary: we already honour the executable flag for memory sections
in the ELF.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants