-
Notifications
You must be signed in to change notification settings - Fork 34
Home
This processor is designed for small size, high clock speed, and simplicity. There are 8 general purpose registers, each of which can contain a 16 bit value. There are also 4 condition code flags (zero, negative, carry, and overflow), which are set by arithmetic operations and can be used for conditional branches.
The pipeline does perform bypassing of intermediate arithmetic results, but it does not have any form of interlocking. As such, it has two "branch delay slots," which is a fancy way of saying it will execute the next two instructions after a branch because the instructions are already in the pipeline. NOPs can be inserted to avoid side effects, or the code can be structured to take advantage of this. Likewise, loads have 2 cycles of latency. If you attempt to access a load destination register within two instructions, it will not contain the correct value.
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | ||
Arithmetic | 0 | 0 | 0 | operation | opb | opa | dest | ||||||||||
Load | 0 | 0 | 1 | offset | ptr | dest | |||||||||||
Store | 0 | 1 | 0 | offset h | src | ptr | offset l | ||||||||||
Addi | 0 | 1 | 1 | immediate | opa | dest | |||||||||||
Lui | 1 | 0 | 0 | immediate | dest | ||||||||||||
Conditional branch | 1 | 0 | 1 | cond | offset | ||||||||||||
Unconditional branch | 1 | 1 | 0 | link | offset | ||||||||||||
Jump to reg | 1 | 1 | 1 | link | unused | target | unused |
Name | Params | Instruction Format | Flags Affected | operation/cond/ | Description |
---|---|---|---|---|---|
and | dest, srca, srcb | Arithmetic | NZ | 0 | Bitwise logical and |
or | dest, srca, srcb | Arithmetic | NZ | 1 | Bitwise logical or |
shl | dest, srca | Arithmetic | CNZ | 2 | Logical shift left one position |
shr | dest, srca | Arithmetic | CNZ | 3 | Logical shift right one position |
add | dest, srca, srcb | Arithmetic | CNZ | 4 | Add without carry |
sub | dest, srca, srcb | Arithmetic | CNZ | 5 | Subtract without carry |
xor | dest, srca, srcb | Arithmetic | NZ | 6 | Bitwise logical exclusive or |
not | dest, srca | Arithmetic | NZ | 7 | Bitwise logical not |
adc | dest, srca, srcb | Arithmetic | CNZ | 8 | Add with carry in |
sbc | dest, srca, srcb | Arithmetic | CNZ | 9 | Subtract with carry[borrow] in |
rol | dest, srca | Arithmetic | NZ | 10 | Rotate left (carry bit loaded into LSB) |
ror | dest, srca | Arithmetic | NZ | 11 | Rotate right (carry bit loaded into MSB) |
load | dest, [offset](ptr) | Load | Load word | ||
store | src, [offset](ptr) | Store | Store word | ||
addi | dest, srca, immediate | Addi | CZNO1 | Add signed immediate value -31 to 31 | |
lui | dest, immediate | Lui | Load upper immediate. Value is loaded into top 10 bits of dest. Low 6 bits are cleared. | ||
jump | label | Unconditional branch | link=0 | Jump to label | |
call | label | Unconditional branch | link=1 | Call to label (return address saved in r7) | |
jumpr | reg | Jump to reg | link=0 | Jump to address in register | |
callr | reg | Jump to reg | link=1 | Call to address in register (return address saved in r7) | |
bcc | label | Conditional branch | 6 | Branch if carry flag clear | |
bcs | label | Conditional branch | 2 | Branch if carry flag set | |
bzc | label | Conditional branch | 4 | Branch if zero flag clear | |
bzs | label | Conditional branch | 0 | Branch if zero flag set | |
bnc | label | Conditional branch | 5 | Branch if negative flag clear | |
bns | label | Conditional branch | 1 | Branch if negative flag set | |
boc | label | Conditional branch | 7 | Branch if overflow flag clear | |
bos | label | Conditional branch | 3 | Branch if overflow flag set |
- Note that C and O bits are currently not implemented. I guess that's technically a bug.
ldi | immediate | Load 16-bit immediate into register (creates LDI/ADDI pair) |
nop | No-operation (and r0, r0, r0) | |
lea | label | Load effective address of label into register |
+------------------+ 0000 | Boot ROM | +------------------+ 0010 | Local memory | +------------------+ 4000 | Global memory | +------------------+ (global memory size) | | / / | | +------------------+ FC00 | Device Registers | +------------------+ FFFF
Note that Internally, the processor is a harvard architecture. Instructions and data are fetched on independent busses. For the most part, however, the address spaces access the same memory. One exception is that the instruction bus cannot access global memory or device registers.
When each core comes out of reset, it begins executing code at address 0 in its local address space. A small chunk of ROM at that location contains copies the program from global memory into local memory at address 16, then branches to it.
(Currently, the ROM is emulated using $readmemh into each local memory)
Writing a one to a hardware mutex location will attempt to acquire it and writing a zero will release it if it is held by the owning core. A core may read the location to determine if it has acquired the mutex: it will return one if so, zero if not.