-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NoC experiments #9
Draft
petervdonovan
wants to merge
34
commits into
main
Choose a base branch
from
noc-experiments
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,109
−15
Draft
Changes from all commits
Commits
Show all changes
34 commits
Select commit
Hold shift + click to select a range
fdf8d34
Tentative start on NoC benchmarks.
petervdonovan 1cff46f
Measure 11 cycles of latency by cheating.
petervdonovan 841e500
store->nop->load -> wrong WB read. Bug?
petervdonovan f9afb86
Actually 35 cycles of latency it seems.
petervdonovan ec82902
Adjust and comment on noc_latency_aligned.
petervdonovan 6abf115
Experiment with the NoC interface.
petervdonovan 828d100
More tinkering.
petervdonovan 93e1631
Get a basic test working in simulation.
petervdonovan 7e59e25
Failed attempt at synchronization.
petervdonovan d77fd4d
Successful attempt at synchronization.
petervdonovan d65c22f
Factor more assembly out into macros.
petervdonovan 8eefda3
First draft of the sender side of the protocol.
petervdonovan 89b8d65
Initial attempt at a batch communication.
petervdonovan 085bcd8
Refactor the assembly a bit.
petervdonovan 5ecc536
More assembly refactoring.
petervdonovan 614a56f
First draft of receive words macro.
petervdonovan 8a49d64
Receive a sequence of words correctly.
petervdonovan b9ee3ad
Send packets of length up to 64.
petervdonovan 132d4c7
Add C API for read_n_words_and_print.
petervdonovan a8d065d
Add C API for broadcast_count.
petervdonovan fa062b2
Get broadcast to work from each core in turn.
petervdonovan 5fea13b
Start extending the protocol.
petervdonovan 7ba4017
Make small modifications.
petervdonovan 8b863f9
Get the extended protocol to work properly.
petervdonovan 309dad6
Update programs/benchmarks/noc/latency_aligned/noc_latency_aligned.c
petervdonovan b8fdadb
This sends 1023 words in 5867 cycles.
petervdonovan 8e70867
Optimize out a SYNC5.
petervdonovan d5983a3
Bugfix; move header-only lib to flexpret.
petervdonovan 4c64694
Start creating a BroadcastMemory program.
petervdonovan b18fd08
Assembly generation "hello world".
petervdonovan 23d3659
Start porting assembly to rvg.
petervdonovan 6827fcc
Top-level definitions parse for BroadcastCount.
petervdonovan 9bf0027
BroadcastCount assembly is generated.
petervdonovan f066f42
Struggle to get assembly to work.
petervdonovan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Submodule flexpret
updated
from d51bbb to 45eba5
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -34,4 +34,4 @@ int main2() { | |
|
||
int main3() { | ||
_fp_print(43); | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
build: | ||
riscv_compile.sh ispm noc_latency_aligned.c | ||
|
||
clean: | ||
riscv_clean.sh | ||
|
||
|
||
rebuild: clean build | ||
|
||
PHONY: build clean rebuild |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
#define WAIT_FOR_NEXT_ZERO_MOD_1024(id) \ | ||
"li t0, 1014\n\t" \ | ||
"li a0, 1\n\t" \ | ||
"li a1, 2\n\t" \ | ||
"li a2, 3\n\t" \ | ||
"li a3, 4\n\t" \ | ||
"li a4, 5\n\t" \ | ||
"li a5, 6\n\t" \ | ||
"li t6, 7\n\t" \ | ||
"rdcycle t1\n\t" \ | ||
"andi t1, t1, 7\n\t" \ | ||
"beq t1, t6, LOOP" #id "\n\t" \ | ||
"beq t1, a5, LOOP" #id "\n\t" \ | ||
"beq t1, a4, LOOP" #id "\n\t" \ | ||
"beq t1, a3, LOOP" #id "\n\t" \ | ||
"beq t1, a2, LOOP" #id "\n\t" \ | ||
"beq t1, a1, LOOP" #id "\n\t" \ | ||
"beq t1, a0, LOOP" #id "\n\t" \ | ||
"beq t1, x0, LOOP" #id "\n\t" \ | ||
/* This entire loop is 8 cycles long, so the value of t1 upon exiting is t0 plus a */ \ | ||
/* number in the range [0, 7] */ \ | ||
"LOOP" #id ":\n\t" \ | ||
"nop\n\t" /* Delay so that loop length is a power of 2 */ \ | ||
"nop\n\t" \ | ||
"nop\n\t" \ | ||
"rdcycle t1\n\t" \ | ||
"andi t1, t1, 1023\n\t" \ | ||
"blt t1, t0, LOOP" #id "\n\t" /* Cost of 3 cycles when taken, 1 otherwise; see page 37 https://www2.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-181.pdf */ \ | ||
"nop\n\t" \ | ||
"nop\n\t" \ | ||
"nop\n\t" \ | ||
"nop\n\t" \ | ||
"nop\n\t" \ | ||
"nop\n\t" |
80 changes: 80 additions & 0 deletions
80
programs/benchmarks/noc/latency_aligned/noc_latency_aligned.c
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
/** | ||
* This program explores the absolute minimum amount of time that it can take to send one word and | ||
* write it into a register on another core, when under the most favorable circumstances, | ||
* and when controlling relative timing and optimizing the assembly. | ||
*/ | ||
#include <stdint.h> | ||
#include <flexpret_io.h> | ||
#include <flexpret_noc.h> | ||
#include <stdlib.h> | ||
|
||
#include "align.h" | ||
|
||
#define N 100 | ||
|
||
static int main_of(uint32_t core); | ||
|
||
static int send_main(uint32_t receiver); | ||
static int receive_main(uint32_t sender); | ||
|
||
int main() { | ||
unsigned long coreid = read_csr(CSR_COREID); | ||
srand(coreid); | ||
if (coreid == 0) for (int i = 0; i < 10; i++) send_main(1); | ||
if (coreid == 1) for (int i = 0; i < 10; i++) receive_main(0); | ||
} | ||
|
||
static int send_main(uint32_t receiver) { | ||
asm volatile ( | ||
"li t4, 0x40000000\n\t" | ||
WAIT_FOR_NEXT_ZERO_MOD_1024(send) // clobber "a" registers, as well as t0, t1, t6 | ||
// like noc_send, but without blocking | ||
"li t5, 0x1\n\t" // noc destination | ||
"sw t5, 8(t4)\n\t" | ||
"li t5, 0x08\n\t" | ||
"sw t5, 4(t4)\n\t" | ||
"nop\n\t" | ||
"nop\n\t" | ||
"li t5, 42\n\t" // Set noc data to 42 | ||
"sw t5, 8(t4)\n\t" // NOTE: Data must be written first. This is by design. | ||
"li t5, 0x04\n\t" | ||
"sw t5, 4(t4)\n\t" | ||
); | ||
} | ||
|
||
static int receive_main(uint32_t sender) { | ||
asm volatile ( | ||
WAIT_FOR_NEXT_ZERO_MOD_1024(receive) | ||
// "nop\n\t" // The 9-cycle read loop is aligned optimally when the number of nops here is zero mod 9 | ||
// "nop\n\t" | ||
// "nop\n\t" | ||
// "nop\n\t" | ||
// "nop\n\t" | ||
// "nop\n\t" | ||
// "nop\n\t" | ||
// "nop\n\t" | ||
// "nop\n\t" | ||
"li t4, 0x40000000\n\t" // wishbone base address | ||
// FIXME: Why does this loop have to go through one iteration extra the first time around, compared to the number of iterations that it makes thereafter? | ||
"CHECK_IF_RECEIVED_YET:\n\t" | ||
// Sadly, this whole sequence -- store, wait, read, mask, beq -- must be in the loop. In particular, if the store is factored out, the read doesn't work, even though we are storing the same thing each time. | ||
"sw x0, 0(t4)\n\t" // Write the address of NoC CSR to Wishbone read address | ||
"nop\n\t" | ||
"nop\n\t" | ||
"lw t5, 12(t4)\n\t" // Read NoC CSR | ||
"andi t5, t5, 2\n\t" | ||
"beq x0, t5, CHECK_IF_RECEIVED_YET\n\t" | ||
"li t5, 4\n\t" // Write the address of NoC data to Wishbone read address | ||
"sw t5, 0(t4)\n\t" | ||
"nop\n\t" | ||
"nop\n\t" | ||
"lw t5, 12(t4)\n\t" // Read NoC data | ||
"rdcycle t3\n\t" | ||
"andi t3, t3, 1023\n\t" | ||
"li t0, 0xbaaabaaa\n\t" | ||
"csrw 0x51e, t0\n\t" | ||
"csrw 0x51e, t3\n\t" | ||
"csrw 0x51e, t0\n\t" | ||
"csrw 0x51e, t5\n\t" | ||
); | ||
} |
10 changes: 10 additions & 0 deletions
10
programs/benchmarks/noc/latency_random_sparse_send/Makefile
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
build: | ||
riscv_compile.sh ispm noc_latency_random_sparse_send.c | ||
|
||
clean: | ||
riscv_clean.sh | ||
|
||
|
||
rebuild: clean build | ||
|
||
PHONY: build clean rebuild |
50 changes: 50 additions & 0 deletions
50
programs/benchmarks/noc/latency_random_sparse_send/noc_latency_random_sparse_send.c
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
#include <stdint.h> | ||
#include <flexpret_io.h> | ||
#include <flexpret_noc.h> | ||
#include <stdlib.h> | ||
|
||
#define N 100 | ||
// 1 << LOG2_OF_A_LONG_TIME should be much greater than the number of cycles required to run | ||
// one iteration of the benchmark. I think it takes less than 512 cycles to run one iteration | ||
// of the benchmark. | ||
#define LOG2_OF_A_LONG_TIME 11 | ||
|
||
static int main_of(uint32_t core); | ||
|
||
int main() { | ||
unsigned long coreid = read_csr(CSR_COREID); | ||
srand(coreid); | ||
main_of(coreid); | ||
} | ||
|
||
static int send_main(uint32_t receiver) { | ||
for (uint32_t i = 0; i < N; i++) { | ||
uint32_t min_delay = 1 << LOG2_OF_A_LONG_TIME; | ||
uint32_t additional_delay = rand() & ((1 << LOG2_OF_A_LONG_TIME) - 1); | ||
unsigned long end_time = rdcycle() + min_delay + additional_delay; | ||
while (rdcycle() < end_time) {} | ||
unsigned long t0 = rdcycle(); // benchmark start | ||
noc_send(receiver, t0); | ||
} | ||
} | ||
|
||
static int receive_main(uint32_t sender) { | ||
for (uint32_t i = 0; i < N; i++) { | ||
uint32_t t0 = noc_receive(); | ||
uint32_t t1 = rdcycle(); // benchmark end | ||
_fp_print((sender + 1) * 1000000 + t1 - t0); | ||
} | ||
} | ||
|
||
static int send_receive(uint32_t partner, int first) { | ||
first ? send_main(partner) : receive_main(partner); | ||
!first ? send_main(partner) : receive_main(partner); | ||
} | ||
|
||
static int main_of(uint32_t core) { | ||
int big = core & 2; | ||
int odd = core & 1; | ||
send_receive((core + 1) & 3, !odd); | ||
send_receive((core + 2) & 3, !big); | ||
send_receive((core + 3) & 3, !odd); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
build: | ||
riscv_compile.sh ispm broadcast_count_noc.c | ||
|
||
clean: | ||
riscv_clean.sh | ||
|
||
|
||
rebuild: clean build | ||
|
||
PHONY: build clean rebuild |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
#include <stdint.h> | ||
#include <flexpret_io.h> | ||
#include <noc/flexpret_noc_c_api.h> | ||
|
||
/* | NE | _ | N | E | _ | */ | ||
|
||
/*********************** | ||
* core 0 / N \ core 1 * | ||
* W + E * | ||
* core 2 \ S / core 3 * | ||
***********************/ | ||
|
||
int main0(); | ||
int main1(); | ||
int main2(); | ||
int main3(); | ||
|
||
int main() { | ||
int core_id = read_csr(CSR_COREID); | ||
switch(core_id) { | ||
case 0: main0(); break; | ||
case 1: main1(); break; | ||
case 2: main2(); break; | ||
case 3: main3(); break; | ||
default: _fp_print(666); //ERROR | ||
} | ||
} | ||
|
||
int main0() { | ||
broadcast_count(0, 125); | ||
read_n_words_and_print(1, EAST_INT); | ||
read_n_words_and_print(2, NORTH_INT); | ||
read_n_words_and_print(3, NORTHEAST_INT); | ||
} | ||
|
||
int main1() { | ||
read_n_words_and_print(0, EAST_INT); | ||
broadcast_count(1, 17); | ||
read_n_words_and_print(2, NORTHEAST_INT); | ||
read_n_words_and_print(3, NORTH_INT); | ||
} | ||
|
||
int main2() { | ||
read_n_words_and_print(0, NORTH_INT); | ||
read_n_words_and_print(1, NORTHEAST_INT); | ||
broadcast_count(2, 42); | ||
read_n_words_and_print(3, EAST_INT); | ||
} | ||
|
||
int main3() { | ||
read_n_words_and_print(0, NORTHEAST_INT); | ||
read_n_words_and_print(1, NORTH_INT); | ||
read_n_words_and_print(2, EAST_INT); | ||
broadcast_count(3, 3); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
build: gen compile | ||
|
||
gen: | ||
mkdir -p asm-gen | ||
|
||
rvg asm/stdlib.rvg asm/ctrl.rvg asm/flexpret.rvg asm/flexpret-noc-low-level-interface.rvg asm/flexpret-noc-c-api.rvg "flexpret-noc-c-api.rvg=[flexpret-noc-c-api [mu [] [print-flexpret-noc-c-api-h]]]" > asm-gen/flexpret-noc-c-api.h | ||
|
||
rvg asm/stdlib.rvg asm/ctrl.rvg asm/flexpret.rvg asm/flexpret-noc-low-level-interface.rvg asm/flexpret-noc-c-api.rvg "flexpret-noc-c-api.rvg=[flexpret-noc-c-api [mu [] [print-read-n-words-and-print]]]" > asm-gen/read-n-words-and-print.s | ||
|
||
rvg asm/stdlib.rvg asm/ctrl.rvg asm/flexpret.rvg asm/flexpret-noc-low-level-interface.rvg asm/flexpret-noc-c-api.rvg "flexpret-noc-c-api.rvg=[flexpret-noc-c-api [mu [] [print-broadcast-count]]]" > asm-gen/broadcast-count.s | ||
|
||
rvg asm/stdlib.rvg asm/ctrl.rvg asm/hello.rvg > asm-gen/hello.s | ||
rvg asm/stdlib.rvg asm/ctrl.rvg asm/hello_h.rvg > asm-gen/hello.h | ||
|
||
compile: | ||
riscv_compile.sh ispm asm-gen/hello.s asm-gen/read-n-words-and-print.s asm-gen/broadcast-count.s broadcast_memory_noc.c | ||
|
||
clean: | ||
rm -r asm-gen | ||
riscv_clean.sh | ||
|
||
rebuild: clean build | ||
|
||
PHONY: build clean rebuild |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem