Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAXON_CPU_COUNT >4 woes #64

Open
soundnut opened this issue Apr 15, 2021 · 9 comments
Open

SAXON_CPU_COUNT >4 woes #64

soundnut opened this issue Apr 15, 2021 · 9 comments

Comments

@soundnut
Copy link

Hi,

Great update to the readme - thanks.

Out of curiosity, I've been playing with the latest version and wanted to find the max I can do with an 85k Ulx3s board.
Mainly I wanted to see how many hearts I can fit.
SAXON_CPU_COUNT=6 seems to work to create a bitstream with 6 cores (at least I think it does - in the build log I see references to the additional cores)

Linux is a different story.
I added two more cpu definition blocks to ./buildroot-spinal-saxon/boards/common/dts/linux_cpu.dts.

linux boots but reports cpu 4 and 5 as failed to start
[ 0.117618] smp: Bringing up secondary CPUs ...
[ 0.194216] CPU4: failed to start
[ 0.212434] CPU5: failed to start
[ 0.214449] smp: Brought up 1 node, 4 CPUs

unsure if ./buildroot-spinal-saxon/boards/common/dts/linux_plic_link.dts needs extending too. Please advise

Trying to digg a little deeper I found that u-boot only reports 4 cpus
=> cpu list
0: cpu@0 rv32ima
1: cpu@1 rv32ima
2: cpu@2 rv32ima
3: cpu@3 rv32ima

found uboot.dts and tried adding 2 more cpu definitions (./buildroot-spinal-saxon/boards/spinal-saxon/ulx3s/u-boot/uboot.dts)
but still only 4 cpus in linux and uboot

poking around some more, I found this uboot config file (in ./build/uboot-smp-latest/configs/saxon_bsp_defconfig) with the default of 4 cpus. Changing CONFIG_NR_CPUS from 4 to 6 doesn't seem to stick though. it is overwritten in every run of saxon_buildroot
Performing just saxon_buildroot_compile after the change prevents it from being overwritten but still doesn't solve the problem

Any idea what I'm missing?

Thanks

@soundnut soundnut changed the title SAXON_CPU_COUNT inconsistent SAXON_CPU_COUNT >4 woes Apr 15, 2021
@soundnut
Copy link
Author

You're the man!
Great stuff - yes, that made the difference

[ 0.117988] smp: Bringing up secondary CPUs ...
[ 0.218975] smp: Brought up 1 node, 6 CPUs

root@buildroot:~# cat /proc/cpuinfo
processor : 0
hart : 4
isa : rv32ima
mmu : sv32

processor : 1
hart : 0
isa : rv32ima
mmu : sv32

processor : 2
hart : 1
isa : rv32ima
mmu : sv32

processor : 3
hart : 2
isa : rv32ima
mmu : sv32

processor : 4
hart : 3
isa : rv32ima
mmu : sv32

processor : 5
hart : 5
isa : rv32ima
mmu : sv32

Overall, 6 hearts or 4 hearts plus fpu seems to be about the maximum that's doable with this tiny board. (with 32bit)

with 6 hearts I get 95% TRELLIS_SLICE utilization. Still some LUTs left though - at 74%.

How hard are the clocks configured? would a smaller setup - say 2 cores plus fpu - work with a higher clock rate? the fpga should be able to handle higher rates according to the specs.

Thanks again for your help.
Cheers

@Dolu1990
Copy link
Member

Cool ^^

How hard are the clocks configured?

About 52 Mhz, are the timing passing with 6 cores ?

the fpga should be able to handle higher rates according to the specs.

Which spec ?

@soundnut
Copy link
Author

Re timing - with 6 hearts I get this:
Warning: Max frequency for clock '$glbnet$clocking_pll_clkout2': 39.93 MHz (FAIL at 52.08 MHz)
Info: Max frequency for clock '$glbnet$clocking_rmii_clk$TRELLIS_IO_IN': 73.10 MHz (PASS at 50.00 MHz)
Info: Max frequency for clock '$glbnet$clocking_pll_clkout0': 170.53 MHz (PASS at 125.00 MHz)
Info: Max frequency for clock '$glbnet$clocking_pll_clkout3': 57.25 MHz (PASS at 25.00 MHz)
Info: Max frequency for clock '$glbnet$debug_jtag_tck$TRELLIS_IO_IN': 93.79 MHz (PASS at 50.00 MHz)

Re spec https://www.latticesemi.com/view_document?document_id=50461 - chapter 3.19
not sure though how far you can stretch the internal clock with a 25MHz input.

as far as I understand, it all starts with a 25MHz input oscillator. The cores are currently configured to run at 52MHz and memory at around 100MHz. Where are these ratios configured? could we try to double up? i.e run a single core at 100MHz and memory around 200MHz?

@soundnut
Copy link
Author

soundnut commented Apr 19, 2021

Can you tell me how the clock domains are being used?
Ulx3s
clkout0 125 \ -HDMI
clkout1 100 \ - Memory
clkout2 50 \ - main Heart clock
clkout3 25 \ - VGA
ArtyA7 and NexysA7 seem to have more clock domains (Arty 0-5 and Nexys 0-6)

I assume Memory needs to run 2x faster than main Heart from previous posts, clkout0 and 3 are presumably fix at these levels.
So if I want to play with higher clock frequencies on the Ulx3s, clkout1 and clkout2 are the ones to increase and to keep at a 1:2 ratio. Correct?

how does this setting in Ulx3sSmp.scala play into all of this?
frequency = FixedFrequency(52 MHz),

Thanks

@Dolu1990
Copy link
Member

Where are these ratios configured

You will have to update https://github.com/SpinalHDL/SaxonSoc/blob/dev-0.3/hardware/scala/saxon/board/radiona/ulx3s/Ulx3sSmp.scala#L227

And also https://github.com/SpinalHDL/SaxonSoc/blob/dev-0.3/hardware/synthesis/radiona/ulx3s/smp/pll_linux.v
via https://github.com/SpinalHDL/SaxonSoc/blob/dev-0.3/hardware/synthesis/radiona/ulx3s/smp/makefile#L57 i gess.

could we try to double up?

You can't overclock things for ever, already the synthesis tool isn't happy right now : 39.93 MHz (FAIL at 52.08 MHz)

Going higher is asking for troubles XD

I assume Memory needs to run 2x faster, clkout0 and 3 are presumably fix at these levels. Correct

Right

how does this setting in Ulx3sSmp.scala play into all of this?
frequency = FixedFrequency(52 MHz),

Yes you only need to update that one.

@soundnut
Copy link
Author

update
5 hearts plus FPU are fitting on the 85k Ulx3s. Routing took over 29 hrs to complete. (For comparison, the 6 heart config took slightly more than 1 hr to place) So this seems to be the max in terms of cores that are possible. 5 cores plus FPU or 6 cores without FPU.

Next I'm going to try playing with the frequency - will start small and build up until things start to break. FUN stuff!

@Dolu1990
Copy link
Member

FUN stuff!

Freedoom ^^
This kind of situation scare me as hell, as then i'm scared that the CPU design is bugy and i would have to spend weeeeeeks to find the bug XD

@soundnut
Copy link
Author

This I can understand.
My intention is not to poke holes into this - just to learn the concepts and how this is all pieced together and maybe contribute small bits and pieces here and there. While I appreciate fixes and solutions, I'm also content if you point me into the right direction so that I can try to solve the puzzle myself. Like the input regarding frequency - knowing where to look is really helpful and I'm happy with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants