From db6a4b77684e41bbfb2eb1224c413b32c2822ec7 Mon Sep 17 00:00:00 2001 From: stnolting Date: Fri, 10 Jan 2025 12:03:30 +0100 Subject: [PATCH] [docs] smp updates and cleanups --- docs/datasheet/cpu_dual_core.adoc | 65 ++++++++++++++++++------------- docs/datasheet/software.adoc | 3 +- 2 files changed, 40 insertions(+), 28 deletions(-) diff --git a/docs/datasheet/cpu_dual_core.adoc b/docs/datasheet/cpu_dual_core.adoc index a70e0ed0f..fec414fc5 100644 --- a/docs/datasheet/cpu_dual_core.adoc +++ b/docs/datasheet/cpu_dual_core.adoc @@ -1,9 +1,9 @@ :sectnums: === Dual-Core Configuration -.Dual-Core Example +.Dual-Core Example Programs [TIP] -A simple dual-core example program can be found in `sw/example/demo_dual_core`. +A set of rather simple dual-core example programs can be found in `sw/example/demo_dual_core*`. Optionally, the CPU core can be implemented as **symmetric multiprocessing (SMP) dual-core** system. This dual-core configuration is enabled by the `DUAL_CORE_EN` <<_processor_top_entity_generics, top generic>>. @@ -24,24 +24,30 @@ The following table summarizes the most important aspects when using the dual-co [cols="<2,<10"] [grid="rows"] |======================= +| **CPU configuration** | Both cores use the same cache, CPU and ISA configuration provided by the according top generics. | **Debugging** | A special SMP openOCD script (`sw/openocd/openocd_neorv32.dual_core.cfg`) is required to -debug both cores at one. SMP-debugging is fully supported by RISC-V gdb port. +debug both cores at one. SMP-debugging is fully supported by the RISC-V gdb port. | **Clock and reset** | Both cores use the same global processor clock and reset. If <<_cpu_clock_gating>> -is enabled the clock of each core can be individually halted by putting it into <<_sleep_mode>>. -| **Address space** | Both cores have access to the same <<_address_space>>. +is enabled, the clock of each core can be individually halted by putting the core into <<_sleep_mode>>. +| **Address space** | Both cores have full access to the same physical <<_address_space>>. | **Interrupts** | All <<_processor_interrupts>> are routed to both cores. Hence, each core has access to all <<_neorv32_specific_fast_interrupt_requests>> (FIRQs). Additionally, the RISC-V machine-level _external interrupt_ (via the top `mext_irq_i` port) is also send to both cores. In contrast, the RISC-V machine level -_software_ and _timer_ interrupts are exclusive for each core (provided by the <<_core_local_interruptor_clint>>). -| **RTE** | The <<_neorv32_runtime_environment>> also supports the dual-core configuration. However, it needs -to be explicitly initialized on each core individually. The RTE trap handling provides a individual handler -tables for each core. +_software_ and _timer_ interrupts are core-exclusive (provided by the <<_core_local_interruptor_clint>>). +| **RTE** | The <<_neorv32_runtime_environment>> fully supports the dual-core configuration and provides +core-individual trap handler tables. However, the RTE needs to be explicitly initialized on each core +(executing `neorv32_rte_setup()`). | **Memory** | Each core has its own stack. The top of stack of core 0 is defined by the <<_linker_script>> while the top of stack of core 1 has to be explicitly defined by core 0 (see <<_dual_core_boot>>). Both -cores share the same heap, `.data` and `.bss` sections. -| **Constructors and destructors** | Constructors and destructors are executed on core 0 only. -(see ) -| **Core communication** | See section <<_inter_core_communication_icc>>. +cores share the same heap, `.data` and `.bss` sections. Hence, only core 0 setups the `.data` and `.bss` +sections at boot-up. +| **Constructors and destructors** | Constructors and destructors are executed by core 0 only +(see section <<_c_standard_library>>). +| **Cache coherency** | Be aware that there is no cache snooping available. If any level-1 cache is enabled +(<<_processor_internal_instruction_cache_icache>> and/or <<_processor_internal_data_cache_dcache>>) care +must be taken to prevent access to outdated data - either by using cache synchronization (`fence[.]` +instructions) or by using <<_atomic_memory_access>>. +| **Inter-core communication** | See section <<_inter_core_communication_icc>>. | **Bootloader** | Only core 0 will boot and execute the bootloader while core 1 is held in standby. | **Booting** | See section <<_dual_core_boot>>. |======================= @@ -55,22 +61,18 @@ shared-memory communication. Additionally, communication using these links is gu The inter-core communication (ICC) module is implemented as dedicated hardware module within each CPU core (VHDL file `rtl/core/neorv32_cpu_icc.vhd`). This module is automatically included if the dual-core option -is enabled. Each core provides a 32-bit wide and 4 entries deep FIFO for sending data to the other core. +is enabled. Each core provides a **32-bit wide** and **4 entries deep** FIFO for sending data to the other core. Hence, there are two FIFOs: one for sending data from core 0 to core 1 and another one for sending data the opposite way. -The ICC communication links are accessed via NEORV32-specific CSRs. Hence, those FIFOs are accessible only -by the CPU core itself and cannot be accessed by the DMA or any other CPU core. In total, three CSRs are -provided to handle communications: +The ICC communication links are accessed via two NEORV32-specific CSRs. Hence, those FIFOs are accessible only +by the CPU core itself and cannot be accessed by the DMA or any other CPU core. -The <<_mxiccsr>> is used to select the core with which to communicate. In the dual-core configuration core 1 -can only select core 0 and vice versa. The core selection in this register allows access to the according -message FIFOs via the two other CSRs. Additionally, the CSR provides status flags (TX FIFO data available; -RX FIFO free space) related to the selected communication link. - -The <<_mxiccrxd>> and <<_mxicctxd>> CSRs are used for the actual data read and write operations. Writing data -to <<<<_mxicctxd>>> will send to the message queue of the core selected by <<_mxiccsr>>. Conversely, reading -data from <<_mxiccrxd>> will return data received from the core selected by <<_mxiccsr>>. +The <<_mxiccsreg>> provides read-only status information about the core's ICC links: bit 0 becomes set if +there is RX data available for _this_ core (send from the the other core). Bit 1 is set as long there is +free space in _this_ core's TX data FIFO. The <<_mxiccdata>> CSR is used for actual data send/receive operations. +Writing this register will put the according data word into the TX link FIFO of _this_ core. Reading this CSR +will return a data word from the RX FIFO of _this_ core. The ICC FIFOs do not provide any interrupt capabilities. Software is expected to use the machine-software interrupt of the receiving core (provided by the <<_core_local_interruptor_clint>>) to inform it about @@ -94,11 +96,20 @@ To boot-up core 1, the primary core has to use a special library function provid .CPU Core 1 launch function prototype (note that this function can only be executed on core 0) [source,c] ---- -int neorv32_smp_launch(int hart_id, void (*entry_point)(void), uint8_t* stack_memory, size_t stack_size_bytes); +int neorv32_smp_launch(int (*entry_point)(void), uint8_t* stack_memory, size_t stack_size_bytes); ---- -When executed, core 0 use the <<_inter_core_communication_icc>> to send launch data that includes the entry point +When executed, core 0 uses the <<_inter_core_communication_icc>> to send launch data that includes the entry point for core 1 (via `entry_point`) and the actual stack configuration (via `stack_memory` and `stack_size_bytes`). +Note that the main function for core 1 has to use a specific type (return `int`, no arguments): + +.CPU Core 1 Main Function +[source,c] +---- +int core1_main(void) { + return 0; // return to crt0 and go to sleep mode +} +---- .Core 1 Stack Memory [NOTE] diff --git a/docs/datasheet/software.adoc b/docs/datasheet/software.adoc index 944084796..61e643439 100644 --- a/docs/datasheet/software.adoc +++ b/docs/datasheet/software.adoc @@ -380,7 +380,8 @@ Note that `\n` (newline) is automatically converted to `\r\n` (carriage-return a .Constructors and Destructors [NOTE] Constructors and destructors for plain C code or for C++ applications are supported by the software framework. -See `sw/example/hello_cpp` for a minimal example. +See `sw/example/hello_cpp` for a minimal example. Note that constructor and destructors are only executed +by core 0 (primary core) in the SMP <<_dual_core_configuration>>. .Newlib Test/Demo Program [TIP]