Skip to content

Commit

Permalink
[docs] smp updates and cleanups
Browse files Browse the repository at this point in the history
  • Loading branch information
stnolting committed Jan 10, 2025
1 parent 650e1f0 commit db6a4b7
Show file tree
Hide file tree
Showing 2 changed files with 40 additions and 28 deletions.
65 changes: 38 additions & 27 deletions docs/datasheet/cpu_dual_core.adoc
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
:sectnums:
=== Dual-Core Configuration

.Dual-Core Example
.Dual-Core Example Programs
[TIP]
A simple dual-core example program can be found in `sw/example/demo_dual_core`.
A set of rather simple dual-core example programs can be found in `sw/example/demo_dual_core*`.

Optionally, the CPU core can be implemented as **symmetric multiprocessing (SMP) dual-core** system.
This dual-core configuration is enabled by the `DUAL_CORE_EN` <<_processor_top_entity_generics, top generic>>.
Expand All @@ -24,24 +24,30 @@ The following table summarizes the most important aspects when using the dual-co
[cols="<2,<10"]
[grid="rows"]
|=======================
| **CPU configuration** | Both cores use the same cache, CPU and ISA configuration provided by the according top generics.
| **Debugging** | A special SMP openOCD script (`sw/openocd/openocd_neorv32.dual_core.cfg`) is required to
debug both cores at one. SMP-debugging is fully supported by RISC-V gdb port.
debug both cores at one. SMP-debugging is fully supported by the RISC-V gdb port.
| **Clock and reset** | Both cores use the same global processor clock and reset. If <<_cpu_clock_gating>>
is enabled the clock of each core can be individually halted by putting it into <<_sleep_mode>>.
| **Address space** | Both cores have access to the same <<_address_space>>.
is enabled, the clock of each core can be individually halted by putting the core into <<_sleep_mode>>.
| **Address space** | Both cores have full access to the same physical <<_address_space>>.
| **Interrupts** | All <<_processor_interrupts>> are routed to both cores. Hence, each core has access to
all <<_neorv32_specific_fast_interrupt_requests>> (FIRQs). Additionally, the RISC-V machine-level _external
interrupt_ (via the top `mext_irq_i` port) is also send to both cores. In contrast, the RISC-V machine level
_software_ and _timer_ interrupts are exclusive for each core (provided by the <<_core_local_interruptor_clint>>).
| **RTE** | The <<_neorv32_runtime_environment>> also supports the dual-core configuration. However, it needs
to be explicitly initialized on each core individually. The RTE trap handling provides a individual handler
tables for each core.
_software_ and _timer_ interrupts are core-exclusive (provided by the <<_core_local_interruptor_clint>>).
| **RTE** | The <<_neorv32_runtime_environment>> fully supports the dual-core configuration and provides
core-individual trap handler tables. However, the RTE needs to be explicitly initialized on each core
(executing `neorv32_rte_setup()`).
| **Memory** | Each core has its own stack. The top of stack of core 0 is defined by the <<_linker_script>>
while the top of stack of core 1 has to be explicitly defined by core 0 (see <<_dual_core_boot>>). Both
cores share the same heap, `.data` and `.bss` sections.
| **Constructors and destructors** | Constructors and destructors are executed on core 0 only.
(see )
| **Core communication** | See section <<_inter_core_communication_icc>>.
cores share the same heap, `.data` and `.bss` sections. Hence, only core 0 setups the `.data` and `.bss`
sections at boot-up.
| **Constructors and destructors** | Constructors and destructors are executed by core 0 only
(see section <<_c_standard_library>>).
| **Cache coherency** | Be aware that there is no cache snooping available. If any level-1 cache is enabled
(<<_processor_internal_instruction_cache_icache>> and/or <<_processor_internal_data_cache_dcache>>) care
must be taken to prevent access to outdated data - either by using cache synchronization (`fence[.]`
instructions) or by using <<_atomic_memory_access>>.
| **Inter-core communication** | See section <<_inter_core_communication_icc>>.
| **Bootloader** | Only core 0 will boot and execute the bootloader while core 1 is held in standby.
| **Booting** | See section <<_dual_core_boot>>.
|=======================
Expand All @@ -55,22 +61,18 @@ shared-memory communication. Additionally, communication using these links is gu

The inter-core communication (ICC) module is implemented as dedicated hardware module within each CPU core
(VHDL file `rtl/core/neorv32_cpu_icc.vhd`). This module is automatically included if the dual-core option
is enabled. Each core provides a 32-bit wide and 4 entries deep FIFO for sending data to the other core.
is enabled. Each core provides a **32-bit wide** and **4 entries deep** FIFO for sending data to the other core.
Hence, there are two FIFOs: one for sending data from core 0 to core 1 and another one for sending data the
opposite way.

The ICC communication links are accessed via NEORV32-specific CSRs. Hence, those FIFOs are accessible only
by the CPU core itself and cannot be accessed by the DMA or any other CPU core. In total, three CSRs are
provided to handle communications:
The ICC communication links are accessed via two NEORV32-specific CSRs. Hence, those FIFOs are accessible only
by the CPU core itself and cannot be accessed by the DMA or any other CPU core.

The <<_mxiccsr>> is used to select the core with which to communicate. In the dual-core configuration core 1
can only select core 0 and vice versa. The core selection in this register allows access to the according
message FIFOs via the two other CSRs. Additionally, the CSR provides status flags (TX FIFO data available;
RX FIFO free space) related to the selected communication link.

The <<_mxiccrxd>> and <<_mxicctxd>> CSRs are used for the actual data read and write operations. Writing data
to <<<<_mxicctxd>>> will send to the message queue of the core selected by <<_mxiccsr>>. Conversely, reading
data from <<_mxiccrxd>> will return data received from the core selected by <<_mxiccsr>>.
The <<_mxiccsreg>> provides read-only status information about the core's ICC links: bit 0 becomes set if
there is RX data available for _this_ core (send from the the other core). Bit 1 is set as long there is
free space in _this_ core's TX data FIFO. The <<_mxiccdata>> CSR is used for actual data send/receive operations.
Writing this register will put the according data word into the TX link FIFO of _this_ core. Reading this CSR
will return a data word from the RX FIFO of _this_ core.

The ICC FIFOs do not provide any interrupt capabilities. Software is expected to use the machine-software
interrupt of the receiving core (provided by the <<_core_local_interruptor_clint>>) to inform it about
Expand All @@ -94,11 +96,20 @@ To boot-up core 1, the primary core has to use a special library function provid
.CPU Core 1 launch function prototype (note that this function can only be executed on core 0)
[source,c]
----
int neorv32_smp_launch(int hart_id, void (*entry_point)(void), uint8_t* stack_memory, size_t stack_size_bytes);
int neorv32_smp_launch(int (*entry_point)(void), uint8_t* stack_memory, size_t stack_size_bytes);
----

When executed, core 0 use the <<_inter_core_communication_icc>> to send launch data that includes the entry point
When executed, core 0 uses the <<_inter_core_communication_icc>> to send launch data that includes the entry point
for core 1 (via `entry_point`) and the actual stack configuration (via `stack_memory` and `stack_size_bytes`).
Note that the main function for core 1 has to use a specific type (return `int`, no arguments):

.CPU Core 1 Main Function
[source,c]
----
int core1_main(void) {
return 0; // return to crt0 and go to sleep mode
}
----

.Core 1 Stack Memory
[NOTE]
Expand Down
3 changes: 2 additions & 1 deletion docs/datasheet/software.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -380,7 +380,8 @@ Note that `\n` (newline) is automatically converted to `\r\n` (carriage-return a
.Constructors and Destructors
[NOTE]
Constructors and destructors for plain C code or for C++ applications are supported by the software framework.
See `sw/example/hello_cpp` for a minimal example.
See `sw/example/hello_cpp` for a minimal example. Note that constructor and destructors are only executed
by core 0 (primary core) in the SMP <<_dual_core_configuration>>.
.Newlib Test/Demo Program
[TIP]
Expand Down

0 comments on commit db6a4b7

Please sign in to comment.