diff --git a/images/highlevel-arch.png b/images/highlevel-arch.png index 91cc474..ebde742 100644 Binary files a/images/highlevel-arch.png and b/images/highlevel-arch.png differ diff --git a/src/message-protocol.adoc b/src/message-protocol.adoc index e3fbe48..ee5259a 100644 --- a/src/message-protocol.adoc +++ b/src/message-protocol.adoc @@ -24,7 +24,7 @@ Multiple related RPMI services are grouped logically into an *RPMI service group* such as Clock, Voltage, Performance, etc. Depending on the RPMI service, a RPMI request message may carry data required to perform the control and management task. An RPMI request message may have an associated response which -is send back as an *RPMI acknowledgement message* on the same RPMI transport +is sent back as an *RPMI acknowledgement message* on the same RPMI transport channel. The RPMI acknowledgement message carries the status and optional response data from an RPMI request after it has been processed. @@ -166,6 +166,34 @@ values as corresponding fields in the RPMI request message. The `DATALEN` field of the RPMI acknowledgement message must be set according to the data carried by this acknowledgement. +NOTE: The message token will help the application processors to keep track of +the origin of the request when it receives a response. This is useful when the +multiple application processors are sharing the same queues. For example, two +different application processors may send the same type of request message with +the same SERVICEGROUP_ID and SERVICE_ID. When the response messages for both +requests are received from the platform microcontroller, the token helps +distinguish which response belongs to which request. + +NOTE: The RPMI specification recommends monotonically increasing token numbers +and the token number can be initialized from any value without any constraints. + +When the doorbell interrupts are supported and enabled, the application processor +can set the `flags[3]` bit to `1` in the request message header to inform the +platform microcontroller to ring the doorbell after sending the response back. +If the `flags[3]` bit is `0` in the request message +header, it means that the application processor is going poll for the +response message in the queue and the platform microcontroller should not +ring the doorbell. + +NOTE: The flags[3] bit can be used for a particular message or for the entire +lifecycle of RPMI message communication. For example if the application +processor and the platform microcontroller are capable for MSIs and the application +processor has configured MSI details via defined service in <>, +then flags[3] bit can always be enabled so that the platform microcontroller will +always send the MSI for each response. Also, the application processor can +selectively disable it so that the platform microcontroller in that case does +not trigger a doorbell. + For an RPMI notification message, the platform microcontroller will set appropriate values for the `TOKEN`, `SERVICEGROUP_ID`, and `DATALEN` fields whereas the `SERVICE_ID` field must be always set to `0x0`. diff --git a/src/rpmi.bib b/src/rpmi.bib index 6ec2dbc..51ef209 100644 --- a/src/rpmi.bib +++ b/src/rpmi.bib @@ -18,3 +18,9 @@ @electronic{libRPMI title = {libRPMI}, url = {https://github.com/riscv-software-src/librpmi} } + +@electronic{priv_v1_12, + title = {The RISC-V Instruction Set Manual, Volume II: Privileged Architecture}, + url = {https://github.com/riscv/riscv-isa-manual/releases/tag/Priv-v1.12}, + year = {2021} +} \ No newline at end of file diff --git a/src/service-groups.adoc b/src/service-groups.adoc index 0900ad6..bbf0c91 100644 --- a/src/service-groups.adoc +++ b/src/service-groups.adoc @@ -29,8 +29,8 @@ All implemented RPMI service groups must satisfy the following requirements: level associated with the RPMI context which includes it. . All RPMI services of the RPMI service groups must be supported except the dedicated notification service (`SERVICE_ID = 0x00`) which is reserved -for RPMI notification messages. A RPMI service group may be partially -implement its RPMI services only if defines mechanism to discover supported +for RPMI notification messages. A RPMI service group may implement its RPMI +services partially only if it also defines a mechanism to discover supported RPMI services. . The RPMI service group must implement a dedicated RPMI service with `SERVICE_ID = 0x01` to subscribe for event notifications. @@ -42,9 +42,11 @@ should be invoked. This specification defines standard RPMI service groups and RPMI services with the provision to add more service groups as required in the future. -The platform vendors can provide implementation specific RPMI service groups. -The <> below list all standard RPMI service groups -defined by this specification. +The RPMI specification also provides experimental service group IDs space +for development of service group until a standard service group ID is +allocated. The platform vendors can provide implementation specific RPMI +service groups. The <> table below lists all standard +RPMI service groups defined by this specification. [#table_service_groups] .RPMI Service Groups @@ -115,11 +117,16 @@ defined by this specification. | REQUEST_FORWARD | M-mode, S-mode -| 0x000D - 0x7FFF +| 0x000D - 0x7BFF | | _Reserved for Future Use_ | +| 0x7C00 - 0x7FFF +| +| _Experimental Service Groups_ +| + | 0x8000 - 0xFFFF | | _Implementation Specific Service Groups_ diff --git a/src/srvgrp-base.adoc b/src/srvgrp-base.adoc index 9f91154..cd0b823 100644 --- a/src/srvgrp-base.adoc +++ b/src/srvgrp-base.adoc @@ -65,6 +65,13 @@ The following table lists the services in the BASE service group: |=== ==== RPMI Implementation IDs +The RPMI specification defines space for standard implementation IDs and for +experimental implementation IDs. The experimental implementation IDs can be used +by the implementations until a standard implementation ID is assigned to it. + +The RPMI implementations that have been assigned a standard implementation ID +are listed in the table below. + [#table_base_rpmi_impl_id] .RPMI Implementation IDs [cols="2, 3a", width=100%, align="center", options="header"] @@ -72,8 +79,14 @@ The following table lists the services in the BASE service group: | Implementation ID | Name -| 0x0 +| 0x00000000 | libRPMI cite:[libRPMI] + +| 0x00000001 - 0x7FFFFFFF +| _Reserved for Future Use_ + +| 0x80000000 - 0xFFFFFFFF +| _Experimental Implementation IDs_ |=== [#base-notifications] @@ -525,6 +538,7 @@ service group. | _Reserved_, must be initialized to `0`. |=== +[#srvgrp_base_set_msi] ==== Service: BASE_SET_MSI (SERVICE_ID: 0x08) This service is used to configure the MSI address and data which the platform microcontroller can use as a doorbell to the application processor. The @@ -537,10 +551,7 @@ appropriate `STATUS` returned. The platform microcontroller will enable MSI only if support is present and this service configures MSI address and data successfully. -NOTE: If the platform supports PLIC, the platform need to provide a MMIO -register to inject an edge-triggered interrupt. - -NOTE: The platform microcontroller can use MSI for both sending the MSI +NOTE: The platform microcontroller can use MSI for sending the MSI directly or injecting wired interrupt in the application processor. If the MSI target address is IMSIC, then the application processor will take MSI whereas if the MSI target address is `setipnum` of the APLIC then the application diff --git a/src/transport.adoc b/src/transport.adoc index f03a4db..42a4a6c 100644 --- a/src/transport.adoc +++ b/src/transport.adoc @@ -38,6 +38,11 @@ the platform microcontroller can avoid implementing the P2A channel. The current RPMI specification only defines a shared memory based transport but other transport types can be added in the future. +NOTE: The shared memory for RPMI transport and fast-channels allocated +in DRAM or in on-chip RAM will require memory attributes configuration. These +memory attributes also called PMA (Physical Memory Attributes) are defined in +RISC-V Privileged Specification cite:[priv_v1_12]. + [#transport_bidir_comm] .Bi-directional Communication image::transport-bidirectional.png[400,400, align="center"] @@ -69,26 +74,11 @@ processors must discover it using standard hardware description mechanisms such as device tree or ACPI. If the P2A doorbell is a MSI then the application processors must configure -the MSI on the platform microcontroller side using RPMI messages defined by +the MSI on the platform microcontroller side using the RPMI service defined by the `BASE` service group. -=== Fast-channels -Fast-channels are special shared memory-based channels used in scenarios -requiring lower latency and faster processing of requests from application -processors to the platform microcontroller. - -The layout and request format of fast-channels are service group specific -and only a few service groups may support fast-channels. A service group -that supports fast-channels: - -* May only enable some services to be used over fast-channels -* Must provide physical address and other attributes (such as optional - fast-channel doorbell) of the fast-channels via a services defined by - the service group - -NOTE: To avoid the caching side-effects, the platform can configure the -fast-channel shared memory as non-cacheable or IO memory for both the -application processors and the platform microcontroller. +NOTE: If the platform supports PLIC, the platform need to provide a MMIO +register to inject an edge-triggered interrupt. === Shared Memory Transport The RPMI shared memory transport defines a mechanism to exchange messages via @@ -97,9 +87,17 @@ device memory. The RPMI shared memory transport does not specify where the shared memory resides in a platform, but it must be accessible from both the application processors and the platform microcontroller. -NOTE: To avoid the caching side-effects, the platform can configure the shared +The platform must setup the PMA for the shared memory used for RPMI transport. + +NOTE: Its possible that the application processor and the platform +microcontroller are not cache-coherent and using the shared memory may lead to +caching side effects such as data inconsistency between the platform +microcontroller and the application processor, write propagation delays and +others issues which may lead to race conditions. To avoid the caching +side-effects, the platform can configure the memory attribute of the shared memory as non-cacheable or IO memory for both the application processor and the -platform microcontroller. +platform microcontroller. In addition, the implementation can perform manual +cache maintenance using cache flush and invalidate operations. All data sent or received through the RPMI shared memory transport must follow little-endian byte-order. @@ -166,26 +164,37 @@ must be a `power-of-2` and must be at least `64 bytes`. The slot size is same across all RPMI shared memory queues and the physical address of each slot must be aligned at slot size boundary. -NOTE: The slot size should match with the maximum cache line size used in a +NOTE: The slot size should match with the maximum cache block size used in a platform. The requirement of `power-of-2` slot size with minimum value of -`64 bytes` is because usual CPU cache line size is `64 bytes` or some +`64 bytes` is because usual CPU cache block size is `64 bytes` or some `power-of-2` value. The slots of the RPMI shared memory queue are assigned sequentially increasing indices starting with `0`. The slot at index `0` is referred to as the `head slot` and the slot at index `1` is referred to as the `tail slot`. The remaining `(M - 2)` slots of the RPMI shared memory queue are message slots. -The first `4 bytes` of the head slot is used as the `head` of the circular -queue which contains a `slot index - 2` value pointing to the message slot from +The first `4 bytes` of the Head slot is used as the `head` of the circular +queue which contains a `(slot index - 2)` value pointing to the message slot from where the next message can be dequeued. The first `4 bytes` of the tail slot is -used as the `tail` of the circular queue which contains a `slot index - 2` value +used as the `tail` of the circular queue which contains a `(slot index - 2)` value pointing to the message slot from where the next message can be enqueued. The pictorial view of the RPMI shared memory queue internals is shown in the <> below. +NOTE: In the total `M` slots only the `(M - 2)` slots are used as an queue +having RPMI messages stored as data. The `(slot index - 2)` index value +represents that from all slots perspective in a queue shared memory which also +includes the `head` and `tail` slots, the `head` and `tail` stores the indices +of the message slots which effectively starts from `slot index - 2`. + NOTE: The requirement of keeping `head` and `tail` in separate slots is -to prevent both `head` and `tail` using the same cache line so that cache -maintenance can be done separately for both `head` and `tail`. +to prevent both `head` and `tail` using the same cache block so that cache +maintenance such as using cache flush and invalidate operations can be done +separately for both `head` and `tail`. + +NOTE: There are no explicit indicators present to highlight the queue +wrapping condition. The implementations can use `head` == `tail` as queue +empty condition and `\((tail + 1) % (M - 2)) == head` as full condition. [#transport_shared_memory_qint] .Shared Memory Queue Internals @@ -211,7 +220,7 @@ into two parts where one part belongs to the A2P channel and other belongs to the P2A channel. The shared memory region sizes of the A2P and P2A channel can be different. For each channel (A2P or P2A), the corresponding REQ and ACK queues must be of the same size hence equal number of slots (or queue capacity). -The size of each RPMI shared shared queue must be a multiple of the slot size. +The size of each RPMI shared queue must be a multiple of the slot size. NOTE: A platform should provide sufficient shared memory for all RPMI shared memory queues so that the number of slots (queue capacity) does not become @@ -255,3 +264,41 @@ M = (X / slot-size) : Total slot count in a queue (M-2) : Message slot count (2 slots less for `HEAD` and `TAIL`) ``` ==== + +=== Shared Memory based Fast-channels +A fast-channel is a unidirectional shared memory channel with a dedicated RPMI +service type. The data transmitted over a fast-channel is without any message +header and its layout is defined by the service which is dedicated to that +fast-channel. Unlike normal RPMI transport, which can be shared by multiple +service groups and services, a fast-channel is exclusive to a service in a +service group which allows faster exchange of the data. A fast-channel can be +used in scenarios that require lower latency and faster processing of requests +between the application processors and the platform microcontroller. + +NOTE: Because of fixed data format and type associated with a fast-channel, the +requests made over a fast-channel can be processed quickly, but the time required +by the platform microcontroller to complete the requests may not be less than +the time required for completion of requests made over the normal RPMI transport +The request completion time depends on the platform implementation. + +A service group that supports fast-channels for services: + +* May only enable some services to be used over fast-channels. +* Must provide physical address and other attributes (such as optional + fast-channel doorbell) of the fast-channels via a services defined by + the service group. + +The layout and data format of a fast-channel are RPMI service specific in a +service group and defined in the respective service group sections. + +The platform must setup the PMA for the shared memory used for the fast-channels. + +NOTE: It is possible that the application processor and the platform +microcontroller are not cache-coherent and using the shared memory may lead to +caching side effects such as data inconsistency between the platform +microcontroller and the application processor, write propagation delays and +others issues which may lead to race conditions. To avoid the caching +side-effects, the platform can configure the memory attribute of the shared +memory as non-cacheable or IO memory for both the application processor and the +platform microcontroller. In addition, the implementation can perform manual +cache maintenance using cache flush and invalidate operations. \ No newline at end of file