diff --git a/qos_capacity.adoc b/qos_capacity.adoc index eb3097d..4aa976f 100644 --- a/qos_capacity.adoc +++ b/qos_capacity.adoc @@ -7,40 +7,40 @@ interface. The capacity controller allocates capacity in fixed multiples of _capacity units_. A group of these _capacity units_ is referred to as a _capacity block_. -One or more _capacity blocks_ may be allocated to a workload. When a workload -requests capacity allocation, the capacity is allocated using _capacity units_ +One or more _capacity blocks_ can be allocated to a workload. When a workload +requests capacity allocation, the capacity is allocated by using _capacity units_ situated within the _capacity blocks_ assigned to the workload. Capacity blocks can also be shared among one or more workloads. Optionally, the capacity -controller may allow configuration of a limit on the maximum number of _capacity -units_ that can be occupied in the _capacity blocks_ allocated to a given +controller might allow configuration of a limit on the maximum number of _capacity +units_ that can be occupied in the _capacity blocks_ allocated to a specific workload. [NOTE] ==== -For example, a cache controller may allocate capacity in multiples of cache +For example, a cache controller allocates capacity in multiples of cache blocks. In this context, a cache block serves as a _capacity unit_, and a group of cache blocks forms a _capacity block_. A cache controller supporting capacity -allocation by ways might define a _capacity block_ to be the cache blocks in one +allocation might define a _capacity block_ to be the cache blocks in one way of the cache. ==== The capacity allocation affects the decision regarding which _capacity blocks_ -to use when a new _capacity unit_ is requested by a workload but usually does +to use when a new _capacity unit_ is requested by a workload, but usually does not affect other operations of the controller. [NOTE] ==== For example, when a request is made to a cache controller, the request involves scanning the entire cache to determine if the requested data is present. If the -data is located, then the request is fulfilled using this data, even if the +data is located, then the request is fulfilled with this data, even if the cache block containing the data was initially allocated for a different workload. The data continues to reside in the same cache block. Consequently, the cache lookup function remains unaffected by the capacity allocation -constraints set for the workload initiating the request. Conversely, if the data +constraints set for the workload that initiated the request. Conversely, if the data is not found, a new cache block must be allocated. This allocation is executed -using the _capacity blocks_ assigned to the workload that made the request. -Hence, a workload may only trigger evictions within _capacity blocks_ designated -to it but can access shared data in _capacity blocks_ allocated to other +by using the _capacity blocks_ assigned to the workload that made the request. +Hence, a workload might only trigger evictions within _capacity blocks_ designated +to it, but can access shared data in _capacity blocks_ allocated to other workloads. ==== @@ -159,10 +159,10 @@ The `cc_mon_ctl` register is used to control monitoring of capacity usage by a .... Capacity controllers that support capacity usage monitoring implement a usage -monitoring counter for each supported `MCID`. The usage monitoring counter may +monitoring counter for each supported `MCID`. The usage monitoring counter can be configured to count a monitoring event. When an event matching the event -configured for the `MCID` occurs then the monitoring counter is updated. The -event matching may optionally be filtered by the access-type. +configured for the `MCID` occurs, then the monitoring counter is updated. The +event matching might optionally be filtered by the access-type identifier. The `OP`, `AT`, `ATV`, `MCID`, and `EVT_ID` fields of the register are WARL fields. @@ -190,8 +190,8 @@ in <>. The `EVT_ID` field is used to program the identifier of the event to count in the monitoring counter selected by `MCID`. The `AT` field (See <>) is -used to program the access-type to count, and its validity is indicated by the -`ATV` field. When `ATV` is 0, the counter counts requests with all access-types, +used to program the access-type identifier to count, and its validity is indicated by the +`ATV` field. When `ATV` is 0, the counter counts requests with all access-type identifiers, and the `AT` value is ignored. <<< @@ -211,18 +211,18 @@ and the `AT` value is ignored. | -- | 128-256 | Designated for custom use. |=== -When the `EVT_ID` for a `MCID` is programmed with a non-zero and legal value +When the `EVT_ID` for a `MCID` is programmed with a non-zero and legal value by using the `CONFIG_EVENT` operation, the counter is reset to 0 and starts counting matching events for requests with the matching `MCID` and `AT` (if `ATV` is 1). However, if the `EVT_ID` is programmed to 0, the counter stops counting. -A controller that does not support monitoring by access-type can hardwire the +A controller that does not support monitoring by access-type identifier can hardwire the `ATV` and the `AT` fields to 0, indicating that the counter counts requests with -all access-types. +all access-types identifiers. -When the `cc_mon_ctl` register is written, the controller may need to perform -several actions that may not complete synchronously with the write. A write to -the `cc_mon_ctl` sets the read-only `BUSY` bit to 1 indicating the controller +When the `cc_mon_ctl` register is written, the controller can perform +several actions that might not complete synchronously with the write. A write to +the `cc_mon_ctl` sets the read-only `BUSY` bit to 1, indicating the controller is performing the requested operation. When the `BUSY` bit reads 0, the operation is complete, and the read-only `STATUS` field provides a status value (see <> for details). Written values to the `BUSY` and the `STATUS` @@ -249,8 +249,8 @@ the register. The `STATUS` field remains valid until a subsequent write to the |=== When the `BUSY` bit is set to 1, the behavior of writes to the `cc_mon_ctl` is -`UNSPECIFIED`. Some implementations may ignore the second write, while others -may perform the operation determined by the second write. To ensure proper +`UNSPECIFIED`. Some implementations ignore the second write, while others +might perform the operation determined by the second write. To ensure proper operation, software must first verify that the `BUSY` bit is 0 before writing the `cc_mon_ctl` register. @@ -258,7 +258,7 @@ the `cc_mon_ctl` register. === Capacity Usage Monitoring Counter Value The `cc_mon_ctr_val` is a read-only register that holds a snapshot of the -counter selected by the `READ_COUNTER` operation. When the controller does not +counter that is selected by the `READ_COUNTER` operation. When the controller does not support capacity usage monitoring, the `cc_mon_ctr_val` register always reads as zero. @@ -271,27 +271,27 @@ zero. ], config:{lanes: 2, hspace:1024}} .... -The counter is valid if the `INV` field is 0. The counter may be marked `INV` if -the controller, for `UNSPECIFIED` reasons determine the count to be not valid. -The counters marked `INV` may become valid in future. +The counter is valid if the `INV` field is 0. The counter is marked `INV` if +the controller determines the count to be not valid for `UNSPECIFIED` reasons. +The counters marked `INV` can become valid in future. -The counter shall not decrement below zero. If an event should occur that would -otherwise result in a negative value, the counter will continue to hold a value +The counter shall not decrement below zero. If an event occur that would +otherwise result in a negative value, the counter continues to hold a value of 0. [NOTE] ==== -Following a reset of the counter to zero, a capacity de-allocation may attempt -to drive its value below zero. This scenario may occur when the `MCID` is +Following a reset of the counter to zero, a capacity de-allocation attempts +to drive its value below zero. This scenario occurs when the `MCID` is reassigned to a new workload, yet the capacity controller continues to hold -capacity initially allocated by the previous workload. In such cases, the +capacity that was initially allocated by the previous workload. In such cases, the counter shall not decrement below zero and shall remain at zero. After a brief period of execution for the new workload post-counter reset, the counter value is expected to stabilize to reflect the capacity usage of this new workload. -Some implementations may not store the `MCID` of the request that caused the +Some implementations might not store the `MCID` of the request that caused the capacity to be allocated with every unit of capacity in the controller to -optimize on the storage overheads. Such controllers may in turn rely on +optimize for the storage overheads. Such controllers, in turn, rely on statistical sampling to report the capacity usage by tagging only a subset of the capacity units. @@ -302,14 +302,14 @@ sets. By keeping track of the hits and misses in the monitored sets, it is possible to estimate the overall cache occupancy with a high degree of accuracy. The size of the subset needed to obtain accurate estimates depends on various factors, such as the size of the cache, the cache access patterns, and the -desired accuracy level. Research cite:[SSAMPLE] has shown that set-sampling can +desired accuracy level. Research cite:[SSAMPLE] shows that set-sampling can provide statistically accurate estimates with a relatively small sample size, such as 10% or less, depending on the cache properties and sampling technique used. When the controller has not observed enough samples to provide an accurate -value in the monitoring counter, it may report the counter as being `INV` -until more accurate measurements are available. This helps to prevent inaccurate +value in the monitoring counter, it might report the counter as being `INV` +until more accurate measurements are available. This state helps to prevent inaccurate or misleading data from being used in capacity planning or other decision-making processes. ==== @@ -318,10 +318,10 @@ processes. === Capacity Allocation Control The `cc_alloc_ctl` register is used to configure allocation of capacity to an -`RCID` per access-type (`AT`). The `OP`, `RCID` and `AT` fields in this register -are WARL. If a controller does not support capacity allocation then this +`RCID` per access type (`AT`). The `OP`, `RCID` and `AT` fields in this register +are WARL. If a controller does not support capacity allocation, then this register is read-only zero. If the controller does not support capacity -allocation per access-type then the `AT` field is read-only zero. +allocation per access type, then the `AT` field is read-only zero. .Capacity Allocation Control Register (`cc_alloc_ctl`) [wavedrom, , ] @@ -345,7 +345,7 @@ targeted _capacity blocks_ are designated in the form of a bitmask in the unit_ limit to be defined in the `cc_cunits` register. To execute operations that require a capacity block mask and/or a capacity unit limit, software must first program the `cc_block_mask` and/or the `cc_cunits` register, followed by -initiating the operation via the `cc_alloc_ctl` register. +initiating the operation with the `cc_alloc_ctl` register. [[CC_ALLOC_OP]] .Capacity Allocation Operations (`OP`) @@ -355,7 +355,7 @@ initiating the operation via the `cc_alloc_ctl` register. |Operation | Encoding ^| Description |-- | 0 | Reserved for future standard use. |`CONFIG_LIMIT`| 1 | Configure a capacity allocation for requests by - `RCID` and of access-type `AT`. The _capacity + `RCID` and of access type `AT`. The _capacity blocks_ allocation is specified in the `cc_block_mask` register, and a limit on capacity units is specified in the `cc_cunits` register. @@ -384,37 +384,36 @@ initiating the operation via the `cc_alloc_ctl` register. Capacity controllers enumerate the allocatable _capacity blocks_ in the `NCBLKS` field of the `cc_capabilities` register. The `cc_block_mask` register is -programmed with a bit-mask where each bit represents a _capacity block_ for the -operation. A limit on the _capacity unit_, if configuration of such limits is -supported (i.e., `cc_capabilities.CUNIT=1`), that can be occupied in the -allocated _capacity blocks_ may be programmed in the `cc_cunits` register. If -configuration of a limit on the _capacity units_ is not supported, then the -controller allows the use of all _capacity units_ in the allocated _capacity +programmed with a bit-mask value, where each bit represents a _capacity block_ for the +operation. If configuring _capacity unit_ limits is supported (for example, +`cc_capabilities.CUNIT=1`), then the number of allocated capacity blocks can +be programmed in the `cc_cunits` register. If configuring limits is not supported, +then the controller allows the use of all _capacity units_ in the allocated _capacity blocks_. A value of zero programmed into `cc_cunits` indicates that no limits -should be enforced on _capacity unit_ allocation. +shall be enforced on _capacity unit_ allocation. -A capacity allocation must be configured for each supported access-type by the +A capacity allocation must be configured for each supported access type by the controller. An implementation that does not support capacity allocation per -access-type may hardwire the `AT` field to 0 and associate the same capacity -allocation configuration for requests with all access-types. When capacity -allocation per access-type is supported, identical limits may be configured for -two or more access-types if different capacity allocation per access-type is not -required. If capacity is not allocated for each access-type supported by the +access type can hardwire the `AT` field to 0 and associate the same capacity +allocation configuration for requests with all access types. When capacity +allocation per access type is supported, identical limits can be configured for +two or more access types, if different capacity allocation per access type is not +required. If capacity is not allocated for each access type supported by the controller, the behavior is `UNSPECIFIED`. [NOTE] ==== A cache controller that supports capacity allocation indicates the number of allocatable _capacity blocks_ in `cc_capabilities.NCBLKS` field. For example, -let's consider a cache with `NCBLKS=8`. In this example, the `RCID=5` has been -allocated _capacity blocks_ numbered 0 and 1 for requests with access-type `AT=0`, -and has been allocated _capacity blocks_ numbered 2 for requests with access-type -`AT=1`. The `RCID=3` in this example has been allocated _capacity blocks_ -numbered 3 and 4 for both `AT=0` and `AT=1` access-types as separate capacity -allocation by access-type is not required for this workload. Further in this +consider a cache with `NCBLKS=8`. In this example, the `RCID=5` is +allocated _capacity blocks_ numbered 0 and 1 for requests with access type `AT=0`, +and _capacity blocks_ numbered 2 for requests with access type +`AT=1`. The `RCID=3` in this example is allocated _capacity blocks_ +numbered 3 and 4 for both `AT=0` and `AT=1` access types as separate capacity +allocation by access type is not required for this workload. Further in this example, the `RCID=6` has been configured with the same _capacity block_ -allocations as `RCID=3`. This implies that they share a common capacity -allocation in this cache but may have been associated with different `RCID` to +allocations as `RCID=3`. This configuration implies that they share a common capacity +allocation in this cache, but might be associated with different `RCID` to allow differentiated treatment in another capacity and/or bandwidth controller. [width=100%] @@ -440,36 +439,36 @@ blocks, respectively. <<< -The `FLUSH_RCID` operation may incur a long latency to complete. New requests to -the controller by the `RCID` being flushed are allowed. Additionally, the +The `FLUSH_RCID` operation can incur a long latency to complete. New requests to +the controller by flushing the `RCID` are allowed. Additionally, the controller is allowed to deallocate capacity that was allocated after the operation was initiated. [NOTE] ==== -For cache controllers, the `FLUSH_RCID` operation may perfom an operation +For cache controllers, the `FLUSH_RCID` operation perfoms an operation similar to that performed by the RISC-V `CBO.FLUSH` instruction on each cache block that is part of the allocation configured for the `RCID`. The `FLUSH_RCID` operation can be used as part of reclaiming a previously allocated `RCID` and associating it with a new workload. When such a -reallocation is performed, the capacity controllers may have capacity allocated -by the old workload and thus for a short warmup duration the capacity controller -may be enforcing capacity allocation limits that reflect the usage by the old -workload. Such warmup durations are typically not statistically significant, but +reallocation is performed, the capacity controllers might have capacity allocated +by the old workload and thus for a short warm-up duration, the capacity controller +might be enforcing capacity allocation limits that reflect the usage by the old +workload. Such warm-up durations are typically not statistically significant, but if that is not desired, then the `FLUSH_RCID` operation can be used to flush and evict capacity allocated by the old workload. ==== -When the `cc_alloc_ctl` register is written, the controller may need to perform -several actions that may not complete synchronously with the write. A write to -the `cc_alloc_ctl` sets the read-only `BUSY` bit to 1 indicating the controller +When the `cc_alloc_ctl` register is written, the controller might perform +several actions that might not complete synchronously with the write. A write to +the `cc_alloc_ctl` sets the read-only `BUSY` bit to 1, indicating the controller is performing the requested operation. When the `BUSY` bit reads 0, the operation is complete, and the read-only `STATUS` field provides a status value -(<>) of the requested operation. Values written to the `BUSY` and +(<>) of the requested operation. Values that are written to the `BUSY` and the `STATUS` fields are always ignored. An implementation that can complete the -operation synchronously with the write may hardwire the `BUSY` bit to 0. The -state of the `BUSY` bit, when not hardwired to 0, shall only change in response +operation synchronously with the write might hardwire the `BUSY` bit to 0. The +state of the `BUSY` bit, when not hardwired to 0, shall change only in response to a write to the register. The `STATUS` field remains valid until a subsequent write to the `cc_alloc_ctl` register. @@ -491,7 +490,7 @@ write to the `cc_alloc_ctl` register. When the `BUSY` bit is set to 1, the behavior of writes to the `cc_alloc_ctl` register, `cc_cunits` register, or to the `cc_block_mask` register is -`UNSPECIFIED`. Some implementations may ignore the second write and others may +`UNSPECIFIED`. Some implementations might ignore the second write and others might perform the operation determined by the second write. To ensure proper operation, software must verify that `BUSY` bit is 0 before writing any of these registers. @@ -499,12 +498,12 @@ software must verify that `BUSY` bit is 0 before writing any of these registers === Capacity Block Mask (`cc_block_mask`) The `cc_block_mask` is a WARL register. If the controller does not support -capacity allocation, i.e., `NCBLKS` is 0, then this register is read-only 0. +capacity allocation, for example, `NCBLKS` is 0, then this register is read-only 0. -The register has `NCBLKS` bits each corresponding to one allocatable -_capacity block_ in the controller. The width of this register is variable but +The register has `NCBLKS` bits, each corresponding to one allocatable +_capacity block_ in the controller. The width of this register is variable, but always a multiple of 64 bits. The bitmap width in bits (`BMW`) is determined by -the equation below. The division operation in this equation is an integer +the following equation. The division operation in this equation is an integer division. [latexmath#eq-2,reftext="equation ({counter:eqs})"] @@ -517,49 +516,49 @@ BMW = \lfloor{\frac{NCBLKS + 63}{64}}\rfloor \times 64 Bits `NCBLKS-1:0` are read-write, and the bits `BMW-1:NCBLKS` are read-only zero. The process of configuring capacity allocation for an `RCID` and `AT` begins by -programming the `cc_block_mask` register with a bit-mask that identifies the +programming the `cc_block_mask` register with a bit-mask value that identifies the _capacity blocks_ to be allocated and, if supported, by programming the -`cc_cunits` register with a limit on the capacity units that may be occupied in +`cc_cunits` register with a limit on the capacity units that might be occupied in those capacity blocks. Next, the `cc_alloc_ctl register` is written to request a -`CONFIG_LIMIT` operation for the `RCID` and `AT`. Once a capacity allocation -limit has been established, a request may be allocated capacity in the _capacity +`CONFIG_LIMIT` operation for the `RCID` and `AT`. After a capacity allocation +limit is established, a request can be allocated capacity in the _capacity blocks_ allocated to the `RCID` and `AT` associated with the request. It is -important to note that some implementations may require at least one _capacity -block_ to be allocated using `cc_block_mask` when allocating capacity; -otherwise, the operation will fail with `STATUS=5`. Overlapping _capacity block_ +important to note that some implementations might require at least one _capacity +block_ to be allocated by using `cc_block_mask` when allocating capacity; +otherwise, the operation fails with `STATUS=5`. Overlapping _capacity block_ masks among `RCID` and/or `AT` are allowed to be configured. [NOTE] ==== -A set-associative cache controller that supports capacity allocation by ways -can advertise `NCBLKS` as the number of ways per set in the cache. To Allocate +A set-associative cache controller that supports capacity allocation +can advertise `NCBLKS` as the number of ways per set in the cache. To allocate capacity in such a cache for an `RCID` and `AT`, a subset of ways must be -selected and mask of the selected ways must be programmed in `cc_block_mask` when -requesting the `CONFIG_LIMIT` operation. +selected and a mask of the selected ways must be programmed in `cc_block_mask` field when +the `CONFIG_LIMIT` operation is requested. ==== To read the _capacity block_ allocation for an `RCID` and `AT`, the controller -provides the `READ_LIMIT` operation which can be requested by writing to the -`cc_alloc_ctl` register. Upon successful completion of the operation, the +provides the `READ_LIMIT` operation, which can be requested by writing to the +`cc_alloc_ctl` register. When the operation completes successfully, the `cc_block_mask` register holds the configured _capacity block_ allocation. [[CC_CUNITS]] === Capacity Units The `cc_cunits` register is a read-write WARL register. If the controller does -not support capacity allocation (i.e., `NCBLKS` is set to 0), this register +not support capacity allocation (for example, `NCBLKS` is set to 0), this register shall be read-only zero. If the controller does not support configuring limits on _capacity units_ that -may be occupied in the allocated _capacity blocks_ (i.e., -`cc_capabilities.CUNITS=0`) then this register shall be read-only zero. In such -cases the controller will allow utilization of all available _capacity units_ by +may be occupied in the allocated _capacity blocks_ (for example, +`cc_capabilities.CUNITS=0`), then this register shall be read-only zero. In such +cases, the controller allows the utilization of all available _capacity units_ by an `RCID` within the _capacity blocks_ allocated to it. <<< -If the controller supports configuring limits on _capacity units_ that may be -occupied in the allocated _capacity blocks_ (i.e., `cc_capabilities.CUNITS=1`) +If the controller supports configuring limits on _capacity units_ that might be +occupied in the allocated _capacity blocks_ (for example, `cc_capabilities.CUNITS=1`) then this register sets an upper limit on the number of _capacity units_ that can be occupied by an `RCID` in the _capacity blocks_ allocated for an `AT`. A value of zero specified in the `cc_cunits` register indicates that no limit is @@ -572,18 +571,18 @@ allocation. [NOTE] ==== When multiple `RCID` instances share a _capacity block_ allocation, the -`cc_cunits` register may be employed to set an upper limit on the number of +`cc_cunits` register can be employed to set an upper limit on the number of _capacity units_ each `RCID` can occupy. For instance, consider a group of four `RCID` instances configured to share a set of _capacity blocks_, representing a total of 100 capacity units. Each -`RCID` could be configured with a limit of 30 capacity units, ensuring that no +`RCID` can be configured with a limit of 30 capacity units, ensuring that no individual `RCID` exceeds 30% of the total shared _capacity units_. -The capacity controller may enforce these limits through various techniques. +The capacity controller might enforce these limits through various techniques. Examples include: -. Refraining from allocating new capacity units to an `RCID` that has reached +. Refraining from allocating new capacity units to an `RCID` that reached its limit. . Evicting previously allocated capacity units when a new allocation is required. @@ -592,12 +591,12 @@ These methods are not exhaustive and can be applied either individually or in combination to maintain _capacity unit_ limits. When the limit on the _capacity units_ is reached or is about to be reached, -the capacity controller may initiate additional operations. These could include -throttling certain activities (e.g., prefetches) of the corresponding workload +the capacity controller can initiate additional operations. These could include +throttling certain activities (for example, prefetches) of the corresponding workload requests. ==== To read the _capacity unit_ limit for an `RCID` and `AT`, the controller -provides the `READ_LIMIT` operation which can be requested by writing to the -`cc_alloc_ctl` register. Upon successful completion of the operation, the +provides the `READ_LIMIT` operation that can be requested by writing to the +`cc_alloc_ctl` register. When the operation completes successfully, the `cc_cunits` register holds the configured _capacity unit_ allocation limit.