From 55e49247a7bcd2e9d2a43d13e731d656be5cfd27 Mon Sep 17 00:00:00 2001 From: Tao Zhu Date: Wed, 25 Sep 2024 11:15:40 -0500 Subject: [PATCH 01/15] draft --- ...X-vm-consume-budget-for-percise-failure.md | 76 +++++++++++++++++++ 1 file changed, 76 insertions(+) create mode 100644 proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md diff --git a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md b/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md new file mode 100644 index 00000000..36a4235b --- /dev/null +++ b/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md @@ -0,0 +1,76 @@ +--- +simd: 'XXXX' +title: Conditional CU metering +authors: + - Tao Zhu (Anza) +category: Standard +type: Core +status: Draft +created: 2024-MM-DD +feature: +supersedes: +superseded-by: +extends: +--- + +## Summary + +Adjusting how CU consumption is measured based on the conditions of Basic Block execution: successful completion will charge actual CUs, or requested CU if exceptions during Basic Block execution. + +## Motivation + +### Background: + +In the Solana protocol, tracking transaction Compute Unit (CU) consumption is a critical aspect of maintaining consensus. Block costs are part of this consensus, meaning that all clients must agree on the execution cost of each transaction, including those that error out during execution. Ensuring consistency in CU tracking across clients is essential for maintaining protocol integrity. + +### Proposed Change: + +To improve performance, Solana programs are often compiled into Basic Blocks — linear sequences of BPF instructions with a single entry and exit point, and no loops or branches. Basic Blocks allow for efficient execution by reducing the overhead associated with tracking CU consumption for each individual BPF instruction. + +When a Basic Block is executed successfully (i.e., it exits at the final BPF instruction in the block), the total CU consumption is deterministic and can be calculated before execution. This ensures that CU accounting for successful transactions is accurate and predictable, enabling all clients to agree on the transaction’s execution cost. + +However, when an exception is thrown during the execution of a Basic Block (e.g., a null memory dereference or other faults), determining the exact number of CUs consumed up to the point of failure requires additional effort. For instance, the Agave client implements a mechanism that tracks the Instruction Pointer (IP) or Program Counter (PC) to backtrack and estimate the CUs consumed when an exception occurs. More details on this mechanism can be found [here](https://github.com/solana-labs/rbpf/blob/57139e9e1fca4f01155f7d99bc55cdcc25b0bc04/src/jit.rs#L267). + +While this approach is effective, it introduces additional work and complexity. These mechanisms are often implementation-specific, and requiring all clients to track the exact number of executed BPF instructions for consensus is costly and unnecessary. Such precision is not essential for protocol-level consensus. + +### Clarified Protocol Behavior: + +Instead of mandating implementation-specific work to handle exceptions, we propose the following clarification in the protocol: + +- For successful execution of a Basic Block (i.e., the block exits at the last BPF instruction), the deterministic CU cost of the block will be charged to the transaction’s CU meter. This ensures that CU consumption for successful transactions is accurately accounted for. +- In the event of an exception during Basic Block execution, where the block does not exit normally, the requested CUs for the transaction will be charged to the CU meter. This allows for a simple and efficient fallback mechanism that avoids the need for tracking the exact number of executed instructions up to the point of failure. + +By adopting this approach, the protocol avoids the overhead of requiring precise instruction-level CU tracking for transactions that fail. Instead, the requested CU limit of the transaction will be used, simplifying the handling of failed transactions while still maintaining consensus. + +### Conclusion: + +This proposal enhances performance and simplifies CU tracking by formalizing the use of Basic Blocks for efficient execution. It eliminates the need for costly, implementation-specific work to track CU consumption during execution failures, providing a clear and consistent approach to handling exceptions. This change allows clients to maintain consensus without sacrificing performance, ensuring that the protocol remains both efficient and robust. + +## Alternatives Considered + +None + +## New Terminology + +- [Basic Block](https://en.wikipedia.org/wiki/Basic_block):i In the context of JIT execution and BPF processing, a Basic Block is a sequence of BPF instructions that forms a single, linear flow of control with no loops or conditional branches except for the entry and exit points. It represents a segment of code where execution starts at the first instruction and proceeds sequentially through to the last instruction without deviation. The Basic Block is characterized by its predictable execution path, allowing for efficient budget checks and optimizations, as its Compute Unit (CU) cost can be determined before execution and verified at the end of the block. + +## Detailed Design + +At banking stage [here](https://github.com/anza-xyz/agave/blob/master/core/src/banking_stage/committer.rs#L99) and replay stage [here](https://github.com/anza-xyz/agave/blob/master/ledger/src/blockstore_processor.rs#L239) where Transaction's executed_units is checked, implement new logic: +``` +let execution_cu = match transaction.execution_results { + Ok(_) || Err(TransactionError::CustomError(_)) => committed_tx.executed_cu, + _ => transaction.requested_cu, +}; +... ... +``` + +## Impact + +None + +## Security Considerations + +One potential issue with using requested CUs in the case of failed transactions is the risk of transactions with grossly large CU requests consuming an excessive portion of the block's CU limit. This could effectively cause a denial-of-service effect by preventing legitimate transactions from being included in the block. To mitigate this risk, it is recommended that this proposal be implemented after SIMD-172 is deployed, which removes the possibility of accidentally requesting an excessively large number of CUs. + +By ensuring that CU requests are reasonable and controlled, the risk of failed transactions taking up disproportionate block space will be minimized, allowing the proposed solution to work effectively without compromising block utilization. From 989a88f9782de985e421a50e98b96184ca34dacb Mon Sep 17 00:00:00 2001 From: Tao Zhu <82401714+tao-stones@users.noreply.github.com> Date: Wed, 2 Oct 2024 18:37:26 -0500 Subject: [PATCH 02/15] Apply suggestions from code review Co-authored-by: Philip Taffet <123486962+ptaffet-jump@users.noreply.github.com> --- .../simd-XXXX-vm-consume-budget-for-percise-failure.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md b/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md index 36a4235b..e3e78bc2 100644 --- a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md +++ b/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md @@ -15,7 +15,7 @@ extends: ## Summary -Adjusting how CU consumption is measured based on the conditions of Basic Block execution: successful completion will charge actual CUs, or requested CU if exceptions during Basic Block execution. +Adjusting how CU consumption is measured based on the conditions of transaction execution: successful completion will consume actual CUs, but certain irregular failures will result in the transaction automatically consuming all requested CUs. ## Motivation @@ -25,13 +25,15 @@ In the Solana protocol, tracking transaction Compute Unit (CU) consumption is a ### Proposed Change: -To improve performance, Solana programs are often compiled into Basic Blocks — linear sequences of BPF instructions with a single entry and exit point, and no loops or branches. Basic Blocks allow for efficient execution by reducing the overhead associated with tracking CU consumption for each individual BPF instruction. +To improve performance, Solana programs are often compiled with a JIT that works at the level of Basic Blocks — linear sequences of sBPF instructions with a single entry and exit point, and no loops or branches. Basic Blocks allow for efficient execution by reducing the overhead associated with tracking CU consumption for each individual BPF instruction. -When a Basic Block is executed successfully (i.e., it exits at the final BPF instruction in the block), the total CU consumption is deterministic and can be calculated before execution. This ensures that CU accounting for successful transactions is accurate and predictable, enabling all clients to agree on the transaction’s execution cost. +Other than in rare, exceptional situations discussed below, the total CU consumption for a Basic Block is deterministic and, and CU accounting can be done once per basic block instead of at each instruction. +A transaction completing successfully or with most errors implies that execution exited each basic block at its single exit point, +and thus that the total CU consumption of the execution is equal to the sum of the CU cost of each Basic Block executed. However, when an exception is thrown during the execution of a Basic Block (e.g., a null memory dereference or other faults), determining the exact number of CUs consumed up to the point of failure requires additional effort. For instance, the Agave client implements a mechanism that tracks the Instruction Pointer (IP) or Program Counter (PC) to backtrack and estimate the CUs consumed when an exception occurs. More details on this mechanism can be found [here](https://github.com/solana-labs/rbpf/blob/57139e9e1fca4f01155f7d99bc55cdcc25b0bc04/src/jit.rs#L267). -While this approach is effective, it introduces additional work and complexity. These mechanisms are often implementation-specific, and requiring all clients to track the exact number of executed BPF instructions for consensus is costly and unnecessary. Such precision is not essential for protocol-level consensus. +While this approach is effective, it introduces additional work and complexity. These mechanisms are often implementation-specific, and requiring all clients to track the exact number of executed BPF instructions for consensus is costly and unnecessary. Such precision is not essential for protocol-level consensus, especially since these cases are rare. ### Clarified Protocol Behavior: From 6f6e049dd656fd00b883378e37809d0bf3cd8a35 Mon Sep 17 00:00:00 2001 From: Tao Zhu Date: Thu, 3 Oct 2024 09:40:33 -0500 Subject: [PATCH 03/15] set textwidth --- ...X-vm-consume-budget-for-percise-failure.md | 118 +++++++++++++----- 1 file changed, 88 insertions(+), 30 deletions(-) diff --git a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md b/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md index e3e78bc2..b84b2129 100644 --- a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md +++ b/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md @@ -7,46 +7,87 @@ category: Standard type: Core status: Draft created: 2024-MM-DD -feature: -supersedes: +feature: +supersedes: superseded-by: extends: --- ## Summary -Adjusting how CU consumption is measured based on the conditions of transaction execution: successful completion will consume actual CUs, but certain irregular failures will result in the transaction automatically consuming all requested CUs. +Adjusting how CU consumption is measured based on the conditions of transaction +execution: successful completion will consume actual CUs, but certain irregular +failures will result in the transaction automatically consuming all requested +CUs. ## Motivation ### Background: -In the Solana protocol, tracking transaction Compute Unit (CU) consumption is a critical aspect of maintaining consensus. Block costs are part of this consensus, meaning that all clients must agree on the execution cost of each transaction, including those that error out during execution. Ensuring consistency in CU tracking across clients is essential for maintaining protocol integrity. +In the Solana protocol, tracking transaction Compute Unit (CU) consumption is a +critical aspect of maintaining consensus. Block costs are part of this +consensus, meaning that all clients must agree on the execution cost of each +transaction, including those that error out during execution. Ensuring +consistency in CU tracking across clients is essential for maintaining protocol +integrity. ### Proposed Change: -To improve performance, Solana programs are often compiled with a JIT that works at the level of Basic Blocks — linear sequences of sBPF instructions with a single entry and exit point, and no loops or branches. Basic Blocks allow for efficient execution by reducing the overhead associated with tracking CU consumption for each individual BPF instruction. - -Other than in rare, exceptional situations discussed below, the total CU consumption for a Basic Block is deterministic and, and CU accounting can be done once per basic block instead of at each instruction. -A transaction completing successfully or with most errors implies that execution exited each basic block at its single exit point, -and thus that the total CU consumption of the execution is equal to the sum of the CU cost of each Basic Block executed. - -However, when an exception is thrown during the execution of a Basic Block (e.g., a null memory dereference or other faults), determining the exact number of CUs consumed up to the point of failure requires additional effort. For instance, the Agave client implements a mechanism that tracks the Instruction Pointer (IP) or Program Counter (PC) to backtrack and estimate the CUs consumed when an exception occurs. More details on this mechanism can be found [here](https://github.com/solana-labs/rbpf/blob/57139e9e1fca4f01155f7d99bc55cdcc25b0bc04/src/jit.rs#L267). - -While this approach is effective, it introduces additional work and complexity. These mechanisms are often implementation-specific, and requiring all clients to track the exact number of executed BPF instructions for consensus is costly and unnecessary. Such precision is not essential for protocol-level consensus, especially since these cases are rare. +To improve performance, Solana programs are often compiled with a JIT that works +at the level of Basic Blocks — linear sequences of sBPF instructions with a +single entry and exit point, and no loops or branches. Basic Blocks allow for +efficient execution by reducing the overhead associated with tracking CU +consumption for each individual BPF instruction. + +Other than in rare, exceptional situations discussed below, the total CU +consumption for a Basic Block is deterministic and, and CU accounting can be +done once per basic block instead of at each instruction. A transaction +completing successfully or with most errors implies that execution exited each +basic block at its single exit point, and thus that the total CU consumption of +the execution is equal to the sum of the CU cost of each Basic Block executed. + +However, when an exception is thrown during the execution of a Basic Block +(e.g., a null memory dereference or other faults), determining the exact number +of CUs consumed up to the point of failure requires additional effort. For +instance, the Agave client implements a mechanism that tracks the Instruction +Pointer (IP) or Program Counter (PC) to backtrack and estimate the CUs consumed +when an exception occurs. More details on this mechanism can be found +[here](https://github.com/solana-labs/rbpf/blob/57139e9e1fca4f01155f7d99bc55cdcc25b0bc04/src/jit.rs#L267). + +While this approach is effective, it introduces additional work and complexity. +These mechanisms are often implementation-specific, and requiring all clients to +track the exact number of executed BPF instructions for consensus is costly and +unnecessary. Such precision is not essential for protocol-level consensus, +especially since these cases are rare. ### Clarified Protocol Behavior: -Instead of mandating implementation-specific work to handle exceptions, we propose the following clarification in the protocol: +Instead of mandating implementation-specific work to handle exceptions, we +propose the following clarification in the protocol: -- For successful execution of a Basic Block (i.e., the block exits at the last BPF instruction), the deterministic CU cost of the block will be charged to the transaction’s CU meter. This ensures that CU consumption for successful transactions is accurately accounted for. -- In the event of an exception during Basic Block execution, where the block does not exit normally, the requested CUs for the transaction will be charged to the CU meter. This allows for a simple and efficient fallback mechanism that avoids the need for tracking the exact number of executed instructions up to the point of failure. +- For successful execution of a Basic Block (i.e., the block exits at the last + BPF instruction), the deterministic CU cost of the block will be charged to +the transaction’s CU meter. This ensures that CU consumption for successful +transactions is accurately accounted for. +- In the event of an exception during Basic Block execution, where the block + does not exit normally, the requested CUs for the transaction will be charged +to the CU meter. This allows for a simple and efficient fallback mechanism that +avoids the need for tracking the exact number of executed instructions up to the +point of failure. -By adopting this approach, the protocol avoids the overhead of requiring precise instruction-level CU tracking for transactions that fail. Instead, the requested CU limit of the transaction will be used, simplifying the handling of failed transactions while still maintaining consensus. +By adopting this approach, the protocol avoids the overhead of requiring precise +instruction-level CU tracking for transactions that fail. Instead, the requested +CU limit of the transaction will be used, simplifying the handling of failed +transactions while still maintaining consensus. ### Conclusion: -This proposal enhances performance and simplifies CU tracking by formalizing the use of Basic Blocks for efficient execution. It eliminates the need for costly, implementation-specific work to track CU consumption during execution failures, providing a clear and consistent approach to handling exceptions. This change allows clients to maintain consensus without sacrificing performance, ensuring that the protocol remains both efficient and robust. +This proposal enhances performance and simplifies CU tracking by formalizing the +use of Basic Blocks for efficient execution. It eliminates the need for costly, +implementation-specific work to track CU consumption during execution failures, +providing a clear and consistent approach to handling exceptions. This change +allows clients to maintain consensus without sacrificing performance, ensuring +that the protocol remains both efficient and robust. ## Alternatives Considered @@ -54,18 +95,26 @@ None ## New Terminology -- [Basic Block](https://en.wikipedia.org/wiki/Basic_block):i In the context of JIT execution and BPF processing, a Basic Block is a sequence of BPF instructions that forms a single, linear flow of control with no loops or conditional branches except for the entry and exit points. It represents a segment of code where execution starts at the first instruction and proceeds sequentially through to the last instruction without deviation. The Basic Block is characterized by its predictable execution path, allowing for efficient budget checks and optimizations, as its Compute Unit (CU) cost can be determined before execution and verified at the end of the block. +- [Basic Block](https://en.wikipedia.org/wiki/Basic_block):i In the context of + JIT execution and BPF processing, a Basic Block is a sequence of BPF +instructions that forms a single, linear flow of control with no loops or +conditional branches except for the entry and exit points. It represents a +segment of code where execution starts at the first instruction and proceeds +sequentially through to the last instruction without deviation. The Basic Block +is characterized by its predictable execution path, allowing for efficient +budget checks and optimizations, as its Compute Unit (CU) cost can be determined +before execution and verified at the end of the block. ## Detailed Design -At banking stage [here](https://github.com/anza-xyz/agave/blob/master/core/src/banking_stage/committer.rs#L99) and replay stage [here](https://github.com/anza-xyz/agave/blob/master/ledger/src/blockstore_processor.rs#L239) where Transaction's executed_units is checked, implement new logic: -``` -let execution_cu = match transaction.execution_results { - Ok(_) || Err(TransactionError::CustomError(_)) => committed_tx.executed_cu, - _ => transaction.requested_cu, -}; -... ... -``` +At banking stage +[here](https://github.com/anza-xyz/agave/blob/master/core/src/banking_stage/committer.rs#L99) +and replay stage +[here](https://github.com/anza-xyz/agave/blob/master/ledger/src/blockstore_processor.rs#L239) +where Transaction's executed_units is checked, implement new logic: ``` let +execution_cu = match transaction.execution_results { Ok(_) || +Err(TransactionError::CustomError(_)) => committed_tx.executed_cu, _ => +transaction.requested_cu, }; ... ... ``` ## Impact @@ -73,6 +122,15 @@ None ## Security Considerations -One potential issue with using requested CUs in the case of failed transactions is the risk of transactions with grossly large CU requests consuming an excessive portion of the block's CU limit. This could effectively cause a denial-of-service effect by preventing legitimate transactions from being included in the block. To mitigate this risk, it is recommended that this proposal be implemented after SIMD-172 is deployed, which removes the possibility of accidentally requesting an excessively large number of CUs. - -By ensuring that CU requests are reasonable and controlled, the risk of failed transactions taking up disproportionate block space will be minimized, allowing the proposed solution to work effectively without compromising block utilization. +One potential issue with using requested CUs in the case of failed transactions +is the risk of transactions with grossly large CU requests consuming an +excessive portion of the block's CU limit. This could effectively cause a +denial-of-service effect by preventing legitimate transactions from being +included in the block. To mitigate this risk, it is recommended that this +proposal be implemented after SIMD-172 is deployed, which removes the +possibility of accidentally requesting an excessively large number of CUs. + +By ensuring that CU requests are reasonable and controlled, the risk of failed +transactions taking up disproportionate block space will be minimized, allowing +the proposed solution to work effectively without compromising block +utilization. From 19a48f1c38b975f1df4cd904812f28f7e3516b5f Mon Sep 17 00:00:00 2001 From: Tao Zhu Date: Thu, 3 Oct 2024 09:51:28 -0500 Subject: [PATCH 04/15] add irregular failure term --- proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md b/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md index b84b2129..4d22e3c6 100644 --- a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md +++ b/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md @@ -95,7 +95,7 @@ None ## New Terminology -- [Basic Block](https://en.wikipedia.org/wiki/Basic_block):i In the context of +- [Basic Block](https://en.wikipedia.org/wiki/Basic_block): In the context of JIT execution and BPF processing, a Basic Block is a sequence of BPF instructions that forms a single, linear flow of control with no loops or conditional branches except for the entry and exit points. It represents a @@ -105,6 +105,9 @@ is characterized by its predictable execution path, allowing for efficient budget checks and optimizations, as its Compute Unit (CU) cost can be determined before execution and verified at the end of the block. +- Irregular transaction failure: A rare case that a Transaction execution aborts +in the middle of executing basic block, results in consuming all requested CUs. + ## Detailed Design At banking stage From 08a9f5691daa71587b3e6f385ed599cbb57b3a9b Mon Sep 17 00:00:00 2001 From: Tao Zhu Date: Thu, 3 Oct 2024 10:37:52 -0500 Subject: [PATCH 05/15] updated proposed design details --- ...X-vm-consume-budget-for-percise-failure.md | 56 ++++++++++++++++--- 1 file changed, 49 insertions(+), 7 deletions(-) diff --git a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md b/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md index 4d22e3c6..c46ecd07 100644 --- a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md +++ b/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md @@ -110,14 +110,56 @@ in the middle of executing basic block, results in consuming all requested CUs. ## Detailed Design -At banking stage +Banking stage and Replay stage collect transaction execution results and CUs +consumed at [here](https://github.com/anza-xyz/agave/blob/master/core/src/banking_stage/committer.rs#L99) -and replay stage -[here](https://github.com/anza-xyz/agave/blob/master/ledger/src/blockstore_processor.rs#L239) -where Transaction's executed_units is checked, implement new logic: ``` let -execution_cu = match transaction.execution_results { Ok(_) || -Err(TransactionError::CustomError(_)) => committed_tx.executed_cu, _ => -transaction.requested_cu, }; ... ... ``` +and +[here](https://github.com/anza-xyz/agave/blob/master/ledger/src/blockstore_processor.rs#L239), +respectively. + +Where the `committed_tx.executed_units` are accumulated value of each +Instruction `compute_unit_consumed`, which essentially is the changes of +invoke_context's CU_meter. + +We propose VM to deplete CU meter when irregular failure occurs during +execution, which is following errors in Agave: + +``` +EbpfError::DivideByZero +EbpfError::DivideOverflow +EbpfError::CallOutsideTextSegment +EbpfError::InvalidInstruction +EbpfError::InvalidVirtualAddress +``` + +And in Firedancer: + +``` +#define FD_VM_ERR_SIGSPLIT ( -9) /* split multiword instruction (e.g. jump into the middle of a multiword instruction) */ +#define FD_VM_ERR_SIGILL (-12) /* illegal instruction (e.g. opcode is not valid) */ +#define FD_VM_ERR_SIGSEGV (-13) /* illegal memory address (e.g. read/write to an address not backed by any memory) */ +#define FD_VM_ERR_SIGBUS (-14) /* misaligned memory address (e.g. read/write to an address with inappropriate alignment) */ +#define FD_VM_ERR_SIGRDONLY (-15) /* illegal write (e.g. write to a read only address) */ +#define FD_VM_ERR_SIGFPE (-18) /* divide by zero */ +``` + +In this way, detecting irregular failure is fully encapsulated within VMs, call +sites can continue work on Execution Results without change. + +### Alternatives: + +No changes to VM, instead at call sites, e.g. at Banking Stage and Replay Stage +to check execution results then determine how many CUs to consume, like this: + +``` +let execution_cu = match transaction.execution_results { + irregualr_execution_failure => transaction.requested_cu, + _ => committed_tx.executed_cu, +}; +``` + +This alternative requires mapping VM error to runtime InstructionError, and +pushing irregualr failure detection upstream to call sites. ## Impact From 9a22dfcad2455b5eb21ec69fbdbfeb13d01056e1 Mon Sep 17 00:00:00 2001 From: Tao Zhu Date: Thu, 3 Oct 2024 10:49:46 -0500 Subject: [PATCH 06/15] /typo --- proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md b/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md index c46ecd07..2c377ae1 100644 --- a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md +++ b/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md @@ -40,7 +40,7 @@ efficient execution by reducing the overhead associated with tracking CU consumption for each individual BPF instruction. Other than in rare, exceptional situations discussed below, the total CU -consumption for a Basic Block is deterministic and, and CU accounting can be +consumption for a Basic Block is deterministic and CU accounting can be done once per basic block instead of at each instruction. A transaction completing successfully or with most errors implies that execution exited each basic block at its single exit point, and thus that the total CU consumption of From d307391f533cf22993abd56dab875d7c25bd97d9 Mon Sep 17 00:00:00 2001 From: Tao Zhu Date: Thu, 3 Oct 2024 15:32:19 -0500 Subject: [PATCH 07/15] updated detail section --- ...X-vm-consume-budget-for-percise-failure.md | 52 +------------------ 1 file changed, 2 insertions(+), 50 deletions(-) diff --git a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md b/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md index 2c377ae1..24e08aa5 100644 --- a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md +++ b/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md @@ -110,56 +110,8 @@ in the middle of executing basic block, results in consuming all requested CUs. ## Detailed Design -Banking stage and Replay stage collect transaction execution results and CUs -consumed at -[here](https://github.com/anza-xyz/agave/blob/master/core/src/banking_stage/committer.rs#L99) -and -[here](https://github.com/anza-xyz/agave/blob/master/ledger/src/blockstore_processor.rs#L239), -respectively. - -Where the `committed_tx.executed_units` are accumulated value of each -Instruction `compute_unit_consumed`, which essentially is the changes of -invoke_context's CU_meter. - -We propose VM to deplete CU meter when irregular failure occurs during -execution, which is following errors in Agave: - -``` -EbpfError::DivideByZero -EbpfError::DivideOverflow -EbpfError::CallOutsideTextSegment -EbpfError::InvalidInstruction -EbpfError::InvalidVirtualAddress -``` - -And in Firedancer: - -``` -#define FD_VM_ERR_SIGSPLIT ( -9) /* split multiword instruction (e.g. jump into the middle of a multiword instruction) */ -#define FD_VM_ERR_SIGILL (-12) /* illegal instruction (e.g. opcode is not valid) */ -#define FD_VM_ERR_SIGSEGV (-13) /* illegal memory address (e.g. read/write to an address not backed by any memory) */ -#define FD_VM_ERR_SIGBUS (-14) /* misaligned memory address (e.g. read/write to an address with inappropriate alignment) */ -#define FD_VM_ERR_SIGRDONLY (-15) /* illegal write (e.g. write to a read only address) */ -#define FD_VM_ERR_SIGFPE (-18) /* divide by zero */ -``` - -In this way, detecting irregular failure is fully encapsulated within VMs, call -sites can continue work on Execution Results without change. - -### Alternatives: - -No changes to VM, instead at call sites, e.g. at Banking Stage and Replay Stage -to check execution results then determine how many CUs to consume, like this: - -``` -let execution_cu = match transaction.execution_results { - irregualr_execution_failure => transaction.requested_cu, - _ => committed_tx.executed_cu, -}; -``` - -This alternative requires mapping VM error to runtime InstructionError, and -pushing irregualr failure detection upstream to call sites. +If VM execution returns any error except `SyscallError`, transaction's CU meter +should be depleted; otherwise the actual executed CUs shall be consumed. ## Impact From 30f088c4485e4372fdbf08d02245a8f78b0b1389 Mon Sep 17 00:00:00 2001 From: Tao Zhu Date: Thu, 3 Oct 2024 17:39:41 -0500 Subject: [PATCH 08/15] clean up --- ...X-vm-consume-budget-for-percise-failure.md | 45 ++++++------------- 1 file changed, 13 insertions(+), 32 deletions(-) diff --git a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md b/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md index 24e08aa5..34b15346 100644 --- a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md +++ b/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md @@ -22,8 +22,6 @@ CUs. ## Motivation -### Background: - In the Solana protocol, tracking transaction Compute Unit (CU) consumption is a critical aspect of maintaining consensus. Block costs are part of this consensus, meaning that all clients must agree on the execution cost of each @@ -31,17 +29,15 @@ transaction, including those that error out during execution. Ensuring consistency in CU tracking across clients is essential for maintaining protocol integrity. -### Proposed Change: - To improve performance, Solana programs are often compiled with a JIT that works at the level of Basic Blocks — linear sequences of sBPF instructions with a single entry and exit point, and no loops or branches. Basic Blocks allow for efficient execution by reducing the overhead associated with tracking CU -consumption for each individual BPF instruction. +consumption for each individual sBPF instruction. Other than in rare, exceptional situations discussed below, the total CU consumption for a Basic Block is deterministic and CU accounting can be -done once per basic block instead of at each instruction. A transaction +done once per basic block instead of at each instruction. A transaction completing successfully or with most errors implies that execution exited each basic block at its single exit point, and thus that the total CU consumption of the execution is equal to the sum of the CU cost of each Basic Block executed. @@ -56,38 +52,27 @@ when an exception occurs. More details on this mechanism can be found While this approach is effective, it introduces additional work and complexity. These mechanisms are often implementation-specific, and requiring all clients to -track the exact number of executed BPF instructions for consensus is costly and +track the exact number of executed sBPF instructions for consensus is costly and unnecessary. Such precision is not essential for protocol-level consensus, especially since these cases are rare. -### Clarified Protocol Behavior: - Instead of mandating implementation-specific work to handle exceptions, we propose the following clarification in the protocol: - For successful execution of a Basic Block (i.e., the block exits at the last - BPF instruction), the deterministic CU cost of the block will be charged to + sBPF instruction), the deterministic CU cost of the block will be charged to the transaction’s CU meter. This ensures that CU consumption for successful transactions is accurately accounted for. -- In the event of an exception during Basic Block execution, where the block - does not exit normally, the requested CUs for the transaction will be charged -to the CU meter. This allows for a simple and efficient fallback mechanism that -avoids the need for tracking the exact number of executed instructions up to the -point of failure. +- In the event of irregular failure, where execution aborts from the middle of +basic block, the requested CUs for the transaction will be charged to the CU +meter. This allows for a simple and efficient fallback mechanism that avoids the +need for tracking the exact number of executed instructions up to the point of +failure. By adopting this approach, the protocol avoids the overhead of requiring precise instruction-level CU tracking for transactions that fail. Instead, the requested -CU limit of the transaction will be used, simplifying the handling of failed -transactions while still maintaining consensus. - -### Conclusion: - -This proposal enhances performance and simplifies CU tracking by formalizing the -use of Basic Blocks for efficient execution. It eliminates the need for costly, -implementation-specific work to track CU consumption during execution failures, -providing a clear and consistent approach to handling exceptions. This change -allows clients to maintain consensus without sacrificing performance, ensuring -that the protocol remains both efficient and robust. +CU limit of the transaction will be used, simplifying the handling of +irregularly failed transactions while still maintaining consensus. ## Alternatives Considered @@ -111,7 +96,8 @@ in the middle of executing basic block, results in consuming all requested CUs. ## Detailed Design If VM execution returns any error except `SyscallError`, transaction's CU meter -should be depleted; otherwise the actual executed CUs shall be consumed. +should be fully depleted, in another words, all requested CUs are consumed; +otherwise consumes the actual executed CUs. ## Impact @@ -126,8 +112,3 @@ denial-of-service effect by preventing legitimate transactions from being included in the block. To mitigate this risk, it is recommended that this proposal be implemented after SIMD-172 is deployed, which removes the possibility of accidentally requesting an excessively large number of CUs. - -By ensuring that CU requests are reasonable and controlled, the risk of failed -transactions taking up disproportionate block space will be minimized, allowing -the proposed solution to work effectively without compromising block -utilization. From d439937ce50f9448a31814b774bb1e6f53de8321 Mon Sep 17 00:00:00 2001 From: Tao Zhu Date: Fri, 4 Oct 2024 09:17:53 -0500 Subject: [PATCH 09/15] add simd id --- ...ercise-failure.md => simd-0182-conditional-cu-metering.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename proposals/{simd-XXXX-vm-consume-budget-for-percise-failure.md => simd-0182-conditional-cu-metering.md} (99%) diff --git a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md b/proposals/simd-0182-conditional-cu-metering.md similarity index 99% rename from proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md rename to proposals/simd-0182-conditional-cu-metering.md index 34b15346..a1e6aa42 100644 --- a/proposals/simd-XXXX-vm-consume-budget-for-percise-failure.md +++ b/proposals/simd-0182-conditional-cu-metering.md @@ -1,12 +1,12 @@ --- -simd: 'XXXX' +simd: '0182' title: Conditional CU metering authors: - Tao Zhu (Anza) category: Standard type: Core status: Draft -created: 2024-MM-DD +created: 2024-10-03 feature: supersedes: superseded-by: From a69ec9317ae36ccd0d7c19e54bd9be391c4094b3 Mon Sep 17 00:00:00 2001 From: Tao Zhu <82401714+tao-stones@users.noreply.github.com> Date: Wed, 20 Nov 2024 16:01:58 -0600 Subject: [PATCH 10/15] Apply suggestions from code review Co-authored-by: Justin Starry --- proposals/simd-0182-conditional-cu-metering.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/simd-0182-conditional-cu-metering.md b/proposals/simd-0182-conditional-cu-metering.md index a1e6aa42..e053d953 100644 --- a/proposals/simd-0182-conditional-cu-metering.md +++ b/proposals/simd-0182-conditional-cu-metering.md @@ -17,7 +17,7 @@ extends: Adjusting how CU consumption is measured based on the conditions of transaction execution: successful completion will consume actual CUs, but certain irregular -failures will result in the transaction automatically consuming all requested +failures in the sBPF VM will result in the transaction automatically consuming all requested CUs. ## Motivation From 0aed128ee95f85edde18faf45c46601dcefc1d94 Mon Sep 17 00:00:00 2001 From: Tao Zhu Date: Wed, 20 Nov 2024 16:02:56 -0600 Subject: [PATCH 11/15] update --- proposals/simd-0182-conditional-cu-metering.md | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/proposals/simd-0182-conditional-cu-metering.md b/proposals/simd-0182-conditional-cu-metering.md index e053d953..223bb6ad 100644 --- a/proposals/simd-0182-conditional-cu-metering.md +++ b/proposals/simd-0182-conditional-cu-metering.md @@ -1,6 +1,6 @@ --- simd: '0182' -title: Conditional CU metering +title: Consume reuested CUs for sBPF failures authors: - Tao Zhu (Anza) category: Standard @@ -105,10 +105,4 @@ None ## Security Considerations -One potential issue with using requested CUs in the case of failed transactions -is the risk of transactions with grossly large CU requests consuming an -excessive portion of the block's CU limit. This could effectively cause a -denial-of-service effect by preventing legitimate transactions from being -included in the block. To mitigate this risk, it is recommended that this -proposal be implemented after SIMD-172 is deployed, which removes the -possibility of accidentally requesting an excessively large number of CUs. +None From 0dfaa4d541425ae1b699d6dfd0cea28c5e1e36e5 Mon Sep 17 00:00:00 2001 From: Tao Zhu Date: Wed, 20 Nov 2024 16:19:22 -0600 Subject: [PATCH 12/15] fmt --- proposals/simd-0182-conditional-cu-metering.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/simd-0182-conditional-cu-metering.md b/proposals/simd-0182-conditional-cu-metering.md index 223bb6ad..024cca84 100644 --- a/proposals/simd-0182-conditional-cu-metering.md +++ b/proposals/simd-0182-conditional-cu-metering.md @@ -17,8 +17,8 @@ extends: Adjusting how CU consumption is measured based on the conditions of transaction execution: successful completion will consume actual CUs, but certain irregular -failures in the sBPF VM will result in the transaction automatically consuming all requested -CUs. +failures in the sBPF VM will result in the transaction automatically consuming +all requested CUs. ## Motivation From 73507970314cdeba7a87c4b2f241f689f52dd88e Mon Sep 17 00:00:00 2001 From: Tao Zhu <82401714+tao-stones@users.noreply.github.com> Date: Thu, 21 Nov 2024 09:15:17 -0600 Subject: [PATCH 13/15] Update proposals/simd-0182-conditional-cu-metering.md Co-authored-by: Justin Starry --- proposals/simd-0182-conditional-cu-metering.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/simd-0182-conditional-cu-metering.md b/proposals/simd-0182-conditional-cu-metering.md index 024cca84..c43e2a27 100644 --- a/proposals/simd-0182-conditional-cu-metering.md +++ b/proposals/simd-0182-conditional-cu-metering.md @@ -1,6 +1,6 @@ --- simd: '0182' -title: Consume reuested CUs for sBPF failures +title: Consume requested CUs for sBPF failures authors: - Tao Zhu (Anza) category: Standard From 51942278019896e83b4992f9aad9b9c247428e68 Mon Sep 17 00:00:00 2001 From: Tao Zhu Date: Thu, 21 Nov 2024 09:21:29 -0600 Subject: [PATCH 14/15] lint --- proposals/simd-0182-conditional-cu-metering.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/simd-0182-conditional-cu-metering.md b/proposals/simd-0182-conditional-cu-metering.md index c43e2a27..43d0a0cc 100644 --- a/proposals/simd-0182-conditional-cu-metering.md +++ b/proposals/simd-0182-conditional-cu-metering.md @@ -5,7 +5,7 @@ authors: - Tao Zhu (Anza) category: Standard type: Core -status: Draft +status: Review created: 2024-10-03 feature: supersedes: From e8c74ac084a51587f405b07ea90df90b716f01f8 Mon Sep 17 00:00:00 2001 From: Tao Zhu Date: Thu, 21 Nov 2024 19:28:26 -0600 Subject: [PATCH 15/15] in anticipating furture changes such as direct mapping, describe such VM error as "less common" --- proposals/simd-0182-conditional-cu-metering.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/simd-0182-conditional-cu-metering.md b/proposals/simd-0182-conditional-cu-metering.md index 43d0a0cc..8398626b 100644 --- a/proposals/simd-0182-conditional-cu-metering.md +++ b/proposals/simd-0182-conditional-cu-metering.md @@ -35,7 +35,7 @@ single entry and exit point, and no loops or branches. Basic Blocks allow for efficient execution by reducing the overhead associated with tracking CU consumption for each individual sBPF instruction. -Other than in rare, exceptional situations discussed below, the total CU +Other than in less common situations discussed below, the total CU consumption for a Basic Block is deterministic and CU accounting can be done once per basic block instead of at each instruction. A transaction completing successfully or with most errors implies that execution exited each @@ -54,7 +54,7 @@ While this approach is effective, it introduces additional work and complexity. These mechanisms are often implementation-specific, and requiring all clients to track the exact number of executed sBPF instructions for consensus is costly and unnecessary. Such precision is not essential for protocol-level consensus, -especially since these cases are rare. +especially since these cases are infrequent. Instead of mandating implementation-specific work to handle exceptions, we propose the following clarification in the protocol: