From e772874c3cd52d7e6874d8c44b3d1107b54c0b43 Mon Sep 17 00:00:00 2001 From: Garand Tyson Date: Fri, 15 Nov 2024 00:23:23 -0800 Subject: [PATCH] Added multiple Archival Filters --- core/cap-0057.md | 199 +++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 174 insertions(+), 25 deletions(-) diff --git a/core/cap-0057.md b/core/cap-0057.md index ed769fcd4..ce525cc90 100644 --- a/core/cap-0057.md +++ b/core/cap-0057.md @@ -389,7 +389,7 @@ case LEDGER_ENTRY_RESTORE: #### Archival State Tree (AST) The Archive State Tree (AST) is a collection of immutable Merkle trees whose leaves -are archived entries and `PERSISTENT` entry keys explicitly deleted via transaction +are all archived entries and some `PERSISTENT` entry keys explicitly deleted via transaction execution. The AST is a collection of subtrees indexed `AST[0]`, `AST[1]`, …, `AST[N]`. The AST index number is called the Archival Epoch. We define the current Archival Epoch as N + 1, where N is the index of the most recently completed AST @@ -406,7 +406,8 @@ In addition to these boundary leafs, `AST[k]` contains a leaf for 1. Every `PERSISTENT` entry evicted (but not restored) during archival epoch `k`. These entries are stored in `AST[k]` as type `COLD_ARCHIVE_ARCHIVED_LEAF`. -2. Every `PERSISTENT` entry explicitly deleted via transaction execution during epoch `k`. +2. Every `PERSISTENT` entry explicitly deleted via transaction execution during epoch `k`, +iff an ARCHIVED entry with given entry's key exists in some subtree AST[i] where i < k. These keys are stored in `AST[k]` as type `COLD_ARCHIVE_DELETED_LEAF`. Leaf nodes are sorted as follows: @@ -616,7 +617,7 @@ lowerBound and upperBound proofs of existence are valid, then must check that the two entries provided are direct neighbors as follows: ``` -verifyProofOfNonexistence(p: ArchivalProof, r: ArchivalCOLD_ARCHIVE_HASH): +verifyProofOfNonexistence(p: ArchivalProof, rootHash: Hash): for i in range(0, len(p.keysToProve)): let k = p.keysToProve[i] @@ -633,7 +634,8 @@ verifyProofOfNonexistence(p: ArchivalProof, r: ArchivalCOLD_ARCHIVE_HASH): if lowBound >= k or highBound <= k: return INVALID - // Note: verifyNonexistenceSubProof is functionally equivalent to verifyProofOfExistence + // Note: verifyNonexistenceSubProof is functionally equivalent to + // verifyProofOfExistence if not verifyNonexistenceSubProof(lowBound, r, p) or not verifyNonexistenceSubProof(highBound, r, p): return INVALID @@ -658,14 +660,9 @@ set when it does not. However, the filter is guaranteed to never return a false If a key exists in the set, the filter always claims that the key exists. This means that if the binary fuse filter claims a key does not exist in the set, it is guaranteed to not exist. However if the filter claims a key exist in the set, it may or may not actually exist. - These guarantees allow validators to check for AST subtree exclusion directly without -the need of actual proofs of nonexistence in most cases. When a proof of nonexistence is -required, the validator will first check the archival filter for the given subtree. If no -false positive occurs and the key does not exist in the AST, the validator can conclude that -the key does not exist without a proof of exclusion. However, occasionally a false positive -might occur. Should the archival filter return a false positive, the transaction can include -a full proof of exclusion for the given AST subtree to "override" the false positive. +the need of actual proofs of nonexistence in most cases. In the case of a false positive, +a full proof can be provided to "override," as detailed in the section below. The Archival Filter will be implemented as a 3 wise binary fuse filter with a bit width of 32 bits. This provides a 1 / 4 billion false positive rate with a storage overhead of 36 bits @@ -674,6 +671,108 @@ approximately 211 MB of archival filter overhead. Additionally, there is only a a single false positive would have occurred throughout the entire history of the Stellar network. +#### Proof Requirement for Key Creation + +When a key is being created, it has either never existed before, or has existed, but +has since been deleted. These cases carry different proof requirements. + +If a key is never existed, is is necessary to prove that the key does not exist in +any AST subtree. + +If a key previously existed and has been deleted, a proof of the deletion is required. +If the entry was deleted in epoch i, a proof of existence for a DELETED node must be +given for AST[i], and a proof of nonexistence for every subtree AST[k] where k > i. + +This verification is as follows: + +``` +// proofs maps epoch -> ArchivalProof +isCreationValid(key: LedgerKey, proofs: map(uint32, ArchivalProof), lastEpoch: uint32): + firstEpochForExclusion = 0 + + // If a proof of existence is provided, we must be proving recreation. + // The existence of a DELETED entry serves as the base of the proof, we + // don't need proofs-of-exclusion older than the DELETED entry. + // If multiple DELETED events occur in different AST subtrees, only a proof + // for the most recent deletion is necessary. + if proofOfExistence in proofs: + existenceEpoch = proofOfExistenceEpoch + existenceProof = proofs[existenceEpoch] + if existenceProof.entryBeingProved.type != DELETED: + return INVALID + + if verifyProofOfExistence(existenceProof, roots[firstEpochForExclusion]) + == INVALID: + return INVALID + + // Start checking for exclusion proofs after DELETED entry + firstEpochForExclusion = existenceEpoch + 1 + + for i in range(firstEpochForExclusion, lastEpoch): + filterForEpoch = filters[i] + + if key in filterForEpoch: + // Possible filter miss ocurred + if i not in proofs: + return INVALID + + p = proofs[i] + + // We should only ever need a single proof of existence + // for the latest DELETED entry, handled above this loop + if proof.type == EXISTENCE: + return INVALID + + if p.entryBeingProved != key: + return INVALID + + if verifyProofOfNonexistence(key, p, rootHashes[i]) == INVALID: + return INVALID + + return VALID +``` + +#### Proof Requirement for Entry Restoration + +When an entry is being restored, it is necessary to prove: + +1. The entry (with correct value) exists in some subtree AST[i] +2. For every epoch k > i, no entry with the same key exists in AST[k] + +Intuitively, it must be shown the entry exists in the archive, and is the +newest version of the entry in the archive. This verification is as follows: + +``` +// proofs maps epoch -> ArchivalProof +isRestoreValid(key: LedgerKey, proofs: map(uint32, ArchivalProof), lastEpoch: uint32): + firstEpoch = getSmallestKey(proofs) + existenceProof = proofs[firstEpoch] + + if existenceProof.type != EXISTENCE: + return INVALID + + if verifyProofOfExistence(existenceProof, roots[firstEpoch]) == INVALID: + return INVALID + + for i in range(firstEpoch + 1, lastEpoch): + if key in filters[i]: + if i not in proofs: + return INVALID + + // Entry must not be deleted, so require proofs of nonexistence + p = proofs[i] + if p.entryBeingProved != key: + return INVALID + + if p.type != NONEXISTENCE: + return INVALID + + if verifyProofOfNonexistence(p, rootHashes[i]) == INVALID: + return INVALID + + return VALID +``` + #### Generating the AST Each validator maintains the "Live State BucketList" (currently called the BucketList). This stores all live ledger state, @@ -696,7 +795,8 @@ all empty. After some time, the Hot Archive `AST[0]` will become full and enter the Pending Cold Archive Queue. A new, empty Hot Archive is initialized for `AST[1]`. While in the Pending Cold Archive Queue, `AST[0]` -will be converted into a single Cold Archive Bucket. In order to give validators time to perform this merge, the `AST[0]` +will be converted into a single Cold Archive Bucket and prepare the Archival Filter. In order to give +validators time to perform this merge, `AST[0]` must stay in the queue for a minimum of `numLedgersToInitSnapshot` ledgers. In this example, `AST[1]` also becomes full before `numLedgersToInitSnapshot` ledgers occur. Thus, `AST[1]` is also added to this queue, and `AST[2]` is initialized as the current Hot Archive. @@ -705,6 +805,7 @@ the current Hot Archive. After `numLedgersToInitSnapshot` ledgers have passed, `AST[0]` is now eligible to become the current Cold Archive. On the ledger that this occurs, `AST[0]` is removed from the Pending Cold Archive Queue and initialized as the current Cold Archive. +At this time, the Archival Filter is also persisted to the live BucketList. Simultaneously, the single merged Cold Archive Bucket for `AST[0]` is published to history as the canonical Archival Snapshot for epoch 0. @@ -740,7 +841,7 @@ It contains `HotArchiveBucketEntry` type entries and is constructed as follows: 1. Whenever a `PERSISTENT` entry is evicted, the entry is deleted from the Live State BucketList and added to the Hot Archive as a `HOT_ARCHIVE_ARCHIVED` entry. The corresponding `TTLEntry` is deleted and not stored in the Hot Archive. 2. Whenever a `PERSISTENT` entry is deleted as part of transaction execution (not deleted via eviction event), the key is stored in the Hot -Archive as a `HOT_ARCHIVE_DELETED` entry. +Archive as a `HOT_ARCHIVE_DELETED` entry iff an ARCHIVED entry with given entry's key exists in some subtree AST[i] where i < k. 3. If an archived entry is restored and the entry currently exists in the Hot Archive, the `HOT_ARCHIVE_ARCHIVED` previously stored in the Hot Archive is overwritten by a `HOT_ARCHIVE_LIVE` entry. 4. If a deleted key is recreated and the deleted key currently exists in the Hot Archive, the `HOT_ARCHIVE_DELETED` previously stored in the @@ -815,6 +916,28 @@ The Cold Archive should only every have a single version for each key, such that Once the root Merkle node has been created, the root hash is persisted in the live state BucketList and the current Cold Archive is dropped. +#### Writing Deleted key to the AST + +In order to prevent double restoration attacks, some deleted keys must be written to the AST to enforce proofs-of-nonexistence. However, +writing deleted keys can be optimized, and not all deletion events require writing the deleted key. + +Suppose an entry is archived in epoch i, restored, then deleted. The goal is to prevent restoring this entry from epoch i again. If a +DELETED entry for the given key is written in some subtree AST[k] for k > i, restorations will fail, as no proof-on-nonexistence for the +key will exist for AST[k]. However, if an entry being deleted has not previously been archived, there is no reason to write the deletion +event, as no double restores can occur. Thus, we only write a DELETED entry if some ARCHIVED entry exists for that key in the archive. + +This can be implemented by checking binary fuse filters. When a persistent entry is deleted, the validator will check the key against +all Archival Filters for all subtrees, as well as check the Hot Archive and any pending Hot Archives for an ARCHIVED entry with +the same key. If any binary fuse filter query indicates the existence of an ARCHIVED entry, or if an ARCHIVED entry exists in any Hot Archive, +the deleted key must be written. If no binary fuse filter indicates this, no DELETED key must be written. While a filter false positive may +occasionally require writing a DELETED key when no ARCHIVED entry actually exists, a DELETED entry will always be written iff an ARCHIVED +entry exists. + +Typically, it seems like PERSISTENT entries are seldom deleted. Most deletion seem to occur briefly after creation. For example, a proposed +DEX trading protocol creates temporary intermediary accounts to store liquidity when transaction between the classic DEX and Soroban based +DEXs. Since this intermediary account stores token balances, it is not appropriate for he TEMPORARY durability class. However, these PERSISTENT +storage entries are usually quickly deleted after their construction. + #### Changes to History Archives The `HistoryArchiveState` will be extended as follows: @@ -889,21 +1012,21 @@ for key in footprint.readWrite: if key does not exist in live BucketList: // Assume key is being created for the first time - // Incudes hot archive and every pending archival snapshot + // Incudes hot archive, every pending archival snapshot, and cold archive + // Iteration is in order of most recent to oldest epoch for bucketList in allBucketListsStoredOnDisk: - if key exists in bucketlist: + if ARCHIVED(key) exists in bucketList: // Entry is archived failTx() + else if DELETED(key) exists in bucketList: + // Entry has been recently deleted, no need to check archive + continue - // oldest epoch is epoch of oldest pending archival snapshot, + // oldest epoch is epoch of cold archive (if one exists), the oldest pending archival snapshot, // or hot archive in there are no pending snapshots - for epoch in range(0, oldestEpochOnDisk): - filter = getArchivalFilter(epoch) - if key in filter: - // Check for proof in case of filter false positive - if validProof(key, epoch) not in tx.proofs: - // Entry is archived, or filter miss occurred without overriding proof - failTx() + if isCreationValid(key, tx.proofs, oldestEpochOnDisk) == INVALID: + // Entry is archived, or filter miss occurred without overriding proof + failTx() ``` Following this new archival phase, `InvokeHostFunctionOp` will function identically to today. @@ -916,7 +1039,32 @@ and a proof must be verified (see No Archival Fees for Entry Creation). is not currently stored by the validators. The operation can restore a mix of entries that are archived but currently stored by validators and those that are not stored by validators. Restoration proofs are included in the `proofs` struct inside `SorobanTransactionData` for -the given transaction. +the given transaction. Restore proof verification works as follows: + +``` +// readOnly must be empty +for key in footprint.readWrite: + if key exists in live BucketList: + // Key is already live + continue + + // Incudes hot archive, every pending archival snapshot, and cold archive + // Iteration is in order of most recent to oldest epoch + for bucketList in allBucketListsStoredOnDisk: + if ARCHIVED(key) exists in bucketList: + // Entry is archived, restore it + restore(key) + continueToNextKey + else if DELETED(key) exists in bucketList: + // Entry was most recently deleted, do not allow outdated restore + failTx() + + // oldest epoch is epoch is cold archive (if one exists), the oldest pending archival snapshot, + // or hot archive in there are no pending snapshots + if isRestoreValid(key, tx.proofs, oldestEpochOnDisk) == INVALID: + failTx() +``` + `RestoreFootprintOp` will now require `instruction` resources, metered based on the complexity of verifying the given proofs of inclusion. There are no additional fees or resources @@ -930,7 +1078,8 @@ persisted on validators via `lastArchivalEpochPersisted`. This will be useful to as it is the cutoff point at which a proof will need to be generated for `RestoreFootprintOp`. Whenever a `PERSISTENT` entry is evicted (i.e. removed from the Live State BucketList and added to the Hot Archive), -the full entry will be emitted via `evictedPersistentLedgerEntries`. +the entry key and its associated TTL key will be emitted via `evictedTemporaryLedgerKeys` (this field is unfortunately +named for legacy reasons). Whenever an entry is restored via `RestoreFootprintOp`, the `LedgerEntry` being restored and its associated TTL will be emitted as a `LedgerEntryChange` of type `LEDGER_ENTRY_RESTORE`. Note that