TransactionView - sanitization checks #2757

apfitzge · 2024-08-27T16:34:53Z

Problem

Task in [Tracking] TransactionView #2255
Run and "encode" sanitization in the type

Summary of Changes

Add generic SANITIZED on the TransactionView to indicate if sanitization checks have been run
Add sanitize checks and tests

Since the structs are different, and we probably do not want some intermediate trait - sanitization checks are duplicated from sdk.

Fixes #

apfitzge · 2024-08-27T16:45:18Z

transaction-view/src/transaction_view.rs

-pub struct TransactionView<D: TransactionData> {
-    data: D,
-    meta: TransactionMeta,
+pub struct TransactionView<const SANITIZED: bool, D: TransactionData> {


@tao-stones @alessandrod

What are your thoughts on this approach wrt encoding whether we sanitzed in the type itself?

Basically I'm weighing whether or not we should even expose an unsanitized type at all, or if that should be some internal that's not exposed.

I think because the sanitize checks can be relatively expensive, it makes sense to allow calling code to opt-out of doing them until necessary.

wrt the use of const generic parameter, my first thought is if it's aiming for having type-level compile-time guarantee, or if a runtime is_sanitized flag or enum would suffice? My hesitation towards const generic parameter is the code complexity, and in often cases, code bloat it adds. Just a quick thought, happy to discuss more.

I agree that exposing unsanitized type provides flexibility, user should not be forced to sanitize transaction before viewing it.

I think we should have some compile-time guarantee tbh; we want to be sure we aren't processing transactions that don't pass those checks.

In terms of complexity I'm not sure it adds too much, I probably should make these aliases pub:

// alias for convenience pub type UnsanitizedTransactionView<D> = TransactionView<false, D>; pub type SanitizedTransactionView<D> = TransactionView<true, D>;

What I attempted to achieve with the const generic is the type safety separation (good) of VersionedTransaction vs SanitizedVersionedTransaction, but without having the mess of different member functions or calling fns on fields that make using SanitizedVersionedTransaction a pain in the ass to work with.

Using const generic, both have exactly the same fns and data, it's just one has gone through neccessary checks vs not.

In terms of intended transitioning through the stages of our pipeline.

receive packet

parse -> UnsanitizedTransactionView

(possible filtering on priority, etc)

sanitize -> SanitizedTransactionView

static_meta -> RuntimeTransaction<SanitizedTransactionView<D>>

dynamic_meta -> RuntimeTransaction<Resolved(?)TransactionView<D>> // don't know about naming here yet, but basically wrapper of SanitizedTransactionView plus LoadedAddresses

So type-safety of SanitizedTransactionView will let us force sanitization when we go to the RuntimeTransaction to calculate static/dynamic meta.

However, since I'm not sure we will actually do 3 - it may just be better to always do the sanitize checks.
wdyt?

Using const generic, both have exactly the same fns and data, it's just one has gone through neccessary checks vs not.

I don't think we actually want exactly the same functions and data for sanitized vs unsanitized do we? The current separation of Sanitized types and their unsanitized counterparts allowed us to add certain functions which we knew were only safe if the transaction was sanitized.

It looks like you've added some sanitization to TransactionMeta regardless of whether the sanitized or unsanitized view is used which allows us to know a transaction has at least one signature, for example. This would allow us to add a signature method to both sanitized and unsanitized views, which is nice..

But what about methods like get_durable_nonce or program_instructions_iter which rely on sanitized inputs? Are you not planning to add methods on the SanitizedTransactionView for those? If you are, I don't think it's a big difference to use separate structs vs this const generic approach.

The const generic doesn't restrict us in this manner; I did not mean to imply they should have exactly the same functions - but that the Sanitized version should have access to at least the same functions as the Unsanitized. and that they should not be less convenient to call (which is my issue with SanitizedVersionedTransaction).

We can still add specific behavior (if we find it necessary) to the sanitized version like this:

// Additional methods that rely on sanitization checks for correctness. impl<D: TransactionData> TransactionView<true, D> { pub fn program_instructions_iter<'a>( &'a self, ) -> impl Iterator<Item = (&'a Pubkey, SVMInstruction<'a>)> { self.instructions_iter().map(|ix| { ( &self.static_account_keys()[usize::from(ix.program_id_index)], ix, ) }) } }

You mentioned a few examples, so I'll just comment on those here as well.

get_durable_nonce shouldn't be a member fn imo. It will need to become generic over SVMTransaction or similar trait so we can abstract the processing pipeline. And it can be simply derived from the common methods of that trait.

program_instructions_iter could be useful as a member fn, and does in fact rely on sanitized input. My plan right now is to only add "extensions" if they are necessary or would make things convenient. So far I haven't needed this method (since there's no actual uses of TransactionView) so haven't added it yet.

apfitzge · 2024-08-27T16:46:10Z

transaction-view/src/sanitize.rs

+    // Most sanitization checks are based on the counts stored in this metadata
+    // rather than checking actual data. Using a mutable dummy object allows for
+    // far simpler testing.
+    fn dummy_metadata() -> TransactionMeta {


Made several fields pub(crate) so that this style of testing is much simpler.
I think the above justifies that change, and public-ness is limited to this crate itself.

I don't love the way you set up the tests using this dummy_metadata because it creates transactions views over invalid data slices. It looks like all of those tests could modify an unsanitized transaction directly and serialize to bytes to test the same sanitization errors.

jstarry

Haven't reviewed the tests yet but the sanitization parts look solid to me

jstarry · 2024-08-28T02:33:32Z

transaction-view/src/transaction_view.rs

+    /// Creates a new `TransactionView`, running sanitization checks.
+    pub fn try_new_sanitized(data: D) -> Result<Self> {
+        let unsanitized_view = TransactionView::try_new_unsanitized(data)?;
+        sanitize(unsanitized_view)


Rather than making this sanitize function public can we add a sanitize method to UnsanitizedTransactionView? Seems more convenient for users to use a method than importing that function.

2eebb4d

That seems like a better interface - I kept the function defined in sanitize.rs so that all sanitization checks are in a single place, but made it a member function of UnsanitizedTransactionView<D>.

That somewhat contradicts my earlier comment that

the Sanitized version should have access to at least the same functions as the Unsanitized

but think this is a situation where it makes sense we shouldn't sanitize the already sanitized struct.

jstarry · 2024-08-28T03:12:22Z

transaction-view/src/transaction_view.rs

-pub struct TransactionView<D: TransactionData> {
-    data: D,
-    meta: TransactionMeta,
+pub struct TransactionView<const SANITIZED: bool, D: TransactionData> {


Using const generic, both have exactly the same fns and data, it's just one has gone through neccessary checks vs not.

I don't think we actually want exactly the same functions and data for sanitized vs unsanitized do we? The current separation of Sanitized types and their unsanitized counterparts allowed us to add certain functions which we knew were only safe if the transaction was sanitized.

It looks like you've added some sanitization to TransactionMeta regardless of whether the sanitized or unsanitized view is used which allows us to know a transaction has at least one signature, for example. This would allow us to add a signature method to both sanitized and unsanitized views, which is nice..

But what about methods like get_durable_nonce or program_instructions_iter which rely on sanitized inputs? Are you not planning to add methods on the SanitizedTransactionView for those? If you are, I don't think it's a big difference to use separate structs vs this const generic approach.

jstarry · 2024-08-28T13:45:48Z

transaction-view/src/lib.rs

 pub mod result;
-mod signature_meta;
+pub mod sanitize;


This no longer needs to be pub

jstarry · 2024-08-28T13:59:36Z

transaction-view/src/sanitize.rs

+    // Most sanitization checks are based on the counts stored in this metadata
+    // rather than checking actual data. Using a mutable dummy object allows for
+    // far simpler testing.
+    fn dummy_metadata() -> TransactionMeta {


I don't love the way you set up the tests using this dummy_metadata because it creates transactions views over invalid data slices. It looks like all of those tests could modify an unsanitized transaction directly and serialize to bytes to test the same sanitization errors.

jstarry · 2024-08-28T14:00:56Z

transaction-view/src/transaction_view.rs

+    pub(crate) data: D,
+    pub(crate) meta: TransactionMeta,


I would like these to stay private since they are quite sensitive. I could be convinced that pub(crate) is safe and fine but it doesn't seem necessary.

jstarry · 2024-08-28T14:06:58Z

transaction-view/src/sanitize.rs

+    }
+
+    #[test]
+    fn test_sanitize_instructions() {


It would be nice to have a test that ensures that if even when there are non static keys from ATLs, the program index still cannot be greater or equal to the number of static keys. The tests you have now don't use ATLs at all.

Added: https://github.com/anza-xyz/agave/pull/2757/files#diff-a88f2adf2b6178065e29a7feb3fb1083b0a2282ea647ae4db1782acddb2787abR455

I still don't see this particular test. Basically a test like your signature sanitization test "Not enough static accounts - with look up accounts" (thanks for adding that btw!) but for program id indexes

oh it just has a slightly different comment, and the ATLs were defined above it since I clone them.

Look at the test with this comment: // Invalid account index with v0.

That test doesn't have anything to do with program indexes. It just checks that account indexes can't be greater than or equal to the number of static + ATL accounts. I would like a test that shows that a program id index cannot reference an ATL account and therefore cannot be greater than or equal to the number of static accounts.

Oh! I missed a word when reading. yeah, I get what you're saying now.

a1600ca

apfitzge · 2024-08-28T15:26:38Z

transaction-view/src/sanitize.rs

+    transaction_view::UnsanitizedTransactionView,
+};
+
+pub(crate) fn sanitize(view: &UnsanitizedTransactionView<impl TransactionData>) -> Result<()> {


actual sanitization checks done here. Data movement for type-change done in TransactionView.

apfitzge · 2024-08-28T15:29:08Z

@jstarry

I don't love the way you set up the tests using this dummy_metadata because it creates transactions views over invalid data slices. It looks like all of those tests could modify an unsanitized transaction directly and serialize to bytes to test the same sanitization errors.

Can't respond inline to this because files changed.
I think your assessment is fair - I adjusted tests so I could more easily construct invalid (from perspective of sanitization) transactions. They are longer, but I think this is a more correct method of testing.

jstarry · 2024-08-29T01:51:29Z

transaction-view/src/sanitize.rs

+    }
+
+    #[test]
+    fn test_sanitize_instructions() {


I still don't see this particular test. Basically a test like your signature sanitization test "Not enough static accounts - with look up accounts" (thanks for adding that btw!) but for program id indexes

jstarry · 2024-08-29T01:53:17Z

transaction-view/src/transaction_meta.rs

+    pub(crate) signature: SignatureMeta,
    /// Message header metadata.
-    message_header: MessageHeaderMeta,
+    pub(crate) message_header: MessageHeaderMeta,
    /// Static account keys metadata.
-    static_account_keys: StaticAccountKeysMeta,
+    pub(crate) static_account_keys: StaticAccountKeysMeta,
    /// Recent blockhash offset.
-    recent_blockhash_offset: u16,
+    pub(crate) recent_blockhash_offset: u16,
    /// Instructions metadata.
-    instructions: InstructionsMeta,
+    pub(crate) instructions: InstructionsMeta,
    /// Address table lookup metadata.
-    address_table_lookup: AddressTableLookupMeta,
+    pub(crate) address_table_lookup: AddressTableLookupMeta,


these can all be private again

thanks! 6d02337

apfitzge added 5 commits August 27, 2024 11:32

Track number of lookup accounts

9e479b2

sanitization checks

99d147d

error-type split sanitize and parse

375e7d1

generic over sanitized

d292aff

sanitize tests

2eab4fd

apfitzge mentioned this pull request Aug 27, 2024

[Tracking] TransactionView #2255

Open

14 tasks

add bench for parse and sanitize

e8937cd

apfitzge commented Aug 27, 2024

View reviewed changes

apfitzge marked this pull request as ready for review August 27, 2024 19:27

apfitzge requested review from jstarry and tao-stones August 27, 2024 19:28

jstarry reviewed Aug 28, 2024

View reviewed changes

sanitize as member function of UnsanitizedTransactionView

2eebb4d

jstarry reviewed Aug 28, 2024

View reviewed changes

apfitzge added 3 commits August 28, 2024 10:04

no more dummy meta

c9c47a5

rework tests

11630b4

modules private where they can be

1fb746c

apfitzge commented Aug 28, 2024

View reviewed changes

jstarry reviewed Aug 29, 2024

View reviewed changes

apfitzge added 2 commits August 29, 2024 08:05

privatize TransactionMeta fields

6d02337

Add test for invalid program index with lookups

a1600ca

jstarry approved these changes Aug 29, 2024

View reviewed changes

apfitzge merged commit 2a59e61 into anza-xyz:master Aug 29, 2024
40 checks passed

apfitzge deleted the transaction-view-sanitization branch August 29, 2024 15:52

ray-kast pushed a commit to abklabs/agave that referenced this pull request Nov 27, 2024

TransactionView - sanitization checks (anza-xyz#2757)

f0c4cd1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TransactionView - sanitization checks #2757

TransactionView - sanitization checks #2757

apfitzge commented Aug 27, 2024

apfitzge Aug 27, 2024

apfitzge Aug 27, 2024

tao-stones Aug 27, 2024

apfitzge Aug 27, 2024

apfitzge Aug 27, 2024 •

edited

Loading

apfitzge Aug 27, 2024

jstarry Aug 28, 2024

apfitzge Aug 28, 2024

apfitzge Aug 27, 2024

jstarry Aug 28, 2024

jstarry left a comment

jstarry Aug 28, 2024

apfitzge Aug 28, 2024

jstarry Aug 28, 2024

jstarry Aug 28, 2024

jstarry Aug 28, 2024

jstarry Aug 28, 2024

jstarry Aug 28, 2024

apfitzge Aug 28, 2024

jstarry Aug 29, 2024

apfitzge Aug 29, 2024

jstarry Aug 29, 2024

apfitzge Aug 29, 2024

apfitzge Aug 28, 2024

apfitzge commented Aug 28, 2024

jstarry Aug 29, 2024

jstarry Aug 29, 2024

apfitzge Aug 29, 2024

TransactionView - sanitization checks #2757

TransactionView - sanitization checks #2757

Conversation

apfitzge commented Aug 27, 2024

Problem

Summary of Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apfitzge Aug 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jstarry left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apfitzge commented Aug 28, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apfitzge Aug 27, 2024 •

edited

Loading