Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TransactionView - sanitization checks #2757

Merged
merged 12 commits into from
Aug 29, 2024

Conversation

apfitzge
Copy link

Problem

Summary of Changes

  • Add generic SANITIZED on the TransactionView to indicate if sanitization checks have been run
  • Add sanitize checks and tests

Since the structs are different, and we probably do not want some intermediate trait - sanitization checks are duplicated from sdk.

Fixes #

@apfitzge apfitzge mentioned this pull request Aug 27, 2024
14 tasks
pub struct TransactionView<D: TransactionData> {
data: D,
meta: TransactionMeta,
pub struct TransactionView<const SANITIZED: bool, D: TransactionData> {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tao-stones @alessandrod

What are your thoughts on this approach wrt encoding whether we sanitzed in the type itself?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically I'm weighing whether or not we should even expose an unsanitized type at all, or if that should be some internal that's not exposed.

I think because the sanitize checks can be relatively expensive, it makes sense to allow calling code to opt-out of doing them until necessary.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrt the use of const generic parameter, my first thought is if it's aiming for having type-level compile-time guarantee, or if a runtime is_sanitized flag or enum would suffice? My hesitation towards const generic parameter is the code complexity, and in often cases, code bloat it adds. Just a quick thought, happy to discuss more.

I agree that exposing unsanitized type provides flexibility, user should not be forced to sanitize transaction before viewing it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have some compile-time guarantee tbh; we want to be sure we aren't processing transactions that don't pass those checks.

In terms of complexity I'm not sure it adds too much, I probably should make these aliases pub:

// alias for convenience
pub type UnsanitizedTransactionView<D> = TransactionView<false, D>;
pub type SanitizedTransactionView<D> = TransactionView<true, D>;

What I attempted to achieve with the const generic is the type safety separation (good) of VersionedTransaction vs SanitizedVersionedTransaction, but without having the mess of different member functions or calling fns on fields that make using SanitizedVersionedTransaction a pain in the ass to work with.

Using const generic, both have exactly the same fns and data, it's just one has gone through neccessary checks vs not.

Copy link
Author

@apfitzge apfitzge Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of intended transitioning through the stages of our pipeline.

  1. receive packet
  2. parse -> UnsanitizedTransactionView
  3. (possible filtering on priority, etc)
  4. sanitize -> SanitizedTransactionView
  5. static_meta -> RuntimeTransaction<SanitizedTransactionView<D>>
  6. dynamic_meta -> RuntimeTransaction<Resolved(?)TransactionView<D>> // don't know about naming here yet, but basically wrapper of SanitizedTransactionView plus LoadedAddresses

So type-safety of SanitizedTransactionView will let us force sanitization when we go to the RuntimeTransaction to calculate static/dynamic meta.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, since I'm not sure we will actually do 3 - it may just be better to always do the sanitize checks.
wdyt?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using const generic, both have exactly the same fns and data, it's just one has gone through neccessary checks vs not.

I don't think we actually want exactly the same functions and data for sanitized vs unsanitized do we? The current separation of Sanitized types and their unsanitized counterparts allowed us to add certain functions which we knew were only safe if the transaction was sanitized.

It looks like you've added some sanitization to TransactionMeta regardless of whether the sanitized or unsanitized view is used which allows us to know a transaction has at least one signature, for example. This would allow us to add a signature method to both sanitized and unsanitized views, which is nice..

But what about methods like get_durable_nonce or program_instructions_iter which rely on sanitized inputs? Are you not planning to add methods on the SanitizedTransactionView for those? If you are, I don't think it's a big difference to use separate structs vs this const generic approach.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The const generic doesn't restrict us in this manner; I did not mean to imply they should have exactly the same functions - but that the Sanitized version should have access to at least the same functions as the Unsanitized. and that they should not be less convenient to call (which is my issue with SanitizedVersionedTransaction).

We can still add specific behavior (if we find it necessary) to the sanitized version like this:

// Additional methods that rely on sanitization checks for correctness.
impl<D: TransactionData> TransactionView<true, D> {
    pub fn program_instructions_iter<'a>(
        &'a self,
    ) -> impl Iterator<Item = (&'a Pubkey, SVMInstruction<'a>)> {
        self.instructions_iter().map(|ix| {
            (
                &self.static_account_keys()[usize::from(ix.program_id_index)],
                ix,
            )
        })
    }
}

You mentioned a few examples, so I'll just comment on those here as well.

  • get_durable_nonce shouldn't be a member fn imo. It will need to become generic over SVMTransaction or similar trait so we can abstract the processing pipeline. And it can be simply derived from the common methods of that trait.
  • program_instructions_iter could be useful as a member fn, and does in fact rely on sanitized input. My plan right now is to only add "extensions" if they are necessary or would make things convenient. So far I haven't needed this method (since there's no actual uses of TransactionView) so haven't added it yet.

// Most sanitization checks are based on the counts stored in this metadata
// rather than checking actual data. Using a mutable dummy object allows for
// far simpler testing.
fn dummy_metadata() -> TransactionMeta {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made several fields pub(crate) so that this style of testing is much simpler.
I think the above justifies that change, and public-ness is limited to this crate itself.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love the way you set up the tests using this dummy_metadata because it creates transactions views over invalid data slices. It looks like all of those tests could modify an unsanitized transaction directly and serialize to bytes to test the same sanitization errors.

@apfitzge apfitzge marked this pull request as ready for review August 27, 2024 19:27
@apfitzge apfitzge requested review from jstarry and tao-stones August 27, 2024 19:28
Copy link

@jstarry jstarry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't reviewed the tests yet but the sanitization parts look solid to me

/// Creates a new `TransactionView`, running sanitization checks.
pub fn try_new_sanitized(data: D) -> Result<Self> {
let unsanitized_view = TransactionView::try_new_unsanitized(data)?;
sanitize(unsanitized_view)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than making this sanitize function public can we add a sanitize method to UnsanitizedTransactionView? Seems more convenient for users to use a method than importing that function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2eebb4d

That seems like a better interface - I kept the function defined in sanitize.rs so that all sanitization checks are in a single place, but made it a member function of UnsanitizedTransactionView<D>.

That somewhat contradicts my earlier comment that

the Sanitized version should have access to at least the same functions as the Unsanitized

but think this is a situation where it makes sense we shouldn't sanitize the already sanitized struct.

pub struct TransactionView<D: TransactionData> {
data: D,
meta: TransactionMeta,
pub struct TransactionView<const SANITIZED: bool, D: TransactionData> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using const generic, both have exactly the same fns and data, it's just one has gone through neccessary checks vs not.

I don't think we actually want exactly the same functions and data for sanitized vs unsanitized do we? The current separation of Sanitized types and their unsanitized counterparts allowed us to add certain functions which we knew were only safe if the transaction was sanitized.

It looks like you've added some sanitization to TransactionMeta regardless of whether the sanitized or unsanitized view is used which allows us to know a transaction has at least one signature, for example. This would allow us to add a signature method to both sanitized and unsanitized views, which is nice..

But what about methods like get_durable_nonce or program_instructions_iter which rely on sanitized inputs? Are you not planning to add methods on the SanitizedTransactionView for those? If you are, I don't think it's a big difference to use separate structs vs this const generic approach.

pub mod result;
mod signature_meta;
pub mod sanitize;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This no longer needs to be pub

// Most sanitization checks are based on the counts stored in this metadata
// rather than checking actual data. Using a mutable dummy object allows for
// far simpler testing.
fn dummy_metadata() -> TransactionMeta {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love the way you set up the tests using this dummy_metadata because it creates transactions views over invalid data slices. It looks like all of those tests could modify an unsanitized transaction directly and serialize to bytes to test the same sanitization errors.

Comment on lines 22 to 23
pub(crate) data: D,
pub(crate) meta: TransactionMeta,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like these to stay private since they are quite sensitive. I could be convinced that pub(crate) is safe and fine but it doesn't seem necessary.

}

#[test]
fn test_sanitize_instructions() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have a test that ensures that if even when there are non static keys from ATLs, the program index still cannot be greater or equal to the number of static keys. The tests you have now don't use ATLs at all.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't see this particular test. Basically a test like your signature sanitization test "Not enough static accounts - with look up accounts" (thanks for adding that btw!) but for program id indexes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh it just has a slightly different comment, and the ATLs were defined above it since I clone them.

Look at the test with this comment: // Invalid account index with v0.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That test doesn't have anything to do with program indexes. It just checks that account indexes can't be greater than or equal to the number of static + ATL accounts. I would like a test that shows that a program id index cannot reference an ATL account and therefore cannot be greater than or equal to the number of static accounts.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I missed a word when reading. yeah, I get what you're saying now.

a1600ca

transaction_view::UnsanitizedTransactionView,
};

pub(crate) fn sanitize(view: &UnsanitizedTransactionView<impl TransactionData>) -> Result<()> {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actual sanitization checks done here. Data movement for type-change done in TransactionView.

@apfitzge
Copy link
Author

@jstarry

I don't love the way you set up the tests using this dummy_metadata because it creates transactions views over invalid data slices. It looks like all of those tests could modify an unsanitized transaction directly and serialize to bytes to test the same sanitization errors.

Can't respond inline to this because files changed.
I think your assessment is fair - I adjusted tests so I could more easily construct invalid (from perspective of sanitization) transactions. They are longer, but I think this is a more correct method of testing.

}

#[test]
fn test_sanitize_instructions() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't see this particular test. Basically a test like your signature sanitization test "Not enough static accounts - with look up accounts" (thanks for adding that btw!) but for program id indexes

Comment on lines 16 to 26
pub(crate) signature: SignatureMeta,
/// Message header metadata.
message_header: MessageHeaderMeta,
pub(crate) message_header: MessageHeaderMeta,
/// Static account keys metadata.
static_account_keys: StaticAccountKeysMeta,
pub(crate) static_account_keys: StaticAccountKeysMeta,
/// Recent blockhash offset.
recent_blockhash_offset: u16,
pub(crate) recent_blockhash_offset: u16,
/// Instructions metadata.
instructions: InstructionsMeta,
pub(crate) instructions: InstructionsMeta,
/// Address table lookup metadata.
address_table_lookup: AddressTableLookupMeta,
pub(crate) address_table_lookup: AddressTableLookupMeta,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these can all be private again

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! 6d02337

@apfitzge apfitzge merged commit 2a59e61 into anza-xyz:master Aug 29, 2024
40 checks passed
@apfitzge apfitzge deleted the transaction-view-sanitization branch August 29, 2024 15:52
ray-kast pushed a commit to abklabs/agave that referenced this pull request Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants