Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update SlotHashesSysvar API for more flexibility #1948

Closed
wants to merge 10 commits into from

Conversation

buffalojoec
Copy link

Problem

The new SlotHashesSysvar API - used to query slot hashes using the new sol_get_sysvar
syscall - works well, but there are a few limitations that can arise when testing on-chain
programs.

For starters, the creation of a Vec<PodSlotHash> in the API's internals doesn't play well
with the bump allocator. On-chain programs who wish to run multiple queries will easily
max out their heap size, since the vector won't deallocate. They'd benefit from being able
to access this created vector directly.

Secondly, when the slot hashes sysvar is smaller than its typical fixed-size - such as in
program-test scenarios - the new API won't work properly, since it expects the full
size.

And lastly, the previous implementation was initializing the Vec<PodSlotHash> with
PodSlotHash::default(), which yields a slot 0 for each entry,. This means that no matter
what was in the slot hashes sysvar data, a query for slot 0 would succeed and yield you
the entry from the middle of the vector.

Summary of Changes

First make get_pod_slot_hashes public, returning direct access to the vector.

Then, redesign the get_pod_slot_hashes method to first query the existing length of
the sysvar data, before making another call to get_sysvar to initiate a properly-sized
Vec<PodSlotHash>.

Finally, add some more test coverage.

@buffalojoec buffalojoec requested a review from joncinque July 1, 2024 20:34
@buffalojoec buffalojoec marked this pull request as ready for review July 1, 2024 20:34
Copy link

@joncinque joncinque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guts look good! Let's aim to reduce this to one syscall always to improve CU usage in the general case

sdk/program/src/sysvar/slot_hashes.rs Outdated Show resolved Hide resolved
sdk/program/src/sysvar/slot_hashes.rs Outdated Show resolved Hide resolved
sdk/program/src/sysvar/slot_hashes.rs Outdated Show resolved Hide resolved
sdk/program/src/sysvar/slot_hashes.rs Outdated Show resolved Hide resolved
// Finally, convert to a vector of `PodSlotHash`.
data.get(start..end)
.and_then(|data| bytemuck::try_cast_slice(data).ok())
.map(|pod_hashes| pod_hashes.to_vec())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something, but I believe to_vec() will redo the huge allocation -- we should instead find a way to reuse the existing buffer that was already fetched to avoid blowing up the heap.

I initially thought that SlotHashesSysvar would hold onto the underlying Vec<u8> and that it would also expose a way to get the underlying PodSlotHash slice using something like as_slice(), which would do the bytemuck cast.

Otherwise, we'll need to do some unsafe stuff here to lop off the first 8 bytes and return the rest as Vec<PodSlotHash> without doing another allocation. Although, looking at the Vec interface, I think it'll be pretty gross do that unfortunately.

Any better ideas for this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about doing a slice from raw parts for the initial mutable buffer, since we know it's a fixed size, and then that call .to_vec() should do just one heap allocation, since the original slice will be stack?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I might be misunderstanding -- we can't have a 20kb variable on the stack until dynamic stack frames are added, so what would the "initial mutable buffer" be?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh darn, yeah that probably won't work then. I think I see what you're getting at in your first message. I've added a commit that hangs onto the initial buffer, and then the commit right after uses as_slice. Let me know if this is what you had in mind.

Copy link

@joncinque joncinque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks good, but now we'll need to figure out how best to make the change, since it's breaking

Comment on lines 126 to 145
/// Get a value from the sysvar entries by its key.
/// Returns `None` if the key is not found.
pub fn get(&self, slot: &Slot) -> Result<Option<Hash>, ProgramError> {
self.get_pod_slot_hashes().map(|pod_hashes| {
pod_hashes
.binary_search_by(|PodSlotHash { slot: this, .. }| slot.cmp(this))
.map(|idx| pod_hashes[idx].hash)
.ok()
})
}

/// Get the position of an entry in the sysvar by its key.
/// Returns `None` if the key is not found.
pub fn position(&self, slot: &Slot) -> Result<Option<usize>, ProgramError> {
self.get_pod_slot_hashes().map(|pod_hashes| {
pod_hashes
.binary_search_by(|PodSlotHash { slot: this, .. }| slot.cmp(this))
.ok()
})
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These, along with the addition of fields to the struct, are breaking changes. Do you plan to break and backport the breaking change? We're still doing a little bit of that, and since it's a new API, it should be OK if that's how you want to go.

If that's the case, how about making SlotHashesSysvar implement the Sysvar trait for get(), and then add the other functions as before?

Or, we could avoid a breaking change by creating a new PodSlotHashes struct and deprecate the old one, which would make a nice analogue to the normal SlotHashes. What do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, we could avoid a breaking change by creating a new PodSlotHashes struct and deprecate the old one, which would make a nice analogue to the normal SlotHashes. What do you think?

I think this is probably the better way to go. We can deprecate SlotHashesSysvar and add a PodSlotHashes that implements Sysvar and offers a similar API to SlotHashes. Nice idea!

I think it maybe makes more sense at this point to open a new PR where I can add the deprecation and new struct/API. Work for you?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, whatever you prefer!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants