Update `SlotHashesSysvar` API for more flexibility #1948

buffalojoec · 2024-07-01T19:54:28Z

Problem

The new SlotHashesSysvar API - used to query slot hashes using the new sol_get_sysvar
syscall - works well, but there are a few limitations that can arise when testing on-chain
programs.

For starters, the creation of a Vec<PodSlotHash> in the API's internals doesn't play well
with the bump allocator. On-chain programs who wish to run multiple queries will easily
max out their heap size, since the vector won't deallocate. They'd benefit from being able
to access this created vector directly.

Secondly, when the slot hashes sysvar is smaller than its typical fixed-size - such as in
program-test scenarios - the new API won't work properly, since it expects the full
size.

And lastly, the previous implementation was initializing the Vec<PodSlotHash> with
PodSlotHash::default(), which yields a slot 0 for each entry,. This means that no matter
what was in the slot hashes sysvar data, a query for slot 0 would succeed and yield you
the entry from the middle of the vector.

Summary of Changes

First make get_pod_slot_hashes public, returning direct access to the vector.

Then, redesign the get_pod_slot_hashes method to first query the existing length of
the sysvar data, before making another call to get_sysvar to initiate a properly-sized
Vec<PodSlotHash>.

Finally, add some more test coverage.

joncinque

The guts look good! Let's aim to reduce this to one syscall always to improve CU usage in the general case

sdk/program/src/sysvar/slot_hashes.rs

joncinque · 2024-07-24T17:35:29Z

sdk/program/src/sysvar/slot_hashes.rs

+        // Finally, convert to a vector of `PodSlotHash`.
+        data.get(start..end)
+            .and_then(|data| bytemuck::try_cast_slice(data).ok())
+            .map(|pod_hashes| pod_hashes.to_vec())


I might be missing something, but I believe to_vec() will redo the huge allocation -- we should instead find a way to reuse the existing buffer that was already fetched to avoid blowing up the heap.

I initially thought that SlotHashesSysvar would hold onto the underlying Vec<u8> and that it would also expose a way to get the underlying PodSlotHash slice using something like as_slice(), which would do the bytemuck cast.

Otherwise, we'll need to do some unsafe stuff here to lop off the first 8 bytes and return the rest as Vec<PodSlotHash> without doing another allocation. Although, looking at the Vec interface, I think it'll be pretty gross do that unfortunately.

Any better ideas for this?

How about doing a slice from raw parts for the initial mutable buffer, since we know it's a fixed size, and then that call .to_vec() should do just one heap allocation, since the original slice will be stack?

Sorry, I might be misunderstanding -- we can't have a 20kb variable on the stack until dynamic stack frames are added, so what would the "initial mutable buffer" be?

Ahh darn, yeah that probably won't work then. I think I see what you're getting at in your first message. I've added a commit that hangs onto the initial buffer, and then the commit right after uses as_slice. Let me know if this is what you had in mind.

joncinque

The implementation looks good, but now we'll need to figure out how best to make the change, since it's breaking

joncinque · 2024-08-05T21:35:08Z

sdk/program/src/sysvar/slot_hashes.rs

+    /// Get a value from the sysvar entries by its key.
+    /// Returns `None` if the key is not found.
+    pub fn get(&self, slot: &Slot) -> Result<Option<Hash>, ProgramError> {
+        self.get_pod_slot_hashes().map(|pod_hashes| {
+            pod_hashes
+                .binary_search_by(|PodSlotHash { slot: this, .. }| slot.cmp(this))
+                .map(|idx| pod_hashes[idx].hash)
+                .ok()
+        })
+    }
+
+    /// Get the position of an entry in the sysvar by its key.
+    /// Returns `None` if the key is not found.
+    pub fn position(&self, slot: &Slot) -> Result<Option<usize>, ProgramError> {
+        self.get_pod_slot_hashes().map(|pod_hashes| {
+            pod_hashes
+                .binary_search_by(|PodSlotHash { slot: this, .. }| slot.cmp(this))
+                .ok()
+        })
+    }


These, along with the addition of fields to the struct, are breaking changes. Do you plan to break and backport the breaking change? We're still doing a little bit of that, and since it's a new API, it should be OK if that's how you want to go.

If that's the case, how about making SlotHashesSysvar implement the Sysvar trait for get(), and then add the other functions as before?

Or, we could avoid a breaking change by creating a new PodSlotHashes struct and deprecate the old one, which would make a nice analogue to the normal SlotHashes. What do you think?

Or, we could avoid a breaking change by creating a new PodSlotHashes struct and deprecate the old one, which would make a nice analogue to the normal SlotHashes. What do you think?

I think this is probably the better way to go. We can deprecate SlotHashesSysvar and add a PodSlotHashes that implements Sysvar and offers a similar API to SlotHashes. Nice idea!

I think it maybe makes more sense at this point to open a new PR where I can add the deprecation and new struct/API. Work for you?

No problem, whatever you prefer!

buffalojoec requested a review from joncinque July 1, 2024 20:34

buffalojoec marked this pull request as ready for review July 1, 2024 20:34

joncinque reviewed Jul 3, 2024

View reviewed changes

sdk/program/src/sysvar/slot_hashes.rs Outdated Show resolved Hide resolved

sdk/program/src/sysvar/slot_hashes.rs Outdated Show resolved Hide resolved

buffalojoec added 5 commits July 23, 2024 11:44

make get_pod_slot_hashes public

2b814be

use length of slot hashes sysvar

f347d25

update tests

fadfaeb

redesign to use one syscall

9f675e0

rename to pod_slot_hashes

eda2322

buffalojoec force-pushed the slot-hashes-api-v2 branch from 0cfa0bb to eda2322 Compare July 23, 2024 18:23

rename back to get_pod_slot_hashes

2d1b856

joncinque reviewed Jul 24, 2024

View reviewed changes

buffalojoec added 4 commits August 5, 2024 16:51

review feedback

8c356db

refactor to hang onto buffer

3bf3e44

rename get_pod_slot_hashes to as_slice

dea2a49

make fields

b43cb0a

joncinque reviewed Aug 5, 2024

View reviewed changes

buffalojoec closed this Aug 6, 2024

buffalojoec mentioned this pull request Aug 8, 2024

SDK: Add PodSlotHashes API #2481

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update `SlotHashesSysvar` API for more flexibility #1948

Update `SlotHashesSysvar` API for more flexibility #1948

buffalojoec commented Jul 1, 2024

joncinque left a comment

joncinque Jul 24, 2024

buffalojoec Jul 24, 2024

joncinque Jul 24, 2024

buffalojoec Aug 5, 2024

joncinque left a comment

joncinque Aug 5, 2024

buffalojoec Aug 6, 2024

joncinque Aug 6, 2024

Update SlotHashesSysvar API for more flexibility #1948

Update SlotHashesSysvar API for more flexibility #1948

Conversation

buffalojoec commented Jul 1, 2024

Problem

Summary of Changes

joncinque left a comment

Choose a reason for hiding this comment

joncinque Jul 24, 2024

Choose a reason for hiding this comment

buffalojoec Jul 24, 2024

Choose a reason for hiding this comment

joncinque Jul 24, 2024

Choose a reason for hiding this comment

buffalojoec Aug 5, 2024

Choose a reason for hiding this comment

joncinque left a comment

Choose a reason for hiding this comment

joncinque Aug 5, 2024

Choose a reason for hiding this comment

buffalojoec Aug 6, 2024

Choose a reason for hiding this comment

joncinque Aug 6, 2024

Choose a reason for hiding this comment

Update `SlotHashesSysvar` API for more flexibility #1948

Update `SlotHashesSysvar` API for more flexibility #1948