-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Add iter_from()
method in CyclicBoundedVec
#686
Conversation
044a8bc
to
bfc8c26
Compare
81646b8
to
10fbd44
Compare
you might need to regenerate accounts. |
@@ -335,7 +336,8 @@ where | |||
Self { | |||
capacity, | |||
length: 0, | |||
next_index: 0, | |||
first_index: 0, | |||
last_index: 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need a first and a last index?
Don't we always start from index 0 except when providing a different index to start from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We used to do that, but I think it's not a correct thing to do.
When we don't provide a different index, I think we should start from the oldest element. The oldest element can be somewhere in the middle, if we went over capacity.
Let's assume that we have a bounded vec of integers incremented by 1. The capacity is 16. We append 22 elements. It ends up looking like:
$ ^
[16, 17, 18, 20, 21, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
^ - first_index
$ - last_index
Every time we are over capacity and add more elements, we need to increment first_index
and last_index
.
In my opinion, a default iterator should iterate from 5 to 15, then from 0 to 21. Overall - from 5 to 21 with a reset.
To be precise, my way of thinking is that for the cyclic vector above, the result of iter().collect()
should be:
[5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 20, 21]
What you are proposing, would iterate... how exactly? From 16 to 21? From 16 to 15?
We used to iterate from 16 to 15, which I think it's incorrect. This approach made the previous implementation of .to_slice()
a bit useless and it was the reason why we ended up with doing all this manual iteration dance in ConcurrentMerkleTree
.
Maybe I should rename first_index
and last_index
to head
and tail
to make it clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your example makes it clear what it means
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
last index is always first index minus 1 except if first index = 0 then its length - 1, and it is 0 if it is empty.
So what is the reasoning to put it into a variable and not just compute it when needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair I can see that it makes sense for the iterator to start at the current index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrt heads and tail it's confusing in a different way, I think we just need a doc comment then it's clear.
self.length += 1; | ||
} else if self.next_index == self.capacity() { | ||
self.next_index = 0; | ||
self.last_index += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the looks suspicious to me since in line 446 self.first_index is used as current_index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 446 initializes an iterator. Our iterator (CyclicBoundedVecIterator
) is a wrapper containing a reference to CyclicBoundedVec
. current_index
is the internal counter, belonging only to the iterator.
And there are two cases how we initialize current_index
:
iter()
- we just set it to the current value offirst_index
(line 446 you pointed). So the iterator starts withfirst_index
. It incrementscurrent_index
by eachnext()
call. It stops whenlast_index
is reached, after this point eachnext()
call returnsNone
.iter_from(start: usize)
- we set it to the providedstart
value. It incrementscurrent_index
by eachnext()
call. It stops whenlast_index
is reached, after this point eachnext()
call returnsNone
.
And BTW, to be clear - only iter_from()
is being used in ConcurrentMerkleTree
for patching changelogs.
self.current = new_current; | ||
element | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does this achieve equivalence to what it is meant to replace?
Can you still iterate over the cyclic vec in a normal way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This iterator struct makes it possible to do
for changelog_entry in changelog_entries.iter()
Which will iterate from the oldest (not necessarily 0
!) to the newest element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If your question is more about "how the heck all this code creates an iterator" - it was confusing for me as well some time ago. This was helpful
https://dev.to/wrongbyte/implementing-iterator-and-intoiterator-in-rust-3nio
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// | ||
/// Should iterate over elements 6..7 and 8..11 - 6 iterations. | ||
#[test] | ||
fn test_cyclic_bounded_vec_iter_reset() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am missing a test for the edge case that we need to cover in the Merkle tree.
Consider an example:
// capacity 5 [0,1,2,3,4]
// start with index 2 and want to do 4 iterations
let elements = cyclic_bounded_vec.iter_from(2);
let expected_result = [&3, &4, &0, &1];
for i in 0..4 {
assert_eq!(elements.next(), expected_result[i]);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can add it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your expected_result is not correct for such example though. The [0, 1, 2, 3, 4]
vector didn't cycle over yet. Starting from 2 should result in [2, 3, 4]
. 0 and 1 are elements which are older than than 2. Why would you want to iterate over changelog entries which predate your requested index? We want to use only newer changelogs, not older ones.
That's exactly why I'm keeping track of first_index
and last_index
- to figure out if I really need to jump to the beginning. These counters are making the algorithm aware of whether the vector already cycled or not.
Your example would be somehow correct if we did an explicit push of 0
and 1
elements afterwards and the vector would cycle over. But in these tests, I'm always incrementing pushed elements by 1 though, just to make it crystal clear which elements are newer and older.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a test for the input you mentioned, but with output which IMO is correct. LMK WDYT :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair I didn't intend to put any meaning into the numbers itself.
And I got the two wrong it should have been included since we are starting from index two inclusive.
// capacity 5 [43,22,55,13,42]
// using first and last index these would be: first 2, last 1
// without first and last, self.current_index would be 1,
// and the user requests to start at index 2
let elements = cyclic_bounded_vec.iter_from(2);
let expected_result = [55,13,42, 43, 22];
for i in 0..4 {
assert_eq!(elements.next(), expected_result[i]);
}
10fbd44
to
7aa9fd2
Compare
7aa9fd2
to
83ad050
Compare
`ConcurrentMerkleTree` uses `CyclicBoundedVec` for storing changelog entries. And it often needs to iterate over changelogs which are full and started overwriting elements from the start. Even though `CyclicBoundedVec` was doing good job in preserving the bounds, before this change, it didn't provide any safe way of iterating over elements with a custom index as a start. That unfortunately resulted in attempts to solve that problem in the code of `ConcurrentMerkleTree`, which eventually resulted in bugs. The bugs were harder to spot, because mixing the logic of Merkle proof patching with leaked internals of `CyclicBoundedVec` made it confusing to read, debug and test. To avoid such issues, introduce the `iter_from()` method which allows iteration from the given index and simplify the `ConcurrentMerkleTree` to just use that iterator. Other changes: * Stop returning `Result` in `push()`. `push()` can never panic, unless we have a bug in the implementation. Our tests should make sure that the structure is bulletproof.
07899a8
to
ab79d1f
Compare
ConcurrentMerkleTree
usesCyclicBoundedVec
for storing changelog entries. And it often needs to iterate over changelogs which are full and started overwriting elements from the start.Even though
CyclicBoundedVec
was doing good job in preserving the bounds, before this change, it didn't provide any safe way of iterating over elements with a custom index as a start.That unfortunately resulted in attempts to solve that problem in the code of
ConcurrentMerkleTree
, which eventually resulted in bugs. The bugs were harder to spot, because mixing the logic of Merkle proof patching with leaked internals ofCyclicBoundedVec
made it confusing to read, debug and test.To avoid such issues, introduce the
iter_from()
method which allows iteration from the given index and simplify theConcurrentMerkleTree
to just use that iterator.