-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix reshare timing issue #1211
Fix reshare timing issue #1211
Conversation
@@ -295,7 +295,7 @@ pub async fn validate_new_reshare( | |||
.await? | |||
.ok_or_else(|| ValidatorErr::ChainFetch("Not Currently in a reshare"))?; | |||
|
|||
if chain_data.block_number != reshare_data.block_number.saturating_sub(1) | |||
if chain_data.block_number != reshare_data.block_number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sub was removed as now it matches the running testnet
@@ -74,8 +74,5 @@ pub const MAX_SIGNERS: u8 = 15; | |||
/// Threshold for those signers | |||
pub const SIGNER_THRESHOLD: u8 = 2; | |||
|
|||
/// For testing to line up chain mock data and reshare_test | |||
pub const TEST_RESHARE_BLOCK_NUMBER: u32 = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh i am happy to say goodbye to this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⭐
I like this way of doing things without using the chainspec, as you can modify things for just one test without meddling with all the other tests. I really hope this now means the reshare test will pass consistently, because i am pretty disappointed that it still fails, after a lot of fiddling around, as does the jumpstart test sometimes.
@@ -101,14 +132,19 @@ async fn test_reshare_basic() { | |||
if new_signer_ids != old_signer_ids { | |||
break Ok(new_signer_ids); | |||
} | |||
if i > 120 { | |||
if i > 240 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Was this needed to get the tests to pass? Disappointingly we are still often getting this timeout error, and i kinda thought it could not be that the reshare takes more than two minutes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably not, I changed it when I was doing to reshares so it took more time, but I mean doesn't hurt
break Err("Timed out waiting for reshare"); | ||
} | ||
i += 1; | ||
tokio::time::sleep(std::time::Duration::from_secs(1)).await; | ||
} | ||
.unwrap(); | ||
|
||
// wait for roatate keyshare | ||
tokio::time::sleep(std::time::Duration::from_secs(10)).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah now i see what you mean about rotate keyshare happening after signers have changed. Yeah totally makes sense that you have to wait here a block before checking the keyshare has changed.
let storage_address_reshare_data = entropy::storage().staking_extension().reshare_data(); | ||
let value_reshare_info = | ||
ReshareInfo { block_number, new_signers: vec![alice_stash.public().encode()] }; | ||
// Add another reshare |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment made me think you are doing a third reshare here - but this is part of the second reshare right? We need to modify next_signers
and reshare_data
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, I can try to find better comments
call_set_storage(&api, &rpc, call).await; | ||
|
||
// wait for roatate keyshare | ||
tokio::time::sleep(std::time::Duration::from_secs(60)).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems a bit brittle - how come one minute here but above we have 4 mins plus 10 seconds for the key rotation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes on the second I am just checking that the rotate key share happend, got to 60 seconds from trial and error, the above one exists if finished early so I didn't try to make it shorter
return Ok(()); | ||
} | ||
|
||
let deadline = sp_io::offchain::timestamp().add(Duration::from_millis(2_000)); | ||
let deadline = sp_io::offchain::timestamp().add(Duration::from_millis(10_000)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No harm in leaving this here i guess, but in theory it should not be needed outside of the hot fix because resharing happens in a separate task
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true, gonna leave it tho cuz why not
@@ -876,7 +858,7 @@ pub mod pallet { | |||
// trigger reshare at next block | |||
let current_block_number = <frame_system::Pallet<T>>::block_number(); | |||
let reshare_info = ReshareInfo { | |||
block_number: current_block_number + sp_runtime::traits::One::one(), | |||
block_number: current_block_number - sp_runtime::traits::One::one(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems kinda weird that we subtract one here and then add one again in the propagation pallet. But since it works, definitely don't touch it.
…reshare-timing-issue
this is pretty much a mirror of #1201 into master
also tagging #1210 this is the error that was related to that issue
Adds a few regression tests and a better way to mock reshares by manipulating storage