-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Single-shot DKG #3776
Single-shot DKG #3776
Conversation
The currently used DKG retry mechanism based on random exclusion turned out to be ineffective for a higher number of participating operators. Such retries have a very small chance for success and produce a lot of unnecessary network traffic that consumes bandwidth and CPU excessively. Here we aim to improve the situation. First, we are making DKG a single-shot process which fails fast if the result cannot be produced during the first attempt. Second, we are doubling down the announcement period to maximize participation chance for all selected operators, even those being at the edge of the network. Last but not least, we are reducing the submission delay that is preserved between operators attempting to submit the final result on-chain. All those changes combined allow to achieve shorter DKG iterations that can be timed out quicker. This way, we will be able to repeat DKG more often, with different operator sets.
Solidity API documentation preview available in the artifacts of the https://github.com/keep-network/keep-core/actions/runs/7820364827 check. |
All DKG states use the `BackoffRetransmissionStrategy` strategy. There is no point to make an exception for the `resultSigningState`. This should reduce network load in case one of the participants fails at the end of the protocol.
Solidity API documentation preview available in the artifacts of the https://github.com/keep-network/keep-core/actions/runs/7820549852 check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, left just one comment.
@@ -28,7 +28,7 @@ const ( | |||
// is used to calculate the submission delay period that should be respected | |||
// by the given member to avoid all members submitting the same DKG result | |||
// at the same time. | |||
dkgResultSubmissionDelayStepBlocks = 15 | |||
dkgResultSubmissionDelayStepBlocks = 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am concerned this is too low. The gas price bump happens after one minute by default so if the initial estimation was not successful, there will be not enough time to do even one price bump.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested that yesterday on Sepolia where gas prices were quite high and volatile. Haven't observed any problem so I think we should be good. It is important to reduce DKG timeout to the minimum and this factor strongly commits to that value. We will monitor the situation on mainnet and increase that if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth noting that the price bump will be actually done. The only downside is that the next member may front-run with their own transaction. This may lead to a collision sometimes but is not harmful to the protocol.
This pull request backports #3776 to the `releases/mainnet/v2.0.0-m7` branch.
Refs: #3770
Depends on: #3775
The currently used DKG retry mechanism based on random exclusion turned out to be ineffective for a higher number of participating operators. Such retries have a very small chance of success and produce a lot of unnecessary network traffic that consumes bandwidth and CPU excessively.
Here we aim to improve the situation. First, we are making DKG a single-shot process that fails fast if the result cannot be produced during the first attempt. Second, we are doubling down the announcement period to maximize participation chances for all selected operators, even those at the edge of the network. Last but not least, we are reducing the submission delay that is preserved between operators attempting to submit the final result on-chain.
All those changes combined allow us to achieve shorter DKG iterations that can be timed out quicker. This way, we will be able to repeat DKG more often, with different operator sets.
Last but not least, we are also changing the re-transmission strategy for the
resultSigningState
which was still usingStandardRetransmissionStrategy
with retransmissions occurring on each tick. All DKG states use theBackoffRetransmissionStrategy
strategy which leads to a sparse distribution of retransmissions and is considered more lightweight. There is no point in making an exception for theresultSigningState
. This should reduce network load in case one of the participants fails at the end of the protocol.