-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect usage of timeouts in gossip #70
Comments
@rahulghangas Not sure I fully understand the issue/implication of the dual timeouts. Do you mind listing out a concrete example? |
The call to gossip has the following signature func (g *Gossiper) Gossip(ctx context.Context, contentID []byte, subnet *id.Hash) where the context provided usually has a timeout of, say 5-10 seconds. On the other hand, if you look at func (g *Gossiper) didReceiveSync(from id.Signatory, msg wire.Msg) {
...
...
...
ctx, cancel := context.WithTimeout(context.Background(), g.opts.Timeout)
defer cancel()
g.Gossip(ctx, msg.Data, &subnet) after receiving the sync message, we regossip the message, but using the internal timeout. The internal timeout is supposed to be used to define the time limit of sending a single message, not a whole round of gossipping. |
@rahulghangas Got it. Could we send to the recipients in parallel inside the |
That could potentially solve the issue, and the call to gossip wouldn't require a context as input since we'll use the internal timeout to define a context and use it on all messages (in parallel) |
While gosspping, two different timeouts are used
- defined by context supplied by user
- defined by value of field in the
GossipOptions
structThe first one defies the total time for a round of gossipping, while the second one defines the timeout for sending each message. When a sync message is received, the gossipper tries to propagate the new message by re-gossipping it. However, the gossipper now has no notion of the timeout supplied by the user (and that particular context has probably gone out scope!). Currently we use the second timeout to a call to
Gossip
, but the gossipping might fail if the first attempt to message a peer fails (since that uses the same timeout)The text was updated successfully, but these errors were encountered: