Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gossipsub v2.0 spec: Lower or zero duplicates by lazy mesh propagation #653

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

ppopth
Copy link
Contributor

@ppopth ppopth commented Dec 14, 2024

This PR supersedes #652

This extension allows lazy propagation to mesh peers to reduce the number of duplicates (which can reach zero) in the network trading off with more latency.

Instead of sending the messages to mesh peers right away, it tosses a coin to decide whether to send it lazily or eagerly.

If it decides to send eagerly, it just forwards the message right away.

If it decides to send lazily, it sends IANNOUNCE instead and waits for INEED before sending the actual messages.

Notice that if the probability is configured to be 1, it guarantees that each node will receive exactly one copy of messages, which means no duplicates.


Authored by: @ppopth, @nisdas, @chirag-parmar

I made this PR as a draft first just to gain some visibility. We need to do simulations and further analysis to compare it with Gossipsub v1.2

Our next step would be to implement it in go-libp2p-pubsub and do simulations

ppopth and others added 3 commits December 14, 2024 02:37
gossipsub v2.0 allows you to reduce the number of duplicates lower or to
zero trading off with more latency.

It works by probabilistically deciding to forward the message eargerly
or lazily to mesh peers.

If it decides to send eargerly, it just forwards the message right away.

If it decides to send lazily, it sends IANNOUNCE instead and waits for
INEED before sending the actual messages.

Notice that if the probability is configured to be 1, it guarantees that
each node will receives exactly one copy of messages, which means no
duplicates.
## Future Improvements

- Penalize peers that don't send the message in time, after sending `INEED`.
- Let publishers just send the full content of messages to mesh peers, rather than `IANNOUNCE`, because no one has really seen the message before. This saves one RTT, but it will kill anonymity so we are not sure yet to do it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think publishers should just flood publish the message, as we currently do.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have allowed that as long as D_announce is less than D
https://github.com/libp2p/specs/pull/653/files#diff-f85861c1fe2084ec5cd59445f67d62b9bfd20e6eb0d879801833af26ebf8c107R139

However, we stop it if the protocol becomes fully announcement based

message ControlMessage {
// messages from v1.2
repeated ControlIAnnounce iannounce = 6;
repeated ControlINeed ineed = 7;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we just do this with IWANT?
In fact we could just send IHAVEs instead of using INEED.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reasoning was that this would function very differently from IHAVE/IWANT which is emitted periodically. While message announcements were meant to always be immediate and be primarily be used for message propagation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhm ok, fair enough.

We also need to consider the interplay of IDONTWANT and IANNOUNCE as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just say that if we received IDONTWANT before, we won't send IANNOUNCE.

@vyzo
Copy link
Contributor

vyzo commented Dec 14, 2024

In general I like the direction of where this is going.

However, it begs the question: Do we need the new control messages? We could do it by reusing IHAVE/IWANT.

@ufarooqstatus
Copy link

IMO, we may need to consider a few issues related to the duplicate-count problem in GossipSub:

  1. Typically, a peer is unaware that it is already receiving a message and may generate MANY IWANT requests (GossipSub v1.1 allows that).
  2. The same applies to IDONTWANT messages. A peer already receiving a message can only send IDONTWANTs once it finishes downloading that message. During this window (usually several hundred milliseconds), other mesh peers start relaying to that peer.

The same can happen to INEED messages. A peer is already downloading a message, and it sends another INEED request.

@ppopth
Copy link
Contributor Author

ppopth commented Dec 16, 2024

The same can happen to INEED messages. A peer is already downloading a message, and it sends another INEED request.

I would like to note that this can happen only when D_announce < D (allowing some eager forwarding).

If D_announce = D (every forwarding is lazy), if I'm already downloading a message and I'm about to send another INEED, it means that I'm about to send INEED because the timeout occurs for the peer sending me the message, so that peer is deemed misbehaving.

@ppopth
Copy link
Contributor Author

ppopth commented Dec 16, 2024

Do we need the new control messages? We could do it by reusing IHAVE/IWANT.

We have thought about that and we thought that

  1. the logic of IANNOUNCE/INEED is very different from IHAVE/IWANT. (potential penalty for not sending the message after receiving INEED)
  2. IANNOUNCE/INEED is supposed to contain only 1 msg id rather than a list
  3. It's easier to distinguish between the two.

@ufarooqstatus
Copy link

If D_announce = D (every forwarding is lazy), if I'm already downloading a message and I'm about to send another INEED, it means that I'm about to send INEED because the timeout occurs for the peer sending me the message, so that peer is deemed misbehaving.

It depends on the message size and situation:
Early receivers will get around 'D' INEED and many IWANT requests.
If a peer tries to respond to all these requests for a large message, it might miss the 400ms deadline. Perhaps the deadline can be inferred from the message size.

@nisdas
Copy link

nisdas commented Dec 16, 2024

@ufarooqstatus

The same can happen to INEED messages. A peer is already downloading a message, and it sends another INEED request.

We would queue the multiple announcements we receive and only send INEED messages one peer at a time. In the event the first peer was unable to send us the full message within the timeout , we would then send the INEED to the next peer who sent us the announcement and so on.

After the router sends INEED, it will time out if it doesn't receive the message back in time, as indicated by timeout. If the timeout happens, the router will pop acache[msgid] send INEED to the next peer. If it still times out, keep going with next peers until the cache runs out of peers.

@ppopth
Copy link
Contributor Author

ppopth commented Dec 16, 2024

it might miss the 400ms deadline. Perhaps the deadline can be inferred from the message size.

Yeah, you have to configure the timeout carefully.

@ufarooqstatus
Copy link

We would queue the multiple announcements we receive and only send INEED messages one peer at a time. In the event the first peer was unable to send us the full message within the timeout , we would then send the INEED to the next peer who sent us the announcement and so on.

Yes, the peer responding to 'INEEDs+IWANTs' can get overwhelmed, and may require much higher time for responding to these requests (depending on the message size).

@vyzo
Copy link
Contributor

vyzo commented Dec 16, 2024

Another thing we could consider is the size of the messages; maybe small messages should always be eagerly forwarded and large messages could be just announced.

@nisdas
Copy link

nisdas commented Dec 17, 2024

Yes, the peer responding to 'INEEDs+IWANTs' can get overwhelmed, and may require much higher time for responding to these requests (depending on the message size).

Responding to INEED would be bounded by your degree, so we would only be providing data to our mesh peers that is actually useful for them. Currently we eagerly forward them anyway, so in the worst case where you have your mesh responding with the highest amount of INEED messages (D) , which is the status quo right for every message forwarded as of gossipsub v1.2 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Triage
Development

Successfully merging this pull request may close these issues.

4 participants