Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pimd: Ensure upstream points at the correct rpf #14707

Merged
merged 1 commit into from
Nov 2, 2023

Conversation

donaldsharp
Copy link
Member

In the scenario on an intermediate router where a *,G join has been received and a S,G stream is being sent through that router on the *,G stream, there exists a situation when the *,G in has been pruned but the stream is still being received on on incoming interface towards
the RP for the *,G. In this situation PIM will see the S,G stream
initially as a NOCACHE from the dataplane, PIM will then do a RPF
for the S and notice that it is supposed to be coming in on adifferent
interface. In this case PIM the original PIM code would create
a blackhole mroute towards the RPF of the *,G( the interface the
stream is being received on ). The original reason for this is that
if there is a scenario where this particular S1,G stream is sending
at basically line rate, and there also happens to be a different
S2,G stream that is sending at a very low rate. With certain
dataplanes there is no way to really rate limit the S1 -vs- S2
stream and the S1 stream completely overwhelms the S2 stream
for sending up to the control plane for proper pim handling.
The problem then becomes that FRR never properly responds
to the situation where the *,G is rereceived and the S,G
stream switches back over to the SPT for itself and FRR ends
up with a dead mroute that stops everything from working properly.

This code change, installs the blackhole mroute with the RPF towards the RP for the G and then resets the RPF to the correct RPF for the Stream but does not modify the mroute. When the *,G is rereceived and we attempt to transition to the S,G stream this now works.

As a note: Both David L and myself do not necessarily believe we fully understand the problem yet. What this does do is fix all the inconsistent CI issues we are seeing in the topotests at this time. Internally I am seeing other test failures in PIM that I don't fully understand and we suspect that there are other problems in the state machine. We plan to revisit this problem as we are able to debug the issue better. In the meantime both David and Myself agree that this gets the CI working again and Streams end up in the right state.

In the scenario on an intermediate router where a *,G join has
been received and a S,G stream is being sent through that router
on the *,G stream, there exists a situation when the *,G in has been pruned
but the stream is still being received on on incoming interface towards
the RP for the *,G.   In this situation PIM will see the S,G stream
initially as a NOCACHE from the dataplane, PIM will then do a RPF
for the S and notice that it is supposed to be coming in on adifferent
interface.  In this case PIM the original PIM code would create
a blackhole mroute towards the RPF of the *,G( the interface the
stream is being received on ).  The original reason for this is that
if there is a scenario where this particular S1,G stream is sending
at basically line rate, and there also happens to be a different
S2,G stream that is sending at a very low rate.  With certain
dataplanes there is no way to really rate limit the S1 -vs- S2
stream and the S1 stream completely overwhelms the S2 stream
for sending up to the control plane for proper pim handling.
The problem then becomes that FRR never properly responds
to the situation where the *,G is rereceived and the S,G
stream switches back over to the SPT for itself and FRR ends
up with a dead mroute that stops everything from working properly.

This code change, installs the blackhole mroute with the RPF
towards the RP for the G and then resets the RPF to the correct
RPF for the Stream but does not modify the mroute.  When the
*,G is rereceived and we attempt to transition to the S,G stream
this now works.

As a note:  Both David L and myself do not necessarily believe
we fully understand the problem yet.  What this does do is fix
all the inconsistent CI issues we are seeing in the topotests
at this time.  Internally I am seeing other test failures
in PIM that I don't fully understand and we suspect that
there are other problems in the state machine.  We plan to
revisit this problem as we are able to debug the issue better.
In the meantime both David and Myself agree that this gets
the CI working again and Streams end up in the right state.

Signed-off-by: Donald Sharp <[email protected]>
@donaldsharp
Copy link
Member Author

@Mergifyio backport dev/9.1

Copy link

mergify bot commented Oct 31, 2023

backport dev/9.1

✅ Backports have been created

@ton31337 ton31337 merged commit 081d1a4 into FRRouting:master Nov 2, 2023
80 checks passed
donaldsharp added a commit that referenced this pull request Nov 2, 2023
pimd: Ensure upstream points at the correct rpf (backport #14707)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants