Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BFD session down does not trigger BGP session down #14266

Closed
1 task
sbrs3 opened this issue Aug 24, 2023 · 33 comments
Closed
1 task

BFD session down does not trigger BGP session down #14266

sbrs3 opened this issue Aug 24, 2023 · 33 comments

Comments

@sbrs3
Copy link

sbrs3 commented Aug 24, 2023


Describe the bug

BFD does not seem to notify BGP when a peer goes down.

I configure a BFD peer with default parameters and use the same peer IP in an eBGP session as follows:

router bgp 65000
 bgp router-id 10.1.1.1
 no bgp ebgp-requires-policy
 neighbor 172.16.1.7 remote-as 65001
 neighbor 172.16.1.7 bfd
bfd
 peer 172.16.1.7
 exit
 !
exit

The result is: when the remote peer interface (172.16.1.7) goes down, the BFD session is taken down within 1 second. The BGP session however stays up until the BGP timer expires (~ 3 minutes). I see following messages from BGPD:

2023-08-24 10:44:30.762 [DEBG] bgpd: [Q4BCV-6FHZ5] zclient_bfd_session_update: 0.0.0.0/32 -> 172.16.1.7/32 (interface enp0s1f0d1) VRF default(0) (CPI bit no): Down
2023-08-24 10:44:30.762 [DEBG] bgpd: [QFMSE-NPSNN] zclient_bfd_session_update:   sessions updated: 0

It seems that the function zclient_bfd_session_update in bfd.c skips the BGP session. Why?

  • [x ] Did you check if this is a duplicate issue?
  • Did you test it on the latest FRRouting/frr master branch?

Versions

  • FRR version: 8.5
  • OS version: Rocky Linux release 9.1
  • Linux 5.15.0
@sbrs3 sbrs3 added the triage Needs further investigation label Aug 24, 2023
@ton31337
Copy link
Member

Did you try explicitly defining an interface for bfd peer? peer 172.16.1.7 interface enp0s1f0d1.

@sbrs3
Copy link
Author

sbrs3 commented Aug 24, 2023

Did you try explicitly defining an interface for bfd peer? peer 172.16.1.7 interface enp0s1f0d1.

Yes, the result is the same.

Extra info: I have a second BFD and BGP peer in a different subnet but reachable in the same L2 network. I assume that would not interfere, right?

@ton31337
Copy link
Member

Could you also show ip route 172.16.1.7, show ip bgp 172.16.1.7?

@sbrs3
Copy link
Author

sbrs3 commented Aug 24, 2023

Could you also show ip route 172.16.1.7, show ip bgp 172.16.1.7?

# show ip route 172.16.1.7
Routing entry for 172.16.1.0/24
  Known via "connected", distance 0, metric 0, best
  Last update 23:54:58 ago
  * directly connected, enp0s1f0d3

# show ip bgp 172.16.1.7
% Network not in table

This was captured while BFD is down and BGP is still up.

@ton31337
Copy link
Member

Interface names are different: enp0s1f0d3 vs. enp0s1f0d1. Is this expected?

@sbrs3
Copy link
Author

sbrs3 commented Aug 24, 2023

Interface names are different: enp0s1f0d3 vs. enp0s1f0d1. Is this expected?

Interesting. The interface name in the log message seems wrong; it remains the same even if I configure the BFD peer with explicit interface:

peer 172.16.1.7 interface enp0s1f0d3

Addresses and routes:

enp0s1f0d1       UP             100.0.6.48/24 172.16.0.1/24
enp0s1f0d3       UP             172.16.1.1/24
C>* 172.16.0.0/24 is directly connected, enp0s1f0d1, 1d00h01m
C>* 172.16.1.0/24 is directly connected, enp0s1f0d3, 1d00h01m

@ton31337
Copy link
Member

ton31337 commented Aug 24, 2023

Can you show also show ip bgp neighbors 172.16.1.7? Also, show bfd peer.

@sbrs3
Copy link
Author

sbrs3 commented Aug 24, 2023

show ip bgp neighbors 172.16.1.7

# show ip bgp neighbors 172.16.1.7
BGP neighbor is 172.16.1.7, remote AS 65001, local AS 65000, external link
  Local Role: undefined
  Remote Role: undefined
Hostname: test-pe
  BGP version 4, remote router ID 10.7.7.7, local router ID 10.1.1.1
  BGP state = Established, up for 00:02:03
  Last read 00:02:02, Last write 00:00:03
  Hold time is 180 seconds, keepalive interval is 60 seconds
  Configured hold time is 180 seconds, keepalive interval is 60 seconds
  Configured conditional advertisements interval is 60 seconds
  Neighbor capabilities:
    4 Byte AS: advertised and received
    Extended Message: advertised and received
    AddPath:
      IPv4 Unicast: RX advertised and received
      L2VPN EVPN: RX advertised and received
    Long-lived Graceful Restart: advertised and received
      Address families by peer:
    Route refresh: advertised and received(old & new)
    Enhanced Route Refresh: advertised and received
    Address Family IPv4 Unicast: advertised and received
    Address Family L2VPN EVPN: advertised and received
    Hostname Capability: advertised (name: pe1,domain name: n/a) received (name: test-pe,domain name: n/a)
    Graceful Restart Capability: advertised and received
      Remote Restart timer is 120 seconds
      Address families by peer:
        none
  Graceful restart information:
    End-of-RIB send: IPv4 Unicast, L2VPN EVPN
    End-of-RIB received: IPv4 Unicast, L2VPN EVPN
    Local GR Mode: Helper*

    Remote GR Mode: Helper

    R bit: False
    N bit: False
    Timers:
      Configured Restart Time(sec): 120
      Received Restart Time(sec): 120
    IPv4 Unicast:
      F bit: False
      End-of-RIB sent: Yes
      End-of-RIB sent after update: Yes
      End-of-RIB received: Yes
      Timers:
        Configured Stale Path Time(sec): 360
  Message statistics:
    Inq depth is 0
    Outq depth is 0
                         Sent       Rcvd
    Opens:                 19         19
    Notifications:          9          6
    Updates:              297        201
    Keepalives:          1385       1365
    Route Refresh:          1          2
    Capability:             0          0
    Total:               1711       1593
  Minimum time between advertisement runs is 0 seconds
  Update source is 172.16.1.1

 For address family: IPv4 Unicast
  Update group 5, subgroup 21
  Packet Queue length 0
  Community attribute sent to this neighbor(all)
  1 accepted prefixes

 For address family: L2VPN EVPN
  Update group 6, subgroup 22
  Packet Queue length 0
  Inbound soft reconfiguration allowed
  NEXT_HOP is propagated unchanged to this neighbor
  Community attribute sent to this neighbor(all)
  advertise-all-vni
  3 accepted prefixes

  Connections established 19; dropped 18
  Last reset 00:02:05,  Peer closed the session
  External BGP neighbor may be up to 1 hops away.
Local host: 172.16.1.1, Local port: 46037
Foreign host: 172.16.1.7, Foreign port: 179
Nexthop: 172.16.1.1
Nexthop global: ::
Nexthop local: ::
BGP connection: shared network
BGP Connect Retry Timer in Seconds: 120
Estimated round trip time: 0 ms
Read thread: on  Write thread: on  FD used: 33

  BFD: Type: single hop
  Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300
  Status: Unknown, Last update: never
# show bfd peer
BFD Peers:
        peer 172.16.0.7 vrf default
                ID: 1473560464
                Remote ID: 149030257
                Active mode
                Status: up
                Uptime: 31 minute(s), 27 second(s)
                Diagnostics: ok
                Remote diagnostics: ok
                Peer Type: configured
                RTT min/avg/max: 0/0/0 usec
                Local timers:
                        Detect-multiplier: 3
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo receive interval: 50ms
                        Echo transmission interval: disabled
                Remote timers:
                        Detect-multiplier: 3
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo receive interval: 50ms

        peer 172.16.1.7 vrf default
                ID: 723377540
                Remote ID: 0
                Active mode
                Status: down
                Downtime: 4 minute(s), 39 second(s)
                Diagnostics: control detection time expired
                Remote diagnostics: ok
                Peer Type: configured
                RTT min/avg/max: 0/0/0 usec
                Local timers:
                        Detect-multiplier: 3
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo receive interval: 50ms
                        Echo transmission interval: disabled
                Remote timers:
                        Detect-multiplier: 3
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo receive interval: 50ms

@ton31337
Copy link
Member

  Status: Unknown, Last update: never

Don't you block something?

@sbrs3
Copy link
Author

sbrs3 commented Aug 24, 2023

  Status: Unknown, Last update: never

Don't you block something?

As in: firewall? Not that I am aware of. What would have to be blocked to cause this?

@ton31337
Copy link
Member

Can you check what packets do you see in tcpdump for those interfaces? If you see BFD packets or not at all.

@sbrs3
Copy link
Author

sbrs3 commented Aug 24, 2023

Can you check what packets do you see in tcpdump for those interfaces? If you see BFD packets or not at all.

Yes, but different on both ports.


# tcpdump -lnvi enp0s1f0d3 ! port ssh and ! arp
dropped privs to tcpdump
tcpdump: listening on enp0s1f0d3, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:05:28.914310 IP (tos 0xc0, ttl 255, id 30959, offset 0, flags [DF], proto UDP (17), length 52)
    172.16.1.1.49153 > 172.16.1.7.bfd-control: BFDv1, length: 24
        Control, State Up, Flags: [none], Diagnostic: No Diagnostic (0x00)
        Detection Timer Multiplier: 3 (900 ms Detection time), BFD Length: 24
        My Discriminator: 0x2b1ddd84, Your Discriminator: 0x8ca0da24
          Desired min Tx Interval:     300 ms
          Required min Rx Interval:    300 ms
          Required min Echo Interval:   50 ms
12:05:29.205321 IP (tos 0xc0, ttl 255, id 30980, offset 0, flags [DF], proto UDP (17), length 52)
    172.16.1.1.49153 > 172.16.1.7.bfd-control: BFDv1, length: 24
        Control, State Up, Flags: [none], Diagnostic: No Diagnostic (0x00)
        Detection Timer Multiplier: 3 (900 ms Detection time), BFD Length: 24
        My Discriminator: 0x2b1ddd84, Your Discriminator: 0x8ca0da24
          Desired min Tx Interval:     300 ms
          Required min Rx Interval:    300 ms
          Required min Echo Interval:   50 ms
12:05:29.442341 IP (tos 0xc0, ttl 255, id 30996, offset 0, flags [DF], proto UDP (17), length 52)
    172.16.1.1.49153 > 172.16.1.7.bfd-control: BFDv1, length: 24
        Control, State Up, Flags: [none], Diagnostic: No Diagnostic (0x00)
        Detection Timer Multiplier: 3 (900 ms Detection time), BFD Length: 24
        My Discriminator: 0x2b1ddd84, Your Discriminator: 0x8ca0da24
          Desired min Tx Interval:     300 ms
          Required min Rx Interval:    300 ms
          Required min Echo Interval:   50 ms

# tcpdump -lnvi enp0s1f0d1 ! port ssh and ! arp
dropped privs to tcpdump
tcpdump: listening on enp0s1f0d1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:05:31.488052 IP (tos 0xc0, ttl 255, id 3574, offset 0, flags [DF], proto UDP (17), length 52)
    172.16.0.1.49152 > 172.16.0.7.bfd-control:
    BCM-LI-SHIM: direction unused, pkt-type unknown, pkt-subtype untagged, li-id 792
     (invalid)
12:05:31.731068 IP (tos 0xc0, ttl 255, id 3579, offset 0, flags [DF], proto UDP (17), length 52)
    172.16.0.1.49152 > 172.16.0.7.bfd-control:
    BCM-LI-SHIM: direction unused, pkt-type unknown, pkt-subtype untagged, li-id 792
     (invalid)
12:05:31.974084 IP (tos 0xc0, ttl 255, id 3581, offset 0, flags [DF], proto UDP (17), length 52)
    172.16.0.1.49152 > 172.16.0.7.bfd-control:
    BCM-LI-SHIM: direction unused, pkt-type unknown, pkt-subtype untagged, li-id 792
     (invalid)
12:05:32.223099 IP (tos 0xc0, ttl 255, id 3585, offset 0, flags [DF], proto UDP (17), length 52)
    172.16.0.1.49152 > 172.16.0.7.bfd-control:
    BCM-LI-SHIM: direction unused, pkt-type unknown, pkt-subtype untagged, li-id 792
     (invalid)

@ton31337
Copy link
Member

Are these interfaces enp0s1f0d1/enp0s1f0d3 physically separate or is this a breakout cable? (Just trying to gather information before trying to replicate it locally).

@sbrs3
Copy link
Author

sbrs3 commented Aug 24, 2023

Capture on enp0s1f0d1 looks like broken packets. Let me investigate our data plane. It is non-standard.

@ton31337
Copy link
Member

And I see only the packets one-way for BFD...

@sbrs3
Copy link
Author

sbrs3 commented Aug 30, 2023

The messages were not corrupted; that must have been a problem of tcpdump. In Wireshark they looked OK. Messages were only seen in one direction because of asymmetric routing.
The real cause of the problem was that both router's interfaces were on a single L2 broadcast domain, although being in different IP subnets. After isolating them pairwise into separate VLANs, the problem disappeared. Now the BGP session promptly goes down when BFD goes down. This is somewhat surprising. One would expect that L2 topology does not influence the outcome because both BGP and BFD operate above the IP layer.

@sbrs3 sbrs3 closed this as completed Oct 18, 2023
@HareshKhandelwal
Copy link

Hi,

I observed this issue in my environment as well and wondering if this is expected since ticket is closed stating peers in separate vlan solves the problem. Look at below.

computebgp-0# show ip bgp neighbors
BGP neighbor on eno2: fe80::e643:4bff:fe4a:c22, remote AS 64999, local AS 64999, internal link
Hostname: rhos-nfv-07.lab.eng.rdu2.redhat.com
Member of peer-group uplink for session parameters
BGP version 4, remote router ID 9.9.9.13, local router ID 192.168.80.115
BGP state = Established, up for 00:09:17 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<NOTE THIS
Last read 00:00:17, Last write 00:00:17
Hold time is 180, keepalive interval is 60 seconds
Configured conditional advertisements interval is 60 seconds
Neighbor capabilities:
4 Byte AS: advertised and received
Extended Message: advertised and received
AddPath:
IPv4 Unicast: RX advertised and received
IPv6 Unicast: RX advertised and received
Extended nexthop: advertised and received
Address families by peer:
IPv4 Unicast
Long-lived Graceful Restart: advertised and received
Address families by peer:
Route refresh: advertised and received(old & new)
Enhanced Route Refresh: advertised and received
Address Family IPv4 Unicast: advertised and received
Address Family IPv6 Unicast: advertised and received
Hostname Capability: advertised (name: computebgp-0,domain name: n/a) received (name: rhos-nfv-07.lab.eng.rdu2.redhat.com,domain name: n/a)
Graceful Restart Capability: advertised and received
Remote Restart timer is 120 seconds
Address families by peer:
none
Graceful restart information:
End-of-RIB send: IPv4 Unicast, IPv6 Unicast
End-of-RIB received: IPv4 Unicast, IPv6 Unicast
Local GR Mode: Helper*
Remote GR Mode: Helper
R bit: False
N bit: False
Timers:
Configured Restart Time(sec): 120
Received Restart Time(sec): 120
IPv4 Unicast:
F bit: False
End-of-RIB sent: Yes
End-of-RIB sent after update: Yes
End-of-RIB received: Yes
Timers:
Configured Stale Path Time(sec): 360
IPv6 Unicast:
F bit: False
End-of-RIB sent: Yes
End-of-RIB sent after update: Yes
End-of-RIB received: Yes
Timers:
Configured Stale Path Time(sec): 360
Message statistics:
Inq depth is 0
Outq depth is 0
Sent Rcvd
Opens: 158 52
Notifications: 45 16
Updates: 181 166
Keepalives: 47545 47545
Route Refresh: 0 0
Capability: 0 0
Total: 47929 47779
Minimum time between advertisement runs is 0 seconds

For address family: IPv4 Unicast
uplink peer-group member
Update group 67, subgroup 67
Packet Queue length 0
Community attribute sent to this neighbor(all)
Outbound path policy configured
Outgoing update prefix filter list is *only-host-prefixes
0 accepted prefixes

For address family: IPv6 Unicast
uplink peer-group member
Update group 68, subgroup 68
Packet Queue length 0
Community attribute sent to this neighbor(all)
Outbound path policy configured
Outgoing update prefix filter list is *only-host-prefixes
0 accepted prefixes

Connections established 34; dropped 33
Last reset 00:09:19, No AFI/SAFI activated for peer
Message received that caused BGP to send a NOTIFICATION:
FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
00850104 FDE700B4 C0A85081 68020601
04000100 01020805 06000100 01000202
06010400 02000102 02800002 02020002
02460002 06410400 00FDE702 02060002
0A450800 01010100 02010102 10490E0C
636F6E74 726F6C6C 65722D30 00020440
02007802 10470E00 01018000 00000002
01800000 00
Internal BGP neighbor may be up to 1 hops away.
Local host: fe80::b533:41e0:a0b9:b7ba, Local port: 179
Foreign host: fe80::e643:4bff:fe4a:c22, Foreign port: 38124
Nexthop: 192.168.80.115
Nexthop global: fe80::b533:41e0:a0b9:b7ba
Nexthop local: fe80::b533:41e0:a0b9:b7ba
BGP connection: shared network
BGP Connect Retry Timer in Seconds: 120
Estimated round trip time: 2 ms
Read thread: on Write thread: on FD used: 29

BFD: Type: single hop
Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300
Status: Down, Last update: 0:00:09:19 <<<<<<<<<<<<<<<<<<<<<<<<<<NOTE THIS

computebgp-0# show bfd peers
BFD Peers:
peer fe80::e643:4bff:fe4a:c22 local-address fe80::b533:41e0:a0b9:b7ba vrf default interface eno2
ID: 1578519708
Remote ID: 0
Active mode
Status: down <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<NOTE THIS
Downtime: 0 second(s)
Diagnostics: control detection time expired
Remote diagnostics: control detection time expired
Peer Type: dynamic
Local timers:
Detect-multiplier: 3
Receive interval: 300ms
Transmission interval: 300ms
Echo receive interval: 50ms
Echo transmission interval: disabled
Remote timers:
Detect-multiplier: 3
Receive interval: 1000ms
Transmission interval: 1000ms
Echo receive interval: 50ms

My config is:
router bgp 64999
bgp router-id 192.168.80.115
bgp log-neighbor-changes
no bgp ebgp-requires-policy
no bgp suppress-duplicates
no bgp hard-administrative-reset
no bgp default ipv4-unicast
bgp graceful-shutdown
no bgp graceful-restart notification
no bgp network import-check
neighbor uplink peer-group
neighbor uplink remote-as internal
neighbor uplink bfd
neighbor uplink bfd profile tripleo
neighbor uplink ttl-security hops 1
neighbor eno2 interface peer-group uplink

My usecase here is very simple, unnumbered iBGP session between 2 peers connected via uplink switch in same vlan.
eno2 (physical interface) here is plain ethernet interface.

I brought down BFD purposefully here. However, bringing BFD up/down does not impact BGP session.

Shall we re-open the ticket and fix this issue?

Thanks

@sbrs3
Copy link
Author

sbrs3 commented Mar 5, 2024

The problem has reoccurred for us as well, in a different topology, even though the interfaces are pairwise isolated on different VLANs. So this was not the solution.

In our case there is asymmetrical routing, i.e. from FRR perspective, incoming and outgoing BFD (and BGP) packets are on different logical interfaces associated with the same physical port.

While the physical port is up, I continuously get the following message in the log:
bfdd[3033083]: [JCG1D-X7VTW] BFD: getting peer's mac on [interface-name] failed error No such device or address

When I bring down the interface, this log message stops, the BFD peer is immediately down, but the BGP peer stays up, as reported before.

I tried configuring the BFD session as multihop, but with the same result.

Our FRR version is now 8.5.4.

@sbrs3 sbrs3 reopened this Mar 5, 2024
@ton31337
Copy link
Member

ton31337 commented Mar 7, 2024

Can you give us the configuration and a quick shot of how the interfaces look like?

@ton31337 ton31337 self-assigned this Mar 7, 2024
@sbrs3
Copy link
Author

sbrs3 commented Mar 7, 2024

Here are the configs and a printout of the interfaces.

When I bring down the link on RouterB, BFD goes down immediately, but BGP stays up until the holdtime expires.

RouterA-config.txt

RouterB-config.txt

interfaces.txt

[Edit] Adding captures of the interfaces:

RouterA:
p0.zip
repGRD-parent.zip

RouterB:
ens3.zip

@ton31337
Copy link
Member

@sbrs3 how is enp0s1f0d4 related to enp0s1f0d1? I'm not sure I understood the topology here.

@sbrs3
Copy link
Author

sbrs3 commented Mar 12, 2024

enp0s1f0d1 and enp0s1f0d4 are interfaces of Router A's control plane module, where FRR is running. Both logical interfaces are associated with the same physical port. Router B is a VM, that's why it is simpler. I am adding a picture to make the topology more clear.
Capture

@ton31337
Copy link
Member

I'm not sure yet regarding RouterA configuration. Is there a bridge or what is the actual underlaying configuration? Can you show the /etc/network/interfaces or something (depending on the OS).

@sbrs3
Copy link
Author

sbrs3 commented Mar 13, 2024

There is a hardware pipeline connecting enp0s1f0d1/4 with the physical port. It forwards outgoing packets transparently. For incoming packets, it rewrites the DMAC from "phy port MAC" to "enp0s1f0d4 MAC". Could that be a problem for BFD? There are no other changes done to the packet. You can also see what arrives on the other end (Router B ens3) in the capture I uploaded above.

Router A's control plane module is an ARM platform running Rocky Linux 9.3/kernel 5.15 and FRR 8.5.4 compiled for ARM.

I am uploading the Network Manager systemconnection files for the two above interfaces.

enp0s1f0d1.nmconnection.txt
enp0s1f0d4.nmconnection.txt

@sbrs3
Copy link
Author

sbrs3 commented Mar 26, 2024

I think I found the cause of the problem in our case. If I comment out following lines in lib/bfd.c, the coupling between BFD and BGP works:

               /* Skip different interface. */
              if (bsp->args.ifnamelen && ifp
                  && strcmp(bsp->args.ifname, ifp->name) != 0)
                      continue;

Now this is obviously a solution for our specific case with two interface for ingress and egress traffic. So I am wondering what is the above if-clause needed for, and what are the side effects when removing it?

@ton31337
Copy link
Member

I'm not an expert on BFD, actually, we need to check what RFCs say on such a scenario (if any). Maybe @rzalamena has an opinion here?

@rzalamena
Copy link
Member

@sbrs3

There is a hardware pipeline connecting enp0s1f0d1/4 with the physical port. It forwards outgoing packets transparently. For incoming packets, it rewrites the DMAC from "phy port MAC" to "enp0s1f0d4 MAC". Could that be a problem for BFD?

Yes, BFD expects to receive the packet from the same (OS) interface it sent. The way your OS is configured it makes it look like you are doing LACP (Link Aggregation) from the BFD point of view and that is not going to work. We also don't support RFC 7130 yet.

A possible hack would be ignore the interface name as you suggested (e.g. make a configuration knob for it), but I'm not sure what others might think of this option (or how common it is to have such data plane). You also have the option to implement your own BFD data plane handler using distributed BFD and hide these details from FRR. There is a library to help build distributed BFD data plane handlers here.

@sbrs3
Copy link
Author

sbrs3 commented Mar 27, 2024

I think you have just given the best argument for such a knob, namely the current lack of support for BFD on LAG. But there could also be other scenarios of asymmetric routing with multiple links that are not in a LAG. So a configuration knob like the following could benefit multiple usecases:

bfd
   peer A.B.C.D ignore-interface-names

If "ignore-interface-names" is specified, the BFD implementation would simply skip the interface name check in the code that I quoted above.

@mjstapp
Copy link
Contributor

mjstapp commented Mar 27, 2024

The test is really just comparing interfaces - I think the code that tests the interface name is sort of ... internal, it's not really the right thing to try to expose in a config. Maybe the config should be ignore-interface or skip-interface-match ?

I think you have just given the best argument for such a knob, namely the current lack of support for BFD on LAG. But there could also be other scenarios of asymmetric routing with multiple links that are not in a LAG. So a configuration knob like the following could benefit multiple usecases:

bfd
   peer A.B.C.D ignore-interface-names

If "ignore-interface-names" is specified, the BFD implementation would simply skip the interface name check in the code that I quoted above.

@ton31337 ton31337 removed their assignment Mar 30, 2024
@ton31337 ton31337 added bfd enhancement and removed triage Needs further investigation labels Mar 30, 2024
@ton31337 ton31337 self-assigned this Jul 3, 2024
@ton31337
Copy link
Member

ton31337 commented Jul 4, 2024

Could you check the latest master? We pushed one fix regarding BFD/BGP integration (when the session is down/admin down).

@tera2603
Copy link

How to reproduce this issue,please provide configuration steps.

@HareshKhandelwal
Copy link

Hi @ton31337 Unfortunately, i dont have environment right now and moved on to different project.
@tera2603 Steps are very simple. below is what i did.

My config is:
router bgp 64999
bgp router-id 192.168.80.115
bgp log-neighbor-changes
no bgp ebgp-requires-policy
no bgp suppress-duplicates
no bgp hard-administrative-reset
no bgp default ipv4-unicast
bgp graceful-shutdown
no bgp graceful-restart notification
no bgp network import-check
neighbor uplink peer-group
neighbor uplink remote-as internal
neighbor uplink bfd
neighbor uplink bfd profile tripleo
neighbor uplink ttl-security hops 1
neighbor eno2 interface peer-group uplink

My usecase here is very simple, unnumbered iBGP session between 2 peers connected via uplink switch in same vlan.
eno2 (physical interface) here is plain ethernet interface.

I brought down BFD purposefully here. However, bringing BFD up/down does not impact BGP session.

Thanks

@ton31337 ton31337 removed their assignment Jul 30, 2024
@ton31337
Copy link
Member

ton31337 commented Aug 7, 2024

It should be.

@ton31337 ton31337 closed this as completed Nov 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants