Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipv6 route-map not marking correct routes as a match #15274

Open
2 tasks done
mruprich opened this issue Feb 1, 2024 · 19 comments
Open
2 tasks done

ipv6 route-map not marking correct routes as a match #15274

mruprich opened this issue Feb 1, 2024 · 19 comments
Labels
triage Needs further investigation

Comments

@mruprich
Copy link
Contributor

mruprich commented Feb 1, 2024

Describe the bug
I am using an ipv6 prefix-list in a route-map and the route map is then using set source on outgoing packets. I believe that FRR is not filtering the outgoing ipv6 correctly. Current configuration:

frr version 8.5.3
frr defaults traditional
hostname test-host
log file /var/log/frr/frr.log debugging
log timestamp precision 3
no ip forwarding
no ipv6 forwarding
!
debug zebra events
debug zebra packet
debug zebra kernel
debug zebra rib
debug zebra dplane
debug bgp updates in
debug bgp updates out
!
debug route-map
!
router bgp 1111111111
 bgp router-id 192.168.0.1
 bgp log-neighbor-changes
 no bgp ebgp-requires-policy
 no bgp default ipv4-unicast
 neighbor 1111:1111:1111:1a3a::15 remote-as external
 neighbor 1111:1111:1111:1a3a::15 timers 1 3
 neighbor 1111:1111:1111:1a3a::15 timers connect 10
 neighbor 1111:1111:1111:1a3b::15 remote-as external
 neighbor 1111:1111:1111:1a3b::15 timers 1 3
 neighbor 1111:1111:1111:1a3b::15 timers connect 10
 neighbor 1111:1111:1111:1a3e::15 remote-as external
 neighbor 1111:1111:1111:1a3e::15 timers 1 3
 neighbor 1111:1111:1111:1a3e::15 timers connect 10
 neighbor 1111:1111:1111:1a3f::15 remote-as external
 neighbor 1111:1111:1111:1a3f::15 timers 1 3
 neighbor 1111:1111:1111:1a3f::15 timers connect 10
 !
 address-family ipv6 unicast
  redistribute connected
  neighbor 1111:1111:1111:1a3a::15 activate
  neighbor 1111:1111:1111:1a3a::15 soft-reconfiguration inbound
  neighbor 1111:1111:1111:1a3a::15 route-map route-for-internal out
  neighbor 1111:1111:1111:1a3b::15 activate
  neighbor 1111:1111:1111:1a3b::15 soft-reconfiguration inbound
  neighbor 1111:1111:1111:1a3b::15 route-map route-for-internal out
  neighbor 1111:1111:1111:1a3e::15 activate
  neighbor 1111:1111:1111:1a3e::15 soft-reconfiguration inbound
  neighbor 1111:1111:1111:1a3e::15 route-map route-for-datasync out
  neighbor 1111:1111:1111:1a3f::15 activate
  neighbor 1111:1111:1111:1a3f::15 soft-reconfiguration inbound
  neighbor 1111:1111:1111:1a3f::15 route-map route-for-datasync out
 exit-address-family
exit
!
ipv6 prefix-list prefix-list-for-datasync seq 5 permit 1111:1111:1111:1001::/64 le 128
ipv6 prefix-list prefix-list-for-internal seq 5 permit 1111:1111:1111:1000::/64 le 128
ipv6 prefix-list prefix-list-for-internal seq 10 permit ::/0 le 128
!
no route-map set_src_lo optimization
!
route-map set_src_lo permit 1
 match ipv6 address prefix-list prefix-list-for-datasync
 set src 1111:1111:1111:1001::3:e003
exit
!
route-map set_src_lo permit 5
 match ipv6 address prefix-list prefix-list-for-internal
 set src 1111:1111:1111:1000::3:e003
exit
!
route-map set_src_lo permit 10
exit
!
route-map route-for-datasync permit 10
 match ipv6 address prefix-list prefix-list-for-datasync
 set metric 100
exit
!
route-map route-for-datasync permit 20
 set metric 200
exit
!
route-map route-for-internal permit 10
 match ipv6 address prefix-list prefix-list-for-datasync
 set metric 200
exit
!
route-map route-for-internal permit 20
 set metric 100
exit
!
ipv6 protocol bgp route-map set_src_lo
!

So as you can see, I have two ipv6 prefix lists, prefix-list-for-datasync for prefix 1111:1111:1111:1001::/64 le 128 and prefix-list-for-internal for prefix 1111:1111:1111:1000::/64 le 128 and sort-of catch-all ::/0 le 128. The route-map should set source address based on these prefix lists. Anything from 1111:1111:1111:1001::/64 le 128 should have a source set as 1111:1111:1111:1001::3:e003, everything else as 1111:1111:1111:1000::3:e003. Route-map output looks good to me:

# sh route-map set_src_lo
ZEBRA:
route-map: set_src_lo Invoked: 771 Optimization: enabled Processed Change: false
 permit, sequence 1 Invoked 40
  Match clauses:
    ipv6 address prefix-list prefix-list-for-datasync   <- sequence 1 for 1111:1111:1111:1001::/64 le 128
  Set clauses:
    src 1111:1111:1111:1001::3:e003
  Call clause:
  Action:
    Exit routemap
 permit, sequence 5 Invoked 731
  Match clauses:
    ipv6 address prefix-list prefix-list-for-internal <- anything else should fall here
  Set clauses:
    src 1111:1111:1111:1000::3:e003
  Call clause:
  Action:
    Exit routemap
 permit, sequence 10 Invoked 0
  Match clauses:
  Set clauses:
  Call clause:
  Action:
    Exit routemap
BGP:
route-map: set_src_lo Invoked: 0 Optimization: enabled Processed Change: false
 permit, sequence 1 Invoked 0
  Match clauses:
    ipv6 address prefix-list prefix-list-for-datasync
  Set clauses:
  Call clause:
  Action:
    Exit routemap
 permit, sequence 5 Invoked 0
  Match clauses:
    ipv6 address prefix-list prefix-list-for-internal
  Set clauses:
  Call clause:
  Action:
    Exit routemap
 permit, sequence 10 Invoked 0
  Match clauses:
  Set clauses:
  Call clause:
  Action:
    Exit routemap

But, when I try to ping 1111:1111:1111:1001::3:e001, the address from sequence 5 is used instead of sequence 1:

vlan2009 Out IP6 1111:1111:1111:1000::3:e003 > 1111:1111:1111:1001::3:e001: ICMP6, echo request, id 58, seq 7, length 64
vlan2008 In  IP6 1111:1111:1111:1001::3:e001 > 1111:1111:1111:1000::3:e003: ICMP6, echo reply, id 58, seq 7, length 64

The log says that the first sequence in not a match:

Best match route-map: set_src_lo, sequence: 1 for pfx: 1111:1111:1111:1001::3:e001/128, result: no match  <- sequence 1 should match
Route-map: set_src_lo, sequence: 5, prefix: 1111:1111:1111:1001::3:e001/128, result: match
Route-map: set_src_lo, prefix: 1111:1111:1111:1001::3:e001/128, result: permit

I would really appreciate any insight with this issue. I think the configuration is OK but I might be mistaken in the route-map/prefix-lists usage.

  • Did you check if this is a duplicate issue?
  • Did you test it on the latest FRRouting/frr master branch?

I did not test this on the master branch, unfortunately I can't do that at the moment. Using FRR-8.5.3 now.

Expected behavior
I would expect that the first sequence in the route-map would get a hit at this point.

Versions

  • OS Version: RHEL-9.2.0
  • Kernel: 5.14.0-284.11.1
  • FRR Version: 8.5.3

EDIT: I had to edit the IP addresses a bit so that they are a bit more anonymous. Sorry.

@mruprich mruprich added the triage Needs further investigation label Feb 1, 2024
@mruprich
Copy link
Contributor Author

mruprich commented Feb 1, 2024

Would it make sense to try 1c950f3 and cc09ba4 ?

@takesaito
Copy link

Hello,

I met this issue and Mruprich is helping me now.

It seems that "consolidated result of func_apply"  [1] does not work correctly in this environment.
"MATCH" was expected but the results seem to be "NOMATCH"
I do not know if this is the bug or not ...

[1]  https://github.com/FRRouting/frr/blob/frr-8.5.3/lib/routemap.c#L1691-L1766

@donaldsharp
Copy link
Member

Can you turn off optimization for the route-map?
no route-map set_src_lo optimization

and see if the problem goes away?

@mruprich
Copy link
Contributor Author

mruprich commented Feb 6, 2024

Hi Donald, I think we tried that at some point but the result was even more strange (I do not have the log for that at the moment but I can get it):

14:22:17.247058 vlan2004 In  IP6 1111:1111:1111:1000::1:f002 > 1111:1111:1111:1000::3:e003: ICMP6, echo request, id 44852, seq 7, length 64
14:22:17.247084 vlan2005 Out IP6 1111:1111:1111:1000::3:e003 > 1111:1111:1111:1000::1:f002: ICMP6, echo reply, id 44852, seq 7, length 64
14:22:17.473321 vlan2009 Out IP6 1111:1111:1111:1a3f::e003 > 1111:1111:1111:1001::3:e001: ICMP6, echo request, id 37281, seq 117, length 64
14:22:17.473377 vlan2009 In  IP6 1111:1111:1111:1001::3:e001 > 1111:1111:1111:1a3f::e003: ICMP6, echo reply, id 37281, seq 117, length 64
14:22:18.497297 vlan2009 Out IP6 1111:1111:1111:1a3f::e003 > 1111:1111:1111:1001::3:e001: ICMP6, echo request, id 37281, seq 118, length 64
14:22:18.497345 vlan2009 In  IP6 1111:1111:1111:1001::3:e001 > 1111:1111:1111:1a3f::e003: ICMP6, echo reply, id 37281, seq 118, length 64

The 1111:1111:1111:1a3f::e003 address was used instead which is the Ipv6 address of the actual outgoing interface. The route-map looks very similar without optimization:

# sh route-map set_src_lo
ZEBRA:
route-map: set_src_lo Invoked: 764 Optimization: disabled Processed Change: false
 permit, sequence 1 Invoked 764
  Match clauses:
    ipv6 address prefix-list prefix-list-for-datasync
  Set clauses:
    src 1111:1111:1111:1001::3:e003
  Call clause:
  Action:
    Exit routemap
 permit, sequence 5 Invoked 764
  Match clauses:
    ipv6 address prefix-list prefix-list-for-internal
  Set clauses:
    src 1111:1111:1111:1000::3:e003
  Call clause:
  Action:
    Exit routemap
 permit, sequence 10 Invoked 0
  Match clauses:
  Set clauses:
  Call clause:
  Action:
    Exit routemap
BGP:
route-map: set_src_lo Invoked: 0 Optimization: disabled Processed Change: false
 permit, sequence 1 Invoked 0
  Match clauses:
    ipv6 address prefix-list prefix-list-for-datasync
  Set clauses:
  Call clause:
  Action:
    Exit routemap
 permit, sequence 5 Invoked 0
  Match clauses:
    ipv6 address prefix-list prefix-list-for-internal
  Set clauses:
  Call clause:
  Action:
    Exit routemap
 permit, sequence 10 Invoked 0
  Match clauses:
  Set clauses:
  Call clause:
  Action:
    Exit routemap

@takesaito
Copy link

takesaito commented Feb 7, 2024

Hello Donald,

Thank you for your comment. As Michal said, they tried with "no route-map set_src_lo optimization".

The configuration with "no route-map set_src_lo optimization".

- ping to "1111:111:1111:1001::3:e001" is used source address "1111:111:1111:1a3f::e003"
  ("1111:111:1111:1a3f::e003" is physical ip address. 
  It is not a  loopback address. It is not as the I expected.)
- ping to "1111:111:1111:1000::3:d001" is used as the source address "1111:111:1111:1000::3:e003" 
  (It is as the I expected.)
# tcpdump -nn -r ping.pcap | grep -i icmp
...
14:22:17.247058 vlan2004 In  IP6 1111:111:1111:1000::1:f002 > 1111:111:1111:1000::3:e003: ICMP6, echo request, id 44852, seq 7, length 64
14:22:17.247084 vlan2005 Out IP6 1111:111:1111:1000::3:e003 > 1111:111:1111:1000::1:f002: ICMP6, echo reply, id 44852, seq 7, length 64
14:22:17.473321 vlan2009 Out IP6 1111:111:1111:1a3f::e003 > 1111:111:1111:1001::3:e001: ICMP6, echo request, id 37281, seq 117, length 64
14:22:17.473377 vlan2009 In  IP6 1111:111:1111:1001::3:e001 > 1111:111:1111:1a3f::e003: ICMP6, echo reply, id 37281, seq 117, length 64

The configuration without "no route-map set_src_lo optimization" .

- ping to "1111:111:1111:1001::3:e001" is used source address "1111:111:1111:1000::3:e003" 
  (It is not as I expected.)
- ping to "1111:111:1111:1000::3:d001" is used source address "1111:111:1111:1000::3:e003" 
  (It is as I expected.)

# tcpdump -nn -r ping6.pcap | grep -i icmp
...
19:34:32.961330 vlan2009 Out IP6 1111:111:1111:1000::3:e003 > 1111:111:1111:1001::3:e001: ICMP6, echo request, id 40475, seq 23, length 64
19:34:32.961373 vlan2004 In  IP6 1111:111:1111:1001::3:e001 > 1111:111:1111:1000::3:e003: ICMP6, echo reply, id 40475, seq 23, length 64
...
19:35:37.153336 vlan2009 Out IP6 1111:111:1111:1000::3:e003 > 1111:111:1111:1000::3:d001: ICMP6, echo request, id 40478, seq 9, length 64
19:35:37.153382 vlan2004 In  IP6 1111:111:1111:1000::3:d001 > 1111:111:1111:1000::3:e003: ICMP6, echo reply, id 40478, seq 9, length 64

There is a requirement we would like to do.
We would like to configure 2 loopback addresses per destination network.
For example,

  • Use loopback address A for communicating to network C.
  • Use loopback address B for communicating to network D.

Currently our route-map configuration did not work.
If there are any ideas for resolving this such as changing the route-map configuration, please teach me?

@takesaito
Copy link

takesaito commented Feb 8, 2024

Hello

I have an update.

When the prefix-list is configured as "/128", It worked
But If "/64 le 128" (specify the IP range) is configured, It did not work.

Configured the prefix "/128" as below(*) and tested.

ipv6 prefix-list prefix-list-for-internal seq 5 permit ::/0
ipv6 prefix-list prefix-list-for-internal seq 10 permit 1111:111:1111:1000::/64 le 128
ipv6 prefix-list prefix-list-for-datasync seq 5 permit 1111:111:1111:1001::2:e002/128(*)

The results were what I expected.
There was a ping result.

ping to 1111:111:1111:1000::2:e001(prefix-list-for-internal), Source address "1111:111:1111:1000::3:e003" is used.
[root@test-host ~]# tcpdump -nn -r ping3.pcap | grep -i icmp
reading from file ping3.pcap, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144
Warning: interface names might be incorrect
dropped privs to tcpdump
15:33:36.897331 vlan9004 Out IP6 1111:111:1111:1000::3:e003 > 1111:111:1111:1000::2:e001: ICMP6, echo request, id 4524, seq 12, length 64
15:33:36.897382 vlan9004 In  IP6 1111:111:1111:1000::2:e001 > 1111:111:1111:1000::3:e003: ICMP6, echo reply, id 4524, seq 12, length 64
15:33:37.921322 vlan9004 Out IP6 1111:111:1111:1000::3:e003 > 1111:111:1111:1000::2:e001: ICMP6, echo request, id 4524, seq 13, length 64
15:33:37.921362 vlan9004 In  IP6 1111:111:1111:1000::2:e001 > 1111:111:1111:1000::3:e003: ICMP6, echo reply, id 4524, seq 13, length 64
[root@test-host ~]#

ping to 1111:111:1111:1001::2:e002(prefix-list-for-datasync), Source address "1111:111:1111:1001::3:e003" is used.
[root@test-host ~]# tcpdump -nn -r ping5.pcap | grep -i icmp
reading from file ping5.pcap, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144
Warning: interface names might be incorrect
dropped privs to tcpdump
15:35:29.409317 vlan9009 Out IP6 1111:111:1111:1001::3:e003 > 1111:111:1111:1001::2:e002: ICMP6, echo request, id 4532, seq 9, length 64
15:35:29.409377 vlan9005 In  IP6 1111:111:1111:1001::2:e002 > 1111:111:1111:1001::3:e003: ICMP6, echo reply, id 4532, seq 9, length 64
15:35:30.433335 vlan9009 Out IP6 1111:111:1111:1001::3:e003 > 1111:111:1111:1001::2:e002: ICMP6, echo request, id 4532, seq 10, length 64
15:35:30.433396 vlan9005 In  IP6 1111:111:1111:1001::2:e002 > 1111:111:1111:1001::3:e003: ICMP6, echo reply, id 4532, seq 10, length 64
[root@test-host ~]#

As a current conclusion, If "prefix-list-for-datasync" was configured as IP range (1111:111:1111:1001::/64 le 128), It was not working.
But If "prefix-list-for-datasync" was configured as "1111:111:1111:1001::2:e002/128", It was working.

It seems to be a bug, How do you think of this?

There are current configurations.

# cat /etc/frr/frr.conf
frr version 8.3.1
frr defaults traditional
hostname test-host
log file /var/log/frr/frr.log informational
log timestamp precision 3
no ip forwarding
no ipv6 forwarding
service integrated-vtysh-config
line vty
!
router bgp 1111111111
 bgp router-id 172.15.0.1
 bgp log-neighbor-changes
 no bgp default ipv4-unicast
 no bgp ebgp-requires-policy
 neighbor 1111:111:1111:1a3a::15 remote-as external
 neighbor 1111:111:1111:1a3a::15 timers 3 9
 neighbor 1111:111:1111:1a3a::15 timers connect 10
 neighbor 1111:111:1111:1a3b::15 remote-as external
 neighbor 1111:111:1111:1a3b::15 timers 3 9
 neighbor 1111:111:1111:1a3b::15 timers connect 10
 neighbor 1111:111:1111:1a3e::15 remote-as external
 neighbor 1111:111:1111:1a3e::15 timers 3 9
 neighbor 1111:111:1111:1a3e::15 timers connect 10
 neighbor 1111:111:1111:1a3f::15 remote-as external
 neighbor 1111:111:1111:1a3f::15 timers 3 9
 neighbor 1111:111:1111:1a3f::15 timers connect 10
 !
 address-family ipv6 unicast
  redistribute connected
  neighbor 1111:111:1111:1a3a::15 activate
  neighbor 1111:111:1111:1a3a::15 soft-reconfiguration inbound
  neighbor 1111:111:1111:1a3a::15 route-map route-for-internal out
  neighbor 1111:111:1111:1a3a::15 route-map route-for-internal in
  neighbor 1111:111:1111:1a3b::15 activate
  neighbor 1111:111:1111:1a3b::15 soft-reconfiguration inbound
  neighbor 1111:111:1111:1a3b::15 route-map route-for-internal out
  neighbor 1111:111:1111:1a3b::15 route-map route-for-internal in
  neighbor 1111:111:1111:1a3e::15 activate
  neighbor 1111:111:1111:1a3e::15 soft-reconfiguration inbound
  neighbor 1111:111:1111:1a3e::15 route-map route-for-datasync out
  neighbor 1111:111:1111:1a3e::15 route-map route-for-datasync in
  neighbor 1111:111:1111:1a3f::15 activate
  neighbor 1111:111:1111:1a3f::15 soft-reconfiguration inbound
  neighbor 1111:111:1111:1a3f::15 route-map route-for-datasync out
  neighbor 1111:111:1111:1a3f::15 route-map route-for-datasync in
 exit-address-family
exit
!
ipv6 prefix-list prefix-list-for-internal seq 5 permit ::/0
ipv6 prefix-list prefix-list-for-internal seq 10 permit 1111:111:1111:1000::/64 le 128
ipv6 prefix-list prefix-list-for-datasync seq 5 permit 1111:111:1111:1001::2:e002/128
!
route-map route-for-datasync permit 10
 match ipv6 address prefix-list prefix-list-for-datasync
 set metric 100
exit
!
route-map route-for-datasync permit 20
 set metric 200
exit
!
route-map route-for-internal permit 10
 match ipv6 address prefix-list prefix-list-for-datasync
 set metric 200
exit
!
route-map route-for-internal permit 20
 set metric 100
exit
!
route-map set_src_lo permit 1
 match ipv6 address prefix-list prefix-list-for-datasync
 set src 1111:111:1111:1001::3:e003
exit
!
route-map set_src_lo permit 5
 match ipv6 address prefix-list prefix-list-for-internal
 set src 1111:111:1111:1000::3:e003
exit
!
route-map set_src_lo permit 10
exit
!
ipv6 protocol bgp route-map set_src_lo

@mruprich
Copy link
Contributor Author

mruprich commented Feb 9, 2024

Just a note, I've been able to reproduce this with FRR-9.1 as well.

@mruprich
Copy link
Contributor Author

mruprich commented Feb 9, 2024

One more note, I can't reproduce this for IPv4, only for IPv6 prefixes...

@mruprich
Copy link
Contributor Author

mruprich commented Feb 9, 2024

Seems to me that the problem is actually with the set src clause. I've been playing around with the setup a bit and the match sequences seem alright in the debug log but for instance the config below, I've been trying to switch around the sequence numbers of the route-map statements and at one point I was getting the source 'fd00:268:70fd:1001::3:e003' for every prefix and I was not able to get the 'fd00:268:70fd:1000::3:e003' prefix there:

16:17:49.234082 enp1s0 Out IP6 fd00:268:70fd:1001::3:e003 > fd00:268:70fd:1001::3:e001: ICMP6, echo request, id 46, seq 1, length 64
16:17:50.278395 enp1s0 Out IP6 fd00:268:70fd:1001::3:e003 > fd00:268:70fd:1001::3:e001: ICMP6, echo request, id 46, seq 2, length 64
16:17:53.779028 enp1s0 Out IP6 fd00:268:70fd:1001::3:e003 > fd00:268:70fd:1000::3:e001: ICMP6, echo request, id 47, seq 1, length 64
16:17:54.822385 enp1s0 Out IP6 fd00:268:70fd:1001::3:e003 > fd00:268:70fd:1000::3:e001: ICMP6, echo request, id 47, seq 2, length 64
16:18:12.508780 enp1s0 Out IP6 fd00:268:70fd:1001::3:e003 > fd00:268:70fd:1000::3:e001: ICMP6, echo request, id 48, seq 1, length 64

I tried to restart FRR and the set src switched to 'fd00:268:70fd:1000::3:e003' again for every prefix :D

16:19:03.445311 enp1s0 Out IP6 fd00:268:70fd:1000::3:e003 > fd00:268:70fd:1001::3:e001: ICMP6, echo request, id 50, seq 1, length 64
16:19:04.454469 enp1s0 Out IP6 fd00:268:70fd:1000::3:e003 > fd00:268:70fd:1001::3:e001: ICMP6, echo request, id 50, seq 2, length 64
16:19:08.285678 enp1s0 Out IP6 fd00:268:70fd:1000::3:e003 > fd00:268:70fd:1000::3:e001: ICMP6, echo request, id 51, seq 1, length 64
16:19:09.318368 enp1s0 Out IP6 fd00:268:70fd:1000::3:e003 > fd00:268:70fd:1000::3:e001: ICMP6, echo request, id 51, seq 2, length 64

Current reproducible config below:

frr version 9.1
frr defaults traditional
hostname host
log file /var/log/frr/frr.log
no ip forwarding
no ipv6 forwarding
!
debug bgp neighbor-events
debug bgp updates in
debug bgp updates out
debug bgp zebra
!
debug route-map
!
router bgp 65000
 no bgp ebgp-requires-policy
 neighbor 10.1.217.36 remote-as external
 neighbor 10.1.217.36 ebgp-multihop
 !
 address-family ipv6 unicast
  neighbor 10.1.217.36 activate
 exit-address-family
exit
!
ip prefix-list prefix-list-for-datasync-ipv4 seq 1 permit 192.168.100.0/24 le 32
ip prefix-list prefix-list-for-internal-ipv4 seq 1 permit 192.168.200.0/24 le 32
ip prefix-list prefix-list-for-internal-ipv4 seq 10 permit 0.0.0.0/0
!
ipv6 prefix-list prefix-list-for-datasync seq 5 permit fd00:268:70fd:1001::/64 le 128
ipv6 prefix-list prefix-list-for-internal seq 15 permit fd00:268:70fd:1000::/64 le 128
ipv6 prefix-list prefix-list-for-internal seq 20 permit ::/0
!
route-map set_src_lo permit 5
 match ipv6 address prefix-list prefix-list-for-internal
 set src fd00:268:70fd:1000::3:e003
exit
!
route-map set_src_lo permit 10
 match ipv6 address prefix-list prefix-list-for-datasync
 set src fd00:268:70fd:1001::3:e003
exit
!
route-map set_src_lo permit 1000
exit
!
route-map set_src_lo_ip4 permit 1
 match ip address prefix-list prefix-list-for-datasync-ipv4
 set src 192.168.10.1
exit
!
route-map set_src_lo_ip4 permit 5
 match ip address prefix-list prefix-list-for-internal-ipv4
 set src 192.168.20.1
exit
!
route-map set_src_lo_ip4 permit 10
exit
!
ip protocol bgp route-map set_src_lo_ip4
!
ipv6 protocol bgp route-map set_src_lo
!
end

The log for the config above regarding route-maps seems ok to me:

2024/02/09 16:32:08 ZEBRA: [MT1SJ-WEJQ1] Best match route-map: set_src_lo_ip4, sequence: 1 for pfx: 192.168.100.0/24, result: match
2024/02/09 16:32:08 ZEBRA: [H5AW4-JFYQC] Route-map: set_src_lo_ip4, prefix: 192.168.100.0/24, result: permit
2024/02/09 16:32:08 ZEBRA: [MT1SJ-WEJQ1] Best match route-map: set_src_lo_ip4, sequence: 5 for pfx: 192.168.200.0/24, result: match
2024/02/09 16:32:08 ZEBRA: [H5AW4-JFYQC] Route-map: set_src_lo_ip4, prefix: 192.168.200.0/24, result: permit
2024/02/09 16:32:08 ZEBRA: [MT1SJ-WEJQ1] Best match route-map: set_src_lo, sequence: 5 for pfx: fd00:268:70fd:1000::3:e001/128, result: match
2024/02/09 16:32:08 ZEBRA: [H5AW4-JFYQC] Route-map: set_src_lo, prefix: fd00:268:70fd:1000::3:e001/128, result: permit
2024/02/09 16:32:08 ZEBRA: [MT1SJ-WEJQ1] Best match route-map: set_src_lo, sequence: 10 for pfx: fd00:268:70fd:1001::3:e001/128, result: match
2024/02/09 16:32:08 ZEBRA: [H5AW4-JFYQC] Route-map: set_src_lo, prefix: fd00:268:70fd:1001::3:e001/128, result: permit

This is why I think that the main issue might be with the applying of the set src rule.

@takesaito
Copy link

takesaito commented Feb 13, 2024

Thank you Michal for your support.

I've been trying to switch around the sequence numbers of the route-map statements and at one point I was getting the source 'fd00:268:70fd:1001::3:e003' for every prefix and I was not able to get the 'fd00:268:70fd:1000::3:e003' prefix there:
I tried to restart FRR and the set src switched to 'fd00:268:70fd:1000::3:e003' again for every prefix :D

The log for the config above regarding route-maps seems ok to me:

I understood that

  • It was reproduced in your environment if the prefix is configured as a range.
  • its source address is changed per restarting frr service.
  • the route-maps seems ok, It does not occur "result: no match" but the source address is not repraced as config described (It was not working as expected(*1).
(*1)
route-map set_src_lo permit 5
 match ipv6 address prefix-list prefix-list-for-internal
 set src fd00:268:70fd:1000::3:e003
exit
!
route-map set_src_lo permit 10
 match ipv6 address prefix-list prefix-list-for-datasync
 set src fd00:268:70fd:1001::3:e003
exit
!
route-map set_src_lo permit 1000
exit
!

>2024/02/09 16:32:08 ZEBRA: [MT1SJ-WEJQ1] Best match route-map: set_src_lo, sequence: 5 for pfx: fd00:268:70fd:1000::3:e001/128, result: match
>2024/02/09 16:32:08 ZEBRA: [H5AW4-JFYQC] Route-map: set_src_lo, prefix: fd00:268:70fd:1000::3:e001/128, result: permit
>2024/02/09 16:32:08 ZEBRA: [MT1SJ-WEJQ1] Best match route-map: set_src_lo, sequence: 10 for pfx: fd00:268:70fd:1001::3:e001/128, result: match
>2024/02/09 16:32:08 ZEBRA: [H5AW4-JFYQC] Route-map: set_src_lo, prefix: fd00:268:70fd:1001::3:e001/128, result: permit
This is why I think that the main issue might be with the applying of the set src rule.

I also understood the situation.

Does anyone have an idea for the solution of this issue or a workaround ?

@takesaito
Copy link

Hello

As for this, Are there any ideas for the solution?
It seems that "set src" clause of the route-map section  has bad behavior in case of IPv6, But I do not know how to resolve this.
I am in trouble ..

@mruprich
Copy link
Contributor Author

mruprich commented Mar 6, 2024

Just adding another piece of information. Seems like maybe zebra is to blame here? Setting the src wrong in the routing table?

# ip -6 route
...
fd00:268:70fd:1000::3:e001 nhid 91 via fe80::d28e:79ff:feb7:a08e dev eno1 proto bgp src fd00:268:70fd:1001::3:e003 metric 20 pref medium
fd00:268:70fd:1000::3:e003 dev lo proto kernel metric 256 pref medium
fd00:268:70fd:1001::3:e001 nhid 91 via fe80::d28e:79ff:feb7:a08e dev eno1 proto bgp src fd00:268:70fd:1001::3:e003 metric 20 pref medium
fd00:268:70fd:1001::3:e003 dev lo proto kernel metric 256 pref medium
...

On the other hand, ipv4 table looks fine (taking this from the last example here):

# ip r
192.168.100.0/24 nhid 142 via 10.1.217.128 dev eno1 proto bgp src 192.168.10.1 metric 20 
192.168.200.0/24 nhid 144 via 10.1.217.128 dev eno1 proto bgp src 192.168.20.1 metric 20

Any pointers as to where could the problem be? Did anyone have any time to take a look at this?

@mruprich
Copy link
Contributor Author

mruprich commented Mar 7, 2024

I tried to take a look at the netlink messages that are being sent from zebra to the kernel and zebra sets the wrong source there:

          {nlmsg_len=84, nlmsg_type=RTM_NEWROUTE, nlmsg_flags=NLM_F_REQUEST|NLM_F_CREATE, nlmsg_seq=852, nlmsg_pid=-238531476},
          {rtm_family=AF_INET6, rtm_dst_len=128, rtm_src_len=0, rtm_tos=0, rtm_table=RT_TABLE_MAIN, rtm_protocol=RTPROT_BGP, rtm_scope=RT_SCOPE_UNIVERSE, rtm_type=RTN_UNICAST, rtm_flags=0},
            [
[
                {nla_len=20, nla_type=RTA_DST},
                inet_pton(AF_INET6, "fd00:268:70fd:1000::3:e001")
              ],
              [
                {nla_len=8, nla_type=RTA_PRIORITY},
                20
              ],
              [
                {nla_len=8, nla_type=RTA_NH_ID},
                "\xb2\x00\x00\x00"
              ],
              [
                {nla_len=20, nla_type=RTA_PREFSRC},
                inet_pton(AF_INET6, "fd00:268:70fd:1000::3:e003")
              ]
            ]
          ],
[
            {nlmsg_len=84, nlmsg_type=RTM_NEWROUTE, nlmsg_flags=NLM_F_REQUEST|NLM_F_CREATE, nlmsg_seq=853, nlmsg_pid=-238531476},
            {rtm_family=AF_INET6, rtm_dst_len=128, rtm_src_len=0, rtm_tos=0, rtm_table=RT_TABLE_MAIN, rtm_protocol=RTPROT_BGP, rtm_scope=RT_SCOPE_UNIVERSE, rtm_type=RTN_UNICAST, rtm_flags=0},
              [
                [
                  {nla_len=20, nla_type=RTA_DST},
                  inet_pton(AF_INET6, "fd00:268:70fd:1001::3:e001")
                ],
                [
                  {nla_len=8, nla_type=RTA_PRIORITY},
                  20
                ],
                [
                  {nla_len=8, nla_type=RTA_NH_ID},
                  "\xb2\x00\x00\x00"
                ],
                [
                  {nla_len=20, nla_type=RTA_PREFSRC},
                  inet_pton(AF_INET6, "fd00:268:70fd:1000::3:e003")
                ]
              ]
            ]
          ],
  iov_len=288}],

The RTA_PREFSRC passed to the kernel is the same in both cases, I am just not able to find out, whether bgp is passing a wrong info to zebra or zebra is passing wrong info to the kernel. From the output somewhere above, seems to me like route-map is actually populated correctly so the culprit needs to be somewhere betweeb bgp <-> zebra or zebra <-> kernel.

I think I have already tried every possible debug option that I could think of, if you have any other debug options that might help, please let me know. Currently I have these:
!
debug zebra packet
debug zebra kernel msgdump send
debug zebra kernel
debug zebra nexthop detail
debug zebra neigh
debug bgp neighbor-events
debug bgp updates in
debug bgp updates out
debug bgp zebra prefix fd00:268:70fd:1001::3:e001/128
debug bgp zebra prefix fd00:268:70fd:1000::3:e001/128
!
debug route-map

@mruprich
Copy link
Contributor Author

Hi,
I am really sorry to bother you with this but could anyone spare a moment and take a look here? I am really stuck, I think I've tried every possible debug option that is relevant. I am now trying to figure out where is the source address taken from when zebra is passing the route to kernel but without someone who knows the code better, this is really complicated.

Thanks and Regards,
Michal Ruprich

@mruprich
Copy link
Contributor Author

Hi,
@donaldsharp could I trouble for any assistance here please?

@takesaito
Copy link

Hello @donaldsharp,

I am sorry for bothering you. Would you please take a look at this issue and give me your opinion?
I need your help

@mruprich
Copy link
Contributor Author

I tested it again on the recently released 10.0 version so I marked the 'Did you test it on the latest FRRouting/frr master branch?' as true.

@donaldsharp I would like to ask again if there is any pointer you could give us, any debug option we haven't used yet, any config option that might be helpful, anything?

@takesaito
Copy link

Hello @donaldsharp

I have an update for this.

If it is trying to use "interface name" instead of "ipv6 address prefix-list", It is working as expected (*).
Both source addresses are used individually as configured.

(*)
route-map set_src_lo permit 1
 match interface vlan9994
 set src 1111:1111:1111:1000::3:e003
exit
!
route-map set_src_lo permit 5
 match interface vlan9995
 set src 1111:1111:1111:1000::3:e003
exit
!
route-map set_src_lo permit 10
 match interface vlan9998
 set src 1111:1111:1111:1001::3:e003
exit
!
route-map set_src_lo permit 15
 match interface vlan9999
 set src 1111:1111:1111:1001::3:e003
exit

# tcpdump -nn -r ping.pcap | grep -i icmp | grep -i out
reading from file ping.pcap, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144
Warning: interface names might be incorrect
dropped privs to tcpdump
15:37:01.781475 vlan9995 Out IP6 1111:1111:1111:1000::3:e003 > 1111:1111:1111:1000::2:f002: ICMP6, echo reply, id 23600, seq 5, length 64
15:37:01.963406 vlan9999 Out IP6 1111:1111:1111:1001::3:e003 > 1111:1111:1111:1000::3:e005: ICMP6, echo request, id 26558, seq 7, length 64
15:37:01.980383 vlan9999 M   IP6 fe80::9aa2:c0ff:fe5d:3e6e > ff02::1: ICMP6, router advertisement, length 32
15:37:02.781489 vlan9005 Out IP6 1111:1111:1111:1000::3:e003 > 1111:1111:1111:1000::2:f002: ICMP6, echo reply, id 23600, seq 12, length 64
15:37:02.987417 vlan9999 Out IP6 1111:1111:1111:1001::3:e003 > 1111:1111:1111:1000::3:e005: ICMP6, echo request, id 26558, seq 8, length 64
# 

It seems to be an issue with the "ipv6 address prefix-list" parameter. What do you think of it?

@mruprich
Copy link
Contributor Author

Hi, could anyone please give us any pointers as to what is missing in the bug? Is there some debug info that we did not provide? Any configuration that is crucial to the debugging process?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

3 participants