Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FRR >=9.0.0 crashes upon show rpki $prefix #14646

Closed
1 of 2 tasks
ichdasich opened this issue Oct 24, 2023 · 12 comments
Closed
1 of 2 tasks

FRR >=9.0.0 crashes upon show rpki $prefix #14646

ichdasich opened this issue Oct 24, 2023 · 12 comments
Labels
bgp triage Needs further investigation
Milestone

Comments

@ichdasich
Copy link

ichdasich commented Oct 24, 2023

Describe the bug
FRR >= 9.0.0 crashes on Linux (vyos/Debian stable) upon show rpki prefix $prefix when RPKI is consumed via RTR.

This has been tested on FRR 9.0.0 and FRR 9.0.1

  • Did you check if this is a duplicate issue?
  • Did you test it on the latest FRRouting/frr master branch?

I have an strace for all daemons in case the attached crash log does not suffice.
messages.txt

Example


[email protected]:~$ vtysh

Hello, this is FRRouting (version 9.0).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

gw03# show rpki cache-connection
Connected to group 1
rpki tcp cache 195.191.196.5 323 pref 1 (connected)
rpki tcp cache 195.191.197.8 323 pref 2
gw03# show rpki prefix 38.0.0.0/8
vtysh: error reading from bgpd: Success (0)Warning: closing connection to bgpd because of an I/O error!
gw03# 

To Reproduce

  1. Enable RTR/RPKI
  2. Consume a fulltable
  3. Check RTR is enabled with show rpki cache-connection
  4. Run show rpki prefix 192.0.2.0/24
  5. FRR crashes

Expected behavior
FRR does not crash when executing show rpki prefix

Versions

  • OS Version: Debian 12 / VyOS
  • Kernel: 6.1.50
  • FRR Version: 9.0.0 and 9.0.1 tested
@ichdasich ichdasich added the triage Needs further investigation label Oct 24, 2023
@ton31337 ton31337 added the bgp label Oct 25, 2023
@ton31337 ton31337 self-assigned this Oct 25, 2023
@donaldsharp
Copy link
Member

This command is not crashing for me when I issue it. Can you give us the decode of the actual crash you are seeing?

@ichdasich
Copy link
Author

What do you mean with the decode? The strace output, or did you miss the frr-trace in the ticket? (messages.txt)

@ichdasich
Copy link
Author

(If you want, i can also whip up a test box and give you a login, btw.)

@donaldsharp
Copy link
Member

I totally missed the messages.txt file. In any event it's crashing in a librtr function, straight out of the cli command. Which makes this harder to debug remotely from my perspective since I cannot recreate. Having access would allow me to look at this closer

@ton31337
Copy link
Member

What is the librtr version installed?

@ichdasich
Copy link
Author

@donaldsharp let me get you a box up. Can i just use your GH sshkey?

@ichdasich
Copy link
Author

One system (running 9.0.0) has:

ii  frr-rpki-rtrlib                      9.0-23-g9d1ee3ade                amd64        FRRouting suite - BGP RPKI support (rtrlib)
ii  libqrtr-glib0:amd64                  1.2.2-1                          amd64        Support library to use the QRTR protocol
ii  librtr0:amd64                        0.8.0                            amd64        Small extensible RPKI-RTR-Client C library.

The other one (9.0.1) has:

$ dpkg -l | grep rtr
ii  frr-rpki-rtrlib                      9.0.1-51-g8a6b43262              amd64        FRRouting suite - BGP RPKI support (rtrlib)
ii  libqrtr-glib0:amd64                  1.2.2-1                          amd64        Support library to use the QRTR protocol
ii  librtr0:amd64                        0.8.0                            amd64        Small extensible RPKI-RTR-Client C library.

@ichdasich
Copy link
Author

@donaldsharp ssh [email protected]

It is a vyos, but i pre-filled /etc/apt/sources.list (will be gone after a reboot). Login with your github ssh key.

sudo works without a password, vtysh can be run as the vyos user.

The system currently ingests a fulltable and has two rpki caches configured. After a reboot/frr restart you need to manually start rpki (open vyos bug):

vtysh -c 'rpki start'

At the moment, RPKI is in sync, and a simple vtysh -c 'show rpki prefix 195.191.197.0/24' should kill frr.

Feel free to install what you need and/or reboot the box. SSH/mgmt is in its own VRF, so you can also not break anything if you fnord the default vrf. Please don't announce anything funny, and let me know when you no longer need the box. ;-)

@donaldsharp
Copy link
Member

went on box. I can confirm the crash. I need the frr debug symbols installed as well as the librtr0 debug symbols. I do not know what they are called on vyos. I'm asking around to see if I can get some help here too

@ton31337 ton31337 removed their assignment Oct 26, 2023
@ton31337 ton31337 added this to the 9.1 milestone Oct 26, 2023
@ichdasich
Copy link
Author

Moin,
i am rolling my own. Will take a look at whether i can build them. this might upgrade to 9.0.1 though.

In parallel i can see about whether i can reproduce this on a more standard system.

@ichdasich
Copy link
Author

i now built a new image with a 9.0.1-61-gd5d6be1d8, and frr no longer crashes, as it seems. The previously failing image was running 9.0.1-51-g8a6b43262.

I'd suspect that this is a) something specific to the vyos builds of frr, which now does not occur with self-built packages, or b) has been (accidentally) fixed between 9.0.1-61-gd5d6be1d8 and 9.0.1-51-g8a6b43262.

I am somewhat at a loss, but i guess this fixes it? Feel free to close the issue.

@donaldsharp
Copy link
Member

sharpd@eva ~/frr3 (stable/9.0)> git log --oneline 8a6b432..d5d6be1
d5d6be1 (HEAD -> stable/9.0, origin/stable/9.0) Merge pull request #14654 from FRRouting/mergify/bp/stable/9.0/pr-14645
77e4046 bgpd: Check mandatory attributes more carefully for UPDATE message
d19727b bgpd: Handle MP_REACH_NLRI malformed packets with session reset
8d04bac Merge pull request #14638 from FRRouting/mergify/bp/stable/9.0/pr-14628
b7f3fa8 tests: Check if BGP conditional advertisement works fine with static routes
ccb10cd bgpd: Do not suppress conditional advertisement updates if triggered
7691067 Merge pull request #14622 from FRRouting/mergify/bp/stable/9.0/pr-14616
93e7fbe doc: add "enforce-first-as" to BGP doc
6a925a3 Merge pull request #14612 from FRRouting/mergify/bp/stable/9.0/pr-14607
8fa35b3 pim6d: valgrind issue fixes

None of these commits are related to the code in question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bgp triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

3 participants