Workaround for stupid systemd-networkd behaviour #2479

zviratko · 2024-10-11T06:42:32Z

I know issued connected to this have been discussed before, but could keepalived maybe better handle systemd-networkd deleting things on reload?

In particular

reinstating VIPs that get deleted (it does notice, so why not reinstall them right away?)
routes (not sure if it notices)
routing rules (this one it doesn't notice and debugging it was not fun)
whatever else it installs (at least in the network stack)

Unfortunately, systemd-networkd is not only the engine behind most other stuff (netplan, networkmanager), but also the most featureful network manager if one needs stuff like vlan aware bridges, routing rules, special network settings (and it is somewhat declarative in its behaviour which is nice).

It would be better for systemd-networkd to allow fixing this (it already somewhat does for VIPs but rules just disappear on me), but that's not feasible (I would file a bug in their GitHub but Lennart banned me for making a good argument years ago and it would get ignored anyway because they know better).

Feel free to include a derogatory log message aimed at systemd when keepalived fixes stuff in this instance :-)

Thanks.

pqarmitage · 2024-10-23T14:32:14Z

I have run some tests and all of ip addresses, routes and routeing rules being deleted are detected by keepalived. If such an event occurs (and it shouldn't because the addresses, routes and rules are keepalived's and not anyone elses), then keepalived will revert to backup state, and almost certainly then become master again (the exception would be if there is a higher priority VRRP instance that was held back from becoming master due to nopreempt being configured). The code was written like this since it was much simpler to handle the reinstatement of the addresses/routes/rules by using existing code for backup to master transition rather than add explicit code to handle each individual deletion and reinstatement.

There really is no excuse for any other process, whether it be systemd-networkd or not, to delete addresses, routes or rules that do not belong to it (in other words it did not create).

A while ago we requested, and had allocated, a routeing protocol identifier allocated for keepalived (value 18) and all routes and rules installed by keepalived are specified with that protocol id (see the description of protocol in the ip-rulte(8) man page). Unfortunately there is no equivalent for ip addresses.

@zviratko Can you please provide some specific examples of the problems you are experiencing. In other words, provide your keepalived configuration files, along with what actions are happening/commands being executed that cause the problem, what impact it has on keepalived, and ideally the keepalived log entries at the time.

zviratko · 2024-10-23T15:44:37Z

I see (and understand). Some docs talk about "reinstating" addresses and routes, but I wasn't able to confirm whether it was really implemented that way.

Interesting note about nopreempt - I have it set, so that my firewalls don't flip/flop (it should stick to last healthy node). Not sure what the correct setup for that is then? Keepalived for sure either doesn't notice a rule missing or didn't transition to BACKUP due to my misconfiguration.

I'm not sure I can provide anything truly reproducible, except trying to delete something by hand (which I'm willing to do one day during maintenance, this is in production). Sometimes when a VM goes up/down and its interface is deleted, or when I do "networkctl reload", or maybe on full moon, systemd-networkd justdecides to delete something, keepalived usually transitioned to BACKUP, this was probably the first time it didn't yet a crucial ip rule was missing.

Over time, I added:
KeepConfiguration=yes
To all my .network files
This kept it from deleting VIPs from the interfaces

Now I also added
ManageForeignRoutingRoutes=no
and ManageForeignRoutingPolicyRules=no

to systemd-networkd config, which should prevent it from deleting routes and rules.

Unfortunately there is no equivalent for ip addresses.
You could make your own IP scope :-) but systemd-networkd would delete it anyway.

The weird thing is, that sometimes it (networkd) just doesn't do that and everything works. Sometimes it goes crazy. But that's not really an issue for this repository (it is too civilized for this debate).

I know the "right" thing to do is to boycott systemd or at least not use the networkd component, but it's going to be hard (and there's nothing to switch to unless I want to run Gentoo with openrc/netrc).

configfile below (public IPs redacted)
Thanks for any insight!

! Configuration File for keepalived

global_defs {
    notification_email {
    [email protected]
  }
  notification_email_from [email protected]
  smtp_server 127.0.0.1
  smtp_connect_timeout 30

  vrrp_startup_delay 30
  vrrp_lower_prio_no_advert true
  vrrp_garp_master_delay 1
  vrrp_garp_interval 0.005
  vrrp_gna_interval 0.0005
  script_user root root
  enable_script_security
  max_auto_priority 99
  dynamic_interfaces allow_if_changes
}

interface_up_down_delays {
    peering 2
    prodint 2
    devint 2
    prodsvc 2
    devsvc 2
    devpublic 2
    devpxe 2
    prodpxe 2
    heartbeat 2
    vlan992 2
    vlan991 2
    drbd 2
    bridge 2
    lacp0 2
}

vrrp_script shgw {
  script "/usr/bin/fping -u 1.1.1.1"
  interval 5
  timeout 5
  rise 3
  fall 3
}


vrrp_instance PROD {
    state BACKUP
    nopreempt
    dont_track_primary
    promote_secondaries
    interface peering
    virtual_router_id 1
    priority "120"
    advert_int 1
    smtp_alert
    virtual_ipaddress {
    1.2.3.4/26 dev public96
    10.255.255.1/28 dev peering
    10.64.0.1/22 dev prodint
    10.64.4.1/22 dev prodsvc
    10.64.15.1/24 dev prodpxe
    192.168.100.72/24 dev vlan992
    10.1.0.2/24 dev vlan991
    10.64.20.1/22 dev devsvc
    10.64.24.1/22 dev devpublic
    10.64.31.1/24 dev devpxe
    10.64.16.1/22 dev devint
    }
    virtual_routes {
    10.3.0.0/16 via 10.255.255.14 dev peering src 10.64.0.6 metric 1000
    10.9.0.0/16 via 10.255.255.14 dev peering src 10.64.0.6 metric 1000
    10.32.32.0/20 via 10.255.255.14 dev peering src 10.64.0.6 metric 1000
    10.64.4.0/22 dev prodsvc src 10.64.0.6
    10.64.20.0/22 dev devsvc src 10.64.0.6

    }
    virtual_rules {
       to 0.0.0.0/0 priority 100 lookup main
    }
    track_script {
      shgw
    }
    notify_master "/etc/keepalived/master.sh"
    notify_backup "/etc/keepalived/backup.sh"
    notify_fault "/etc/keepalived/fault.sh"
    notify_stop "/etc/keepalived/backup.sh"
}

pqarmitage · 2024-10-26T08:39:02Z

Just a couple of comments on your configuration.

Using dont_track_primary is unusual. Do you really want that?
You have configured interface delays on interfaces you are not using, e.g. lacp0.

I said in my previous post that it was not possible to set a "protocol" for ip addresses in the way that can be done for routes and rules. I have since discovered that kernel commit 47f0bd503210 added exactly that feature, which first appeared in Linux v5.18 and was first supported by v6.4.0 of the iproute utility. I will add support for this in keepalived.

zviratko · 2024-10-26T09:48:15Z

Thank you for taking a look

Using dont_track_primary is unusual. Do you really want that?

You have configured interface delays on interfaces you are not using, e.g. lacp0.

I did both of these in an attempt to make keepalived as "lenient" as possible. I am surprised this is the only obvious extra stuff that's in there :-)
networkctl reload/systemd-netwokd restart sometimes cycle the interfaces (generally unpredictable behaviour). In the log I saw keepalived noticed just before a failover and in the end just added everything just in case it was ever neeeded (or if keepalived cared for some reason). It's also easier to just put all the interfaces in there with ansible...

Btw with this config, keepalived sometimes just doesn't execute the backup script on failover. Sadly it was not reproducible, and it occured only after a firewall has been running in MASTER state for some time (like a week). I didn't make an issue because I know the right thing to do is to use the FIFO, but in case you see anything in there that might be causing that... but it could be useful to at least log that keepaliveed is trying to execute it (or isn't for some reason) as all I can say is that it never reached the first line in the script.

I said in my previous post that it was not possible to set a "protocol" for ip addresses in the way that can be done for routes and rules. I have since discovered that kernel commit 47f0bd503210 added exactly that feature, which first appeared in Linux v5.18 and was first supported by v6.4.0 of the iproute utility. I will add support for this in keepalived.

Cool, but systemd-networkd still requires configuration for that (and there's no filtering for "ignore proto keepalived"). Maybe it would be better to use "proto kernel" by default when running under systemd so it gets ignored? I can't imagine anyone not wanting everything to survive systemd-networkd interference. It took me a good while to realize what is happening...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workaround for stupid systemd-networkd behaviour #2479

Workaround for stupid systemd-networkd behaviour #2479

zviratko commented Oct 11, 2024

pqarmitage commented Oct 23, 2024 •

edited

Loading

zviratko commented Oct 23, 2024

pqarmitage commented Oct 26, 2024

zviratko commented Oct 26, 2024

Workaround for stupid systemd-networkd behaviour #2479

Workaround for stupid systemd-networkd behaviour #2479

Comments

zviratko commented Oct 11, 2024

pqarmitage commented Oct 23, 2024 • edited Loading

zviratko commented Oct 23, 2024

pqarmitage commented Oct 26, 2024

zviratko commented Oct 26, 2024

pqarmitage commented Oct 23, 2024 •

edited

Loading