-
-
Notifications
You must be signed in to change notification settings - Fork 734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add retrieval of VRRP instance status information via CLI option #2001
Comments
@ChrLau Probably the best way to query the status currently is to use SNMP. For example, executing:
will produce output:
(to see all available OIDs execute or you can use the RFC based MIB:
shows all available output. The man page for proc states: A little over 4 years ago I did start experimenting with implementing FUSE (filesystem in userspace), to do the sort of thing you are requesting, but not mounted under /proc, but I seem to remember that there was some difficulty due to passing control to the fuse library; I think I can see a way around that now though. @ChrLau Would using SNMP work for you, at least for now? Implementing a filesystem based solution could take a little while. |
Hi @pqarmitage Oh dammit, yes you are right about /proc being for the Kernel.. When I started the ticket I had the fallacy that this information is provided by IPVS. Which of course is a kernel module and therefore /proc is an option. So let's stick to the alternative solution: The command line option. ;-) Regarding SNMP: It is an option I could implement, yes. In general I think that keepalived would really benefit from integrating such an CLI option. As I see this as a constantly recurring topic for people new to loadbalancing/keepalived when they try to get familiar/understand keepalived or want to implement monitoring solutions. |
A healthcheck/liveness/readiness probe would be ideal for cloud and on-prem environments. This would aid service discovery and monitoring. |
@ChrLau You state |
@mister2d I think SNMP provides the functionality you are looking for. |
Not exactly. I was hoping for something less legacy (OIDs) and more simpler (an API endpoint). Something like a HTTP |
@pqarmitage I finally had the time to test my planned set up and noticed that the notify_stop parameter isn't working with the keepalived versions from Debian Stretch (Version 1.3.2) and Debian Buster (Version 2.0.10). Nothing is logged upon stopping keepalived. Neither via notify parameter nor notify_stop. I left a comment with a useable workaround in #185 regarding this, as I quickly found this issue (among some StackOverflow threads ;-) ) and maybe this information is helping someone else. |
I do support the need of modern API to get the full runtime status of keepalived (via HTTP/REST, CLI etc.). Notify scripts provide information about state changes only (no other stats) and if notification is lost, it's lost forever. SNMP requires spawning snmpd just for this purpose (which is an overkill and may not be allowed on some systems due to security requirements). And signalling with SIGJSON (or USR1/2) is restricted to root (+ the file keepalived writes is readable by root only) and requires a monitoring script to wait an artificial number of seconds hoping that the state file is written fully. It would be much more convenient to just curl keepalived over tcp or unix socket to get all the state and stats (even the form of current json or data/stats dumps would suffice) or call CLI command that would get current status and print it to stdout. |
Does a request
look to be the right sort of thing? I have also experimented with
and
and
and
These are all examples, and can easily be added to, both in terms of what fields are contained in the output and additional URLs. The port to connect to will be configurable, as will the server name. It will only use https, and valid certificates for the server name will be required (which can be obtained from letsencrypt.org for example). All requests will require to be authenticated using basic authentication to ensure that data cannot be inappropriately leaked. It currently only implements HTTP/1.1, but I may add HTTP/2 support. So far I have only implemented GET requests, but I plan to also implement POST, PUT and DELETE as appropriate. I also haven't yet implemented versioning. Please note this will not be released until authentication is implemented. I would welcome any and all feedback, since it is probably easier to make any modifications before the functionality is released, rather than make subsequent changes, although as I indicated above adding additional fields and URLs should be quite straightforward. |
A healthy status would be useful, with your example [
{
"healthy": true,
"reason": ""
},
[
{
"instance": "VI_1",
"state": "Master",
"interface": "vrrp.253@eth0",
"priority": 200,
"number of config faults": 0,
"last state transition": "2021-12-08T11:46:08.803281Z",
"source ip address": "10.1.0.3"
},
{
"instance": "VI_2",
"state": "Master",
"interface": "vrrp.252@eth0",
"priority": 200,
"number of config faults": 0,
"last state transition": "2021-12-08T11:46:08.802212Z",
"source ip address": "10.1.0.3"
},
{
"instance": "VI_6",
"state": "Master",
"interface": "vrrp6.253@eth0",
"priority": 200,
"number of config faults": 0,
"last state transition": "2021-12-08T11:46:08.798882Z",
"source ip address": "fe80::4005:5dff:fe72:f1a1"
}
]
] But it adds logic that might not fit all deployments... |
@elfranne Many thanks for your response. Could you please explain what circumstances would expect to lead to |
@pqarmitage , If the backup instance is offline for example. And for the real_servers, if the backend do no responds anymore healthy would be set to false (could be set to a percentage or maximum number offline backends). |
@pqarmitage Wow, I'm impressed what became out of my ticket. And yes, an API would be absolutely stunning to have as it makes it much easier to integrate keepalived (and ipvsadm in that consequence too). Regarding the "healthy: true" and false topic: I remember Apache Solr using a more layered approach. See https://solr.apache.org/guide/8_11/cluster-node-management.html You could use the same, which would allow for some more flexibel configuration of health status levels. As I'm on your side when you say: "But it adds logic that might not fit all deployments..." as the VRRP part can be used very differently. So "healthy: green" could be: all Instances are up, all configured nodes are up and therer is 1 master for each instance. |
Hi,
currently the status of a keepalived VRRP instance cannot be extracted in an easy, general way.
The status changes are logged in the logfile (for example: /var/log/messages), but as these get rotated it even cannot be extractable from there.
Another solution is to create local files containing the state using the keepalived notify parameter in the vrrp_sync_group.
It would be handy to, for example, have /proc/net/vrrp_INSTANCENAME_status which outputs only the state (like: MASTER, BACKUP, FAULT).
Where does it help?
I have my keepalived loadbalancers submitting their state into a key value store (etcd). This etcd is queried by our Rundeck to get a list of all loadbalancer in the selected state (MASTER or BACKUP) to only perform actions on loadbalancers which are in the desired state. (We only have 1 VRRP instance per loadbalancer.)
For example: Only do package updates and a reboot on loadbalancers in the state of BACKUP.
Here it would be handy to have an entry which is maintained by keepalived itself as the daemon is the authoritative source for this kind of information. Set ups via the notify-parameters can fail due to human error.
Alternatives:
Of course some kind of CLI option like keepalived --status $VRRP_INSTANCENAME which outputs the state of the VRRP instance would also be useful.
The text was updated successfully, but these errors were encountered: