oom-notifier is a small daemon to be notified about processes killed by the Linux oom-killer. It can report the full command line of the killed out of memory process (something that it's not printed in the kernel ring buffer).
You need a working installation of the Rust compiler, then you can build the service issuing the following command:
cargo build --release
if the build completes then you will find the compiled service at the following path in the current directory: target/release/oom-notifier
It is possible to build a docker image of the service issuing the following command:
docker build -t oom-notifier .
and then run it:
docker run -v /proc:/proc --privileged oom-notifier /oom-notifier
The daemon needs to run with enough privileges to access /dev/kmsg (kernel logs) so it can know about OOMs happening in the system. The events can be sent to different backends, at the moment Syslog, Elasticsearch, Kafka and Slack are supported. Send events to an elasticsearch cluster:
./oom-notifier --elasticsearch-server https://my-elasticsearch-cluster:9200 --elasticsearch-index my-index
Send events to a remote syslog server (over TCP):
./oom-notifier --syslog-server my-syslog-server:9999 --syslog-proto tcp
Send events to a Kafka cluster:
./oom-notifier --kafka-topic oom-events --kafka-brokers broker1:9092,broker2:9092,broker3:9092
Send events to a Slack channel (learn more here):
./oom-notifier --slack-webhook https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX --slack-channel #oom-notifications
You can adjust the logging level of the daemon setting the environment variable LOGGING_LEVEL (default level is info).
It is possible to run the service as a Daemonset on a Kubernetes cluster. It must be run as a privileged Daemonset and with the option hostPID enabled (see here and here). A YAML template ready to be deployed (after adapting it to your environment) is available at k8s/daemonset.yaml.
The tool can only notify about OOMs caused by the Linux oom killer. If you use a userspace mechanism then it will not be able to detect them. Some example of userspace services that act as oom-killer:
If you want to prevent the daemon itself to be killed by the oom-killer you can adjust the oom_adj parameter as described here
Angelo Poerio [email protected]