description |
---|
This document describes the properties a NAIS application should have. |
{% hint style="info" title="TL;DR" %}
The application should make sure it listens to the SIGTERM
signal, and prepare for shutdown (closing connections etc.) upon receival.
{% endhint %}
When running on NAIS (or Kubernetes, actually) your application must be able to handle being shut down at any given time. This is because the platform might have to reboot the node your application is running on (e.g. because of a OS patch requiring restart), and in that case will reschedule your application on a different node.
To best be able to handle this in your application, it helps to be aware of the relevant parts of the termination lifecycle.
- Application (pod) gets status
TERMINATING
, and grace period starts (default 30s) - (simultaneous with 1) If the pod has a
preStop
hook defined, this is invoked - (simultaneous with 1) The pod is removed from the list of endpoints i.e. taken out of load balancing
- (simultaneous with 1, but after
preStop
if defined) Container receivesSIGTERM
, and should prepare for shutdown - Grace period ends, and container receives
SIGKILL
- Pod disappears from the API, and is no longer visible for the client.
The platform will automatically add a preStop
-hook that pauses the termination sufficiently that e.g. the ingress controller has time to update it's list of endpoints (thus avoid sending traffic to a application while terminating).
The application should be instrumented using Prometheus, exposing the relevant application metrics. See the metrics documentation for more info.
The application should emit json
-formatted logs by writing directly to standard output. This will make it easier to index, view and search the logs later. See more details in the logs documentation.
The readiness
-probe is used by Kubernetes to determine if the application should receive traffic, while the liveness
-probe lets Kubernetes know if your application is alive. If it's dead, Kubernetes will remove the pod and bring up a new one.
Useful resources on the topic:
- https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
- https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-setting-up-health-checks-with-readiness-and-liveness-probes
- https://medium.com/metrosystemsro/kubernetes-readiness-liveliness-probes-best-practices-86c3cd9f0b4a
{% hint style="info" title="TL;DR" %}
readiness
andliveness
should be implemented as separate services and they usually have different characteristicsliveness
-probe should simply returnHTTP 200 OK
if main loop is running, andHTTP 5xx
if notreadiness
-probe returnsHTTP 200 OK
is able to process requests, andHTTP 5xx
if not. If the application has dependencies to e.g. a database to serve traffic, it's a good idea to check if the database is available in thereadiness
-probe {% endhint %}