-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CNI: use check command when restoring from restart #24658
base: main
Are you sure you want to change the base?
Conversation
98343be
to
66dcfcb
Compare
66dcfcb
to
9daf267
Compare
Your reproduction steps say:
is the podman driver actually affected by it? I thought that one would not use a pause container and always let nomad configure the network namespaces on it's own. |
You know what, I assumed that it used pause containers too (I hardly ever have run it myself) but that's not the case: https://github.com/hashicorp/nomad-driver-podman/blob/main/driver.go#L87. So yeah, just Docker here. Will edit the repro steps. |
Luckily no, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this using an AWS environment with 1 server and 1 client instance. The client instance has CNI v1.3.0 plugins installed.
JobSpec:
job "example" {
group "nginx" {
network {
mode = "bridge"
port "http" {
to = 80
}
}
task "nginx" {
driver = "docker"
config {
image = "nginx:latest"
ports = ["http"]
}
resources {
cpu = 500
memory = 256
}
}
}
}
Running through the steps using a build from main f4529485563924462dbdccdd1b4cacbd9d68616e
when I rebooted the instance the allocation was restarted, but the networking was broken and I was unable to perform a simple curl to NGINX.
I then tested this patch 9daf267e33bd502d4a386dc1dd6d4a728894c194
I performed the same steps as detailed in the PR and performed in the previous test. When the instance was rebooted, the allocation failed and a new allocation scheduled in its place.
Failed allocation events:
2024-12-13T08:59:18Z Terminated Exit Code: 0
2024-12-13T08:59:18Z Setup Failure failed to setup alloc: pre-run hook "network" failed: failed to configure networking for alloc: Interface name nomad not found
2024-12-13T08:57:08Z Started Task started by client
2024-12-13T08:57:08Z Driver Downloading image
2024-12-13T08:57:07Z Task Setup Building Task Directory
2024-12-13T08:57:07Z Received Task received by client
Client logs around time of failure marking:
2024-12-13T08:59:18.385Z [ERROR] allocrunner/alloc_runner.go:403: client.alloc_runner: prerun failed: alloc_id=8f3b3d5a-906a-7f53-8d72-e9c782bb4dee error="pre-run hook \"network\" failed: failed to configure networking for alloc: Interface name nomad not found"
2024-12-13T08:59:18.385Z [INFO] taskrunner/task_runner.go:1468: client.alloc_runner.task_runner: Task event: alloc_id=8f3b3d5a-906a-7f53-8d72-e9c782bb4dee task=nginx type="Setup Failure" msg="failed to setup alloc: pre-run hook \"network\" failed: failed to configure networking for alloc: Interface name nomad not found" failed=true
2024-12-13T08:59:18.387Z [INFO] docker/handle.go:229: client.driver_mgr.docker: stopped container: container_id=9d6cc3a56a51596885f36fec7c4d74ea1002e8727c1fc213851ac42fed53e8cb driver=docker
10760d7
to
04a444a
Compare
I've updated this PR with a retry where if the check fails the first time, we attempt to recreate the network namespace (attempting this once). This should reduce the cases where, after a reboot, the allocation cannot be restored and gets failed. And I've added tests covering that logic. |
04a444a
to
137e7d6
Compare
When the Nomad client restarts and restores allocations, the network namespace for an allocation may exist but no longer be correctly configured. For example, if the host is rebooted and the task was a Docker task using a pause container, the network namespace may be recreated by the docker daemon. When we restore an allocation, use the CNI "check" command to verify that any existing network namespace matches the expected configuration. This requires CNI plugins of at least version 1.2.0 to avoid a bug in older plugin versions that would cause the check to fail. If the check fails, destroy the network namespace and try to recreate it from scratch once. If that fails in the second pass, fail the restore so that the allocation can be recreated (rather than silently having networking fail). This should fix the gap left #24650 for Docker task drivers and any other drivers with the `MustInitiateNetwork` capability. Fixes: #24292 Ref: #24650
137e7d6
to
a26e6c4
Compare
When the Nomad client restarts and restores allocations, the network namespace for an allocation may exist but no longer be correctly configured. For example, if the host is rebooted and the task was a Docker task using a pause container, the network namespace may be recreated by the docker daemon.
When we restore an allocation, use the CNI "check" command to verify that any existing network namespace matches the expected configuration. This requires CNI plugins of at least version 1.2.0 to avoid a bug in older plugin versions that would cause the check to fail. So we check the plugin fingerprint before performing the check.
If the check fails, destroy the network namespace and try to recreate it from scratch once. If that fails in the second pass, fail the restore so that the allocation can be recreated (rather than silently having networking fail).
This should fix the gap left #24650 for Docker task drivers and any other drivers with the
MustInitiateNetwork
capability.Fixes: #24292
Ref: #24650
Testing & Reproduction steps
Run a cluster on a set of VMs, with at least one client. This can't be a server+client because we need to reboot the hosts. You should probably set the server.heartbeat_grace = "5m" to give yourself time to work.
Contributor Checklist
changelog entry using the
make cl
command.ensure regressions will be caught.
and job configuration, please update the Nomad website documentation to reflect this. Refer to
the website README for docs guidelines. Please also consider whether the
change requires notes within the upgrade guide.
Reviewer Checklist
backporting document.
in the majority of situations. The main exceptions are long-lived feature branches or merges where
history should be preserved.
within the public repository.