CNI: use check command when restoring from restart #24658

tgross · 2024-12-12T20:48:44Z

When the Nomad client restarts and restores allocations, the network namespace for an allocation may exist but no longer be correctly configured. For example, if the host is rebooted and the task was a Docker task using a pause container, the network namespace may be recreated by the docker daemon.

When we restore an allocation, use the CNI "check" command to verify that any existing network namespace matches the expected configuration. This requires CNI plugins of at least version 1.2.0 to avoid a bug in older plugin versions that would cause the check to fail. So we check the plugin fingerprint before performing the check.

If the check fails, destroy the network namespace and try to recreate it from scratch once. If that fails in the second pass, fail the restore so that the allocation can be recreated (rather than silently having networking fail).

This should fix the gap left #24650 for Docker task drivers and any other drivers with the MustInitiateNetwork capability.

Fixes: #24292
Ref: #24650

Testing & Reproduction steps

Run a cluster on a set of VMs, with at least one client. This can't be a server+client because we need to reboot the hosts. You should probably set the server.heartbeat_grace = "5m" to give yourself time to work.

Run a Docker task with network.mode = "bridge". Wait for it to be healthy.
Restart the agent.
Make sure the alloc is untouched and networking works.
Reboot the client host.
Make sure the alloc is restored, the tasks are restarted, and networking works.

Contributor Checklist

Changelog Entry If this PR changes user-facing behavior, please generate and add a
changelog entry using the make cl command.
Testing Please add tests to cover any new functionality or to demonstrate bug fixes and
ensure regressions will be caught.
Documentation If the change impacts user-facing functionality such as the CLI, API, UI,
and job configuration, please update the Nomad website documentation to reflect this. Refer to
the website README for docs guidelines. Please also consider whether the
change requires notes within the upgrade guide.

Reviewer Checklist

Backport Labels Please add the correct backport labels as described by the internal
backporting document.
Commit Type Ensure the correct merge method is selected which should be "squash and merge"
in the majority of situations. The main exceptions are long-lived feature branches or merges where
history should be preserved.
Enterprise PRs If this is an enterprise only PR, please add any required changelog entry
within the public repository.

client/allocrunner/networking_cni.go

client/allocrunner/network_hook.go

apollo13 · 2024-12-12T21:22:31Z

Your reproduction steps say:

Run a Docker/Podman task with network.mode = "bridge". Wait for it to be healthy.

is the podman driver actually affected by it? I thought that one would not use a pause container and always let nomad configure the network namespaces on it's own.

tgross · 2024-12-12T21:31:15Z

is the podman driver actually affected by it? I thought that one would not use a pause container and always let nomad configure the network namespaces on it's own.

You know what, I assumed that it used pause containers too (I hardly ever have run it myself) but that's not the case: https://github.com/hashicorp/nomad-driver-podman/blob/main/driver.go#L87. So yeah, just Docker here. Will edit the repro steps.

apollo13 · 2024-12-12T21:39:00Z

You know what, I assumed that it used pause containers too

Luckily no, podman run supports --network=ns:/path/for/ns/to/join in addition to the half-baked --network=container:id that docker has (and obviously via the API as well).

jrasell

I tested this using an AWS environment with 1 server and 1 client instance. The client instance has CNI v1.3.0 plugins installed.

JobSpec:

job "example" {

  group "nginx" {
    network {
      mode = "bridge"
      port "http" {
        to = 80
      }
    }

    task "nginx" {
      driver = "docker"

      config {
        image = "nginx:latest"
        ports = ["http"]
      }

      resources {
        cpu    = 500
        memory = 256
      }
    }
  }
}

Running through the steps using a build from main f4529485563924462dbdccdd1b4cacbd9d68616e when I rebooted the instance the allocation was restarted, but the networking was broken and I was unable to perform a simple curl to NGINX.

I then tested this patch 9daf267e33bd502d4a386dc1dd6d4a728894c194 I performed the same steps as detailed in the PR and performed in the previous test. When the instance was rebooted, the allocation failed and a new allocation scheduled in its place.

Failed allocation events:

2024-12-13T08:59:18Z  Terminated     Exit Code: 0
2024-12-13T08:59:18Z  Setup Failure  failed to setup alloc: pre-run hook "network" failed: failed to configure networking for alloc: Interface name nomad not found
2024-12-13T08:57:08Z  Started        Task started by client
2024-12-13T08:57:08Z  Driver         Downloading image
2024-12-13T08:57:07Z  Task Setup     Building Task Directory
2024-12-13T08:57:07Z  Received       Task received by client

Client logs around time of failure marking:

2024-12-13T08:59:18.385Z [ERROR] allocrunner/alloc_runner.go:403: client.alloc_runner: prerun failed: alloc_id=8f3b3d5a-906a-7f53-8d72-e9c782bb4dee error="pre-run hook \"network\" failed: failed to configure networking for alloc: Interface name nomad not found"
2024-12-13T08:59:18.385Z [INFO]  taskrunner/task_runner.go:1468: client.alloc_runner.task_runner: Task event: alloc_id=8f3b3d5a-906a-7f53-8d72-e9c782bb4dee task=nginx type="Setup Failure" msg="failed to setup alloc: pre-run hook \"network\" failed: failed to configure networking for alloc: Interface name nomad not found" failed=true
2024-12-13T08:59:18.387Z [INFO]  docker/handle.go:229: client.driver_mgr.docker: stopped container: container_id=9d6cc3a56a51596885f36fec7c4d74ea1002e8727c1fc213851ac42fed53e8cb driver=docker

tgross · 2024-12-13T18:42:03Z

I've updated this PR with a retry where if the check fails the first time, we attempt to recreate the network namespace (attempting this once). This should reduce the cases where, after a reboot, the allocation cannot be restored and gets failed. And I've added tests covering that logic.

When the Nomad client restarts and restores allocations, the network namespace for an allocation may exist but no longer be correctly configured. For example, if the host is rebooted and the task was a Docker task using a pause container, the network namespace may be recreated by the docker daemon. When we restore an allocation, use the CNI "check" command to verify that any existing network namespace matches the expected configuration. This requires CNI plugins of at least version 1.2.0 to avoid a bug in older plugin versions that would cause the check to fail. If the check fails, destroy the network namespace and try to recreate it from scratch once. If that fails in the second pass, fail the restore so that the allocation can be recreated (rather than silently having networking fail). This should fix the gap left #24650 for Docker task drivers and any other drivers with the `MustInitiateNetwork` capability. Fixes: #24292 Ref: #24650

tgross commented Dec 12, 2024

View reviewed changes

client/allocrunner/networking_cni.go Outdated Show resolved Hide resolved

tgross changed the title ~~cni: use check command when restoring from restart~~ CNI: use check command when restoring from restart Dec 12, 2024

tgross added theme/networking type/bug backport/ent/1.7.x+ent Changes are backported to 1.7.x+ent backport/ent/1.8.x+ent Changes are backported to 1.8.x+ent backport/1.9.x backport to 1.9.x release line labels Dec 12, 2024

tgross added this to the 1.9.x milestone Dec 12, 2024

tgross force-pushed the b-24292-netns-restore branch from 98343be to 66dcfcb Compare December 12, 2024 20:52

tgross requested review from jrasell and shoenig December 12, 2024 20:52

vercel bot deployed to Preview – nomad-ui December 12, 2024 20:53 View deployment

tgross mentioned this pull request Dec 12, 2024

CNI: Docker container lose network configuration after host reboot #24292

Open

tgross force-pushed the b-24292-netns-restore branch from 66dcfcb to 9daf267 Compare December 12, 2024 21:12

vercel bot deployed to Preview – nomad-ui December 12, 2024 21:14 View deployment

tgross commented Dec 12, 2024

View reviewed changes

client/allocrunner/network_hook.go Outdated Show resolved Hide resolved

jrasell reviewed Dec 13, 2024

View reviewed changes

vercel bot deployed to Preview – nomad-ui December 13, 2024 17:32 View deployment

tgross force-pushed the b-24292-netns-restore branch 2 times, most recently from 10760d7 to 04a444a Compare December 13, 2024 18:40

vercel bot deployed to Preview – nomad-ui December 13, 2024 18:41 View deployment

tgross marked this pull request as ready for review December 13, 2024 18:42

tgross requested review from a team as code owners December 13, 2024 18:42

tgross force-pushed the b-24292-netns-restore branch from 04a444a to 137e7d6 Compare December 13, 2024 18:53

vercel bot deployed to Preview – nomad-ui December 13, 2024 18:55 View deployment

tgross mentioned this pull request Dec 13, 2024

Weird routing behavior on mode="bridge" when trying to reach alloc on same host. #24648

Closed

tgross requested review from jrasell, schmichael and Juanadelacuesta December 16, 2024 14:59

tgross force-pushed the b-24292-netns-restore branch from 137e7d6 to a26e6c4 Compare December 18, 2024 15:24

tgross requested review from a team as code owners December 18, 2024 15:24

tgross requested a review from pbortnick December 18, 2024 15:24

vercel bot deployed to Preview – nomad-ui December 18, 2024 15:26 View deployment

tgross requested a review from aimeeu December 18, 2024 15:27

vercel bot deployed to Preview – nomad December 18, 2024 15:31 View deployment

aimeeu approved these changes Dec 18, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CNI: use check command when restoring from restart #24658

CNI: use check command when restoring from restart #24658

tgross commented Dec 12, 2024 •

edited

Loading

apollo13 commented Dec 12, 2024

tgross commented Dec 12, 2024

apollo13 commented Dec 12, 2024 •

edited

Loading

jrasell left a comment

tgross commented Dec 13, 2024

CNI: use check command when restoring from restart #24658

Are you sure you want to change the base?

CNI: use check command when restoring from restart #24658

Conversation

tgross commented Dec 12, 2024 • edited Loading

Testing & Reproduction steps

Contributor Checklist

Reviewer Checklist

apollo13 commented Dec 12, 2024

tgross commented Dec 12, 2024

apollo13 commented Dec 12, 2024 • edited Loading

jrasell left a comment

Choose a reason for hiding this comment

tgross commented Dec 13, 2024

tgross commented Dec 12, 2024 •

edited

Loading

apollo13 commented Dec 12, 2024 •

edited

Loading