-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
qemu job failed at "rpc error: code = Unknown desc = unable to configur e cgroups: no such file or directory" #23250
Comments
I am also seeing the same behavior on nomad v1.8.0 attempting to use the qemu example at https://github.com/angrycub/nomad_example_jobs/blob/main/qemu/tc_ssh.nomad Job file (if appropriate)
Nomad Client logs (if appropriate)
|
Hi @joechchen! It looks like the task driver is trying to configure cgroups for the process we use to manage QEMU, but that's failing for some reason related to the cgroup setup on the host. A couple of questions:
|
hello @tgross - I am seeing this in my homelab as well. Nomad on Ubuntu 22.04 bare metal and KVM support I see this log output in the alloc directory: root@host2:/opt/nomad/client/client_datadir/alloc/24e03b91-5d09-d7bc-4149-71eafdc4fa89/virtual# ls my job definition is here: https://github.com/mwright-pivotal/learn-terraform-cloud-agents/blob/main/windows2022vm-job.hcl |
I am running the client as root but have not done anything to setup a custom cgroup |
Ok, thanks folks. I did what I should have done a few days ago when I first touched this issue, which is to run the example job @fjoenichols mentioned here: #23250 (comment). 🤦 I can reproduce the problem exactly now in my local development environment on Linux. I'll dig into this and report back when I know more. |
Draft PR is up here: #23466 |
As part of the work for 1.7.0 we moved portions of the task cgroup setup down into the executor. This requires that the executor constructor get the `TaskConfig.Resources` struct, and this was missing from the `qemu` driver. We fixed a panic caused by this change in #19089 before we shipped, but this fix was effectively undo after we added plumbing for custom cgroups for `raw_exec` in 1.8.0. As a result, running `qemu` tasks always fail on Linux. This was undetected in testing because our CI environment doesn't have QEMU installed. I've got all the unit tests running locally again and have added QEMU installation when we're running the drivers tests. Fixes: #23250
#23466 has been merged and will ship in Nomad 1.8.2 (with backports to Nomad Enterprise 1.7.x and 1.6.x ) |
Sorry - I created a new issue here: |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Nomad v1.8.0
BuildDate 2024-05-28T17:38:17Z
Revision 28b82e4
Operating system and Environment details
Ubuntu 22.04.4 LTS
Issue
qemu job failed at:
The same job runs fine on v1.7.1 and prior.
Reproduction steps
Expected Result
The job should run.
Actual Result
The job failed.
Job file (if appropriate)
Nomad Server logs (if appropriate)
Nomad Client logs (if appropriate)
The text was updated successfully, but these errors were encountered: