-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
workload identity auth failure when audit is enabled #15768
Comments
Hi @louievandyke 👋 I think this may have been fixed by #15140. Would you mind trying this upgrade path but going to 1.4.3+ent instead of 1.4.0+ent? Thanks! |
Hi @lgfa29 I just tried the upgrade path going to 1.4.3+ent (from 1.3.2+ent) and I ran into the same situation. The service discovery fails with the below display messages in the events.
|
I investigated this and was able to reproduce on 1.4.3+ent but not on |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Output from
nomad version
root@ubuntu-focal:/home/vagrant# nomad --version
Nomad v1.4.0+ent (ea16107)
Operating system and Environment details
root@ubuntu-focal:/home/vagrant# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.2 LTS
Release: 20.04
Codename: focal
Issue
After upgrading to Nomad v1.4.0+ent I noticed that restarting certain jobs resulted in a pending error related to the Nomad service defined in my jobspec template.
Template | Missing: nomad.services
I had, before the upgrade, in place
acl { enabled = true }
andaudit { enabled = true }
on both the Servers and Clients agent configs. This was working with no issues.After the upgrade I restarted the two jobs. One of the jobs (sleep1), which registers the Nomad service and then just sleeps forever, started fine. The second job, where I try and discover that service, fails complaining about a missing service.
A work around I found, is if you have
acl { enabled = true }
andaudit { enabled = true }
on the Servers and then on the Client disable the acls and leave audit enabledacl { enabled = false }
andaudit {enabled = true}
it will start working after an agent restart on the client. I believe this may only leave vulnerable the /v1/client endpoints as you still need a token to get to the UI and to run CLI commands on both the Servers and Clients.Reproduction steps
on v1.3.2+ent I ran the two specs pasted below
sleep1
andservice-discovery
while I hadacl { enabled = true }
andaudit { enabled = true }
on the Servers and ClientsI then upgraded the binary to Nomad v1.4.0+ent on both Server and Client and restarted the Nomad agent. Both jobs continued to run fine but when I restarted each, the
service-discovery
job would not start due to errors about discovering the service. The only way to fix this is to remove the acl block from the clients agent config and restart the Nomad agent.Expected Result
Audit and ACL behavior to remain consistent on upgrade paths
Actual Result
ACL appears to block the client after upgrading to v1.4.0+ent
Job file (if appropriate)
see above
Nomad Server logs (if appropriate)
Some logs related to this behavior...
Nomad Client logs (if appropriate)
The text was updated successfully, but these errors were encountered: