-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nomad Service Discovery unable to find service #16983
Comments
The workaround is to restart the Nomad service/agent on the client node. |
Hi @vincenthuynh so far I haven't been able to reproduce what you're seeing - in my cases the template is always successfully rendered once the upstream task is started and its serivce is registered. Before I dig in further, could you post a complete job file you're using that experiences the issue? I want to make sure we're not missing something (e.g. using group vs. task services, etc.) the test job file i've been using bug.hcljob "bug" {
group "group" {
network {
port "http" {
to = 8080
}
}
task "python" {
driver = "raw_exec"
config {
command = "python3"
args = ["-m", "http.server", "8080"]
}
service {
provider = "nomad"
name = "python"
port = "http"
}
resources {
cpu = 10
memory = 32
}
}
task "client" {
driver = "raw_exec"
template {
data = <<EOH
{{range nomadService "python"}}
blah.host: {{ .Address }}
blah.port: {{ .Port }}
{{end}}
EOH
destination = "local/config/application.yml"
}
config {
command = "sleep"
args = ["infinity"]
}
}
}
} |
Hi @shoenig, We've noticed that it takes a few days (2-3 days) before it starts happening. Here's another reproduction:
Here's our job file: myservice.hcl
Hope that helps. Thanks! |
I encountered a similar issue caused by having My errors happened very consistently, so different I think from this case, but wanted to mention here for anyone else who finds this issue like I did. My solution was to ensure |
This issue still consistently happens for us every 2-3 days. I can observe exactly the same as @IamTheFij however we run nomad 1.6.3.
|
I have observed same issue for nomad 1.7.3 to 1.7.7 |
I observer the same problem in v1.8.1 |
seeing same problem, services are clearly visible on Nomad UI, but cannot be used by templating. Nomad 1.7.7 (multi-region, multi-dc and ACL enabled) (Consul-based service templating works fine and reliable, as opposed to Nomad-based services) |
I've seen this occuring frequently under poor network conditions where
I don't know if this is intended and if it is the same issue people are having here. As a workaround, I was restarting the Nomad agent on the client every 20 mins. (I didn't need HA) |
Nomad version
Nomad v1.4.7
Operating system and Environment details
Debian 4.19.260-1 (2022-09-29) x86_64 GNU/Linux
Issue
Allocation is unable to find Nomad service when it exists. It seems to start happening on a client after an uptime of 2-3 days.
Reproduction steps
myservice
using the Nomad providerNomadService
function to reference the service that was registered in Task 1Able to list service:
Expected Result
Able to discover a service consistently
Actual Result
Task log:
Job file (if appropriate)
Task 1:
Task 2:
Nomad Client logs
The text was updated successfully, but these errors were encountered: