-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Consul Service Mesh on CNI networks #8953
Comments
Hey @timotheenicolas Unfortunately this is currently not supported with CNI, but I don't know of any technical limitation. Pinging @shoenig to see if this is a simple validation change or a more in depth one. |
Thanks :) I think it would be a cool feature to have the ability to create Ingress GW which can bind on their own IP on a macvlan network |
Just wanted to check if you had a chance to look at this further? Our use-case here is similar to @timotheenicolas's. We'd like to expose a connect sidecar service on a CNI based overlay network. Thank you! |
Would like to see this happen too, similar use case. Thank you! |
Hi everyone 👋 I've been looking into this issue but I can't seem to get Connect working which may indicate that there's more work that needs to be done than just removing the validation or it could be that I'm not configuring my CNI network properly (probably more likely 😅). I have some custom binaries at the bottom of this page https://github.com/hashicorp/nomad/actions/runs/4059725660 that was built with my changes. This is the diff d375f60...1207475 of what is in the binary. Would anyone with more experience with CNI be able to test them? One important note, these binaries are for development purpose only and should not run in production so make sure you don't accidentally run them with your production data. I used the sample job file that is generated from
Thanks in advance! |
I have edited the title here to expand the scope to all CNI networks (so not just macvlan) and to Consul Service Mesh in general (no just ingress gateways). |
I'm trying to use consul connect on my nomad clusters. Since it is hard coded I'm not able to do that, so I thought using a custom configuration and refer to it using But that fails because if this issue. @lgfa29 Is there something I can do to help advancing this (older) issue? |
Hi @netdata-be 👋 We're not currently working on this issue and I didn't receive feedback on the attempted fix mentioned in #8953 (comment) and haven't had the time to validate it further. If I were to build another set of binaries with those changes would be able to help validate if the changes work? |
@lgfa29 - It looks like this would solve most of my questions at https://discuss.hashicorp.com/t/configure-network-pinning-for-jobs/63434. I would be happy to test a patched version of 1.6.x or 1.7.x to validate the changes. |
Hi @lgfa29 I have done some tests using your patch (applied to nomad 1.7.7). A job which includes CNI and consul connect starts correctly, but the health check uses the incorrect address. I am using a macvlan cni config:
There is a process (envoy?) listening on port 29130 (this is on IP 172.17.107.139
Consul is trying to health check the nomad client's address though ( From the docs (https://developer.hashicorp.com/nomad/docs/job-specification/service#address_mode), I expected the consul check of the sidecar to use the IP provided by CNI My service stanza is:
I can do more tests. Please let me know if anything more would help. |
Ah nice, thanks for testing it @nakermann1973, I'm glad it kind of works 😅 Health checks are an interesting point. First you need to make sure the Consul agent would be able to reach the service at the IP:port allocated by the CNI plugin. Next we need a way to tell Nomad to use that IP:port as well. For the first part, I'm not sure there's a single way to fix it. Each environment will need to be configured to fulfill this requirement. The second part may require some code changes in how Nomad registers the service (and its health check) in Consul. If you run And as a last note, I no longer work for HashiCorp, so I probably won't be able to help much on this issue any more. |
I rolled back to the prod release, as it seemed like with this patch that health checks were failing across multiple services. I didn't dig into it too much, as my focus was to recover the failing services.
I don't recall seeing any when I inspected the job |
@tgross - Any chance of getting someone to have a look at this? |
@nakermann1973 it's not on the roadmap currently. I can't really speak to what would get it on the roadmap, either, but I'll flag it for @arodd and @jrasell to chat about (James is out this week though). |
There are patches for CNI + Connect support on my fork of nomad 1.6 here (Apache license) https://github.com/jupitercloud/nomad/commits/main/ I've been using nomad with Consul Connect & Calico CNI for a couple of years - it works great. The one thing I haven't yet resolved is the health check problem - it's on my roadmap as a priority. Sad to say since the BSL license change I haven't felt motivated to try to upstream them. I was hoping when IBM came in that decision might get reversed. |
Nomad version
Nomad v0.12.5 (514b0d6)
Operating system and Environment details
Ubuntu Focal amd64 20.04.1
Issue
Hi !
I would like to add a CNI macvlan network to use with Consul Connect to enable an ingress gateway to be part of this publicly available network for clients.
Howerver after setup the CNI config file, nomad says that only "bridge" or "host" is correct.
Thanks !
My CNI config:
And my job file:
The text was updated successfully, but these errors were encountered: