-
Notifications
You must be signed in to change notification settings - Fork 743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrading from v1.16.0-eksbuild.1 to v1.17 or v1.18 results in failure to assign IP address to container #2872
Comments
Do you see any errors in the ipamd.log that stand out? Can you run node log collector - |
I'm having trouble gathering the logs. I walked through some steps when we were on v1.16.0 to get a feel for what was available. I found that I could access the ipamd.log in the aws-node when v1.16.0 was running, but could not get a shell when v1.18.0 was running. We're also using bottlerocket as the AMI and the steps in the linked collector resulted in more errors of missing commands than useful output. I was connected to the node where the IP could not be allocated and was unable to find information with the script or manually. Do you have any guidance on what I could do differently? |
I tried to reproduce this issue using a new cluster and bottle-rocket image, but I could not.
To look into your bottlerocket logs. Login to your instance and do
|
This failure |
Thanks, @orsenthil, for your guidance. I have reproduced the issue and submitted the ipamd.log to the triage email. There are messages logged that the ENI on the node "does not have available addresses" and "IP address pool stats: total 18, assigned 18" as well as "IP pool is too low: available (0) < ENI target (1) * addrsPerENI (9)" ... yet the cluster did not scale to create another node. |
You mean, additional ENI wasn't created or the cluster didn't scale for another node, if later, it is auto-scaling functionality, not networking. |
All that's changed is the VPC CNI driver. On 1.16, nodes scale out when nodes reach their limits; after, they do not and we see the error:
|
I think I will close this after seeing no evidence that VPC CNI is not working as expected. Sure, there was an inability to add an IP, but I believe that is because the nodes were over subscribed and the cluster's auto scaler did not add a new node. I found a discrepancy in the number of EC2 instances in the node group and those seen in Kubernetes. |
This issue is now closed. Comments on closed issues are hard for our team to see. |
What happened:
Upgrading from the v1.16.0 version to v1.17 or higher results in scheduled pods that cannot obtain an IP address. Downgrading back to v1.16.0 restores functionality. Also seen during this condition is that the EKS cluster does not scale out.
Attach logs
What you expected to happen:
Should be able to assign IP addresses to pod or the cluster should scale out and then be able to assign IP addresses to pods.
How to reproduce it (as minimally and precisely as possible):
On an EKS cluster running EKS 1.28, upgrade the VPC CNI add-on from v1.16 to v1.17 or v1.18. It may be necessary to add additional pods, but at some point, a pod will be assigned to an existing node and will sit in a pending state because aws-cni could not assign an IP address to the container.
Anything else we need to know?:
Environment:
kubectl version
): 1.28cat /etc/os-release
): Amazon linux bottlerocketuname -a
): bottlerocket-aws-k8s-1.28-x86_64-v1.19.2-29cc92ccThe text was updated successfully, but these errors were encountered: