-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disk partition is too low on vSphere clusters in wallaby workload clusters #3777
Comments
On this worker I see the full disk is partitioned :
On the one Nick mentions it is not:
Disk size was increased on October 10th: https://github.com/WEPA-digital/gitops-mc-wallaby/commit/3b3d6f8cd6d4cedb49dded3dc443357a81400636 The problematic node is 54 days old, same as it's I observe that the There is a
The capi controller has a lot of errors like
Followed by
So it looks like this node rolling got stuck for some reason. |
Deleting the machine objects from the old machine deployment. Machines are gone, but there are now only 2 machines and the machine deployment still shows wrong values.
Because the old machineset still exists (calculated 3+2 as it turns out).
Restarting the capi controller didn't help. I manually deleted the machine set and the machine deployment picked up the first machine set only, now scaling up the extra 2 machines. I manually cleaned the old Not sure if this ☝ is perfectly clean but couldn't think of something else. |
The CAPI controller no longer logs issues so I think we are okish for now. |
I got paged because the diskspace on the root partition was too low.
If the customer runs Java applications there's already a super high chance that this might kill nodes.
From talking with @vxav it should already be increased to 64GB.
Could you please investigate why this hasn't been adjusted?
Cluster which has been affected :
wallaby/plant-cassino-dev
Additional information on that:
https://kubernetes.slack.com/archives/CKFGK3SSD/p1622112033033700?thread_ts=1622108128.030700&cid=CKFGK3SSD
The text was updated successfully, but these errors were encountered: