-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gcp: k8s version updates, transitions to pd-balanced disks, towards n2- nodes #3131
gcp: k8s version updates, transitions to pd-balanced disks, towards n2- nodes #3131
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
ac4f203
to
a5e4262
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd eventually like us to do a swipe through various clusters and look at resizing prometheus as well, now that things are less broken there. But no need to block that on this one, although I'd have preferred this PR to have just dealt with the 2i2c cluster.
Ah, I see perhaps that this resizing is in response to the oscillating pagerduty alerts? Is a bit unclear to me, but ok to try if that is the case. |
Yes! It was apparently very broken, with a user pod stuck with DNS issues to mount NFS for 36 hours for example. |
This will force a recreation of core nodes, but not having this has turned out to break the pilot-hubs cluster and meom-ige, so we really need to do this if there is a project without it already.
a5e4262
to
de5d712
Compare
de5d712
to
991d121
Compare
Merging this PR will trigger the following deployment actions. Support and Staging deployments
Production deployments
|
🎉🎉🎉🎉 Monitor the deployment of the hubs here 👉 https://github.com/2i2c-org/infrastructure/actions/runs/6179039712 |
This is the resolution to #2947, because the 2i2c cluster's core node pool got itself a balanced disk, and can therefore run the ingress nginx controller performant enough.
Since the cloudbank and 2i2c cluster was actively used, I transferred all the existing core node pool workloads to run in a temporary created node pool via the cloud console as an intermediate step.
k8s cluster upgrades
standard disk -> pd-balanced on core nodes
transitions to n2