-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document procedure and migrate existing AWS EKS based hubs from k8s 1.21+ to 1.24+ #2057
Comments
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
Re: GKE, I specifically put us in 'unspecified' because if we are in a channel, not just the control plane but all the nodes would restart at an arbitrary time that GCP chooses. However, in practice since we haven't been upgrading them ourselves, they've just been upgrading at a much slower level. Figuring out a way here to minimize disruption would be great. I think using GKE release channels could redo entire large dask clusters or notebook nodes at inopportune times, so I think keeping the upgrade process manual but actually doing upgrades is probably the way to go |
@yuvipanda maybe it could be communicated with users that maintenance will be done automatically in pre-agreed time-windows, which we may need anyhow if we do the upgrade ourselves. Maybe there is trouble for terraform if we let GKE do it instead of ourselves though? I'm not sure. But let's table this deliberation for another issue and let this issue focus on upgrading EKS clusters manually. I think it would be good to upgrade the GKE clusters manually at least once as well to gain experience and such with doing it manually to start so I've opened #2157 about it. |
|
I removed myself from being assigned to this. There is now documentation on how to make the upgrade, so I think it would be good if I'm not doing all the AWS upgrades. |
Its done!!! All EKS k8s clusters are v1.24 or v1.25 now! |
This is the current status of AWS EKS clusters.
2i2c-aws-us: k8s 1.25, highmem nodes, node sharing profile list, ssh-keys #2343
carbonplan: update k8s from 1.19 to 1.24 is made, now update eksctl cluster config template #2085
gridsst: k8s 1.22 to 1.25, core node from m5 to r5, dask nodes from 4 different m5 to one r5.4xlarge #2373
discussed for upgrade in https://2i2c.freshdesk.com/a/tickets/543
nasa-cryo: k8s 1.22 to 1.25, node sharing setup #2374
nasa-veda: upgrade to k8s 1.25, highmem nodes, profile list with node sharing #2340
openscapes: update EKS cluster config templates from k8s 1.21 to 1.24 #2139
Created at 1.24
victor: k8s 1.22 to 1.25, core node from m5 to r5, dask nodes from 4 different m5 to one r5.4xlarge #2375
Action points
docs: add an aws k8s cluster upgrade guide #2142
Related
Original outdated issue
Click to expand
Such migration will require some additional steps related to #2054 and #2056.
In practice, I think it involves updating the .jsonnet templates of eksctl cluster configuration files that we have in the eksctl folder to match how they would look if we re-generated them from the jinja2 template (template.jsonnet) of these .jsonnet template files.
It also involves manually adding the EKSCTL addon with a
eksctl
command, and recreating the node pools, etc. This will cause disruption.I've outlined the steps I took when I updated the JMTE hub in the eksctl cluster config file part of the JMTE PR branch. These may not be the exact steps we ought to take, but help guide the steps we should take.
Update: I think for the ebs driver stuff, I think we need to add the ebs driver addon and not add
iam
stuff to the nodeGroups. If I'm wrong, we may need to revert 8fe4009 from #2056.The text was updated successfully, but these errors were encountered: