Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: al2 support #119
base: master
Are you sure you want to change the base?
feat: al2 support #119
Changes from 2 commits
a229422
4a1d6ce
c15075f
ee34390
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious on why restarting containerd is needed; it's non-obvious to me because we are not changing any containerd config, and after this script switches kubelet to use CRI-O, containerd will no longer be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i'm not quite sure, I know this was needed at some point (may be AL2 specific), but unsure if it was finally needed or not.. it shouldn't hurt though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to leave it out, since there's no change to any containerd config in the script, so restarting containerd is confusing I think; unless it's actually needed of course, in which case a comment as to why would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it helps on the sysbox uninstall from the cluster, where we revert back from CRI-O -> containerd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ctalledo, I made a new build today on top of sysbox v0.6.3 but applying the new sysbox-runc patch + this PR + reverting the change that deprecated v1.25 (we are still on that, it is officially supported by AWS/EKS until '24 October) and I can confirm that this line (restarting containerd) seems to be needed. Without this, the installer daemonset finishes with the first run and the node just fails (the kubelet is down).
Upon SSH-ing to the failing node, I see the following issue:
This is probably because the kubelet systemd unit has containerd as a dependency (After & Requires):
So I added a new
sed
command to replace any potential dependencies on 'containerd' with 'crio' to fix it. But even after this replacement, followed by thesystemctl daemon-reload
command at the end of the config_kubelet() function, reloading kubelet still fails until thecontainerd
service is stopped. That is why addingservice containerd restart
helped, but I just replaced it with a call to the existingstop_containerd()
func instead.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @szab100, thank you very much for the detailed explanation, let's keep it there (and possibly add a short comment that is needed).
Oh I had missed that, my mistake; we can bring it back then (it's a pretty simple change in the Makefiles IIRC), but I don't think we can actively test on it (but since it's been tested before, things should continue to work).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine, actually in the meantime we upgraded to 1.26.6 as well, so no need to bring it back.