You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I'm using the system-upgrade-controller to bump RKE2 version from v1.23.10+rke2r1 to v1.24.10+rke2r1 (https://github.com/rancher/rke2/releases/tag/v1.24.10+rke2r1) following RKE2 documentation and everything works smooth if using the official images. However, when I try to run it on a disconnected environment where we don't have direct access to public repos, it doesn't work. What we have is a proxy cache implemented by harbor which allows us to pull from public repos. However, the job that spins up does the cordon, but then Jobs gets terminated probably by the controller.
If I check logs from system-upgrade-controller it's constantly complaining:
Expected behavior
It should the same it works when using official images, as the only change we do it's doing the proxy cache.
Actual behavior
The init container works as it cordons the node:
❯ kubectl get node
NAME STATUS ROLES AGE VERSION
node Ready,SchedulingDisabled control-plane,etcd,master,worker 177m v1.23.10+rke2r1
However, the main container it's not even started. We don't even have time to check logs, but our guess is that there might be an issue with SHAs as it might internally do a check internally (system-upgrade-controller) as per SYSTEM_UPGRADE_PLAN_LATEST_HASH.
The text was updated successfully, but these errors were encountered:
Version
Release 0.10.0 (https://github.com/rancher/system-upgrade-controller/releases/tag/v0.10.0)
Platform/Architecture
RKE2 cluster
Describe the bug
I'm using the system-upgrade-controller to bump RKE2 version from
v1.23.10+rke2r1
tov1.24.10+rke2r1
(https://github.com/rancher/rke2/releases/tag/v1.24.10+rke2r1) following RKE2 documentation and everything works smooth if using the official images. However, when I try to run it on a disconnected environment where we don't have direct access to public repos, it doesn't work. What we have is a proxy cache implemented by harbor which allows us to pull from public repos. However, the job that spins up does the cordon, but then Jobs gets terminated probably by the controller.If I check logs from system-upgrade-controller it's constantly complaining:
Error logs:
So basically, it says
level=error msg="error syncing 'system-upgrade/server-check11': handler system-upgrade-controller: DesiredSet - Replace Wait batch/v1, Kind=Job system-upgrade/apply-server-check11-on-node-with-8cb523f17d0363daa544-1c8f9 for system-upgrade-controller system-upgrade/server-check11, requeuing"
Jobs are permanently recreated and so pod:
To Reproduce
Our plan:
Expected behavior
It should the same it works when using official images, as the only change we do it's doing the proxy cache.
Actual behavior
The init container works as it cordons the node:
However, the main container it's not even started. We don't even have time to check logs, but our guess is that there might be an issue with SHAs as it might internally do a check internally (system-upgrade-controller) as per
SYSTEM_UPGRADE_PLAN_LATEST_HASH
.The text was updated successfully, but these errors were encountered: