Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue with installation with create management flag set to false #996

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kylewuolle
Copy link
Contributor

Fixed an issue where the provider templates depend on the management object and the management object depends on the provider templates making it impossible to deploy kcm with the create management flag set to false.

Resolves issue #940

Copy link
Contributor

@zerospiel zerospiel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kylewuolle could you elaborate on the fix? I can't verify it'd work as desired. I do not see how the managements and providertemplates resources are related to the initial Release creation process which is part of the release-controller. I follow the next steps:

  • prepare an instance, in my case make kind-deploy registry-deploy dev-push
  • install with the createManagement=false: helm install kcm oci://127.0.0.1:5001/charts/kcm --version 0.1.0 -n kcm-system --create-namespace --set="controller.createManagement=false" --set="image.repository=localhost/kcm/controller" --set="velero.enabled=false"

The releases resource list is empty and the release-controller is trying to create a Release object indeed (because of the if !r.CreateManagement { return nil; }; code removal`). The problem still persists though:

HelmRelease kcm-system/kcm-0-0-7-54-g612b62e-tpl is not ready yet. HelmChart 'kcm-system/kcm-0-0-7-54-g612b62e-tpl' is not ready: latest generation of object has not been reconciled

And the chart 0-0-7-... has the following error: chart pull error: failed to download chart for remote reference: failed to get 'oci://ghcr.io/k0rdent/kcm/charts/kcm-templates:0.0.7-54-g612b62e': ghcr.io/k0rdent/kcm/charts/kcm-templates:0.0.7-54-g612b62e: not found

This is obviously because there ain't such a version because it is the build version from the initial installation during release creation. The problem that we have is related, in my opinion, somewhere to our ops and how we are handling the local builds between tagging the product. To sum up, I think the issue should be addressed in the other way.

My thought on how to address it: push the "dev-build" kcm-templates chart with the actual version something like

package-chart-%: lint-chart-%
	@chart_name=$(patsubst package-chart-%,%,$@); \
	if [ "$$chart_name" = "kcm-templates" ]; then \
		$(HELM) package --destination $(CHARTS_PACKAGE_DIR) $(TEMPLATES_SUBDIR)/$* --version $(VERSION); \
	else \
		$(HELM) package --destination $(CHARTS_PACKAGE_DIR) $(TEMPLATES_SUBDIR)/$*; \
	fi

and of course, adjust the release-controller's if statement to something if !r.CreateRelease && !r.CreateManagement { return nil;};

perform make REGISTRY_REPO="http://127.0.0.1:5001/charts" kind-deploy registry-deploy dev-push install with the controller registry URL set to the localhost, e.g. via helm:

helm install kcm oci://127.0.0.1:5001/charts/kcm  --version 0.1.0 -n kcm-system --create-namespace --set="controller.createManagement=false" --set="image.repository=localhost/kcm/controller" --set="velero.enabled=false" --set="controller.defaultRegistryURL=127.0.0.1:5001/charts/kcm" --set="controller.insecureRegistry=true"

But I am not sure whether this:

  1. the only existing approach to address
  2. would actually work (highly likely no according to the Makefile, there are a lot of changes to be done regarding installing the cm plugin, proper uname/pass propagation, and probably other keys)

I'd appreciate it if you could elaborate on all of the things above, I might be wrong but it seems to me to be the right direction for addressing the issue

@kylewuolle
Copy link
Contributor Author

@kylewuolle could you elaborate on the fix? I can't verify it'd work as desired. I do not see how the managements and providertemplates resources are related to the initial Release creation process which is part of the release-controller. I follow the next steps:

  • prepare an instance, in my case make kind-deploy registry-deploy dev-push
  • install with the createManagement=false: helm install kcm oci://127.0.0.1:5001/charts/kcm --version 0.1.0 -n kcm-system --create-namespace --set="controller.createManagement=false" --set="image.repository=localhost/kcm/controller" --set="velero.enabled=false"

The releases resource list is empty and the release-controller is trying to create a Release object indeed (because of the if !r.CreateManagement { return nil; }; code removal`). The problem still persists though:

HelmRelease kcm-system/kcm-0-0-7-54-g612b62e-tpl is not ready yet. HelmChart 'kcm-system/kcm-0-0-7-54-g612b62e-tpl' is not ready: latest generation of object has not been reconciled

And the chart 0-0-7-... has the following error: chart pull error: failed to download chart for remote reference: failed to get 'oci://ghcr.io/k0rdent/kcm/charts/kcm-templates:0.0.7-54-g612b62e': ghcr.io/k0rdent/kcm/charts/kcm-templates:0.0.7-54-g612b62e: not found

This is obviously because there ain't such a version because it is the build version from the initial installation during release creation. The problem that we have is related, in my opinion, somewhere to our ops and how we are handling the local builds between tagging the product. To sum up, I think the issue should be addressed in the other way.

My thought on how to address it: push the "dev-build" kcm-templates chart with the actual version something like

package-chart-%: lint-chart-%
	@chart_name=$(patsubst package-chart-%,%,$@); \
	if [ "$$chart_name" = "kcm-templates" ]; then \
		$(HELM) package --destination $(CHARTS_PACKAGE_DIR) $(TEMPLATES_SUBDIR)/$* --version $(VERSION); \
	else \
		$(HELM) package --destination $(CHARTS_PACKAGE_DIR) $(TEMPLATES_SUBDIR)/$*; \
	fi

and of course, adjust the release-controller's if statement to something if !r.CreateRelease && !r.CreateManagement { return nil;};

perform make REGISTRY_REPO="http://127.0.0.1:5001/charts" kind-deploy registry-deploy dev-push install with the controller registry URL set to the localhost, e.g. via helm:

helm install kcm oci://127.0.0.1:5001/charts/kcm  --version 0.1.0 -n kcm-system --create-namespace --set="controller.createManagement=false" --set="image.repository=localhost/kcm/controller" --set="velero.enabled=false" --set="controller.defaultRegistryURL=127.0.0.1:5001/charts/kcm" --set="controller.insecureRegistry=true"

But I am not sure whether this:

  1. the only existing approach to address
  2. would actually work (highly likely no according to the Makefile, there are a lot of changes to be done regarding installing the cm plugin, proper uname/pass propagation, and probably other keys)

I'd appreciate it if you could elaborate on all of the things above, I might be wrong but it seems to me to be the right direction for addressing the issue

If you check out the original issue #940 you'll see that this problem occurs in the current helm release unrelated to any versioning issues. ProviderTemplates as is now require the Management object to be there in order to be valid and the Management object requires ProviderTemplates to be ready/valid in order to be ready. A classic circular dependency. There's actually several others between Management and other objects.

@zerospiel
Copy link
Contributor

Got it, thanks. The aforementioned steps are from the initial issue, and since I still can reproduce the bug doing the steps, I assume that the provided code doesn't address the issue. Could you elaborate on it? Have I done something wrong?

@kylewuolle
Copy link
Contributor Author

kylewuolle commented Feb 3, 2025

Got it, thanks. The aforementioned steps are from the initial issue, and since I still can reproduce the bug doing the steps, I assume that the provided code doesn't address the issue. Could you elaborate on it? Have I done something wrong?

You could manually push the aforementioned chart to your local registry (localhost:5001). How I deploy it locally is :

  1. make VERSION=0.0.7 dev-destroy kind-deploy registry-deploy kcm-chart-release dev-push
  2. helm install kcm oci://127.0.0.1:5001/charts/kcm --version 0.0.7 -n kcm-system --create-namespace --set="controller.createManagement=false" --set="image.repository=localhost/kcm/controller" --set="image.tag=latest"
  3. kubectl -n kcm-system -f <management yaml>
apiVersion: k0rdent.mirantis.com/v1alpha1
kind: Management
metadata:
  name: kcm
spec:
  core:
    capi: {}
    kcm: {}   
  providers:
  - name: k0smotron
  - name: cluster-api-provider-aws
  - name: cluster-api-provider-azure
  - name: cluster-api-provider-openstack
  - name: cluster-api-provider-vsphere
  - name: projectsveltos
  release: kcm-0-0-7

@zerospiel
Copy link
Contributor

Now I see that the extra target kcm-chart-release (which I was unaware of) with VERSION conjunction gives the desired effect. I think that should be covered in the dev docs, from my understanding, it looks a bit like a hack to avoid that exact problem with the "build" version (for instance, 0.0.7 or 0.1.0 are in the remote repository, which is used by default, setting some yet-inexistent-version or leaving it as is with git describe... would break the approach). Also, mention performing git checkout templates/provider/kcm afterward to clean up the changes caused by the kcm-chart-release target.

Anyway, I'd rather wait for other reviewers to take a look.

@kylewuolle kylewuolle force-pushed the fix-mgmt-deployment branch 2 times, most recently from c3ee767 to 1564fe4 Compare February 5, 2025 18:01
…object and the management object depends on the provider templates making it impossible to deploy kcm with the create management flag set to false. Resolves issue k0rdent#940
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bug] kcm controller doesn't create release and crashes if createManagement is set to false
2 participants