Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module management improvements #870

Closed
pbochynski opened this issue Dec 18, 2023 · 5 comments
Closed

Module management improvements #870

pbochynski opened this issue Dec 18, 2023 · 5 comments
Labels
decision Related to all issues that need a decision

Comments

@pbochynski
Copy link
Contributor

pbochynski commented Dec 18, 2023

Created on 2023-12-18 by Piotr Bochynski (@pbochynski )

Decision log

Name Description
Title Module management improvements
Due date 2024-01-15
Status Proposed on 2023-12-20, Accepted on 2024-03-29
Decision type Choice
Affected decisions

Context

Modularization of Kyma components is completed. Now ech module release contains 2 artifacts:

  • module manifest: YAML file with kubernetes resources for module operator deployment and auxiliary resources, like service account, role, role binding, etc
  • default configuration: custom resource describing default configuration of the module

These 2 artifacts can be installed directly by the user in any kubernetes cluster using kubectl apply command. Users that have access to the SAP Kyma Runtime (SKR) can decide to get the selected modules as a managed software, by adding the module name to the spec of Kyma custom resource in their SKR cluster. When module is listed in the Kyma CR, the central meta-operator called Kyma Lifecycle Manager (KLM) takes care of installing and upgrading the module operator to the version assigned to the preferred release channel.

flowchart TD
    S((not installed))
    I((installed))
    M((managed))
    D[delete]
    C{"do managed 
       objects 
       exist?"}
    V{check version}   
    U[install/update]
    KA[kubectl apply]
    KD[kubectl delete]
    CRA[add to Kyma CR]
    CRRD["remove from kyma CR
         delete strategy"]
    CRRI["remove from kyma CR
         ignore strategy"]
    S --> KA
    KA --> I
    I --> KD
    KD --> S
    S --> CRA
    CRA --> M
    I-->CRA
    CRRI --> I
    M --> CRRD
    M --> CRRI
    CRRD --> C
    C --> |yes|C
    C --> |no| D
    M --> |reconcile|V
    V -->|ok|M
    V --> |version not found|U
    U --> M
    D -->S
Loading

Decision

Following improvements are proposed to simplify and clarify module management domain in Kyma.

  1. As the module configuration is not managed KLM should not consider module configuration status in calculating Kyma CR status.
  2. KLM should block deletion of module operator deployment (manifest) until all managed resources are deleted (including module configuration). The list of managed resources contains at least the CRDs included in the module manifest (usually module configuration), but can include also other CRDs (like functions for serverless).
  3. KLM should not recreate module configuration as it is a managed resource that blocks module deletion (CreateAndDelete strategy should only create default config but not reconcile it later)
  4. With Ignore strategy KLM should not delete module deployment as it can lead to orphan resources with misleading status (module config with state Ready, but operator undeployed). With CreateAndDelete strategy the deletion process should be done with blocking strategy (do not delete operator manifest before all managed resources are gone - point no 2).
  5. Lifecycle management is not mandatory for SKR. Removing module from Kyma CR disables auto update but does not prevent users from installing it manually.

Consequences

  1. Kyma control plane components cannot get the information about module configuration status from Kyma CR. This problem is going to be resolved with KCP ModuleConfig lifecycle-manager#1104
  2. Users should check module configuration directly instead of checking Kyma CR (part of the UI story: Community modules in managed and non-managed environment kyma#18450)
  3. Module descriptor has to be enhanced with the information about managed custom resources.
  4. SLA constraints should be well documented. It should be clear what versions are supported (those present in the release channels), and smooth migration is offered only with KLM. Downgrades are not supported (remove and install previous version is the recommended approach).
@pbochynski pbochynski changed the title DRAFT: module management improvements Module management improvements Dec 19, 2023
@janmedrek
Copy link
Contributor

Hey @pbochynski, the points 1-3 are clear and we've already agreed that it is the desired behaviour. I need some clarification regarding the deletion process of the modules.

From what I understand, the module that is listed in the Kyma CR (is managed) will have two deletion modes:

  1. Ignore - this just means that the module was removed from the Kyma CR and KLM does not take any further action. It just stops the reconciliation and all the resources are as they were.
  2. Delete - this means that KLM takes care of the full deletion process in which all the created and managed resources are to be removed from the cluster.

If yes - it would be great if you could expand point 4 with both of these options. It would be better to have it explicitly listed instead of having to deduce it from the diagram. 🙂

As a side note about the current implementation - right now, the strategy for the modules is in the Kyma CR modules list, which means that removing a module from Kyma CR also removes the information on which deletion strategy should be used. I guess we would need to move that somewhere else (replicate it to the status perhaps?).

What I was also worried about was the scenario in which:

  • The module in version 2.0 is manually installed in the cluster
  • The module is added to the Kyma CR
  • The module is available only in version 1.0 for the managed scenario (2.0 is not yet supported as a managed module)
  • KLM goes into error state due to module downgrades not being supported

What do you think about such a case? To me, this is a corner-case scenario, but I suppose that at some point it will be a support case that we need to solve.

@pbochynski pbochynski added the decision Related to all issues that need a decision label Jan 2, 2024
@pbochynski
Copy link
Contributor Author

@janmedrek
I extended point no 4 about deletion with both strategies.
The downgrade protection scenario you explained does not work well right now. I have a cluster that cannot be upgraded when I did the upgrade - downgrade - upgrade sequence. I think what really matters is the operator manager version, and KLM should compare the actual version (running in the SKR right now) with the one that is supposed to be installed from the release channel. This way it will work in any scenario - with/without KLM.

@ptesny
Copy link

ptesny commented Jan 2, 2024

What about having a reset or panic button to restore a given cluster modules to the initial set of these ?

@pbochynski
Copy link
Contributor Author

What about having a reset or panic button to restore a given cluster modules to the initial set of these ?

We can mark default modules in the way people can recognize them in the UI. Anyway if you don't know what you want you can install all of them :)

@pbochynski
Copy link
Contributor Author

Accepted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
decision Related to all issues that need a decision
Projects
None yet
Development

No branches or pull requests

3 participants