Skip to content
This repository has been archived by the owner on Mar 4, 2024. It is now read-only.

EVEREST-458 Restart everest-operator pod during the provisioning #163

Merged
merged 20 commits into from
Oct 9, 2023

Conversation

gen1us2k
Copy link
Contributor

@gen1us2k gen1us2k commented Oct 4, 2023

CHANGE DESCRIPTION

Problem:
EVEREST-458
EVEREST-458

Short explanation of the problem.

Everest operator adds all CRDs only once during the boot. When the user installed one database engine operator and installed another one by restarting the provisioning process the everest-operator doesn't know about added CRDs.

Solution:
Short explanation of the solution we are providing with this PR.

Restarting the everest-operator as the last step of provisioning operators solves this issue.

Also, since integration tests were failing I fixed EVEREST-458 as part of this

CHECKLIST

Jira

  • Is the Jira ticket created and referenced properly?

Tests

  • Is an Integration test/test case added for the new feature/change?
  • Are unit tests added where appropriate?

@codecov-commenter
Copy link

codecov-commenter commented Oct 4, 2023

Codecov Report

Merging #163 (136475a) into main (419a56d) will decrease coverage by 0.18%.
The diff coverage is 0.00%.

@@           Coverage Diff            @@
##            main    #163      +/-   ##
========================================
- Coverage   5.10%   4.93%   -0.18%     
========================================
  Files         12      12              
  Lines       2802    2899      +97     
========================================
  Hits         143     143              
- Misses      2635    2732      +97     
  Partials      24      24              
Files Coverage Δ
pkg/kubernetes/client/client.go 3.37% <0.00%> (-0.02%) ⬇️
pkg/install/operators.go 4.67% <0.00%> (-0.04%) ⬇️
pkg/kubernetes/client/monitoring.go 0.00% <0.00%> (ø)
...kg/kubernetes/client/mock_kube_client_connector.go 0.00% <0.00%> (ø)
pkg/kubernetes/kubernetes.go 16.09% <0.00%> (-1.60%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

pkg/install/operators.go Outdated Show resolved Hide resolved
pkg/kubernetes/kubernetes.go Show resolved Hide resolved
pkg/kubernetes/kubernetes.go Outdated Show resolved Hide resolved
pkg/kubernetes/client/client.go Outdated Show resolved Hide resolved
pkg/kubernetes/kubernetes.go Show resolved Hide resolved
pkg/kubernetes/kubernetes.go Outdated Show resolved Hide resolved
pkg/kubernetes/kubernetes.go Outdated Show resolved Hide resolved
pkg/kubernetes/kubernetes.go Outdated Show resolved Hide resolved
pkg/kubernetes/kubernetes.go Show resolved Hide resolved
@gen1us2k gen1us2k marked this pull request as ready for review October 5, 2023 12:12
@gen1us2k gen1us2k requested a review from a user October 5, 2023 12:12
@gen1us2k gen1us2k enabled auto-merge (squash) October 5, 2023 12:16
pkg/kubernetes/client/client.go Show resolved Hide resolved
pkg/kubernetes/kubernetes.go Outdated Show resolved Hide resolved
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this approach tries to restart the everest operator when we provision the cluster.
I'm wondering if cli is a good place for this logic.
There are other ways to install the upstream operators in which case we run into the same issue with the everest operator not being aware of the new upstream operators.

Have you considered placing the logic in the everest operator instead?
Can you elaborate on what the advantages/drawbacks are?

@gen1us2k
Copy link
Contributor Author

gen1us2k commented Oct 6, 2023

Have you considered placing the logic in the everest operator instead?

We have a ticket to halt the installation process if there are any operators present in the cluster that were installed without using everest.

Can you elaborate on what the advantages/drawbacks are?

Can you elaborate on what you mean by advantages/drawbacks? For what exactly?

@gen1us2k gen1us2k requested a review from a user October 6, 2023 10:08
@gen1us2k
Copy link
Contributor Author

gen1us2k commented Oct 6, 2023

Have you considered placing the logic in the everest operator instead?

Having this on the operator side is not possible right now because we watch for the custom resources and we set watchers during the boot of the operator and the operator checks that CRD is available in the cluster and then adds it to the client.

The result will be a failure to start for the operator if it's trying to watch for custom resources without this test.

The operator can have a different client in that case with already configured custom schemes but this can be setup only during the run because the client needs to fill caches and everything else.

@ghost
Copy link

ghost commented Oct 6, 2023

I think it would be better to figure out how to implement this logic in the everest operator. From the design perspective this is the proper place. Implementation in cli seems like patching an issue without resolving the root cause and since this is not a pressing issue, I don't think we need to solve it with a quick fix.

Can we figure out a way for the operator to know when an upstream operator has been added? Such as watching for CRDs or some OLM resources?
Could we maybe reuse the same approach in case an operator has been upgraded?
What do you think?

@gen1us2k
Copy link
Contributor Author

gen1us2k commented Oct 6, 2023

Can we figure out a way for the operator to know when an upstream operator has been added? Such as watching for CRDs or some OLM resources?
Could we maybe reuse the same approach in case an operator has been upgraded?
What do you think?

You can open a Jira ticket for that to handle it in future releases. Right now, the issue is fixed for the end users with those changes. I think that there might be other solutions to that and the effort might be huge with the same benefit for the end user.

@gen1us2k
Copy link
Contributor Author

gen1us2k commented Oct 6, 2023

@michal-kralik
Kubernetes unfortunately does not have a way to watch the discovery API. We can try polling it but it will be harder to guarantee that once an operator is installed the end user can create a database cluster because either we will face rate limit issues or we will be requesting k8s API too often causing unnecessary load

@gen1us2k
Copy link
Contributor Author

gen1us2k commented Oct 6, 2023

@michal-kralik another thing that might help is once kubernetes-sigs/controller-runtime#2456 will be implemented in the controller-runtime

pkg/kubernetes/kubernetes.go Outdated Show resolved Hide resolved
@gen1us2k gen1us2k requested a review from a user October 9, 2023 09:01
@ghost
Copy link

ghost commented Oct 9, 2023

Can we also please revisit the tests?
The condition was not correct but the tests passed which is not expected, right?

@gen1us2k
Copy link
Contributor Author

gen1us2k commented Oct 9, 2023

@michal-kralik I revisited them and they look good. Names are not the same

@ghost
Copy link

ghost commented Oct 9, 2023

@michal-kralik I revisited them and they look good. Names are not the same

Right, but why didn't they fail before? Are we missing any test?

@gen1us2k
Copy link
Contributor Author

gen1us2k commented Oct 9, 2023

Right, but why didn't they fail before? Are we missing any test?

No idea but it worked for me and I ensured that names are different for the pods. The test is fine

For the CLI actually operator tests were not failing but they should be and this changed last week afaik.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving this but the integration tests don't provide much value if they don't fail if the logic is wrong 🙂

@gen1us2k gen1us2k merged commit 0a42f4c into main Oct 9, 2023
@gen1us2k gen1us2k deleted the EVEREST-458 branch October 9, 2023 10:30
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants