Kubernetes Secrets: Use a cached client #3464

barkbay · 2023-09-22T12:49:23Z

What does this PR do?

Use a namespace scoped cached client to read Kubernetes Secrets

Why is it important?

To avoid hammering the K8s API, refer to #3442

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have made corresponding change to the default configuration files
I have added tests that prove my fix is effective or that my feature works
I have added an entry in ./changelog/fragments using the changelog tool
I have added an integration test or an E2E test

Author's Checklist

Not sure about the log levels...

How to test this PR locally

I don't know yet tbh 🤷

Related issues

Fixes #3442

Use cases

n/a

Screenshots

n/a

Logs

I'll update once I have a better understanding of the log levels in this project

Questions to ask yourself

How are we going to debug this?

Logs should be enough, if not using the k8s audit API can be an alternative.

What are the metrics I should take care of?

Reader created, non cached requests...

mergify · 2023-09-22T12:49:59Z

This pull request does not have a backport label. Could you fix it @barkbay? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

backport-v./d./d./d is the label to automatically backport to the 8./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

elasticmachine · 2023-09-22T13:16:13Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2023-09-25T13:56:23.161+0000
Duration: 28 min 52 sec

Test stats 🧪

Test	Results
Failed	0
Passed	6313
Skipped	59
Total	6372

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.
/package : Generate the packages.
run integration tests : Run the Elastic Agent Integration tests.
run end-to-end tests : Generate the packages and run the E2E Tests.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

elasticmachine · 2023-09-22T13:16:24Z

🌐 Coverage report

Name	Metrics % (`covered/total`)	Diff
Packages	98.78% (`81/82`)	👍
Files	66.102% (`195/295`)	👍
Classes	65.574% (`360/549`)	👎 -0.12
Methods	52.791% (`1135/2150`)	👍 0.022
Lines	38.214% (`12873/33687`)	👎 -0.014
Conditionals	100.0% (`0/0`)	💚

barkbay · 2023-09-22T13:32:22Z

Unscientific comparison with an Docker image I built myself on a small dev cluster:

The remaining spikes come from an external secret sync. operator.

elasticmachine · 2023-09-22T17:50:32Z

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

internal/pkg/composable/providers/kubernetessecrets/kubernetes_secrets.go

gizas · 2023-09-25T10:58:53Z

@barkbay are you gonna provide doc updates and samples on how to configure Elastic Agent with the secrets client?

I would expect some simple commands in the test section here to help me configure the reader for a given namespace. Can you please provide those?

barkbay · 2023-09-25T11:21:50Z

@barkbay are you gonna provide doc updates and samples on how to configure Elastic Agent with the secrets client?

This PR doesn't change anything from a user point of view, we are just replacing a K8S client implementation by an other one. Do you have something specific in mind?

internal/pkg/composable/providers/kubernetessecrets/kubernetes_secrets.go

gizas · 2023-09-25T12:15:35Z

@barkbay are you gonna provide doc updates and samples on how to configure Elastic Agent with the secrets client?

This PR doesn't change anything from a user point of view, we are just replacing a K8S client implementation by an other one. Do you have something specific in mind?

Was referring to this page https://www.elastic.co/guide/en/fleet/current/kubernetes_secrets-provider.html as for now it is the only relevant that I can find. If this can be enhanced with an example with the manifest and refernece of a secret or some logs to troubleshoot references of https://github.com/elastic/elastic-agent/pull/3464/files#diff-8ce13c7c88aad2289f0a95358cafa95589e40293ba2dddeb14832bd2d604737fR141

blakerouse · 2023-09-25T13:32:50Z

changelog/fragments/1695390956-kubernetes_secrets-use-cached-client.yaml

+kind: enhancement
+
+# Change summary; a 80ish characters long description of the change.
+summary: Use a cached client to retrieve Kubernetes Secrets


This doesn't really help our users when reading the changelog. Can this be updated more to describe the benefits this provides? Less API calls, better performance, etc.

blakerouse · 2023-09-25T13:36:51Z

internal/pkg/composable/providers/kubernetessecrets/kubernetes_secrets.go

+		p.readers[namespace] = newReader
+		reader = newReader
+		go func() {
+			if err := newReader.Start(p.ctx); err != nil {


How does this ever get stopped? Looks to me that it leaks. If a secret is pulled from a namespace, and then the policy is updated to pull from another namespace this will stay running for the life of the Elastic Agent.

This is the blocker, by the way. I didn't make that clear in the review. Sorry about that.

As mentioned in the main struct:

// ctx is the context used to start the informers. Informers will be stopped once the context is done.

and then the policy is updated to pull from another namespace this will stay running for the life of the Elastic Agent.

I was assuming that the context was canceled in that case. Is there a way to get notified about such a change?

The context for a provider is only cancelled in the case that the Elastic Agent is being stopped.

The kubernetes context provider was the only "fetch` provider that was added to Elastic Agent and it was added after the fact. We didn't want the Elastic Agent to read/watch every secret, which we don't. The original design of all context and dynamic providers was for it to keep a local cache of all keys/values and then signal the composable manager when a key/value changes.

Based on that information it was never required to alert a provider to a re-render (which happens anytime the policy is updated, or when a key/value is changed from any provider in the Elastic Agent composable manager). I believe for this to work correctly you will need to handle this correctly to add this type of caching.

See me comment here #3442 (comment). Everything in that comment still applies even with the usage of https://github.com/kubernetes-sigs/controller-runtime/blob/d5bc8734caccabddac6a1bea250b0b9d771d318d/pkg/cache/cache.go#L61-L70

blakerouse · 2023-09-25T13:42:22Z

internal/pkg/composable/providers/kubernetessecrets/kubernetes_secrets.go

 }

 // ContextProviderBuilder builds the context provider.
-func ContextProviderBuilder(logger *logger.Logger, c *config.Config, managed bool) (corecomp.ContextProvider, error) {
+func ContextProviderBuilder(logger *logger.Logger, c *config.Config, _ bool) (corecomp.ContextProvider, error) {


I assume that this change has the Elastic Agent now watching for changes in secrets, it would be great to see this also now trigger a re-render of the policy when a secret changes. Seems that the code is already listening to the API for those events, why not take it to the next level and improve it some more?

it would be great to see this also now trigger a re-render of the policy when a secret changes.

I’m afraid I don’t know what "trigger a re-render of the policy" means.

See my comment above for more context. It means that when a secret value is updated that you would call to the comm Communicator that a value has changed. That will signal the Elastic Agent composable manager to re-render the component model because something changed.

Think of it like this:

Elastic Agent is running with a policy and inside of that policy it references a password from a kubernetes secret.

The kubernetes secret is updated to change the password.

Elastic Agent gets the event that the password has changed.

It updates its cache for the secret.

It alerts the composable manager that a value has changed.

The Elastic Agent re-renders the policy with that new password substituted.

The Elastic Agent sends the new configuration to the running component.

The running component starts using the new password.

Elastic Agent already works that way today, just not for kubernetes secrets (everything else does). The kubernetes secrets provider just needs to "5." in that list and Elastic Agent would do the rest for you.

I assume that this change has the Elastic Agent now watching for changes in secrets,

The benefit of using the controller runtime client is to get an abstraction over the watch mechanism, and therefore to get a eventually up-to-date cache without managing watches. Given your explanations, I feel less and less confident in continuing down this path.

elastic-sonarqube · 2023-09-25T14:07:15Z

SonarQube Quality Gate

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

68.8% Coverage
0.0% Duplication

barkbay added 3 commits September 22, 2023 11:51

Kubernetes Secrets: Use a cached client

b5733da

Fix log message

172786a

Start cache async.

d3ad3ec

barkbay requested a review from a team as a code owner September 22, 2023 12:49

barkbay requested review from blakerouse and faec September 22, 2023 12:49

mergify bot assigned barkbay Sep 22, 2023

mergify bot added the backport-skip label Sep 22, 2023

Add changelog fragment

93ed7cc

cmacknz requested a review from gizas September 22, 2023 17:50

cmacknz added the Team:Elastic-Agent Label for the Agent team label Sep 22, 2023

barkbay changed the title ~~Kubernetes Secrets: Used a cached client~~ Kubernetes Secrets: Use a cached client Sep 25, 2023

gizas reviewed Sep 25, 2023

View reviewed changes

internal/pkg/composable/providers/kubernetessecrets/kubernetes_secrets.go Outdated Show resolved Hide resolved

gizas reviewed Sep 25, 2023

View reviewed changes

internal/pkg/composable/providers/kubernetessecrets/kubernetes_secrets.go Show resolved Hide resolved

barkbay commented Sep 25, 2023

View reviewed changes

internal/pkg/composable/providers/kubernetessecrets/kubernetes_secrets.go Show resolved Hide resolved

barkbay added 3 commits September 25, 2023 13:56

Introduce a cacheProvider interface.

673f0c2

Wait for the cache to be started before using it

8fba8d5

Improve tests

3d24a78

barkbay added 2 commits September 25, 2023 14:49

Wait for the context to be set before allowing Fetch(...)

bc316e8

Update from review

bdd049b

blakerouse requested changes Sep 25, 2023

View reviewed changes

update summary

945b051

barkbay closed this Oct 2, 2023

constanca-m mentioned this pull request Oct 9, 2023

Performance issue with K8S Secret Provider #3442

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubernetes Secrets: Use a cached client #3464

Kubernetes Secrets: Use a cached client #3464

barkbay commented Sep 22, 2023 •

edited

Loading

mergify bot commented Sep 22, 2023

elasticmachine commented Sep 22, 2023 •

edited

Loading

Build stats

Test stats 🧪

elasticmachine commented Sep 22, 2023 •

edited

Loading

barkbay commented Sep 22, 2023

elasticmachine commented Sep 22, 2023

gizas commented Sep 25, 2023

barkbay commented Sep 25, 2023

gizas commented Sep 25, 2023

blakerouse Sep 25, 2023

blakerouse Sep 25, 2023

blakerouse Sep 25, 2023

barkbay Sep 25, 2023

blakerouse Sep 25, 2023

blakerouse Sep 25, 2023

barkbay Sep 25, 2023

blakerouse Sep 25, 2023

barkbay Sep 25, 2023

elastic-sonarqube bot commented Sep 25, 2023

Kubernetes Secrets: Use a cached client #3464

Kubernetes Secrets: Use a cached client #3464

Conversation

barkbay commented Sep 22, 2023 • edited Loading

What does this PR do?

Why is it important?

Checklist

Author's Checklist

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

Questions to ask yourself

mergify bot commented Sep 22, 2023

elasticmachine commented Sep 22, 2023 • edited Loading

💚 Build Succeeded

Build stats

Test stats 🧪

💚 Flaky test report

🤖 GitHub comments

elasticmachine commented Sep 22, 2023 • edited Loading

🌐 Coverage report

barkbay commented Sep 22, 2023

elasticmachine commented Sep 22, 2023

gizas commented Sep 25, 2023

barkbay commented Sep 25, 2023

gizas commented Sep 25, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elastic-sonarqube bot commented Sep 25, 2023

barkbay commented Sep 22, 2023 •

edited

Loading

elasticmachine commented Sep 22, 2023 •

edited

Loading

elasticmachine commented Sep 22, 2023 •

edited

Loading