step-issuer being OOMKilled when there's relatively many CertificateRequests #60

LarsBingBong · 2023-01-16T11:25:42Z

So the step-issuer workloads gets OOMKilled when the CertificateRequest object count in X Kubernetes cluster reaches 4334. We experienced over the just passed weekend.

We're on:

step-issuer v0.6.0
Kubernetes K3s v1.24.6+k3s1
cert-manager v1.9.1

When we troubleshooted the issue we restarted the Pod by simply deleting it. Then we followed its startup flow and saw that as it comes up healthy it starts parsing through all CertificateRequest objects on the cluster. This logically uses memory. Apparently so much memory that the step-issuer workload is OOMKilled.

We managed to WORK AROUND it by bumping the resources that the step-issuer can use. From the default value on the Memory limits of 128Mi ( https://github.com/smallstep/helm-charts/blob/master/step-issuer/values.yaml#L34 ) to 500Mi.

This allowed the step-issuer workload to parse all the CertificateRequests and stay healthy.

A more permanent and better solution will be to use the Cert-manager cert-manager.io/revision-history-limit: "5" Ingress annotation. As this will seriously limit the amount of CertificateRequest objects on the cluster.

With that somewhat long intro here's my hot take.

Why in the first place is the step-issuer parsing all the CertificateRequests on the cluster?
2. Why not limit it to only parse the CertificateRequest created by the event that triggered an issuance of a Certificate or a renewal of a Certificate?

What's the reasoning? Or am I misunderstanding how things works under the hood?

Looking forward to some input and replies on this issue.

🙏🏿 you and have ☀️ day.

The text was updated successfully, but these errors were encountered:

maraino · 2023-02-24T01:25:25Z

Hi @LarsBingBong, thanks for reporting this. Right now, we don't have the resources to fix this issue, but you've found a workaround by changing the resources.

We will investigate if there's a way to reduce memory usage in the future.

dopey assigned maraino Jan 18, 2023

rdxmb mentioned this issue Jun 23, 2023

FEATURE REQUEST: provide healthcheck #102

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

step-issuer being OOMKilled when there's relatively many CertificateRequests #60

step-issuer being OOMKilled when there's relatively many CertificateRequests #60

LarsBingBong commented Jan 16, 2023

maraino commented Feb 24, 2023

step-issuer being OOMKilled when there's relatively many CertificateRequests #60

step-issuer being OOMKilled when there's relatively many CertificateRequests #60

Comments

LarsBingBong commented Jan 16, 2023

maraino commented Feb 24, 2023