You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So the step-issuer workloads gets OOMKilled when the CertificateRequest object count in X Kubernetes cluster reaches 4334. We experienced over the just passed weekend.
We're on:
step-issuer v0.6.0
Kubernetes K3s v1.24.6+k3s1
cert-manager v1.9.1
When we troubleshooted the issue we restarted the Pod by simply deleting it. Then we followed its startup flow and saw that as it comes up healthy it starts parsing through all CertificateRequest objects on the cluster. This logically uses memory. Apparently so much memory that the step-issuer workload is OOMKilled.
This allowed the step-issuer workload to parse all the CertificateRequests and stay healthy.
A more permanent and better solution will be to use the Cert-managercert-manager.io/revision-history-limit: "5"Ingress annotation. As this will seriously limit the amount of CertificateRequest objects on the cluster.
With that somewhat long intro here's my hot take.
Why in the first place is the step-issuer parsing all the CertificateRequests on the cluster?
2. Why not limit it to only parse the CertificateRequest created by the event that triggered an issuance of a Certificate or a renewal of a Certificate?
What's the reasoning? Or am I misunderstanding how things works under the hood?
Looking forward to some input and replies on this issue.
🙏🏿 you and have ☀️ day.
The text was updated successfully, but these errors were encountered:
Hi @LarsBingBong, thanks for reporting this. Right now, we don't have the resources to fix this issue, but you've found a workaround by changing the resources.
We will investigate if there's a way to reduce memory usage in the future.
So the
step-issuer
workloads getsOOMKilled
when theCertificateRequest
object count in X Kubernetes cluster reaches 4334. We experienced over the just passed weekend.We're on:
When we troubleshooted the issue we restarted the
Pod
by simplydeleting
it. Then we followed its startup flow and saw that as it comes up healthy it starts parsing through allCertificateRequest
objects on the cluster. This logically uses memory. Apparently so much memory that thestep-issuer
workload isOOMKilled
.We managed to WORK AROUND it by bumping the resources that the
step-issuer
can use. From the default value on theMemory limits
of 128Mi ( https://github.com/smallstep/helm-charts/blob/master/step-issuer/values.yaml#L34 ) to500Mi
.This allowed the
step-issuer
workload to parse all theCertificateRequests
and stay healthy.A more permanent and better solution will be to use the
Cert-manager
cert-manager.io/revision-history-limit: "5"
Ingress
annotation. As this will seriously limit the amount ofCertificateRequest
objects on the cluster.With that somewhat long intro here's my hot take.
step-issuer
parsing all theCertificateRequests
on the cluster?2. Why not limit it to only parse the
CertificateRequest
created by the event that triggered an issuance of aCertificate
or a renewal of aCertificate
?What's the reasoning? Or am I misunderstanding how things works under the hood?
Looking forward to some input and replies on this issue.
🙏🏿 you and have ☀️ day.
The text was updated successfully, but these errors were encountered: