From b674563dc2764d02ab19ec2da9596ef9480fa600 Mon Sep 17 00:00:00 2001 From: YuviPanda Date: Wed, 3 Jan 2024 17:51:21 -0800 Subject: [PATCH 1/2] Bring in newer cryptnono version I've been upgrading cryptnono quite a bit over the last few months, bringing in new detectors that have been quite effective on mybinder.org. We automatically bump cryptnono on our clusters (https://github.com/2i2c-org/infrastructure/pull/3482), but recent progress have included some breaking changes to the helm chart config. This PR just brings in the new config changes, but does not change behavior in any real way. No new detectors are enabled. I've re-measured resource usage for the individual daemonset container (rather than the initContainer) as that can now be set separately. This probably requires us to redo some of the resource allocation generated profiles, which I'll do once this is merged. However, it is an overall reduction in daemonset requests, so deploying this shouldn't result in any profile being undeployable. Merging this should allow https://github.com/2i2c-org/infrastructure/pull/3482 to move forward as well. --- helm-charts/support/Chart.yaml | 2 +- helm-charts/support/values.yaml | 53 +++++++++++++++++++++------------ 2 files changed, 35 insertions(+), 20 deletions(-) diff --git a/helm-charts/support/Chart.yaml b/helm-charts/support/Chart.yaml index 4d9b651192..47e40ba73d 100644 --- a/helm-charts/support/Chart.yaml +++ b/helm-charts/support/Chart.yaml @@ -42,6 +42,6 @@ dependencies: # cryptnono, counters crypto mining # Source code: https://github.com/yuvipanda/cryptnono/ - name: cryptnono - version: "0.0.1-0.dev.git.27.h01b4f25" + version: "0.3.1-0.dev.git.107.heb504bc" repository: https://yuvipanda.github.io/cryptnono/ condition: cryptnono.enabled diff --git a/helm-charts/support/values.yaml b/helm-charts/support/values.yaml index 5c236c636e..f463dbb6c8 100644 --- a/helm-charts/support/values.yaml +++ b/helm-charts/support/values.yaml @@ -396,30 +396,45 @@ cryptnono: # resources for cryptnono was set after inspecting cpu and memory use via # prometheus and grafana. # - # cryptnono has an init container (kubectl-trace-init) and another container - # (trace). The init container has been found using up to 1.6Gi and up to about - # 600m for 4 minutes. The main container has been found using up to 150Mi but - # typically below 100Mi, and miniscule amounts of CPU (0-3m). + # cryptnono has an init container (fetch-kernel-headers) and one container per + # detector. We currently only use one detector (monero). + # + # In the past, the init container init container has been found using up to 1.6Gi and up to about + # 600m for 4 minutes. However, recent changes seem to have made this much faster, + # and there's no record of the initcontainer because our prometheus scrape interval + # is 1minute, and the init container seems to complete by then. We retain the older + # measured metrics until we can make new measurements. # # Since cryptnono is a non-critical service, we are at the moment allowing it # to be evicted during node memory pressure by providing a low memory request # compared to the limit. We are also not requesting significant amounts of CPU # so that it doesn't compete well with others initially. - # - # Note that as of now 2023-03-31 (8367fa5 in yuvipanda/cryptnono), the - # resources configuration configure both containers. - # - # PromQL queries for CPU and memory use: - # - CPU: sum(rate(container_cpu_usage_seconds_total{container="kube-trace-init", namespace="support"}[5m])) by (pod) - # - Memory: sum(container_memory_usage_bytes{container="kube-trace-init", namespace="support"}) by (pod) - # - resources: - limits: - cpu: 800m - memory: 2Gi - requests: - cpu: 5m - memory: 100Mi + fetchKernelHeaders: + resources: + limits: + cpu: 800m + memory: 2Gi + requests: + cpu: 5m + memory: 100Mi + + detectors: + # Disable the execwhacker detector for now, as it matures by being deployed on mybinder.org + execwhacker: + enabled: false + monero: + enabled: true + resources: + # Measured with the following prometheus queries: + # Memory: sum(container_memory_usage_bytes{container="monero", namespace="support"}) by (instance) + # CPU: sum(rate(container_cpu_usage_seconds_total{container="trace", namespace="support"}[5m])) by (instance) + # Seems to hover mostly around the 60Mi mark for memory, and generally less than 0.0002 in CPU + limits: + memory: 128Mi + cpu: 0.005 + requests: + memory: 64Mi + cpu: 0.0001 # Configuration of templates provided directly by this chart # ------------------------------------------------------------------------------- From 5595b13ce0a4e6aa592a01b8d274a3d208ffe8d9 Mon Sep 17 00:00:00 2001 From: YuviPanda Date: Thu, 4 Jan 2024 07:22:04 -0800 Subject: [PATCH 2/2] Use milliCPU as units for clarity --- helm-charts/support/values.yaml | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/helm-charts/support/values.yaml b/helm-charts/support/values.yaml index f463dbb6c8..39996dc6ad 100644 --- a/helm-charts/support/values.yaml +++ b/helm-charts/support/values.yaml @@ -428,13 +428,14 @@ cryptnono: # Measured with the following prometheus queries: # Memory: sum(container_memory_usage_bytes{container="monero", namespace="support"}) by (instance) # CPU: sum(rate(container_cpu_usage_seconds_total{container="trace", namespace="support"}[5m])) by (instance) - # Seems to hover mostly around the 60Mi mark for memory, and generally less than 0.0002 in CPU + # Seems to hover mostly around the 60Mi mark for memory, and generally less than 0.0002 in CPU. But + # 1m (or 0.001) is the lowest that can be specified in kubernetes, so we use that. limits: memory: 128Mi - cpu: 0.005 + cpu: 5m requests: memory: 64Mi - cpu: 0.0001 + cpu: 1m # Configuration of templates provided directly by this chart # -------------------------------------------------------------------------------