Skip to content

Commit

Permalink
CP-24886: ensure KSM service and KSM target always match (#143)
Browse files Browse the repository at this point in the history
* CP-24886: ensure ksm svc and target match

* Update NOTES.txt

---------

Co-authored-by: Thomas Evans <[email protected]>
  • Loading branch information
dmepham and teevans authored Jan 17, 2025
1 parent 2470ca7 commit 78a1e66
Show file tree
Hide file tree
Showing 6 changed files with 57 additions and 12 deletions.
14 changes: 14 additions & 0 deletions charts/cloudzero-agent/docs/releases/1.0.0-beta-10.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
## [Release 1.0.0-beta-10](https://github.com/Cloudzero/cloudzero-agent/compare/v0.0.28...v1.0.0-beta-10) (2025-01-17)

This release adds logic to ensure that the static target used in the `env-validator` and in the Prometheus configuration always matches the internal Service created by the `kube-state-metrics` subchart.

### Upgrade Steps
Upgrade using the following command:
```console
helm upgrade --install <RELEASE_NAME> cloudzero-beta/cloudzero-agent -n <NAMESPACE> --create-namespace -f configuration.example.yaml --version 1.0.0-beta-10
```

### Improvements
* **Static Target and KSM Service Always Match:** Both the `env-validator` and the Prometheus agent require an address for a `kube-state-metrics` Service. By default, the Service name generated by the `kube-state-metrics` subchart generates a name that matches the target value generated by the chart.

However, if the user overrides the name of the `kube-state-metrics` Service using `kubeStateMetrics.fullnameOverride`, there can be a mismatch between the names. This change attempts to mirror the logic used by the internal `kube-state-metrics` chart so that the target and Service names will match regardless of user input.
15 changes: 15 additions & 0 deletions charts/cloudzero-agent/templates/NOTES.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{{- if and .Values.kubeStateMetrics.targetOverride .Values.kubeStateMetrics.enabled }}
***************


****WARNING****

This chart has been installed with both `kubeStateMetrics.targetOverride` and `kubeStateMetrics.enabled`. This is almost certainly not a correct configuration.

The purpose of targetOverride is for you to bring your own kube-state-metrics. If `kubeStateMetrics.enabled` is true, and `kubeStateMetrics.targetOverride` is not null,
it is likely you will not receive the required metrics and data in the CloudZero platform since the agent may be looking for the wrong service address for KSM.

Please refer to the documentation for guidance on `kubeStateMetrics` settings.

***************
{{- end }}
27 changes: 20 additions & 7 deletions charts/cloudzero-agent/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -167,16 +167,29 @@ Required metric labels
{{- end -}}

{{/*
KubeStateMetrics target override
The name of the KSM service target that will be used in the scrape config and validator
*/}}
{{- define "cloudzero-agent.kubeStateMetrics.targetOverride" -}}
{{- if .Values.kubeStateMetrics.enabled -}}
{{ printf "%s-%s.%s.svc.cluster.local:%d" .Release.Name .Values.kubeStateMetrics.nameOverride .Release.Namespace (int .Values.kubeStateMetrics.service.port) }}
{{- else -}}
{{- if not .Values.kubeStateMetrics.targetOverride }}
{{- define "cloudzero-agent.kubeStateMetrics.kubeStateMetricsSvcTargetName" -}}
{{- $name := "" -}}
{{/* If the user specifies an override for the service name, use it. */}}
{{- if .Values.kubeStateMetrics.targetOverride -}}
{{ .Values.kubeStateMetrics.targetOverride }}
{{/* After the first override option is not used, try to mirror what the KSM chart does internally. */}}
{{- else if .Values.kubeStateMetrics.fullnameOverride -}}
{{- $svcName := .Values.kubeStateMetrics.fullnameOverride | trunc 63 | trimSuffix "-" -}}
{{ printf "%s.%s.svc.cluster.local:%d" $svcName .Release.Namespace (int .Values.kubeStateMetrics.service.port) | trim }}
{{/* If KSM is not enabled, and they haven't set a targetOverride, fail the installation */}}
{{- else if not .Values.kubeStateMetrics.enabled -}}
{{- required "You must set a targetOverride for kubeStateMetrics" .Values.kubeStateMetrics.targetOverride -}}
{{/* This is the case where the user has not tried to change the name and are still using the internal KSM */}}
{{- else if .Values.kubeStateMetrics.enabled -}}
{{- $name = default .Chart.Name .Values.kubeStateMetrics.nameOverride -}}
{{- if contains $name .Release.Name -}}
{{- $svcName := .Release.Name | trunc 63 | trimSuffix "-" -}}
{{ printf "%s.%s.svc.cluster.local:%d" $svcName .Release.Namespace (int .Values.kubeStateMetrics.service.port) | trim }}
{{- else -}}
{{ .Values.kubeStateMetrics.targetOverride }}
{{- $svcName := printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}}
{{ printf "%s.%s.svc.cluster.local:%d" $svcName .Release.Namespace (int .Values.kubeStateMetrics.service.port) | trim }}
{{- end }}
{{- end }}
{{- end }}
Expand Down
2 changes: 1 addition & 1 deletion charts/cloudzero-agent/templates/cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ data:
action: labelkeep
static_configs:
- targets:
- {{ include "cloudzero-agent.kubeStateMetrics.targetOverride" . }}
- {{ include "cloudzero-agent.kubeStateMetrics.kubeStateMetricsSvcTargetName" . }}
{{- end }}
{{- if .Values.prometheusConfig.scrapeJobs.cadvisor.enabled }}
- job_name: cloudzero-nodes-cadvisor # container_* metrics
Expand Down
2 changes: 1 addition & 1 deletion charts/cloudzero-agent/templates/validatorcm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ data:
{{- if .Values.validator.serviceEndpoints.kubeStateMetrics }}
kube_state_metrics_service_endpoint: http://{{ .Values.validator.serviceEndpoints.kubeStateMetrics }}/
{{- else }}
kube_state_metrics_service_endpoint: http://{{ include "cloudzero-agent.kubeStateMetrics.targetOverride" . }}
kube_state_metrics_service_endpoint: http://{{ include "cloudzero-agent.kubeStateMetrics.kubeStateMetricsSvcTargetName" . }}
{{- end }}
executable: /bin/prometheus
kube_metrics:
Expand Down
9 changes: 6 additions & 3 deletions charts/cloudzero-agent/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -105,16 +105,19 @@ kubeStateMetrics:
repository: kube-state-metrics/kube-state-metrics
tag: "v2.10.1"
nameOverride: "cloudzero-state-metrics"
# Disable CloudZero KSM as a Scrape Target since the service endpoint is explicity defined
# Disable CloudZero KSM as a Scrape Target since the service endpoint is explicitly defined
# by the Validators config file.
prometheusScrape: false
# Set a default port other than 8080 to avoid collisions with any existing KSM services.
service:
port: 8080

# Overriding static scrape target address for an existing KSM.
# Set to service <servie-name>.<namespace>.svc.cluster.local:port if built-in is disabled (enable=false above)
# -- Overriding static scrape target address for an existing KSM.
# -- Set to service <service-name>.<namespace>.svc.cluster.local:port if built-in is disabled (enable=false above)
# targetOverride: kube-state-metrics.monitors.svc.cluster.local:8080
# -- If targetOverride is set and kubeStateMetrics.enabled is true, it is likely that fullnameOverride below must be set as well.
# -- This should not be a common configuration
# fullnameOverride: "kube-state-metrics"

# -- Annotations to be added to the Secret, if the chart is configured to create one
secretAnnotations: {}
Expand Down

0 comments on commit 78a1e66

Please sign in to comment.