Releases: Cloudzero/cloudzero-charts
1.0.0-rc1
Release 1.0.0-rc1 (2025-01-23)
This release contains several improvements from 1.0.0-beta-10
:
- The name of the initialization Job that gathers information about existing state of a cluster now includes the version of the chart and the image tag used in the Pod.
- The
initScrapeJob
field is deprecated in favor ofinitBackfillJob
. However, this is not a breaking change;initScrapeJob
can still be used without issue. - The
server.agentMode
boolean argument is now provided. - Improvements are made to the resource consumption of the agent-server pod.
- Metrics from the agent-server pod are made available for monitoring.
Upgrade Steps
Optionally rename the initScrapeJob
field in any override files with initBackfillJob
. initBackfillJob
is the preferred field, but configurations using initScrapeJob
will still work.
Upgrade using the following command:
helm upgrade --install <RELEASE_NAME> cloudzero/cloudzero-agent -n <NAMESPACE> --create-namespace -f configuration.example.yaml --version 1.0.0-rc1
Improvements
-
Initialization Job Name Changes With Releases: It was previously possible to have failures in release upgrades if the container image used in the Job changed. This is because the
image
field in a Job spec is immutable. To prevent this, a new Job is created every time the Helm chart version is changed and/or when the image used in the Job is changed. This also ensures that changes to the underlyinginsights-controller
application will be used in the new backfill of existing cluster state data. -
Clarified Field Names: The Job used for gathering existing cluster data was previously controlled via a field named
initScrapeJob
. This is an overloaded term given that this chart also uses the term "scrape job" in the context of Prometheus. This has caused some confusion, so the field is now renamed toinitBackfillJob
.initScrapeJob
is still usable, and values frominitScrapeJob
are merged withinitBackfillJob
with the latter having precedence. -
Easier Debugging: The
server.agentMode
field can be toggled tofalse
; by default it is set totrue
so that the Prometheus server runs inagent
mode to keep resource usage manageable. Setting the field tofalse
takes the Prometheus server out of agent mode. This is helpful for debugging issues with the Prometheus agent-server. -
Resource Consumption Reduction: The Prometheus scrape job used to gather metrics from the
insights-controller
pods now restricts the metrics scraped to ones explicitly set in thevalues.yaml
. This means that the internal TSDB must hold less data. -
Improved Observability: The agent-server now scrapes itself for metrics and exports them for monitoring by the CloudZero platform. This means that issues within a cluster can be detected much sooner and with greater visibility into the cause of the issue.
1.0.0-beta-10
Release 1.0.0-beta-10 (2025-01-17)
This release adds logic to ensure that the static target used in the env-validator
and in the Prometheus configuration always matches the internal Service created by the kube-state-metrics
subchart.
Upgrade Steps
Upgrade using the following command:
helm upgrade --install <RELEASE_NAME> cloudzero-beta/cloudzero-agent -n <NAMESPACE> --create-namespace -f configuration.example.yaml --version 1.0.0-beta-10
Improvements
- Static Target and KSM Service Always Match: Both the
env-validator
and the Prometheus agent require an address for akube-state-metrics
Service. By default, the Service name generated by thekube-state-metrics
subchart generates a name that matches the target value generated by the chart.
However, if the user overrides the name of the kube-state-metrics
Service using kubeStateMetrics.fullnameOverride
, there can be a mismatch between the names. This change attempts to mirror the logic used by the internal kube-state-metrics
chart so that the target and Service names will match regardless of user input.
1.0.0-beta-9
Release 1.0.0-beta-9 (2025-01-15)
This release adds the ability to set the log level via the insightsController.server.logging.level
field. Additionally, the interval in which data is written to the CloudZero platform and the timeout for writing data are configurable via insightsController.server.send_interval
and insightsController.server.send_timeout
, respectively. The default timeout is increased from 10s
to 1m
.
The kube-state-metrics
subchart section now explicitly includes container image information. This introduces no functional changes; it is intended to make it clearer to the user which images will be used and from where they will be pulled.
Upgrade Steps
Upgrade using the following command:
helm upgrade --install <RELEASE_NAME> cloudzero-beta/cloudzero-agent -n <NAMESPACE> --create-namespace -f configuration.example.yaml --version 1.0.0-beta-9
Bug Fixes
- KSM Address: Fixes an issue in which the internal
kube-state-metrics
service address can be templated incorrectly.
Improvements
- More Configurable Server Settings: The log level, remote write interval, and remote write timeout are now configurable in the chart values. See the
insightsController.server
section in thevalues.yaml
for more details. - Default Setting for Send Timeout: The default remote write timeout is increased to
1m
, which allows for backfilling data from larger clusters. - Container Image Information Added: The values passed to the internal
kube-state-metrics
subchart now explicitly set the container image registry, repository, and tag information for the purposes of documentation.
1.0.0-beta-8
Release 1.0.0-beta-8 (2025-01-14)
This release adds the imagePullSecrets
field to the initCertJob
so that the pod from the job can use an image from a private repository. Additionally, the imagePullSecrets
setting for insightsController
, initCertJob
, and initScrapeJob
now have reasonable default values in the case that they are not set.
Upgrade Steps
- If required, set the
initCertJob.imagePullSecrets
to the desired value. - Alternatively, set only the top level
imagePullSecrets
to configure all pods to use thatimagePullSecrets
setting.
Upgrade using the following command:
helm upgrade --install <RELEASE_NAME> cloudzero-beta/cloudzero-agent -n <NAMESPACE> --create-namespace -f configuration.example.yaml --version 1.0.0-beta-8
Bug Fixes
- imagePullSecrets Field Added to initCertJob: The
initCertJob
Job previously did not allow for animagePullSecrets
to be configured, preventing use with private registries.
Improvements
- Default Settings for Images: If
imagePullSecrets
is not set in theinsightsController
,initCertJob
, andinitScrapeJob
sections, the value frominsightsController.imagePullSecrets
or the top levelimagePullSecrets
will be used; this reduces the amount of configuration needed.
1.0.0-beta-7
Release 1.0.0-beta-7 (2025-01-08)
This release changes the default behavior for certificate management. The default option will now create a self-signed certificate created and managed by the chart itself. cert-manager
is removed as a dependency.
Upgrade Steps
- Update TLS preferences; preferences for the TLS certificate used by the
ValidatingWebhookConfiguration
configurations and webhook-server are now managed by theinsightsController.tls
section. See the README.md and values.yaml for configuration details.- If TLS preferences are set in the
insightsController.server.tls
orinsightsController.webhooks.caBundle
section(s), remove them and review the README.md and values.yaml for new options in theinsightsController.tls
section. - It is likely that no changes will need to be made, unless there is a preference for using an external
cert-manager
or externally created certificates.
- If TLS preferences are set in the
- If settings in the
initJob
field are set, rename theinitJob
field toinitScrapeJob
.
Breaking Changes
initJob
field renamed toinitScrapeJob
.insightsController.server.tls
section is removed in favor ofinsightsController.tls
.tls.issuer
andtls.certificate
can no longer be individually toggled; instead, setinsightsController.tls.useCertManager
to toggle both theIssuer
andCertificate
resources at the same time.insightsController.webhooks.caBundle
is moved toinsightsController.tls.caBundle
.cert-manager
is removed as a dependency. The chart will no longer usecert-manager
as a default for certificate management. If there is a preference to manage the TLS certificate withcert-manager
, see the README.md for details.
New Features
- Internal Certificate Creation: Previous versions of the beta agent attempted to deploy
cert-manager
and depended oncert-manager
to provision and manage the TLS certificate used by theValidatingWebhookConfiguration
configurations and webhook-server. As of this beta version, the default behavior is changed such that the TLS certificate is created by the<RELEASE-NAME>-webhook-server-init-cert
Job.- The
ValidatingWebhookConfiguration
resources and the Secret created to hold the TLS certificate information are automatically patched to use this certificate.
- The
Improvements
- Internal KSM Names Properly Prefixed: The internal KSM (cloudzero-state-metrics) managed by the chart now properly prefixes all created resources with the chart release name.
Other Changes
- Expanded ClusterRole Permissions: The
ClusterRole
used by the agent now requirespatch
permissions onvalidatingwebhookconfigurations
andsecrets
for the respective resources created by the chart.
1.0.0-beta-5
1.0.0-beta-5 (2024-12-19)
New Features
- Automatic detection and reconfiguration of secrets rotation
- Automatic detection and reconfiguration of TLS Certificate rotation
- Default insights controller logging level set to "info"
- AntiAffinity support for insights replicaset (best effort)
- Insights controller cleans Cloud Account ID configuration value upon start
Upgrade Steps
To upgrade, run the following command:
helm upgrade --install <RELEASE_NAME> cloudzero-beta/cloudzero-agent -n <NAMESPACE> --create-namespace -f configuration.example.yaml --version 1.0.0-beta-5
For more details, see the beta installation instructions.
Improvements
- Availability Enhancement: Healthcheck support ensures that requests are only forwarded to replica instances that are ready to accept work.
- Security Policy Enhancements: The application can now react to changes in the Cloudzero API Secret or TLS Certificates. In production environments, these secret values will rotate and update periodically. Instead of restarting the service, which can be costly, the application can now react to key changes and reinitialize the related layer.
- Monitoring Statistics: Added monitoring statistics on the insights controller.
- Performance Improvements: Various performance improvements have been made.
- Metrics Service: Now possible to override if required by a customer.
Security Scan Results
Image | Scanner | Scan Date | Critical | High | Medium | Low | Negligible |
---|---|---|---|---|---|---|---|
ghcr.io/cloudzero/cloudzero-insights-controller/cloudzero-insights-controller:0.0.6 | Grype | 2024-12-19 | 0 | 0 | 0 | 0 | 0 |
ghcr.io/cloudzero/cloudzero-agent-validator/cloudzero-agent-validator:0.10.0 | Grype | 2024-12-19 | 0 | 0 | 0 | 0 | 0 |
[DEPRECATED] 0.0.29-beta
Installation Instructions
Please follow the Installation Instructions provided in this releases README.
Release 0.0.29-beta Changes
- add new cloudzero-metrics-service
0.0.28
Installation Instructions
Please follow the Installation Instructions provided in this releases README.
Release 0.0.28 Changes
b9aee9f Update Chart.yaml to version 0.0.28
0.0.27
Installation Instructions
Please follow the Installation Instructions provided in this releases README.
Release 0.0.27 Changes
- Pins the version of the CloudZero Agent Validator container image, which fixes the issue in which the
env-validator
container prevents the server pod from starting and results in a CrashLoopBackoff state
e7338cc Update Chart.yaml to version 0.0.27
0.0.26
Installation Instructions
Please follow the Installation Instructions provided in this releases README.
Release 0.0.26 Changes
- Update README.md by @wreckedred in #73
- updated scrape intervals to default to 120s by @roberthocking in #74
Full Changelog: 0.0.25...0.0.26