Skip to content

1.0.0-rc1

Latest
Compare
Choose a tag to compare
@github-actions github-actions released this 29 Jan 20:55

Release 1.0.0-rc1 (2025-01-23)

This release contains several improvements from 1.0.0-beta-10:

  • The name of the initialization Job that gathers information about existing state of a cluster now includes the version of the chart and the image tag used in the Pod.
  • The initScrapeJob field is deprecated in favor of initBackfillJob. However, this is not a breaking change; initScrapeJob can still be used without issue.
  • The server.agentMode boolean argument is now provided.
  • Improvements are made to the resource consumption of the agent-server pod.
  • Metrics from the agent-server pod are made available for monitoring.

Upgrade Steps

Optionally rename the initScrapeJob field in any override files with initBackfillJob. initBackfillJob is the preferred field, but configurations using initScrapeJob will still work.

Upgrade using the following command:

helm upgrade --install <RELEASE_NAME> cloudzero/cloudzero-agent -n <NAMESPACE> --create-namespace -f configuration.example.yaml --version 1.0.0-rc1

Improvements

  • Initialization Job Name Changes With Releases: It was previously possible to have failures in release upgrades if the container image used in the Job changed. This is because the image field in a Job spec is immutable. To prevent this, a new Job is created every time the Helm chart version is changed and/or when the image used in the Job is changed. This also ensures that changes to the underlying insights-controller application will be used in the new backfill of existing cluster state data.

  • Clarified Field Names: The Job used for gathering existing cluster data was previously controlled via a field named initScrapeJob. This is an overloaded term given that this chart also uses the term "scrape job" in the context of Prometheus. This has caused some confusion, so the field is now renamed to initBackfillJob. initScrapeJob is still usable, and values from initScrapeJob are merged with initBackfillJob with the latter having precedence.

  • Easier Debugging: The server.agentMode field can be toggled to false; by default it is set to true so that the Prometheus server runs in agent mode to keep resource usage manageable. Setting the field to false takes the Prometheus server out of agent mode. This is helpful for debugging issues with the Prometheus agent-server.

  • Resource Consumption Reduction: The Prometheus scrape job used to gather metrics from the insights-controller pods now restricts the metrics scraped to ones explicitly set in the values.yaml. This means that the internal TSDB must hold less data.

  • Improved Observability: The agent-server now scrapes itself for metrics and exports them for monitoring by the CloudZero platform. This means that issues within a cluster can be detected much sooner and with greater visibility into the cause of the issue.