-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discuss] Elastic Agent Helm Chart release process #5486
Comments
As I see it, we have two basic choices here:
The important questions that should guide this choice are:
I don't feel like I know enough about the domain to have informed opinions about these myself. I think I'd default to option 1 just because it's simple, but it does sacrifice some flexibility. |
I like in general the first choice that @swiatekm proposes above, but I would like to add a twist to it Before elaborating more I will try to provide some answer to the raised questions
Changes to the Helm chart can be driven by the following:
I would prefer to have as default one the one that the helm chart was tested against
imo the helm chart should support the user specifying different agent versions/images but provide guarantees of nominal operation only with the default image that comes with it. Hence, following this guarantee, there will be chart versions that will support only a specific agent version and either the user needs to stay in an older version if they want to remain under an agent version or change with their own responsibility
I wouldn't expect this to be more painless that a lockfree release with agent (last famous words of me) My approach would be that the Helm chart releases can happen only from the
The above is just a draft of an my initial thinking more than happy to drill down the details if it sounds reasonable to all the interested parties |
@pkoutsovasilis do you mean literally just |
@swiatekm I was thinking
the helm chart will support agent versions incrementally thus when there is a patch version to the latest agent release I see no issue creating a PR that points the aforementioned |
How do we account for and include changes to the package? Automation that creates an appropriate PR in this repository that eventually leads to a new chart release? I like the idea of just releasing from main as it is simpler, but I worry we'd need a way to indicate which agent versions new features are compatible with eventually like the stack version constraint in integrations packages. It may be simpler to have a release of the helm chart for each active minor to eliminate this problem. If we want to account for package changes, we'd need a versioning scheme that can move faster than the stack release. We couldn't just have 8.16.1, 8.16.2, etc. |
at the moment this is a manual process so the accounting happens manually. That said, if we deem that this becomes "unbearable" we could try and automate it; ps: transitioning from handlebars to helm templates is non-trivial 😄 One of the reasons this Helm chart started with only the kubernetes integration built-in is this one
In my thinking the helm chart provides guarantees only for the combination of |
I think requiring an agent upgrade to get a guarantee for a new kubernetes integration version goes against the way we want package releases to work where you don't have to upgrade agent outside of this scenario. Often users cannot rapidly upgrade their Elastic Agent (because they have to upgrade ES, they have rigid compliance or QA procedures, etc) but they can upgrade their agent configuration or policy quickly. Whatever we decide to do has to support upgrading the k8s integration without upgrading the agent version, I think that will be quite a common ask. Users of this chart will frequently be k8s and Helm users but not k8s and Helm (or Elastic Agent) experts, we need to optimize for helping them get fixes with minimal toil or the need to escalate to engineering via support cases. |
ok I hear you 🙂 Then let's do a minor alternation to the approach; |
Yes that would work and let us backport fixes where applicable. |
ok then my next step will be to draw this "flow" in a diagram and thus help all the interested parties to have a better understanding how this is gonna look like. Then if there are no strong objections I will proceed to coding it |
I've experimented with a local MinIO as a Helm registry. Here are my findings:
Based on the above observations, this my proposed flow; each branch, namely flowchart TD
classDef default line-height:1.5,text-align:center;
B[Manual: Bump chart version and set app version<br>in <a href='https://github.com/elastic/elastic-agent/blob/main/deploy/helm/elastic-agent/Chart.yaml'>Chart.yaml</a>]
D[CI: Verify that agent <a href='https://github.com/elastic/elastic-agent/blob/main/version/version.go'>version</a> matches<br>the major.minor parts of the chart version<br>and fully the app version in Chart.yaml]
E["CI: Ensure that chart version<br>is pre-release (-SNAPSHOT) if no git tag exists"]
F[CI: Ensure that a chart with the same version<br>is not released]
H[CI: Lint and package Helm chart]
I[CI: Download index.yaml from<br>the elastic Helm registry and<br>add the new chart entry]
L[CI: Push Helm chart package<br>to repository]
K[CI: Push updated index.yaml<br>to repository]
B -.->|Pull Request with Chart.yaml changes| D
D --> E
E --> F
F --> H
H -.->|"''release'' comment under the PR (maybe PR merged?!)"| I
I --> L
L --> K
If the above proposal gets accepted, we need to figure out the following:
Things to consider:
|
@pkoutsovasilis in your example, you have: appVersion: 8.16.0 # elastic-agent version
version: 8.16.10 # chart version Does that mean that you want to always align the minor version of the Chart and elastic-agent, but not necessarily the patch version? If so, do you plan to release Chart patch versions without bumping the agent version? |
Yes pretty much; I want to keep the major and minor parts always aligned and the patch unlocked 🙂 One of the requirements for the Helm chart release is to able to backport fixes if required (here) which translates to bumping the chart version but not the agent one. To highlight such a case, imagine that there is an update to the built-in kubernetes integration that needs to get out but the agent version must be the same because the actual agent isn't updated. In the case there is indeed a new agent minor version is released, then again we bump the chart version and we update the app version to the former. |
Looking at your proposed process, I'm wondering if there is even a need to publish Chart snapshot versions. It sounds like for each Chart release, we'll have to update Chart.yaml twice:
That just sounds like a lot of updating for not much gain, especially if we're going to be releasing the Chart more often than agent itself. Could we get away with just keeping the normal version and only bumping on releases? The other question I have is how often do we actually expect to do patch releases of the Chart. The Otel Helm Chart simply bump the version and release on every change. Would that work for us? |
I see what you mean about the number of updates, but the gain for me is that we can deploy unreleased agent versions from each branch. Thus, we want to release also SNAPSHOT charts, which based on our current branches would be:
So if I understand this correctly for each commit on each branch, even when no chart-related files have changed, you propose to bump the Chart version and release it? |
I think we need to consider how most users are going to consume these charts and expect them to work. Having a Helm chart version 8.16.3 when the most recent Elastic stack patch release is 8.16.2 is going to create quite a lot of confusion. I don't think we would be allowed do this because of this confusion and this is main reason why the independent agent release versions never run ahead of the stack version and instead use the +buildYYYYMMDDHHMM qualifier. I like the idea of having the Helm chart able to release separate from the stack releases, but we need a less misleading version scheme than running ahead of the stack release like this. Let's say we used major.minor.YYYYMMDDHHMM with YYYYMMDDHHMM being the timestamp we published the chart at to get sortable versions (major.minor.$gitsha could also work but isn't sortable). We need to confirm with our PMs on what a reasonable versioning scheme is before committing to one. Then the simplest release process I can think of would be something like:
Using major.minor.YYYYMMDDHHMM avoids inventing a new versioning scheme entirely since we already use something like this for independent agent releases. |
The "I don't want to think about this anymore" option that everyone would accept is just hooking this up to the stack release process and following that, with the major con being we need stack releases to push fixes out. |
Unless I am missing something, this is almost the same with my proposal with the difference that instead of a patch version bump we will go with a timestamp. I have no issue, I thought that keeping the patch version to simple integers would cause less confusion but if you say otherwise no objection on my end 🙂
My 2 cents here, since there will be a pipeline that filters per chart-related or agent version changes (for normal releases), there is no overhead of utilising the same here but release a chart with
ha! 😄 I am not there yet but if there a need to go with that, sure let's go?! PS: who should we tag from Product to help decide about the most appropriate version scheme? |
I think we're aligned then, I wrote the process out again to make sure I understood. How would we handle release notes?
|
I considered using the existing fragments approach with |
That works, but we need to define how, when, and where they get published. Our existing release notes are in https://github.com/elastic/ingest-docs/tree/main/docs/en/ingest-management/release-notes but they are structured around the stack release. The current path for the Helm chart would have releases happen outside of that so we'd need to handle this in some way. |
I understand that this is difficult given that we have stack releases and potential package releases. Can I propose that we version and release based on each minor and patch release (aligned with stack) and not complicate it by having daily or snapshot releases or releases based on changes to the package. I think our patch releases are frequent enough to accommodate (or shorten the window) package releases that may happen between each patch release. As @strawgate also mentioned, if there's a need we can always release a modified helm as an emergency. |
@nimarezainia just to check if I my understanding is correct, you propose the Helm chart releases to happen alongside the stack ones because the frequency of the patch releases is frequent enough and thus we can keep the both version locked and fully synced? |
Given that the frequency of stack patch version releases is frequent enough, keeping the Helm chart and stack versions fully synced indeed makes sense, and I believe it simplifies the overall process. To implement this, I think we should extend the existing release process to include or invoke an additional step for the Helm chart release. We may need to reach out to the release team under platform engineering productivity (?!) to coordinate this effort. (cc @nimarezainia @ycombinator) Regarding documentation, I propose utilising the existing fragments mechanism for the changelog, with the parent folder located at Any thoughts on this approach? |
@pkoutsovasilis Your approach sounds reasonable to me. |
This sounds great to me too. I gather in the docs we'd just have a link to the changelog file, which will minimize any maintenance effort. |
In #5331, we added a Helm Chart for Elastic Agent. At the time, we deferred the discussion on how this chart should be released. Let's continue the discussion in this issue and try to arrive at a decision.
The text was updated successfully, but these errors were encountered: