Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 add orderKappsValidateErr in crdupgradesafety preflight #1640

Conversation

azych
Copy link
Contributor

@azych azych commented Jan 20, 2025

orderKappsValidateErr() is meant as a temporary solution to an external (ie. dependency) problem. carvel.dev/kapp/pkg/kapp/crdupgradesafety Validate() can return a multi-line error message which comes in random order. Until that is changed upstream, we need to fix this on our side to avoid falling into cycle of constantly trying to reconcile ClusterExtension's status due to random error message we set in its conditions.

An upstream PR is already created, but it might be some time before it makes it into the carvel's codebase and then to ours, hence this change.

For full context please see #1456 and carvel-dev/kapp#1047, especially carvel-dev/kapp#1047 (comment)

Description

Reviewer Checklist

  • API Go Documentation
  • Tests: Unit Tests (and E2E Tests, if appropriate)
  • Comprehensive Commit Messages
  • Links to related GitHub Issue(s)

@azych azych requested a review from a team as a code owner January 20, 2025 15:40
Copy link

netlify bot commented Jan 20, 2025

Deploy Preview for olmv1 ready!

Name Link
🔨 Latest commit ed16ae9
🔍 Latest deploy log https://app.netlify.com/sites/olmv1/deploys/678e6f56573acf000704ab39
😎 Deploy Preview https://deploy-preview-1640--olmv1.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

orderKappsValidateErr() is meant as a temporary solution to an external
(ie. dependency) problem. carvel.dev/kapp/pkg/kapp/crdupgradesafety
Validate() can return a multi-line error message which comes in random order.
Until that is changed upstream, we need to fix this on our side to avoid
falling into cycle of constantly trying to reconcile ClusterExtension's
status due to random error message we set in its conditions.
@azych azych force-pushed the tmp-ensure-order-in-kapp-validation-error branch from 0845f38 to ed16ae9 Compare January 20, 2025 15:44
Copy link

codecov bot commented Jan 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 67.60%. Comparing base (594cba3) to head (ed16ae9).
Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1640      +/-   ##
==========================================
+ Coverage   67.35%   67.60%   +0.24%     
==========================================
  Files          55       55              
  Lines        4555     4571      +16     
==========================================
+ Hits         3068     3090      +22     
+ Misses       1261     1257       -4     
+ Partials      226      224       -2     
Flag Coverage Δ
e2e 52.90% <3.12%> (-0.28%) ⬇️
unit 54.45% <100.00%> (+0.24%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

// github.com/carvel-dev/kapp/pull/1047 (PR to ensure order in upstream)
//
// TODO: remove this once ordering has been handled by the upstream.
func orderKappsValidateErr(err error) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well written (as with the unit tests). Thank you! Qq: is it worth surfacing whether the function had to fallback on the original? My guess is that if we fallback, we may run into the continuous reconcile issue. Wondering if we should make that noisy somehow...?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @perdasilva I thought about it too and it's a tricky situation, because while the first two steps in the unwrapping logic are true to all internal validators (as of now at least), the last one (most nested error.Join) might only be valid for CheckValidator internal validator. Since fallback might as well be a false negative here, I'd normally log this via Debug, but I noticed Debug isn't being used anywhere in the code so ultimately thought against it.
In case of a non-empty error here and regardless of fallback, we are still logging the full reconciliation error message every time we try to reconcile and that should make spotting the random order possible, though I agree an added hint from failed fallback might definitely help, so maybe it would be ok.

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. Ultimately, I guess, it becomes a debug artifice. Which, while it undoubtedly helps with debugging, we'd still be reconciling repeatedly. If we have exp. backoff, then maybe it's not a big issue, and it's also temporary. I was thinking if we should panic (since the library shouldn't change out from under us during runtime), but then we'd need to be very careful about testing each error condition the underlying library checks for, which might not be a good use of time atm. Let's proceed ^^ thanks, dude =D

@perdasilva perdasilva added this pull request to the merge queue Jan 21, 2025
Merged via the queue into operator-framework:main with commit f14e9d0 Jan 21, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants