Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't ignore errors on query status API #1489

Merged

Conversation

akoshelev
Copy link
Collaborator

We may be observing issues trying to run Hybrid at scale that are not visible to us due to the nature how query status handles those. If error is not a mismatch, it is just silently ignored. This could lead to a variety of problems - most importantly it will mask the failures and made other shards think that everything is OK, and they can proceed with finalizing results.

This PR changes this and makes this API return an error if there is at least one error that is not a status mismatch. If all of them are, the behaviour is the same as before

We may be observing issues trying to run Hybrid at scale that are not visible to us due to the nature how query status handles those. If error is not a mismatch, it is just silently ignored.
This could lead to a variety of problems - most importantly it will mask the failures and made other shards think that everything is OK, and they can proceed with finalizing results.

This PR changes this and makes this API return an error if there is at least one error that is not a status mismatch. If all of them are, the behaviour is the same as before
@akoshelev akoshelev requested a review from cberkhoff December 11, 2024 06:46
Copy link

codecov bot commented Dec 11, 2024

Codecov Report

Attention: Patch coverage is 91.30435% with 2 lines in your changes missing coverage. Please review.

Project coverage is 93.07%. Comparing base (ff37b8a) to head (c464883).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
ipa-core/src/query/processor.rs 91.30% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1489      +/-   ##
==========================================
+ Coverage   93.06%   93.07%   +0.01%     
==========================================
  Files         239      237       -2     
  Lines       43546    43509      -37     
==========================================
- Hits        40526    40496      -30     
+ Misses       3020     3013       -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tyurek
Copy link
Collaborator

tyurek commented Dec 11, 2024

yeah, this seems like a good thing to fix

.into_iter()
.filter_map(|(_si, e)| Self::get_state_from_error(e))
.collect();
status = states.into_iter().fold(status, min_status);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for my education, why does this cause errors to be ignored? What is a minimum status and what would folding an iterator of failures even do?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea it is a bit subtle here. filter_map filters out all errors that are not status mismatch error. The closure maps them to None and filter_map excludes them

@cberkhoff
Copy link
Collaborator

cberkhoff commented Dec 11, 2024

Dont we need to handle this error in the RC loop in some different way? Are all error non-recoverable?

@akoshelev akoshelev merged commit cdb618d into private-attribution:main Dec 12, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants