-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client: prevent using stale allocs #18601
Conversation
Similar to #18269, it is possible that even if Node.GetClientAllocs retrieves fresh allocs that the subsequent Alloc.GetAllocs call retrieves stale allocs. While `diffAlloc(existing, updated)` properly ignores stale alloc *updates*, alloc deletions have no such check. So if a client retrieves an alloc created at index 123, and then a subsequent Alloc.GetAllocs call hits a new server which returns results at index 100, the client will stop the alloc created at 123 because it will be missing from the stale response. This change applies the same logic as #18269 and ensures only fresh responses are used. Glossary: * fresh - modified at an index > the query index * stale - modified at an index <= the query index
@stswidwinski I would love you weighing in on this as well if you have time since this is directly downstream from your fix in #18269 |
I think this is entirely analogous to #18269 and I agree that this should address the problem. Previously we haven't addressed this because of two reasons (neither very good...):
I believe that #18267 more broadly describes this as a problem with nearly every RPC issued such that we require a minimal index. That's because a stale response is not an error which is somewhat compounded by related #18266. While squashing the most painful samples seems like the right thing to do in the short term, perhaps it makes sense to mint a few wrappers which help issue queries with an index limit which convert stale responses into errors (such that the type checking helps us handle them). What do you think? Edit: I think my statement of "nearly every" is somewhat strong. I think it's a relatively common pattern in those places which allow stale responses, but I would hate to make quantitative statements without having audited them. @schmichael has done that though, so a future reader should refer to the issues mentioned above. |
Absolutely. I'm auditing all RPCs Clients make as we speak since as you pointed out this is likely part of a broader pattern. In the past we had audited these calls for properly setting the index, but during that audit we had neglected to account for checking the index after the response. Once I get the audit complete and see what the damage is, we can decide whether spot fixes or more holistic changes are best. Given the opportunity to design blocking RPC semantics from scratch I would have absolutely choose a default behavior of returning an error instead of a stale response. Giving callers valid looking data and entrusting them to double check the index is obviously error prone. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Skipping changelog entry because #18269 covers it (and sadly I don't think I can link 2 PRs from 1 logline in our changelog building system?) |
Similar to #18269, it is possible that even if Node.GetClientAllocs retrieves fresh allocs that the subsequent Alloc.GetAllocs call retrieves stale allocs. While `diffAlloc(existing, updated)` properly ignores stale alloc *updates*, alloc deletions have no such check. So if a client retrieves an alloc created at index 123, and then a subsequent Alloc.GetAllocs call hits a new server which returns results at index 100, the client will stop the alloc created at 123 because it will be missing from the stale response. This change applies the same logic as #18269 and ensures only fresh responses are used. Glossary: * fresh - modified at an index > the query index * stale - modified at an index <= the query index
Similar to #18269, it is possible that even if Node.GetClientAllocs retrieves fresh allocs that the subsequent Alloc.GetAllocs call retrieves stale allocs. While `diffAlloc(existing, updated)` properly ignores stale alloc *updates*, alloc deletions have no such check. So if a client retrieves an alloc created at index 123, and then a subsequent Alloc.GetAllocs call hits a new server which returns results at index 100, the client will stop the alloc created at 123 because it will be missing from the stale response. This change applies the same logic as #18269 and ensures only fresh responses are used. Glossary: * fresh - modified at an index > the query index * stale - modified at an index <= the query index
Similar to #18269, it is possible that even if Node.GetClientAllocs retrieves fresh allocs that the subsequent Alloc.GetAllocs call retrieves stale allocs. While `diffAlloc(existing, updated)` properly ignores stale alloc *updates*, alloc deletions have no such check. So if a client retrieves an alloc created at index 123, and then a subsequent Alloc.GetAllocs call hits a new server which returns results at index 100, the client will stop the alloc created at 123 because it will be missing from the stale response. This change applies the same logic as #18269 and ensures only fresh responses are used. Glossary: * fresh - modified at an index > the query index * stale - modified at an index <= the query index
Backport failed: https://github.com/hashicorp/nomad/actions/runs/6342982782 Manual 1.4.x backport: d2cd6db Manual 1.5.x backport: b1cf888 Manual 1.6.x backport: f02061f |
Similar to #18269, it is possible that even if Node.GetClientAllocs retrieves fresh allocs that the subsequent Alloc.GetAllocs call retrieves stale allocs. While
diffAlloc(existing, updated)
properly ignores stale alloc updates, alloc deletions have no such check.So if a client retrieves an alloc created at index 123, and then a subsequent Alloc.GetAllocs call hits a new server which returns results at index 100, the client will stop the alloc created at 123 because it will be missing from the stale response.
This change applies the same logic as #18269 and ensures only fresh responses are used.
Glossary: