Skip to content

Commit

Permalink
client: prevent using stale allocs (#18601)
Browse files Browse the repository at this point in the history
Similar to #18269, it is possible that even if Node.GetClientAllocs
retrieves fresh allocs that the subsequent Alloc.GetAllocs call
retrieves stale allocs. While `diffAlloc(existing, updated)` properly
ignores stale alloc *updates*, alloc deletions have no such check.

So if a client retrieves an alloc created at index 123, and then a
subsequent Alloc.GetAllocs call hits a new server which returns results
at index 100, the client will stop the alloc created at 123 because it
will be missing from the stale response.

This change applies the same logic as #18269 and ensures only fresh
responses are used.

Glossary:
* fresh - modified at an index > the query index
* stale - modified at an index <= the query index
  • Loading branch information
schmichael committed Sep 29, 2023
1 parent 8d717ae commit f02061f
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 2 deletions.
20 changes: 19 additions & 1 deletion client/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -2339,7 +2339,7 @@ OUTER:
//
// For full context, please see https://github.com/hashicorp/nomad/issues/18267
if resp.Index <= req.MinQueryIndex {
c.logger.Debug("Received stale allocation information. Retrying.",
c.logger.Debug("received stale allocation information; retrying",
"index", resp.Index, "min_index", req.MinQueryIndex)
continue OUTER
}
Expand Down Expand Up @@ -2399,6 +2399,24 @@ OUTER:
}
}

// It is possible that Alloc.GetAllocs hits a different server than
// Node.GetClientAllocs which returns older results.
if allocsResp.Index <= allocsReq.MinQueryIndex {
retry := c.retryIntv(getAllocRetryIntv)
c.logger.Warn("failed to retrieve updated allocs; retrying",
"req_index", allocsReq.MinQueryIndex,
"resp_index", allocsResp.Index,
"num_allocs", len(pull),
"wait", retry,
)
select {
case <-time.After(retry):
continue
case <-c.shutdownCh:
return
}
}

// Ensure that we received all the allocations we wanted
pulledAllocs = make(map[string]*structs.Allocation, len(allocsResp.Allocs))
for _, alloc := range allocsResp.Allocs {
Expand Down
2 changes: 1 addition & 1 deletion nomad/alloc_endpoint.go
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,7 @@ func (a *Alloc) GetAllocs(args *structs.AllocsGetRequest,
reply.Allocs = allocs
reply.Index = maxIndex
} else {
// Use the last index that affected the nodes table
// Use the last index that affected the allocs table
index, err := state.Index("allocs")
if err != nil {
return err
Expand Down

0 comments on commit f02061f

Please sign in to comment.