coordinator: don't share CA instance with gRPC servers #520
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
At a high-level, this PR
meshAuthority
toauthority.Authority
ca.CA
instance away from the gRPC serversThe idea is that the methods called by the gRPC servers on the CA instance are always coupled to the current active manifest, thus getting the manifest from
meshAuthority
and the certificates fromCA
is inherently racy (think of concurrent calls toSetManifest
andNewMeshCert
, for example). This is a first step towards tighter coupling of the two, by handing out CA objects when required and removing the permanent members.In the meshAPI server, I'm moving all certificate operations to the
ValidateCallback
, and all certificate operations happen on the same CA instance. This does not make a practical difference now, but allows to swap out the permanent CA object with one per manifest generation.In the userAPI server, I'm adding the CA as a return to the manifest retrieval operation, also in preparation to swap it out in the future.
The API of the
Authority
should be considered transitory while we're working on the persistence. I could imagine it returning a struct with manifest and CA combined.The race conditions are still present (see associated TODOs), but it should be easier to mitigate them going forward.