-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dynamic host volumes: ensure unique name per node #24748
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
func (l *nameLocker) lock(name, id string) error { | ||
currentID, exists := l.locks.LoadOrStore(name, id) | ||
if exists && id != currentID.(string) { | ||
return fmt.Errorf("%w: name=%q id=%q", ErrVolumeNameExists, name, id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The caller has the name and ID and already logs, so we should probably return only ErrVolumeNameExists
here and then have the caller log the error rather than formatting this by hand and having the formatting mismatch for JSON loggers.
users may modify an existing volume (probably to change the size? why else?) by setting an `id` in the volume spec and re-issuing create
with the same name -- the static vol is discarded, dynamic takes precedence, and log to warn the user
e7a7f9f
to
d12a57b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I feel like the sync.Map
is almost never what you want to use but in this case it was the perfect choice.
@@ -50,7 +51,8 @@ SEND_BATCH: | |||
// host volume updates | |||
var hostVolChanged bool | |||
c.batchNodeUpdates.batchHostVolumeUpdates(func(name string, vol *structs.ClientHostVolumeConfig) { | |||
hostVolChanged = hvm.UpdateVolumeMap(newConfig.Node.HostVolumes, name, vol) | |||
hostVolChanged = hvm.UpdateVolumeMap(c.logger.Named("node_updater").With("method", "batchFirstFingerprint"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking: I get where you're going with having the method
field here but I do wonder if the terminology is weird for a cluster adminstrator looking through their logs. They probably don't care about the Go method name and more what component is emitting the log line.
Also, as far as I can tell there's only ever one possible log line we can emit from the function we're injecting this logger into, and that can only happen on first fingerprint? That would make adding the method
field less useful, I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
theoretically it should only happen in that one situation, yes, but UpdateVolumeMap
is called from 3 spots, and I had such a heck of a time tracing the various call sites to construct this whole thing, I figured I'd just go extra specific with these. I could add a Trace
in UpdateVolumeMap
to shine light on some of the callbacky weirdness in this area of the codebase...
otherwise, while it should only happen one way presently, I tend to do a lot of guarding against potential future regressions -- it's not likely that someone will modify this at all (I'm sure it's perfect ;), much less cause breakage, but I I'm a little paranoid about the future.
I thought of log_include_location
, which is great, but doesn't help in this case because the log isn't emitted right here where it would matter to know about it. I could un-DRY this like the other similar ones, but overall I do like the consistency that UpdateVolumeMap
provides.
🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aye, no worries. Let's go ahead and ship this.
Volume name must be unique per node.
For existing static host vols, the scheduler will prevent creation of duplicate names ("no node meets constraints"), and it will similarly prevent duplicate dynamic vols after fingerprint (node re-registration), but we need to prevent concurrent duplicate creation that may happen within a fingerprint interval, too.
Here a lock is set for name+id during
Create
(and restore), then released whenDelete
d to prevent such very-fast or concurrent creates.If the a volume spec is submitted again with a valid
id
set, it can still update the volume, and the plugin'sCreate
will be run with the new spec. I'm guessing that folks may want to do this if either A) the plugin has been updated and folks want to regenerate their vols in place? or B) to increase the vol size (in CSI parlance, "expand"). Not sure if perhaps any other reasons, but I wanted to leave this door open.A last funny caveat is if a static host volume is added to agent config after a dynamic one has been created, and the agent restarted to use it, the dynamic one will take precedence. A warning is logged to inform the user if this happens.