Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dynamic host volumes: fingerprint client plugins #24589

Merged
merged 5 commits into from
Dec 2, 2024

Conversation

gulducat
Copy link
Member

@gulducat gulducat commented Dec 2, 2024

This finds executable files in a new client.dhv_plugin_dir that respond to a Version call (arg $1 = version or OPERATION=version env) with a valid version string on stdout, and adds them to the client fingerprint. This fingerprint responds to a SIGHUP.

The main thing I'm wishy-washy on are user-facing names, which feel a bit franken-named here.


E.g.: given this client agent config

client {
  enabled = true

  host_volume_plugin_dir = "/opt/nomad/plugins-dhv" # default would be <data-dir>/host_volume_plugins
...

or this CLI call

$ sudo nomad agent -dev -host-volume-plugin-dir /opt/nomad/plugins-dhv/

and with these contents in that directory:

drwxrwxr-x 4.0K Nov 22 18:42 .
drwxrwxr-x 4.0K Nov 22 17:18 dir-and-not-a-plugin
-rwxrwxr-x 2.9K Nov 19 15:47 example-host-volume      # our proper example plugin
-rwxrwxr-x   64 Nov 22 18:42 executable-not-a-plugin
-rw-rw-r--    0 Nov 22 17:18 file-but-not-a-plugin
-rwxrwxr-x   69 Nov 22 18:16 not-a-plugin-and-errors

here are the client agent logs on sighup:

[INFO]  client/fingerprint_manager.go:122: client.fingerprint_mgr: reloading fingerprinter: fingerprinter=host_volume_plugins
[DEBUG] fingerprint/dynamic_host_volumes.go:36: client.fingerprint_mgr.host_volume_plugins: detected plugin built-in: plugin_id=mkdir version=0.0.1
[DEBUG] hostvolumemanager/host_volume_plugin.go:133: client.fingerprint_mgr.host_volume_plugins: error with plugin: plugin_id=not-a-plugin-and-errors operation=version
  stdout=
  | no, stdout

  stderr=
  | no, stderr
   error="exit status 1"
[DEBUG] fingerprint/dynamic_host_volumes.go:108: client.fingerprint_mgr.host_volume_plugins: failed to get version from plugin: plugin_id=not-a-plugin-and-errors error="error getting version from plugin \"not-a-plugin-and-errors\": exit status 1"
[DEBUG] fingerprint/dynamic_host_volumes.go:108: client.fingerprint_mgr.host_volume_plugins: failed to get version from plugin: plugin_id=executable-not-a-plugin error="error with version from plugin: Malformed version: nah, stdout"
[DEBUG] fingerprint/dynamic_host_volumes.go:66: client.fingerprint_mgr.host_volume_plugins: detected plugin: plugin_id=example-host-volume version=0.0.1
[DEBUG] client/client.go:2548: client: state changed, updating node and re-registering
[INFO]  client/client.go:2081: client: node registration complete

and this is the resulting fingerprint:

$ nomad node status -verbose -self | grep host_vol
plugins.host_volume.version.example-host-volume = 0.0.1
plugins.host_volume.version.mkdir               = 0.0.1 # "mkdir" plugin is built-in

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some bikesheddy comments about the naming of user-facing attributes and some log levels, but otherwise this looks great!

@@ -229,6 +229,10 @@ type ClientConfig struct {
// AllocMountsDir is the directory for storing mounts into allocation data
AllocMountsDir string `hcl:"alloc_mounts_dir"`

// DHVPluginDir is the directory containing dynamic host volume plugins
// db TODO(1.10.0): document default directory is alongside alloc_mounts
DHVPluginDir string `hcl:"dhv_plugin_dir"` // db TODO(1.10.0): is this a good name?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For user-facing configuration/fields I don't love "dhv" over "host volume". Maybe host_volume_plugin_dir?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more bike-shedding;

  1. where should it go by default? I put it next to alloc_mounts because the intention is for the plugins to put stuff in alloc_mounts, but it could go next to default plugin_dir within data_dir instead?
  2. I put plugins- as a prefix, so it would sort next to other "plugin" dirs, but none are present by default where I've put this, and host-volume-plugins would better match the config value...
  3. default --joined plugins-host-volume or _ plugins_host_volume?
  4. changing CLI flag to -host-volume-plugin-dir seems good, eh?

so many little fiddly naming bits 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where should it go by default?

I'd follow the behavior we did for plugin_dir where it's a child of data_dir.

I put plugins- as a prefix, so it would sort next to other "plugin" dirs, but none are present by default where I've put this, and host-volume-plugins would better match the config value...

I don't have a super strong opinion on this so long as its unambiguous for other plugin dirs. I was going to ask what we did for CNI but that's the fairly ambiguous cni_path 🤦

default --joined plugins-host-volume or _ plugins_host_volume?

Definitely _ for HCL field values.

changing CLI flag to -host-volume-plugin-dir seems good, eh?

Yeah

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, I'll go with <data-dir>/host_volume_plugins then. it won't be sorted next to plugins, but I like the consistent word order in code and UX more than ls output of the data dir. might even be good to type /opt/nomad/data/hos<tab><tab> so autocomplete won't fight with plugins*

client/hostvolumemanager/host_volumes.go Outdated Show resolved Hide resolved
id, executable, targetPath string) (*HostVolumePluginExternal, error) {
// this should only be called with already-detected executables,
// but we'll double-check it anyway, so we can provide a tidy error message
// if it has changed between fingerprinting and execution.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch on this.

client/fingerprint/dynamic_host_volumes.go Outdated Show resolved Hide resolved
client/fingerprint/dynamic_host_volumes.go Outdated Show resolved Hide resolved
wg.Add(1)
go func(file, fullPath string) {
defer wg.Done()
// really should take way less than a second
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone is totally going to write a plugin on the JVM that takes 10s to assemble a DI framework on startup, and we'll 🤦 about it. But this seems like a reasonable assumption. 😁

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yaknow, I wonder if plugin authors might even use the "responds to version" aspect as a way to auto-disable plugins that won't be able to fulfill requests, like if they can't reach an NFS server, or who knows. they might want to try until some timeout, and that might reasonably be longer than 1s...

client/fingerprint/dynamic_host_volumes.go Outdated Show resolved Hide resolved
client/fingerprint/dynamic_host_volumes.go Outdated Show resolved Hide resolved
client/fingerprint/dynamic_host_volumes_test.go Outdated Show resolved Hide resolved
@gulducat gulducat merged commit 5a9a9a8 into dynamic-host-volumes Dec 2, 2024
18 checks passed
@gulducat gulducat deleted the dhv-plugins-fingerprint branch December 2, 2024 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants