Volume not listed when outside on cli specified dataset #12

girstenbrei · 2019-09-25T15:30:35Z

Summary

ZFS datasets to be used by the plugin can be specified on the cli via the --dataset-name argument. These are then added to ZfsDriver.rds in main.go:41 when the driver is created. When docker asks for a volume list, theses rds's are iterated in driver.go:63, checked and then returned to docker. This works. But if a dataset is created not within the on cli specified root dataset, it is not within the scope of these iterations.

Minimum viable example

zfs-plugin version: 0.5.0
docker version: 19.03.2, build 6a30dfc
Using the standard docker-zfs-plugin.service file

zfs create tank/docker-volumes
zfs create tank/another
docker volume create -d zfs --name=tank/docker-volumes/data
docker volume create -d zfs --name=tank/another/data
docker volume ls

Expected Output

DRIVER              VOLUME NAME
zfs                 tank/another/data
zfs                 tank/docker-volumes/data

Actual Output

DRIVER              VOLUME NAME
zfs                 tank/docker-volumes/data

Why is this a Problem:

Managing these volumes via the docker volume command then makes it impossible to know about created volumes. They will also not be pruned running docker volume prune. Removing a stack also does not destroy the volumes accordingly.

Proposed Solution(s)

Somehow the list command needs to identify, which datasets are used as volumes.

Requiring all datasets used to be present on the cli would allow correct identification of datasets used as volumes. But it would break functionality of creating volumes automatically on datasets outside of these. Suddenly deploys would fail because people just used the docker-zfs-plugin.service file, and this worked till now. This would be breaking changes, requiring a major version change.

Using custom properties on the zfs datasets could make them identifiable. On service startup, the datasets specified via cli would be created if needed and this property would be set. Every dataset created via docker would also get this property. A list command could then identify these, but in order to do so would probably need to iterate through a possibly large amount of datasets. Also, as far as i know, transfering datasets (zfs send) not necessarily copies all properties. This could lead to issues when restoring backups.

Writing the information somewhere (e.g. in the required root dataset from cli) would also save the information, but violate the single-source-of-truth principle. Somehow the actual datasets could change without that change beeing reflected in the static information.

Another option may be to query docker for this information. But, i suspect that docker does not provide the information about which volumes exist, because that is exactly what the plugin should report to docker, not the other way around.

IMHO i would prefer solution 1: require all used datasets on command line. I think it is an elegant solution to define what datasets are in use and also allows to ensure that these are available and working. If this breaks configuration, it is because there are volumes outside the on cli specified dataset. This means intern, that at the moment the docker volume command is broken. I would much prefer a one time broken configuration and then some sensible error messages when i try to create datasets outside the specified root paths (which i can fix by adding these paths) over a silently failing (volume hiding) docker volume ls.

Thanks again to y'all, if I'm completely wrong: please yell at me, why I'm wrong or what to do better 😉

Greets,
Chris

The text was updated successfully, but these errors were encountered:

clinta · 2019-09-25T16:44:29Z

Thanks for the feedback, I agree this isn't ideal. I'd prefer a facility to provide a warning rather than an outright error, but the docker API does not provide the ability to emit a warning. I could log the warning in the driver, but users will only see that if they check the service logs.

I also have use cases for interacting with volumes that are not specified on the command line. I have a dataset that holds shared data that is used by the host and by containers. I do not want this dataset to show up in a docker volume ls and I certainly don't want to risk it ever being pruned by docker. This dataset was not created by docker, but I do use the zfs plugin to mount this dataset into containers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Volume not listed when outside on cli specified dataset #12

Volume not listed when outside on cli specified dataset #12

girstenbrei commented Sep 25, 2019

clinta commented Sep 25, 2019

Volume not listed when outside on cli specified dataset #12

Volume not listed when outside on cli specified dataset #12

Comments

girstenbrei commented Sep 25, 2019

Summary

Minimum viable example

Expected Output

Actual Output

Why is this a Problem:

Proposed Solution(s)

clinta commented Sep 25, 2019