Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a ZFS dataset for each volume when running on ZFS #775

Open
icefo opened this issue Jan 9, 2025 · 1 comment
Open

Create a ZFS dataset for each volume when running on ZFS #775

icefo opened this issue Jan 9, 2025 · 1 comment
Assignees
Labels
community_new New idea raised by a community contributor

Comments

@icefo
Copy link

icefo commented Jan 9, 2025

Tell us about your request
Create a ZFS dataset for each volume. This makes volumes snapshots possible with the added benefit of being able to customize ZFS options per volume (like compression).

Which service(s) is this request for?
Docker Engine

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
This allows you to take volume snapshots that are easy and instantaneous to restore when something goes terribly wrong, like during a risky update / migration.

It's also a great way to achieve better backups. You can either snapshot a live container storage (Would probably be okay but you may still have issues during a restore) or stop a container, take a snapshot (instantaneous), start it again and backup the snapshot. This achieve perfect backups with the minimum amount of downtime. I'm already doing the live snapshot to backup a backup server and it work great.

Are you currently working around the issue?
I'm not the first person to think of that and I found a volume plugin that I improved to be easily usable in docker compose (https://github.com/icefo/docker-zfs-plugin), but since ZFS is a kernel level driver you have to resort to a smelly workaround to make it work in the V2 volume plugin architecture. It's more a proof concept, it logs event to an hard coded path for debug, but it works.

The smelly workaround
In short volume plugins are containers and have to mount the volumes in a specific folder in the container that is shared with the host. Sounds great in theory but ZFS is a kernel level driver, so the mountpoints will be relative to the host and not the container (this break the encapsulation). The workaround he found is:

  1. Define this path as the shared folder: /var/lib/docker/plugins/pluginHash/propagated-mount/
  2. Add this ../../../../../.. to all paths returned to docker from the plugin to return to the true host root path

This allows the plugin to mount the ZFS datasets wherever it wants in the system, I defined a folder for that in /mnt.

This is very brittle and a potential security vulnerability that docker may fix in the future. I don't think it's bad since volume plugins seem to have CAP_SYS_ADMIN anyway, but sorry for not reporting it through proper channels. I posted this on the docker forums & reddit a few days ago, so it's public now anyway.

Additional context
If this feature interest the docker maintainers, I would be interested to write it. I would need some guidance on the proper approach though. Do I improve the local driver to make use of datasets if the underlying filesystem supports it (ZFS, btrfs, bcachefs, ...?) or do I create a new driver.

I quickly looked at the local driver and it seems I would only have to modify the create & remove functions. I need to check what the live restore function actually does too.

@icefo icefo added the community_new New idea raised by a community contributor label Jan 9, 2025
@icefo
Copy link
Author

icefo commented Jan 21, 2025

Hello,

Do you have any update ? I'd gladly give some time to do it the proper way in the docker engine, but if it's not an interesting feature for the maintainers I'll polish a bit the plugin and work with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community_new New idea raised by a community contributor
Projects
None yet
Development

No branches or pull requests

3 participants