This project isolates a given command and args using a container backend. Supported backends are bubblewrap, lxc and chroot. Latter is mainly thought for updating the underlying Linux system root.
A unique, fresh ZFS clone is used as dataset for each isolation, and discarded after use. Changes to the dataset are made on disk, preventing high memory usage under load. A privileged daemon cleans up used clones and provides fresh clones.
The project consists of the main script isolate
, the common bindings common.sh
and seccomp_wrapper.c
, which complements bubblewrap.
Latter works around bubblewraps --new-session
option, which prevents feeding input from the isolated environment to the host terminal, but also prevents job control of the spawned shell.
isolate
can work without ZFS.
In this case, modifications to the backing Linux system root are persistent.
For such setup, see “Setup without ZFS”
Copy isolate
and common.sh
to /usr/local/bin
We expect a group isolate
to exist. Users that shall be able to use isolate
shall be members of this group.
In the following, rpool/isolate/debisl
is the dataset and /isolate/debisl
its mountpoint. Both may be chosen arbitrarily.
Do the following steps as root user:
- Create a system Linux system root dataset root, e.g.
zfs create rpool/isolate/debisl
- Create a template Linux system root using e.g.
debootstrap /isolate/debisl
- [only bwrap] Compile the seccomp wrapper. You need the libseccomp static libs for this step.
gentoo: install libseccomp, USE: static-libs debian: install libseccomp-dev gcc seccomp_wrapper.c -static -o seccomp_wrapper /usr/lib/x86_64-linux-gnu/libseccomp.a
- [only bwrap] Copy the seccomp wrapper into the root dataset such that it is in the PATH of the spawned isolated environment, e.g.
/usr/bin/seccomp_wrapper
- Chroot into the environment and install what you need
isolate -e chroot -t /isolate/debisl bash
- After you are done, create a snapshot
zfs snapshot "rpool/isolate/debisl@$(date "+%Y%m%dT%H%M%SZ")"
- Generate clones for usage by
isolate
:isolate -r rpool/isolate/debisl 5
Your ZFS structure will look somewhat like this:
NAME USED AVAIL REFER MOUNTPOINT rpool/isolate 18.8G 488G 9.96G /isolate rpool/isolate/deb_gnuradio 1.86G 488G 1.72G /isolate/deb_gnuradio rpool/isolate/deb_gnuradio_1677838587626715157 184K 488G 1.72G /isolate/deb_gnuradio_1677838587626715157 rpool/isolate/deb_gnuradio_1677838587665325261 184K 488G 1.72G /isolate/deb_gnuradio_1677838587665325261 rpool/isolate/deb_gnuradio_1677838587703660046 184K 488G 1.72G /isolate/deb_gnuradio_1677838587703660046 rpool/isolate/debisl 6.96G 488G 6.39G /isolate/debisl rpool/isolate/debisl_1677860441807895636 8K 488G 6.39G /isolate/debisl_1677860441807895636 rpool/isolate/debisl_1677860480418781581 8K 488G 6.39G /isolate/debisl_1677860480418781581 rpool/isolate/debisl_1677860488091186587 8K 488G 6.39G /isolate/debisl_1677860488091186587
Do the following steps as root user:
- Chroot into the environment and install what you need
isolate -e chroot -t /isolate/debisl bash
- After you are done, create a snapshot
zfs snapshot "rpool/isolate/debisl@$(date "+%Y%m%dT%H%M%SZ")"
- Generate clones for usage by
isolate
:isolate -f -r rpool/isolate/debisl 5
The path and system units we provide expect the isolation Linux system roots to be at rpool/isolate/xxx
.
If your paths are different, please modify the service and path files.
Also, the number of clones is hardcoded in the unit file. You might want to adjust this too.
A systemd path unit watches a signal file (touched by isolate
on each exit) and triggers clone regeneration when an isolated environment exits and frees the used clone.
Copy [email protected]
and [email protected]
to /etc/systemd/system
.
To enable regeneration for rpool/isolate/debisl
, enable [email protected]
through systemctl enable --now [email protected]
X11 access from the isolated environment is only supported with the bwrap
engine, as is pulseaudio support.
X11 should work without any configuration.
For pulseaudio to work, you will have to allow anonymous access onto the pulse socket.
Add the following to /etc/pulse/default.pa
:
load-module module-native-protocol-unix auth-anonymous=1 socket=/tmp/pulse-socket
isolate
can function without ZFS by using template mode.
Create a backing Linux system root in a directory of your choice.
You can spawn isolate
as usual, but rather using the -t
instead of the -z
option.
Modifications to the underlying system root are of course permanent in template mode.
See isolate -h
:
spawn: ./isolate [-e engine] [-t templatedir] [-1] [-2] [-n] [-p pwd] [-d dir_host:target_ctr]... [-v env_name:env_value]... [command] [args] ... spawn: ./isolate [-e engine] [-z zfs_template] [-1] [-2] [-n] [-p pwd] [-d dir_host:target_ctr]... [-v env_name:env_value]... [command] [args] ... refresh available zfs clones: ./isolate [-r template] [-f] [-s] [num_should_avail] [!] refresh ALWAYS uses the latest snapshot on the dataset OPTIONS -e: engine: one of {bwrap, chroot, lxc}, defaults to bwrap -d: dirs to mount into the sandbox, mounts dir_host to target_ctr in the container -u: uid [integer] to use as base uid. uid and the following 65535 uids are mapped to [0:65535] in the sandbox. -g: gid [integer] to use as base gid. gid and the following 65535 gids are mapped to [0:65535] in the sandbox. NOTE: when using -u or -g, you should align the ownership of the template to the range specified. -i: ignore SIGINT, keep running. Current the workaround until signal passing into the guest is implemented. -v: var:value to pass into the sandbox -p: pwd: switch to this directory on spawn. Defaults to / -1: bind X11 socket into guest -2: bind Pulseaudio socket into guest -n: share host network -x: trace -f: regenerate ALL templates of given zfs template -s: skip generating new clones -q: quiet: only print warnings and prompts add all users that should be able to use zfs features to the `isolate' group ENVIRONMENT VARIABLES PRE_SPAWN_HOOK: command that is run before [command args] are run in the isolated environment $ROOTFS references the root of the isolated environment to be started POST_SPAWN_HOOK: command that is run after [command args] has completed in the isolated environment $ROOTFS references the root of the isolated environment to be started DISABLE_SECCOMP_WRAPPER [=!'']: disable the bwrap seccomp wrapper that prevents IOCTL to host LXC_NET_BR [=!'']: bridge to be used by LXC, defaults to br_vm LXC_MAP_TUN [!='']: map /dev/net/tun into the container LXC_CPU_QUOTA: percentage (1-100), this amount of CPU processing time will be available to the container. Default: 20 LXC_MEM_MB_QUOTA: memory available to guest, in MB DEBUG_LXC_SPAWN [=!'']: if nonzero len, write lxc start log to /tmp/isolate_lxc_log SUPPORTED BY ENGINE | engine | env | {U/G}ID | net | cmd+args | dirs | pwd | X11 | Pulseaudio | |--------+-----|---------------+----------+------|-----+-----+------------| | chroot | | | X | X | | | | | | bwrap | X | * | X | X | X | X | X | X | | lxc | X | X | X | X | X | X | | | * this changes the uid in the container, which itself STILL RUNS AS UID 0!