diff options
author | Luke T. Shumaker <lukeshu@parabola.nu> | 2024-08-22 01:29:10 +0200 |
---|---|---|
committer | Luke T. Shumaker <lukeshu@parabola.nu> | 2024-09-07 18:18:35 +0200 |
commit | dc3223919f663b7c8b8d8d1d6072b4487df7709b (patch) | |
tree | b4192fbe82e73926a6e8bbde1d3e0e1ce272dfad /units/systemd-nspawn@.service.in | |
parent | nspawn: register_machine() and allocate_scope() bools to flags (diff) | |
download | systemd-dc3223919f663b7c8b8d8d1d6072b4487df7709b.tar.xz systemd-dc3223919f663b7c8b8d8d1d6072b4487df7709b.zip |
nspawn: enable FUSE in containers
Linux kernel v4.18 (2018-08-12) added user-namespace support to FUSE, and
bumped the FUSE version to 7.27 (see: da315f6e0398 (Merge tag
'fuse-update-4.18' of
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse, Linus Torvalds,
2018-06-07). This means that on such kernels it is safe to enable FUSE in
nspawn containers.
In outer_child(), before calling copy_devnodes(), check the FUSE version to
decide whether enable (>=7.27) or disable (<7.27) FUSE in the container. We
look at the FUSE version instead of the kernel version in order to enable FUSE
support on older-versioned kernels that may have the mentioned patchset
backported ([as requested by @poettering][1]). However, I am not sure that
this is safe; user-namespace support is not a documented part of the FUSE
protocol, which is what FUSE_KERNEL_VERSION/FUSE_KERNEL_MINOR_VERSION are meant
to capture. While the same patchset
- added FUSE_ABORT_ERROR (which is all that the 7.27 version bump
is documented as including),
- bumped FUSE_KERNEL_MINOR_VERSION from 26 to 27, and
- added user-namespace support
these 3 things are not inseparable; it is conceivable to me that a backport
could include the first 2 of those things and exclude the 3rd; perhaps it would
be safer to check the kernel version.
Do note that our get_fuse_version() function uses the fsopen() family of
syscalls, which were not added until Linux kernel v5.2 (2019-07-07); so if
nothing has been backported, then the minimum kernel version for FUSE-in-nspawn
is actually v5.2, not v4.18.
Pass whether or not to enable FUSE to copy_devnodes(); have copy_devnodes()
copy in /dev/fuse if enabled.
Pass whether or not to enable FUSE back over fd_outer_socket to run_container()
so that it can pass that to append_machine_properties() (via either
register_machine() or allocate_scope()); have append_machine_properties()
append "DeviceAllow=/dev/fuse rw" if enabled.
For testing, simply check that /dev/fuse can be opened for reading and writing,
but that actually reading from it fails with EPERM. The test assumes that if
FUSE is supported (/dev/fuse exists), then the testsuite is running on a kernel
with FUSE >= 7.27; I am unsure how to go about writing a test that validates
that the version check disables FUSE on old kernels.
[1]: https://github.com/systemd/systemd/issues/17607#issuecomment-745418835
Closes #17607
Diffstat (limited to 'units/systemd-nspawn@.service.in')
-rw-r--r-- | units/systemd-nspawn@.service.in | 10 |
1 files changed, 7 insertions, 3 deletions
diff --git a/units/systemd-nspawn@.service.in b/units/systemd-nspawn@.service.in index ff66d4090a..c2f21c6cbb 100644 --- a/units/systemd-nspawn@.service.in +++ b/units/systemd-nspawn@.service.in @@ -30,12 +30,16 @@ CoredumpReceive=yes TasksMax=16384 {{SERVICE_WATCHDOG}} -{# Enforce a strict device policy, similar to the one nspawn configures when it - # allocates its own scope unit. Make sure to keep these policies in sync if you - # change them! #} +{# Enforce a strict device policy, similar to the one nspawn configures (in + # nspawn-register.c:append_machine_properties()) when it allocates its own + # scope unit. Make sure to keep these policies in sync if you change them! #} DevicePolicy=closed DeviceAllow=/dev/net/tun rwm DeviceAllow=char-pts rw +{# /dev/fuse gets 'm' here even though it doesn't in nspawn-register.c, since + # efedb6b0f3 (nspawn: refuse to bind mount device node from host when + # --private-users= is specified, 2024-09-05) #} +DeviceAllow=/dev/fuse rwm # nspawn itself needs access to /dev/loop-control and /dev/loop, to implement # the --image= option. Add these here, too. |