| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
E.g. in logs on jammy-ppc64el in https://github.com/systemd/systemd/pull/27294:
Apr 16 17:42:50 H systemd-gpt-auto-generator[300]: Failed to dissect partition table of block device /dev/sda: No message of desired type
Apr 16 17:42:50 H (sd-execu[295]: /usr/lib/systemd/system-generators/systemd-gpt-auto-generator failed with exit status 1.
ee0e6e476e61d4baa2a18e241d212753e75003bf made this particular condition not an
error. But for other errnos we want to print a better message too.
dissect_loop_device_and_warn() already does this, but it always prints the
error at error level. We want to suppress some of the errors, so let's make the
print helper public and do the error suppression in the caller.
|
|
|
|
|
|
|
|
| |
If we don't find a single useful partition table, refusing dissection.
(Except in systemd-dissect, when we are supposed to show DDI
information, in that case allow this to run and show general DDI
information, i.e. size, UUID and name at least)
|
|
|
|
|
|
| |
This is to dissect_image_file() what dissect_loop_device_and_warn() is
to dissect_loop_device(), i.e. it dissects the image file and logs an
error string if that fails instead of just returning an error.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we mount a DDI, let's set MS_NOSYMFOLLOW for ESP/XBOOTLDR. They are
generally untrusted territory, (i.e. outside of
encryption/authentication via dm-crypt/dm-verity). Moreover they are
generally FAT, where symlinks don't exist anyway. Let's hence disable
symlinks for them.
This slightly refactors how we put together mount options for mounts,
splitting this out into a new helper call
dissected_partition_pick_options(), which we should be able to reuse
later in gpt-auto-generator, to ensure mounts via loopback as DDI and
those on bare metal get the same options.
|
|
|
|
| |
Initially we only have one user, but following patches will add more.
|
|
|
|
|
| |
We already determine the architecture of disk images and make a choice,
and store it per partition. Let's make this accessible globally.
|
| |
|
| |
|
|
|
|
|
|
| |
Let's not leave the sector size unspecified: either set a user supplied
value, or auto-detect the right size by probing the disk image
accordingly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GPT disk image
When we operate with DDIs with sector sizes != 512 we need to configure
the loopback device to match it, otherwise the image and the kernel
block device will disagree what things are.
Let's add a prober that tries to determine the sector size of a GPT DDI.
It does this by looking for the GPT partition table header at the
various byte offsets they must be located on, given a specific sector
size. It will try sector size 512, 1024, 2048 and 4096. Of these only
the 512 and 4096 really make sense IRL I guess, but let's be thorough.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is useful to make the dissection logic at boot a bit safer, as we
can reference device nodes by diskseq.
This locks down dissection a bit, since it makes it harder to swap out
the backing device between the time we dissected and validated it, until
we actually mounted it.
This is not complete though, as /bin/mount would have to verify the
diskseq after opening the diskseq symlink again.
See: https://github.com/util-linux/util-linux/issues/1786
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we dissect images automatically, let's be a bit more conservative
with the file system types we are willing to mount: only mount common
file systems automatically.
Explicit mounts requested by admins should always be OK, but when we do
automatic mounts, let's not permit barely maintained, possibly legacy
file systems.
The list for now covers the four common writable and two common
read-only file systems. Sooner or later we might want to add more to the
list.
Also, it might make sense to eventually make this configurable via the
image dissection policy logic.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-1 was used everywhere, but -EBADF or -EBADFD started being used in various
places. Let's make things consistent in the new style.
Note that there are two candidates:
EBADF 9 Bad file descriptor
EBADFD 77 File descriptor in bad state
Since we're initializating the fd, we're just assigning a value that means
"no fd yet", so it's just a bad file descriptor, and the first errno fits
better. If instead we had a valid file descriptor that became invalid because
of some operation or state change, the other errno would fit better.
In some places, initialization is dropped if unnecessary.
|
|
|
|
|
|
|
|
| |
This function checks if the external verity data referenced in
VeritySettings covers the specified partition (indicated via
designator).
Right now, we'll use that at one place, but in a later commit in more.
|
|
|
|
|
|
|
| |
Let's store the GPT partition flags in the dissected partition info.
Right now we won't actually use them for anything yet, but later we'll
add that, when enforcing policy on dissection.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
let's make sure we can probe file systems also when unprivileged:
instead of probing the partition block devices for file system
signatures, let's go via the original "whole" fd.
libblkid makes this easy actually, as it allows us to specify the
offset/size of the area to probe. And we have the partition
offsets/sizes anyway, so it's trivial for us to make use of.
This thus enables fs probing also when lacking privs and operating on
naked regular files without loopback devices or anything like this.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DISSECT_IMAGE_OPEN_PARTITION_DEVICES
Curently, these two flags were implied by dissect_loop_device(), but
that's not right, because this means systemd-gpt-auto-generator will
dissect the root block device with these flags set and that's not
desirable: the generator should not cause the partition devices to be
created (we don't intend to use them right-away after all, but expect
udev to find/probe them first, and then mount them though .mount units).
And there's no point in opening the partition devices, since we do not
intend to mount them via fds either.
Hence, rework this: instead of implying the flags, specify them
explicitly.
While we are at it, let's also rename the flags to make them more
descriptive:
DISSECT_IMAGE_MANAGE_PARTITION_DEVICES becomes
DISSECT_IMAGE_ADD_PARTITION_DEVICES, since that's really all this does:
add the partition devices via BLKPG.
DISSECT_IMAGE_OPEN_PARTITION_DEVICES becomes
DISSECT_IMAGE_PIN_PARTITION_DEVICES, since we not only open the devices,
but keep the devices open continously (i.e. we "pin" them).
Also, drop the DISSECT_IMAGE_BLOCK_DEVICE combination flag, since it is
misleading, i.e. it suggests it was appropriate to specify on all
dissected blocking devices, but that's precisely not the case, see the
systemd-gpt-auto-generator case. My guess is that the confusion around
this was actually the cause for this bug we are addressing here.
Fixes: #25528
|
|
|
|
|
|
| |
Fixes a bug introduced by f7725647bb41c3398a867f139efe526efe8aa1b3.
Hopefully fixes #25348.
|
|\
| |
| | |
repart: Don't descend into directories assigned to other partitions
|
| |
| |
| |
| |
| | |
To achieve this we move the PartitionDesignator enum from
dissect-image.h to gpt.h
|
|/
|
|
|
|
|
|
| |
image UUID
systemd-repart generates this in a suitably stable fashion, hence let's
actually use it as an identifier for the image. As a first step parse
it, and show it.
|
|
|
|
|
| |
Let's complete support for DDI discovery, and also support 2nd stage
initrds.
|
|
|
|
|
|
|
| |
descriptor of device node
In dissect_loop_device(), we have opened the device node. Let's reuse
the file descriptor.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
file descriptor
If multiple services with the same encrypted image are simultaneously
starting, one may deactivate the dm device while others using it.
Or, similary, after (regular) partitions are dissected, another process
may try to remove them before we mount them.
To prevent such situations, let's keep the dissected and decrypted
partitions opened. Then, use the file descriptors when we mount the
partitions.
Fixes #24617.
|
|
|
|
|
| |
When the --force flag is used, do not insist that the extension-release
file has to match the extension image name
|
| |
|
| |
|
| |
|
|\
| |
| | |
dissect-image: make DissectedImage object take reference to DecryptedImage and LoopDevice
|
| |
| |
| |
| |
| |
| | |
To make LoopDevice object freed after DissectedImage is freed.
At least currently, this should not change anything. Preparation for
later commits.
|
| |
| |
| |
| | |
No functional changes. Preparation for later commits.
|
| | |
|
|/
|
|
|
|
| |
Currently, it is not necessary to set partno or architecture in
dissect_image_new(), but just for safety.
Preparation for later commits.
|
|
|
|
|
|
| |
Currently, dissect_image() is only called through dissect_loop_device(),
and the LoopDevice object has device name. Hence, it is not necessary to
get device name in dissect_image().
|
| |
|
|
|
|
|
|
| |
Note, currently, for each call of dissect_loop_device_and_warn(), the
specified name is equivalent to the path passed to loop_device_make_by_path().
Hence, this should not change the current behavios.
|
|
|
|
|
|
|
|
| |
image name
Follow-up for e374439f4b8def786031ddbbd7dfdae3a335d4d2 (#24322).
This also simplify the logic of generating image name from image path.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The loading of an extension image from a symlink "NAME.raw" to
"NAME-VERSION.raw" failed because the release file name check worked
with the backing file of the loop device which already resolves the
symlink and thus the found name "NAME-VERSION" mismatched "NAME".
Pass the original filename and use it instead of the backing file
when available. This fixes the loading of "NAME.raw" extensions which
are a symlink to "NAME-VERSION.raw" as, e.g., may be the case when
systemd-sysupdate manages multiple versions.
Fixes https://github.com/systemd/systemd/issues/24293
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts a major chunk of 75d7e04eb4662a814c26010d447eed8a862f5ec1
Now that the loopback device code already destroys the partitions we
don't have to do this here anymore.
I am sure the right place to delete the partitions is in the loopback
code, since we really only should do that for loopback devices, see
bug #24431, and not on "real" block devices.
I am also not convinced dropping partitions the dissection logic doesn't
care about is a good idea, after all. The dissection stuff should
probably not consider itself the "owner" of the block devices it
analyzes, but take a more passive role: figure out what is what, but not
modify it.
Fixes: #24431
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When closing a loop device, the kernel will asynchronously remove
the probed partitions. This can lead to race conditions where we
try to reuse a partition device that still needs to be removed by
the kernel. To avoid such issues, let's explicitly try to remove
any partitions using BLKPG_DEL_PARTITION when we're done with an
image.
To make sure we don't try to remove partitions when we want them
to remain (e.g. systemd-dissect --mount), we add
dissected_image_relinquish() in a similar vein to loop_device_relinquish()
and decrypted_image_relinquish().
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This revisits the mess around waiting for partition block devices in
the image dissection code. It implements a nice little trick:
Instead of waiting for the kernel to probe the partition table for us
and generate the block devices from it, we'll just do that ourselves.
How can we do it? Via the BLKPG_ADD_PARTITION ioctl, that the kernel has
supported for a while. This ioctl allows creating partition block
devices off "whole" block devices from userspace, without the partitions
necessarily being present in the partition table at all.
So, whenever we want a partition to be there, we'll just issue
BLKPG_ADD_PARTITION. This can either work, in which case we know the
partition is there, and can use it. Yay. Or it can fail with EBUSY,
which the kernel returns if a partition by the selected partition index
already exists (or if an existing partition overlaps with the new one).
But if that's the case, then that's also OK, because the partition will
already exist.
So, regardless if we win or the kernel wins, for us the outcome is the
same: the partition block device will exist after invoking the ioctl.
Yay.
Net effect: we are not dependent on asynchronous uevent messages to wait
for the devices. Instead we synchronously get what we need. This makes
us independent of the (apparently less than reliable) netlink transport,
and should almost always be quicker.
Hopefully addresses #17469 even on older kernels.
Fixes: #17469
|
|
|
|
|
|
|
|
|
| |
The implementation of MountImageUnit()/systemctl mount-image was
changed to use a /proc/self/fd path as the source, but that causes
the dm-verity files autodiscovery to fail, as it looks for files
in the same directory as the image.
Use the original file path when setting up dm-verity.
|
|
|
|
|
|
|
| |
Some parts of our tree used 'Architecture' for storing architectures,
others used ints. Let's unify on the former.
Inspired by #22952's rework of the 'Virtualization' enum.
|
|\
| |
| | |
Allow dissect_image() to dissect images from architectures other than the native one
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
To allow dissecting images of architectures other than the native
(or secondary) one, we add a third designator 'OTHER' to represent
architectures other than the native or secondary one.
If no partitions of the native or secondary arch are available, we
check if a root partition of any other arch is available and use that
instead if we found one.
|
|/
|
|
|
|
|
|
|
| |
The whole point of acquiring metadata is quite often to figure out why the
image does not pass verification. Refusing to provide metadata is just being
hostile to the user.
When called from other places (e.g. image_read_metadata()), verification is
still performed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
system extension is for
This should make things a bit more robust since it ensures system
extension can only applied to the right environments. Right now three
different "scopes" are defined:
1. "system" (for regular OS systems, after the initrd transition)
2. "initrd" (for sysext images that apply to the initrd environment)
3. "portable" (for sysext images that apply to portable images)
If not specified we imply a default of "system portable", i.e. any image
where the field is not specified is implicitly OK for application to OS
images and for portable services – but not for initrds.
|