summaryrefslogtreecommitdiffstats
path: root/src/gpt-auto-generator (follow)
Commit message (Collapse)AuthorAgeFilesLines
* gpt-auto-generator: rework/simplify logic for picking /efi or /bootZbigniew Jędrzejewski-Szmek2023-05-291-65/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I started looking into https://github.com/uapi-group/specifications/issues/35. BLS says: > Otherwise [no existing XBOOTLDR partition], if on GPT and an ESP is found and > it is large enough (let’s say at least 1G) it should be used as $BOOT and > used as primary location to place boot loader menu resources in. > It is recommended to mount $BOOT to /boot/, and the ESP to /efi/. DPS says: > The ESP used for the current boot is automatically mounted to /efi/ (or > /boot/ as fallback), unless a different partition is mounted there (possibly > via /etc/fstab, or because the Extended Boot Loader Partition — see below — > exists) or the directory is non-empty on the root disk. I don't think we want to mount the same partition in two places. If the same partition is not mounted in two places, then the two specs are contradictory. The code in gpt-auto-generator implemented the logic from the DPS. It is modified to implement the logic from BLS. Effectively: - if both /boot and /efi are available: - if both XBOOTLDR and ESP exist: ESP on /efi, XBOOTLDR on /boot - if only ESP exists: ESP on /boot - if only XBOOTLDR exists: XBOOTLDR on /boot - if only /boot is available: - if XBOOTLDR exists: XBOOTLDR on /boot - if only ESP exists: ESP on /boot - if only /efi is available: - if ESP exists: ESP on /efi "Available" means that it the mount point is not mounted over and does not contain files. If the directory doesn't exist, it is also "available" and will be created later when the mount or automount unit is started. Thus, the generator attempts to match the partitions and mount points to the extent possible. In all cases, /boot is the primary place to install kernels. ESP can be found on /boot or /efi, depending on the situation. If this patch is merged, I'll submit fixes for BLS and DPS to describe the same logic.
* gpt-auto-generator: also honor systemd.swap=noDavid Tardon2023-05-251-0/+17
|
* gpt-auto-generator: "translate" errno codes into proper messagesZbigniew Jędrzejewski-Szmek2023-04-181-5/+4
| | | | | | | | | | | | E.g. in logs on jammy-ppc64el in https://github.com/systemd/systemd/pull/27294: Apr 16 17:42:50 H systemd-gpt-auto-generator[300]: Failed to dissect partition table of block device /dev/sda: No message of desired type Apr 16 17:42:50 H (sd-execu[295]: /usr/lib/systemd/system-generators/systemd-gpt-auto-generator failed with exit status 1. ee0e6e476e61d4baa2a18e241d212753e75003bf made this particular condition not an error. But for other errnos we want to print a better message too. dissect_loop_device_and_warn() already does this, but it always prints the error at error level. We want to suppress some of the errors, so let's make the print helper public and do the error suppression in the caller.
* gpt-auto: do not fail when no suitable partitions foundYu Watanabe2023-04-181-1/+2
| | | | Follow-up for 598fd4da1cf9665834110583fd9133073cc12481.
* image-policy: introduce parse_image_policy_argument() helperYu Watanabe2023-04-131-14/+2
| | | | | | | | | Addresses https://github.com/systemd/systemd/pull/25608/commits/84be0c710d9d562f6d2cf986cc2a8ff4c98a138b#r1060130312, https://github.com/systemd/systemd/pull/25608/commits/84be0c710d9d562f6d2cf986cc2a8ff4c98a138b#r1067927293, and https://github.com/systemd/systemd/pull/25608/commits/84be0c710d9d562f6d2cf986cc2a8ff4c98a138b#r1067926416. Follow-up for 84be0c710d9d562f6d2cf986cc2a8ff4c98a138b.
* tree-wide: hook up image dissection policy logic everywhereLennart Poettering2023-04-051-1/+21
|
* gpt-auto-generator: fix typoAntonio Alvarez Feijoo2023-03-211-1/+1
|
* gpt-auto-generator: port to partition_pick_mount_options() tooLennart Poettering2023-03-091-24/+59
| | | | | | | | This way we'll have the same mount options in place if we boot via the gpt generator, or if we mount a DDI locally. Note that this will also enable MS_NOSYMFOLLOW on ESP and XBOOTLDR now, if booted via gpt-auto-generator.
* gpt-auto: Check for /boot before putting ESP thereAdrian Vovk2023-03-061-3/+8
| | | | | | | | We prefer /efi as a mount point for the ESP, and use /boot as a fallback if /efi doesn't exist. However, when root=tmpfs, neither /efi nor /boot exist. gpt-auto should mount to /efi in this case, but it mounted to /boot instead. This is because gpt-auto didn't check for the existence of /boot. Here, we correct this
* bootctl: add new --print-root-device optionLennart Poettering2023-02-211-32/+7
| | | | | | | | | | | | | | | | | | | | | | | We already have this nice code in system that determines the block device backing the root file system, but it's only used internally in systemd-gpt-generator. Let's make this more accessible and expose it directly in bootctl. It doesn't fit immediately into the topic of bootctl, but I think it's close enough and behaves very similar to the existing "bootctl --print-boot-path" and "--print-esp-path" tools. If --print-root-device (or -R) is specified once, will show the block device backing the root fs, and if specified twice (probably easier: -RR) it will show the whole block device that block device belongs to in case it is a partition block device. Suggested use: # cfdisk `bootctl -RR` To get access to the partition table, behind the OS install, for whatever it might be.
* shared/efi-loader: fix compilation with !ENABLE_EFI, improve messagesZbigniew Jędrzejewski-Szmek2023-01-251-6/+6
| | | | | | | | | | | | When compiled without ENABLE_EFI, efi_stub_measured() was not defined, so compilation would fail. But it's not enough to add a stub that returns -EOPNOTSUPP. We call this function in various places and usually print the error at warning or error level, so we'd print a confusing message. We also can't add a stub that always returns 0, because then we'd print a message like "Kernel stub did not measure", which would be confusing too. Adding special handling for -EOPNOTSUPP in every caller is also unattractive. So instead efi_stub_measured() is reworked to log the warning or error internally, and such logging is removed from the callers, and a stub is added that logs a custom message.
* tree-wide: fix typoYu Watanabe2023-01-201-1/+1
|
* tpm2: add common helper for checking if we are running on UKI with TPM ↵Lennart Poettering2023-01-171-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | measurements Let's introduce a common implementation of a function that checks whether we are booted on a kernel with systemd-stub that has TPM PCR measurements enabled. Do our own userspace measurements only if we detect that. PCRs are scarce and most likely there are projects which already make use of them in other ways. Hence, instead of blindly stepping into their territory let's conditionalize things so that people have to explicitly buy into our PCR assignments before we start measuring things into them. Specifically bind everything to an UKI that reported measurements. This was previously already implemented in systemd-pcrphase, but with this change we expand this to all tools that process PCR measurement settings. The env var to override the check is renamed to SYSTEMD_FORCE_MEASURE, to make it more generic (since we'll use it at multiple places now). This is not a compat break, since the original env var for that was not included in any stable release yet.
* generators: optionally, measure file systems at bootLennart Poettering2023-01-171-0/+6
| | | | | | If we use gpt-auto-generator, automatically measure root fs and /var. Otherwise, add x-systemd.measure option to request this.
* gpt-auto-generator: automatically measure root/var volume keys into PCR 15Lennart Poettering2023-01-171-5/+31
| | | | | let's enable PCR 15 measurements automatically if gpt-auto discovery is used and systemd-stub is also used.
* gpt-auto: harden ESP/XBOOTLDR mounts with "noexec,nosuid,nodev"Mike Yuan2023-01-161-5/+5
| | | | | | | | | When these partitions are probed by gpt-auto, they will always be hardened with such options. See also: https://github.com/systemd/systemd/issues/25776#issuecomment-1364115711 Closes #25776
* gpt-auto-generator: improve log messages a bitLennart Poettering2023-01-061-2/+2
| | | | Fixes: #20331
* gpt-auto-generator: enable referencing partitions via diskseq symlinksLennart Poettering2022-12-231-1/+2
|
* dissect-image: let's lock down fstypes a bitLennart Poettering2022-12-221-0/+9
| | | | | | | | | | | | | | | | | When we dissect images automatically, let's be a bit more conservative with the file system types we are willing to mount: only mount common file systems automatically. Explicit mounts requested by admins should always be OK, but when we do automatic mounts, let's not permit barely maintained, possibly legacy file systems. The list for now covers the four common writable and two common read-only file systems. Sooner or later we might want to add more to the list. Also, it might make sense to eventually make this configurable via the image dissection policy logic.
* gpt-auto-generator: honour rootfstype= and rootflags= kernel cmdline optionLennart Poettering2022-12-211-2/+22
| | | | | | | | Even if root= is not specified on the kernel cmdline, we should honour the other rootXYZ= options. Fixes: #8411 See: #17034
* gpt-auto-generator: do not write "noauto" in unit optionsZbigniew Jędrzejewski-Szmek2022-12-051-5/+1
| | | | | "auto"/"noauto" only make sense in the fstab. Putting them in Options= in the generated unit has no effect and is confusing.
* dissect: rework DISSECT_IMAGE_ADD_PARTITION_DEVICES + ↵Lennart Poettering2022-12-011-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DISSECT_IMAGE_OPEN_PARTITION_DEVICES Curently, these two flags were implied by dissect_loop_device(), but that's not right, because this means systemd-gpt-auto-generator will dissect the root block device with these flags set and that's not desirable: the generator should not cause the partition devices to be created (we don't intend to use them right-away after all, but expect udev to find/probe them first, and then mount them though .mount units). And there's no point in opening the partition devices, since we do not intend to mount them via fds either. Hence, rework this: instead of implying the flags, specify them explicitly. While we are at it, let's also rename the flags to make them more descriptive: DISSECT_IMAGE_MANAGE_PARTITION_DEVICES becomes DISSECT_IMAGE_ADD_PARTITION_DEVICES, since that's really all this does: add the partition devices via BLKPG. DISSECT_IMAGE_OPEN_PARTITION_DEVICES becomes DISSECT_IMAGE_PIN_PARTITION_DEVICES, since we not only open the devices, but keep the devices open continously (i.e. we "pin" them). Also, drop the DISSECT_IMAGE_BLOCK_DEVICE combination flag, since it is misleading, i.e. it suggests it was appropriate to specify on all dissected blocking devices, but that's precisely not the case, see the systemd-gpt-auto-generator case. My guess is that the confusion around this was actually the cause for this bug we are addressing here. Fixes: #25528
* basic: create new basic/initrd-util.[ch] for initrd-related functionsZbigniew Jędrzejewski-Szmek2022-11-081-1/+1
| | | | | | | | | I changed imports of util.h to initrd-util.h, or added an import of initrd-util.h, to keep compilation working. It turns out that many files didn't import util.h directly. When viewing the patch, don't be confused by git rename detection logic: a new .c file is added and two functions moved into it.
* gpt-auto: rename all functions that operate on a DissectedPartition object ↵Lennart Poettering2022-10-171-12/+11
| | | | | | | | | | | | | add_partition_xyz() The function for handling regular mounts based on DissectedPartition objects is called add_partition_mount(), so let's follow this scheme for all other functions that handle them, too. This nicely separates out the low-level functions (which get split up args) from the high-level functions (which get a DissectedPartition object): the latter are called add_partition_xyz() the former just add_xyz(). This makes naming a bit more systematic. No change in behaviour.
* gpt-auto-generator: use our usual ret_xyz parameter namingLennart Poettering2022-10-171-6/+12
|
* gpt-auto: allow using without cryptsetupDavid Seifert2022-10-121-0/+4
| | | | Fixes #24978
* loop-util: rename loop_device_open() -> loop_device_open_from_path()Yu Watanabe2022-09-281-1/+1
| | | | No functional changes, just preparation for later commits.
* tree-wide: fix typoYu Watanabe2022-09-141-1/+1
|
* gpt-auto: use LoopDevice object to manage whole block diskYu Watanabe2022-09-071-77/+16
|
* Use original filename for extension name checkKai Lueke2022-09-051-0/+1
| | | | | | | | | | | | | The loading of an extension image from a symlink "NAME.raw" to "NAME-VERSION.raw" failed because the release file name check worked with the backing file of the loop device which already resolves the symlink and thus the found name "NAME-VERSION" mismatched "NAME". Pass the original filename and use it instead of the backing file when available. This fixes the loading of "NAME.raw" extensions which are a symlink to "NAME-VERSION.raw" as, e.g., may be the case when systemd-sysupdate manages multiple versions. Fixes https://github.com/systemd/systemd/issues/24293
* dissect: drop partition removal codeLennart Poettering2022-09-011-2/+0
| | | | | | | | | | | | | | | | | | | This reverts a major chunk of 75d7e04eb4662a814c26010d447eed8a862f5ec1 Now that the loopback device code already destroys the partitions we don't have to do this here anymore. I am sure the right place to delete the partitions is in the loopback code, since we really only should do that for loopback devices, see bug #24431, and not on "real" block devices. I am also not convinced dropping partitions the dissection logic doesn't care about is a good idea, after all. The dissection stuff should probably not consider itself the "owner" of the block devices it analyzes, but take a more passive role: figure out what is what, but not modify it. Fixes: #24431
* tree-wide: use path_join() instead of prefix_roota() in various casesLennart Poettering2022-08-221-3/+6
| | | | | | | | | | | | | | | | | | | prefix_roota() is something we should stop using. It is bad for three reasons: 1. As it names suggests it's supposed to be used when working relative to some root directory, but given it doesn't follow symlinks (and instead just stupidly joins paths) it is not a good choice for that. 2. More often than not it is currently used with inputs under control of the user, and that is icky given it typically allocates memory on the stack. 3. It's a redundant interface, where chase_symlinks() and path_join() already exist as better, safer interfaces. Hence, let's start moving things from prefix_roota() to path_join() for the cases where that's appropriate.
* gpt-auto-generator: include device name in errorZbigniew Jędrzejewski-Szmek2022-07-221-1/+2
|
* dissect-image: Explicitly remove partitions when done with imageDaan De Meyer2022-05-231-0/+2
| | | | | | | | | | | | | | When closing a loop device, the kernel will asynchronously remove the probed partitions. This can lead to race conditions where we try to reuse a partition device that still needs to be removed by the kernel. To avoid such issues, let's explicitly try to remove any partitions using BLKPG_DEL_PARTITION when we're done with an image. To make sure we don't try to remove partitions when we want them to remain (e.g. systemd-dissect --mount), we add dissected_image_relinquish() in a similar vein to loop_device_relinquish() and decrypted_image_relinquish().
* stat-util: fix dir_is_empty() with hidden/backup filesLennart Poettering2022-05-041-1/+1
| | | | | | | | | | | | | | | | | | | | This is a follow-up for f470cb6d13558fc06131dc677d54a089a0b07359 which in turn is a follow-up for a068aceafbffcba85398cce636c25d659265087a. The latter started to honour hidden files when deciding whether a directory is empty. The former reverted to the old behaviour to fix issue #23220. It introduced a bug though: when a directory contains a larger number of hidden entries the getdents64() buffer will not suffice to read them, since we just allocate three entries for it (which is definitely enough if we just ignore the . + .. entries, but not ig we ignore more). I think it's a bit confusing that dir_is_empty() can return true even if rmdir() on the dir would return ENOTEMPTY. Hence, let's rework the function to make it optional whether hidden files are ignored or not. After all, I looking at the users of this function I am pretty sure in more cases we want to honour hidden files.
* devnum-util: define helper macros for formatting devnum major/minor pairsLennart Poettering2022-04-131-2/+2
| | | | And port some parts over.
* basic: split out dev_t related calls into new devno-util.[ch]Lennart Poettering2022-04-131-0/+1
| | | | | | | | | | | | | | No actual code changes, just splitting out of some dev_t handling related calls from stat-util.[ch], they are quite a number already, and deserve their own module now I think. Also, try to settle on the name "devnum" as the name for the concept, instead of "devno" or "dev" or "devid". "devnum" is the name exported in udev APIs, hence probably best to stick to that. (this just renames a few symbols to "devum", local variables are left untouched, to make the patch not too invasive) No actual code changes.
* tree-wide: take BSD lock on loopback devices we dissect/mount/operate onLennart Poettering2022-04-101-0/+7
| | | | | | | | | | | | | | | | | | | | | So here's something we should always keep in mind: systemd-udevd actually does *two* things with BSD file locks on block devices: 1. While it probes a device it takes a LOCK_SH lock. Thus everyone else taking a LOCK_EX lock will temporarily block udev from probing devices, which is good when making changes to it. 2. Whenever a device is closed after write (detected via inotify), udevd will issue BLKRRPART (requesting the kernel to reread the partition table). It does this while holding a LOCK_EX lock on the block device. Thus anyone else taking LOCK_SH or LOCK_EX will temporarily block udevd from issuing that ioctl. And that's quite relevant, since the kernel will temporarily flush out all partitions while re-reading the partition table and then create them anew. Thus it is smart to take LOCK_SH when dissecting a block device to ensure that no BLKRRPART is issued in the background, until we mounted the devices.
* dissect: rework how we wait for partition block devicesLennart Poettering2022-04-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This revisits the mess around waiting for partition block devices in the image dissection code. It implements a nice little trick: Instead of waiting for the kernel to probe the partition table for us and generate the block devices from it, we'll just do that ourselves. How can we do it? Via the BLKPG_ADD_PARTITION ioctl, that the kernel has supported for a while. This ioctl allows creating partition block devices off "whole" block devices from userspace, without the partitions necessarily being present in the partition table at all. So, whenever we want a partition to be there, we'll just issue BLKPG_ADD_PARTITION. This can either work, in which case we know the partition is there, and can use it. Yay. Or it can fail with EBUSY, which the kernel returns if a partition by the selected partition index already exists (or if an existing partition overlaps with the new one). But if that's the case, then that's also OK, because the partition will already exist. So, regardless if we win or the kernel wins, for us the outcome is the same: the partition block device will exist after invoking the ioctl. Yay. Net effect: we are not dependent on asynchronous uevent messages to wait for the devices. Instead we synchronously get what we need. This makes us independent of the (apparently less than reliable) netlink transport, and should almost always be quicker. Hopefully addresses #17469 even on older kernels. Fixes: #17469
* gpt-auto: properly handle case where we can't determine devno of /usr/ fsLennart Poettering2022-02-141-2/+6
| | | | | | | | | | | | get_block_device_harder() returns == 0 if the fs is valid, but it is not backed by a single devno. (As opposed to returning > 0 if the devno is valid). Let's catch this case and log a clear message, and don't bother open the device in that case. This is mostly cosmetical, as either way, systemd-gpt-auto-generator doesn't work in scenarios like that. Prompted-by: #22504
* Merge pull request #20257 from bluca/seqnoLuca Boccassi2021-08-311-0/+1
|\ | | | | Use new diskseq block device property
| * dissect: use DISKSEQ when waiting for block devicesLuca Boccassi2021-07-281-0/+1
| | | | | | | | | | | | | | | | DISKSEQ is a reliable way to find out if we missed a uevent or not, as it's monotonically increasing. If we parse an event with a smaller or no sequence number, we know we need to wait longer. If we parse an event with a greater sequence number, we know we missed it and the device was reused.
* | gpt-auto-generator: Use volatile-root by default and automatic logic as fallbackKristian Klausen2021-08-311-29/+24
|/ | | | | | | | | | | | Previously volatile-root was only checked if "/" wasn't backed by a block device, but the block device isn't necessarily original root block device (ex: if the rootfs is copied to a ext4 fs backed by zram in the initramfs), so we always want volatile-root checked. So shuffle the code around so volatile-root is checked first and fallback to the automatic logic. Fix #20557
* Mount encrypted swap partitions via gpt-autoHugo Osvaldo Barrera2021-07-081-8/+18
| | | | | | | | | | | | | | If the auto-discovered swap partition is LUKS encrypted, decrypt it automatically. This aligns with the Discoverable Partitions Specification, though I've also updated it to explicitly mention that LUKS is now supported here. Since systemd retries any key already in the kernel keyring, if the swap partition has the same passphrase as the root partition, the user won't be prompted a second time for a second passphrase. See https://github.com/systemd/systemd/issues/20019
* tree-wide: "a" -> "an"Yu Watanabe2021-06-301-1/+1
|
* gpt-auto-generator: pull in systemd-growfs@.service if new GPT growfs ↵Lennart Poettering2021-04-231-5/+22
| | | | partition flag is set
* dissect: ignore udev database entries from before the loopback attachmentLennart Poettering2021-04-201-0/+1
| | | | | | | | | This tries to shorten the race of device reuse a bit more: let's ignore udev database entries that are older than the time where we started to use a loopback device. This doesn't fix the whole loopback device raciness mess, but it makes the race window a bit shorter.
* dissect: ignore old uevents when waiting for loopback partition scanLennart Poettering2021-04-201-0/+1
| | | | | | | | | | | Let's drop all monitor uevent that were enqueued before we actually started setting up the device. This doesn't fix the race, but it makes the race window smaller: since we cannot determine the uevent seqnum and the loopback attachment atomically, there's a tiny window where uevents might be generated by the device which we mistake for being associated with out use of the loopback device.
* gpt-auto-generator: don't generate systemd-cryptsetup@.service when ↵gaoyi2021-04-091-0/+4
| | | | --Dlibcryptsetup=false
* tree-wide: make use of DISSECT_IMAGE_USR_NO_ROOT in various toolsLennart Poettering2021-03-161-1/+7
| | | | | | | | Let's make use of the new dissection in all tools where this makes sense, which are all tools that dissect images, except for those which inherently operate on state/configuraiton and thus where an image without state nor configuration is useless (e.g. systemd-tmpfiles/systemd-firstboot/… --image= switch).