diff options
232 files changed, 5239 insertions, 3194 deletions
diff --git a/.github/codeql-queries/UninitializedVariableWithCleanup.ql b/.github/codeql-queries/UninitializedVariableWithCleanup.ql index e514111f28..dadc6cb1b5 100644 --- a/.github/codeql-queries/UninitializedVariableWithCleanup.ql +++ b/.github/codeql-queries/UninitializedVariableWithCleanup.ql @@ -20,7 +20,7 @@ import semmle.code.cpp.controlflow.StackVariableReachability * since they don't do anything illegal even when the variable is uninitialized */ predicate cleanupFunctionDenyList(string fun) { - fun = "erase_char" + fun = "erase_char" or fun = "erase_obj" } /** @@ -1,5 +1,42 @@ systemd System and Service Manager +CHANGES WITH 253 in spe: + + Changes in sd-boot, bootctl, and the Boot Loader Specification: + + * systemd-boot now passes its random seed directly to the kernel's RNG + via the LINUX_EFI_RANDOM_SEED_TABLE_GUID configuration table, which + means the RNG gets seeded very early in boot before userspace has + started. + + * systemd-boot will pass a random seed when secure boot is enabled if + it can additionally get a random seed from EFI itself, via EFI's RNG + protocol or a prior seed in LINUX_EFI_RANDOM_SEED_TABLE_GUID from a + preceding bootloader. + + * The random seed stored in ESP is now refreshed whenever + systemd-random-seed.service is run. + + * systemd-boot handles various seed inputs using a domain- and + field-separated hashing scheme. + + * systemd-boot's 'random-seed-mode' option has been removed. A system + token is now always required to be present for random seeds to be + used. + + * systemd-stub now processes random seeds in the same way as + systemd-boot, in case a unified kernel image is being used from a + different bootloader than systemd-boot. + + * bootctl will now generate a system token on all EFI systems, even + virtualized ones, and is activated in the case that the system token + is missing from either sd-boot and sd-stub booted systems. + + Changes in systemctl: + + * systemctl reboot has dropped support for accepting a positional argument + as the argument to reboot(2) syscall. Please use --reboot-argument instead. + CHANGES WITH 252 🎃: Announcements of Future Feature Removals: @@ -80,6 +80,10 @@ Janitorial Clean-ups: * get rid of basename() and replace by path_extract_filename() +* Replace our fstype_is_network() with a call to libmount's mnt_fstype_is_netfs()? + Having two lists is not nice, but maybe it's now worth making a dependency on + libmount for something so trivial. + Deprecations and removals: * Remove any support for booting without /usr pre-mounted in the initrd entirely. @@ -115,20 +119,48 @@ Deprecations and removals: * H2 2023: remove support for unmerged-usr +* Remove /dev/mem ACPI FPDT parsing when /sys/firmware/acpi/fpdt is ubiquitous. + That requires distros to enable CONFIG_ACPI_FPDT, and have kernels v5.12 for + x86 and v6.2 for arm. + Features: +* pam_systemd_home: add module parameter to control whether to only accept + only password or only pcks11/fido2 auth, and then use this to hook nicely + into two of the three PAM stacks gdm provides. + See discussion at https://github.com/authselect/authselect/pull/311 + +* sd-boot: make boot loader spec type #1 accept http urls in "linux" + lines. THen, do the uefi http dance to download kernels and boot them. This + is then useful for network boot, by embdedding a cpio with type #1 snippets + in sd-boot, which reference remote kernels. + +* fix systemd-gpt-auto-generator in case a UKI is spawned from XBOOTLDR without + sd-boot. In that case LoaderDevicePartUUID will point to the XBOOTLDR, and we + should then derive the root disk from that, and then the ESP/XBOOTLDR from + that. Right now we will only mount ESP if it matches LoaderDEvicePartUUID + which isn't quite the same. + +* maybe prohibit setuid() to the nobody user, to lock things down, via seccomp. + the nobody is not a user any code should run under, ever, as that user would + possibly get a lot of access to resources it really shouldn't be getting + access to due to the userns + nfs semantics of the user. Alternatively: use + the seccomp log action, and allow it. + * sd-boot: add a new PE section .bls or so that carries a cpio with additional boot loader entries (both type1 and type2). Then when initializing, find this section, iterate through it and populate menu with it. cpio is simple enough to make a parser for this reasonably robust. use same path structures as in the ESP. Similar add one for signature key drop-ins. +* sd-boot: also allow passing in the cpio as in the previous item via SMBIOS + * add a new EFI tool "sd-fetch" or so. It looks in a PE section ".url" for an URL, then downloads the file from it using UEFI HTTP APIs, and executes it. Usecase: provide a minimal ESP with sd-boot and a couple of these sd-fetch binaries in place of UKIs, and download them on-the-fly. -* bootctl: warn if ESP is mounted world-readable (and in particular the seed) +* bootctl: warn if ESP is mounted world-readable (and in particular the seed). * maybe: systemd-loop-generator that sets up loopback devices if requested via kernel cmdline. usecase: include encrypted/verity root fs in UKI. @@ -138,6 +170,7 @@ Features: encrypted/verity root fs in UKI. * sd-stub: add ".bootcfg" section for kernel bootconfig data (as per + https://docs.kernel.org/admin-guide/bootconfig.html) * tpm2: add (optional) support for generating a local signing key from PCR 15 state. use private key part to sign PCR 7+14 policies. stash signatures for @@ -462,18 +495,6 @@ Features: * pick up creds from EFI vars -* sd-stub/sd-boot: write RNG seed to LINUX_EFI_RANDOM_SEED_TABLE_GUID config - table as well. (and possibly drop our efi var). Current kernels will pick up - the seed from there already, if EFI_RNG_PROTOCOL is not implemented by - firmware. - -* sd-boot: include domain specific hash string in hash function for random seed - plus sizes of everything. also include DMI/SMBIOS blob - -* sd-stub: invoke random seed logic the same way as in sd-boot, except if - random seed EFI variable is already set. That way, the variable set will be - set in all cases: if you just use sd-stub, or just sd-boot, or both. - * sd-boot: we probably should include all BootXY EFI variable defined boot entries in our menu, and then suppress ourselves. Benefit: instant compatibility with all other OSes which register things there, in particular @@ -751,13 +772,6 @@ Features: extending the command line to enable vsock on the VM, and using fw_cfg to configure socket address. -* sd-boot: rework random seed handling following recent kernel changes: always - pass seed to kernel, but credit only if secure boot is used - -* sd-boot: also include the hyperv "vm generation id" in the random seed hash, - to cover nicely for machine clones. It's found in the ACPI tables, which - should be easily accessible from UEFI. - * sd-boot: add menu item for shutdown? or hotkey? * sd-device has an API to create an sd_device object from a device id, but has diff --git a/docs/AUTOMATIC_BOOT_ASSESSMENT.md b/docs/AUTOMATIC_BOOT_ASSESSMENT.md index c2a53f48dc..91e2c5b094 100644 --- a/docs/AUTOMATIC_BOOT_ASSESSMENT.md +++ b/docs/AUTOMATIC_BOOT_ASSESSMENT.md @@ -9,7 +9,7 @@ SPDX-License-Identifier: LGPL-2.1-or-later systemd provides support for automatically reverting back to the previous version of the OS or kernel in case the system consistently fails to boot. The -[Boot Loader Specification](BOOT_LOADER_SPECIFICATION.md#boot-counting) +[Boot Loader Specification](https://uapi-group.org/specifications/specs/boot_loader_specification/#boot-counting) describes how to annotate boot loader entries with a counter that specifies how many attempts should be made to boot it. This document describes how systemd implements this scheme. @@ -28,7 +28,7 @@ Here's a brief overview of the complete set of components: * The [`systemd-boot(7)`](https://www.freedesktop.org/software/systemd/man/systemd-boot.html) boot loader optionally maintains a per-boot-loader-entry counter described by - the [Boot Loader Specification](BOOT_LOADER_SPECIFICATION.md#boot-counting) + the [Boot Loader Specification](https://uapi-group.org/specifications/specs/boot_loader_specification/#boot-counting) that is decreased by one on each attempt to boot the entry, prioritizing entries that have non-zero counters over those which already reached a counter of zero when choosing the entry to boot. @@ -60,7 +60,8 @@ Here's a brief overview of the complete set of components: ## Details -As described in [Boot Loader Specification](BOOT_LOADER_SPECIFICATION.md#boot-counting), +As described in the +[Boot Loader Specification](https://uapi-group.org/specifications/specs/boot_loader_specification/#boot-counting), the boot counting data is stored in the file name of the boot loader entries as a plus (`+`), followed by a number, optionally followed by `-` and another number, right before the file name suffix (`.conf` or `.efi`). diff --git a/docs/BOOT_LOADER_INTERFACE.md b/docs/BOOT_LOADER_INTERFACE.md index fc9336085b..267fcc55a0 100644 --- a/docs/BOOT_LOADER_INTERFACE.md +++ b/docs/BOOT_LOADER_INTERFACE.md @@ -80,12 +80,6 @@ variables. All EFI variables use the vendor UUID * `1 << 5` → The boot loader supports looking for boot menu entries in the Extended Boot Loader Partition. * `1 << 6` → The boot loader supports passing a random seed to the OS. -* The EFI variable `LoaderRandomSeed` contains a binary random seed if set. It - is set by the boot loader to pass an entropy seed read from the ESP to the OS. - The system manager then credits this seed to the kernel's entropy pool. It is - the responsibility of the boot loader to ensure the quality and integrity of - the random seed. - * The EFI variable `LoaderSystemToken` contains binary random data, persistently set by the OS installer. Boot loaders that support passing random seeds to the OS should use this data and combine it with the random @@ -107,8 +101,7 @@ that directory is empty, and only if no other file systems are mounted there. The `systemctl reboot --boot-loader-entry=…` and `systemctl reboot --boot-loader-menu=…` commands rely on the `LoaderFeatures` , `LoaderConfigTimeoutOneShot`, `LoaderEntries`, `LoaderEntryOneShot` -variables. `LoaderRandomSeed` is read by PID during early boot and credited to -the kernel's random pool. +variables. ## Boot Loader Entry Identifiers @@ -119,10 +112,11 @@ the identifiers as passed in `LoaderEntries`, `LoaderEntryDefault`, `LoaderEntryOneShot`, `LoaderEntrySelected`, and possibly show nicely localized names for them in UIs. -1. When boot loader entries are defined through - [Boot Loader Specification](BOOT_LOADER_SPECIFICATION.md) drop-in files - the identifier should be derived directly from the drop-in snippet name, but - with the `.conf` (or `.efi` in case of Type #2 entries) suffix removed. +1. When boot loader entries are defined through the + [Boot Loader Specification](https://uapi-group.org/specifications/specs/boot_loader_specification/) + files, the identifier should be derived directly from the file name, + but with the `.conf` (Type #1 snippets) or `.efi` (Type #2 images) + suffix removed. 2. Entries automatically discovered by the boot loader (as opposed to being configured in configuration files) should generally have an identifier @@ -135,7 +129,7 @@ names for them in UIs. discovered Windows installation might have the identifier `auto-windows` or `auto-windows-10` or so.). -4. Similar, boot menu entries referring to Apple macOS installations should +4. Similarly, boot menu entries referring to Apple macOS installations should use the identifier `osx` or one that is prefixed with `osx-`. If such an entry is automatically discovered by the boot loader use `auto-osx` as identifier, or `auto-osx-` as prefix for the identifier, see above. @@ -150,8 +144,8 @@ names for them in UIs. ## Links -[Boot Loader Specification](BOOT_LOADER_SPECIFICATION.md)<br> -[Discoverable Partitions Specification](DISCOVERABLE_PARTITIONS.md)<br> +[Boot Loader Specification](https://uapi-group.org/specifications/specs/boot_loader_specification)<br> +[Discoverable Partitions Specification](https://uapi-group.org/specifications/specs/discoverable_partitions_specification)<br> [`systemd-boot(7)`](https://www.freedesktop.org/software/systemd/man/systemd-boot.html)<br> [`bootctl(1)`](https://www.freedesktop.org/software/systemd/man/bootctl.html)<br> [`systemd-gpt-auto-generator(8)`](https://www.freedesktop.org/software/systemd/man/systemd-gpt-auto-generator.html) diff --git a/docs/BUILDING_IMAGES.md b/docs/BUILDING_IMAGES.md index 955dd90e55..1a96ed0083 100644 --- a/docs/BUILDING_IMAGES.md +++ b/docs/BUILDING_IMAGES.md @@ -67,14 +67,14 @@ boot. For that it's essential to: The [`kernel-install(8)`](https://www.freedesktop.org/software/systemd/man/kernel-install.html) logic used to generate -[Boot Loader Specification Type 1](BOOT_LOADER_SPECIFICATION.md) entries by -default uses the machine ID as stored in `/etc/machine-id` for naming boot menu -entries and the directories in the ESP to place kernel images in. This is done -in order to allow multiple installations of the same OS on the same system -without conflicts. However, this is problematic if the machine ID shall be -generated automatically on first boot: if the ID is not known before the first -boot it cannot be used to name the most basic resources required for the boot -process to complete. +[Boot Loader Specification Type #1](https://uapi-group.org/specifications/specs/boot_loader_specification/#type-1-boot-loader-specification-entries) +entries by default uses the machine ID as stored in `/etc/machine-id` for +naming boot menu entries and the directories in the ESP to place kernel images +in. This is done in order to allow multiple installations of the same OS on the +same system without conflicts. However, this is problematic if the machine ID +shall be generated automatically on first boot: if the ID is not known before +the first boot it cannot be used to name the most basic resources required for +the boot process to complete. Thus, for images that shall acquire their identity on first boot only, it is required to use a different identifier for naming boot menu entries. To allow @@ -203,8 +203,8 @@ it, then format it. in. The `x-systemd.growfs` mount option in `/etc/fstab` is sufficient to enable this logic for specific mounts. Alternatively appropriately set up partitions can set GPT partition flag 59 to request this behaviour, see the - [Discoverable Partitions Specification](DISCOVERABLE_PARTITIONS.md) for - details. If the file system is already grown it executes no operation. + [Discoverable Partitions Specification](https://uapi-group.org/specifications/specs/discoverable_partitions_specification) + for details. If the file system is already grown it executes no operation. 3. Similar, the `systemd-makefs@.service` and `systemd-makeswap@.service` services can format file systems and swap spaces before first use, if they @@ -267,8 +267,8 @@ fields. [`machine-id(5)`](https://www.freedesktop.org/software/systemd/man/machine-id.html)<br> [`systemd-random-seed(8)`](https://www.freedesktop.org/software/systemd/man/systemd-random-seed.service.html)<br> [`os-release(5)`](https://www.freedesktop.org/software/systemd/man/os-release.html)<br> -[Boot Loader Specification](BOOT_LOADER_SPECIFICATION.md)<br> -[Discoverable Partitions Specification](DISCOVERABLE_PARTITIONS.md)<br> +[Boot Loader Specification](https://uapi-group.org/specifications/specs/boot_loader_specification)<br> +[Discoverable Partitions Specification](https://uapi-group.org/specifications/specs/discoverable_partitions_specification)<br> [`mkosi`](https://github.com/systemd/mkosi)<br> [`systemd-boot(7)`](https://www.freedesktop.org/software/systemd/man/systemd-boot.html)<br> [`systemd-repart(8)`](https://www.freedesktop.org/software/systemd/man/systemd-repart.service.html)<br> diff --git a/docs/ENVIRONMENT.md b/docs/ENVIRONMENT.md index 61ad075085..01ee065583 100644 --- a/docs/ENVIRONMENT.md +++ b/docs/ENVIRONMENT.md @@ -188,12 +188,12 @@ All tools: file may be checked for by services run during system shutdown in order to request the appropriate operation from the boot loader in an alternative fashion. Note that by default only boot loader entries which follow the - [Boot Loader Specification](BOOT_LOADER_SPECIFICATION.md) and are - placed in the ESP or the Extended Boot Loader partition may be selected this - way. However, if a directory `/run/boot-loader-entries/` exists, the entries - are loaded from there instead. The directory should contain the usual - directory hierarchy mandated by the Boot Loader Specification, i.e. the entry - drop-ins should be placed in + [Boot Loader Specification](https://uapi-group.org/specifications/specs/boot_loader_specification) + and are placed in the ESP or the Extended Boot Loader partition may be + selected this way. However, if a directory `/run/boot-loader-entries/` + exists, the entries are loaded from there instead. The directory should + contain the usual directory hierarchy mandated by the Boot Loader + Specification, i.e. the entry drop-ins should be placed in `/run/boot-loader-entries/loader/entries/*.conf`, and the files referenced by the drop-ins (including the kernels and initrds) somewhere else below `/run/boot-loader-entries/`. Note that all these files may be (and are @@ -274,6 +274,15 @@ All tools: it is either set to `system` or `user` depending on whether the NSS/PAM module is called by systemd in `--system` or `--user` mode. +* `$SYSTEMD_SUPPORT_DEVICE`, `$SYSTEMD_SUPPORT_MOUNT`, `$SYSTEMD_SUPPORT_SWAP` - + can be set to `0` to mark respective unit type as unsupported. Generally, + having less units saves system resources so these options might be useful + for cases where we don't need to track given unit type, e.g. `--user` manager + often doesn't need to deal with device or swap units because they are + handled by the `--system` manager (PID 1). Note that setting certain unit + type as unsupported may not prevent loading some units of that type if they + are referenced by other units of another supported type. + `systemd-remount-fs`: * `$SYSTEMD_REMOUNT_ROOT_RW=1` — if set and no entry for the root directory @@ -384,7 +393,7 @@ disk images with `--image=` or similar: to load the embedded Verity signature data. If enabled (which is the default), Verity root hash information and a suitable signature is automatically acquired from a signature partition, following the - [Discoverable Partitions Specification](DISCOVERABLE_PARTITIONS.md). + [Discoverable Partitions Specification](https://uapi-group.org/specifications/specs/discoverable_partitions_specification). If disabled any such partition is ignored. Note that this only disables discovery of the root hash and its signature, the Verity data partition itself is still searched in the GPT image. @@ -473,7 +482,12 @@ SYSTEMD_HOME_DEBUG_SUFFIX=foo \ `systemd-journald`: -* `$SYSTEMD_JOURNAL_COMPACT` - Takes a boolean. If enabled, journal files are written +* `$SYSTEMD_JOURNAL_COMPACT` – Takes a boolean. If enabled, journal files are written in a more compact format that reduces the amount of disk space required by the journal. Note that journal files in compact mode are limited to 4G to allow use of 32-bit offsets. Enabled by default. + +`systemd-pcrphase`: + +* `$SYSTEMD_PCRPHASE_STUB_VERIFY` – Takes a boolean. If false the requested + measurement is done even if no EFI stub usage was reported via EFI variables. diff --git a/docs/PORTABLE_SERVICES.md b/docs/PORTABLE_SERVICES.md index 4f02ddb477..7a9c7f512d 100644 --- a/docs/PORTABLE_SERVICES.md +++ b/docs/PORTABLE_SERVICES.md @@ -169,7 +169,7 @@ requirements are made for an image that can be attached/detached with an image with a partition table understood by the Linux kernel with only a single partition defined, or alternatively, a GPT partition table with a set of properly marked partitions following the - [Discoverable Partitions Specification](DISCOVERABLE_PARTITIONS.md). + [Discoverable Partitions Specification](https://uapi-group.org/specifications/specs/discoverable_partitions_specification). 3. The image must at least contain one matching unit file, with the right name prefix and suffix (see above). The unit file is searched in the usual paths, diff --git a/docs/PORTING_TO_NEW_ARCHITECTURES.md b/docs/PORTING_TO_NEW_ARCHITECTURES.md index 5c61481486..1038336010 100644 --- a/docs/PORTING_TO_NEW_ARCHITECTURES.md +++ b/docs/PORTING_TO_NEW_ARCHITECTURES.md @@ -27,8 +27,8 @@ architecture. partitions. Use `systemd-id128 new -p` to generate new suitable UUIDs you can use for this. Make sure to register your new types in the various functions in `gpt.c`. Also make sure to update the tables in - `docs/DISCOVERABLE_PARTITIONS.md` and `man/systemd-gpt-auto-generator.xml` - accordingly. + [Discoverable Partitions Specification](https://uapi-group.org/specifications/specs/discoverable_partitions_specification) + and `man/systemd-gpt-auto-generator.xml` accordingly. 3. If your architecture supports UEFI, make sure to update the `efi_arch` variable logic in `meson.build` to be set to the right architecture string diff --git a/docs/RANDOM_SEEDS.md b/docs/RANDOM_SEEDS.md index 3dc27f5552..4cb2bb9cfa 100644 --- a/docs/RANDOM_SEEDS.md +++ b/docs/RANDOM_SEEDS.md @@ -197,45 +197,41 @@ boot, in order to ensure the entropy pool is filled up quickly. generate sufficient data), to generate a new random seed file to store in the ESP as well as a random seed to pass to the OS kernel. The new random seed file for the ESP is then written to the ESP, ensuring this is completed - before the OS is invoked. Very early during initialization PID 1 will read - the random seed provided in the EFI variable and credit it fully to the - kernel's entropy pool. - - This mechanism is able to safely provide an initialized entropy pool already - in the `initrd` and guarantees that different seeds are passed from the boot - loader to the OS on every boot (in a way that does not allow regeneration of - an old seed file from a new seed file). Moreover, when an OS image is - replicated between multiple images and the random seed is not reset, this - will still result in different random seeds being passed to the OS, as the - per-machine 'system token' is specific to the physical host, and not - included in OS disk images. If the 'system token' is properly initialized - and kept sufficiently secret it should not be possible to regenerate the - entropy pool of different machines, even if this seed is the only source of - entropy. + before the OS is invoked. + + The kernel then reads the random seed that the boot loader passes to it, via + the EFI configuration table entry, `LINUX_EFI_RANDOM_SEED_TABLE_GUID` + (1ce1e5bc-7ceb-42f2-81e5-8aadf180f57b), which is allocated with pool memory + of type `EfiACPIReclaimMemory`. Its contents have the form: + ``` + struct linux_efi_random_seed { + u32 size; // of the 'seed' array in bytes + u8 seed[]; + }; + ``` + The size field is generally set to 32 bytes, and the seed field includes a + hashed representation of any prior seed in `LINUX_EFI_RANDOM_SEED_TABLE_GUID` + together with the new seed. + + This mechanism is able to safely provide an initialized entropy pool before + userspace even starts and guarantees that different seeds are passed from + the boot loader to the OS on every boot (in a way that does not allow + regeneration of an old seed file from a new seed file). Moreover, when an OS + image is replicated between multiple images and the random seed is not + reset, this will still result in different random seeds being passed to the + OS, as the per-machine 'system token' is specific to the physical host, and + not included in OS disk images. If the 'system token' is properly + initialized and kept sufficiently secret it should not be possible to + regenerate the entropy pool of different machines, even if this seed is the + only source of entropy. Note that the writes to the ESP needed to maintain the random seed should be - minimal. The size of the random seed file is directly derived from the Linux - kernel's entropy pool size, which defaults to 512 bytes. This means updating - the random seed in the ESP should be doable safely with a single sector - write (since hard-disk sectors typically happen to be 512 bytes long, too), - which should be safe even with FAT file system drivers built into + minimal. Because the size of the random seed file is generally set to 32 bytes, + updating the random seed in the ESP should be doable safely with a single + sector write (since hard-disk sectors typically happen to be 512 bytes long, + too), which should be safe even with FAT file system drivers built into low-quality EFI firmwares. - As a special restriction: in virtualized environments PID 1 will refrain - from using this mechanism, for safety reasons. This is because on VM - environments the EFI variable space and the disk space is generally not - maintained physically separate (for example, `qemu` in EFI mode stores the - variables in the ESP itself). The robustness towards sloppy OS image - generation is the main purpose of maintaining the 'system token' however, - and if the EFI variable storage is not kept physically separate from the OS - image there's no point in it. That said, OS builders that know that they are - not going to replicate the built image on multiple systems may opt to turn - off the 'system token' concept by setting `random-seed-mode always` in the - ESP's - [`/loader/loader.conf`](https://www.freedesktop.org/software/systemd/man/loader.conf.html) - file. If done, `systemd-boot` will use the random seed file even if no - system token is found in EFI variables. - 4. A kernel command line option `systemd.random_seed=` may be used to pass in a base64 encoded seed to initialize the kernel's entropy pool from during early service manager initialization. This option is only safe in testing diff --git a/hwdb.d/60-evdev.hwdb b/hwdb.d/60-evdev.hwdb index 42e30256e3..47e06737ba 100644 --- a/hwdb.d/60-evdev.hwdb +++ b/hwdb.d/60-evdev.hwdb @@ -338,6 +338,54 @@ evdev:name:Atmel maXTouch Touch*:dmi:bvn*:bvr*:bd*:svnGOOGLE:pnSamus:* EVDEV_ABS_36=::10 ######################################### +# Granite Devices Simucube wheel bases +######################################### + +# Granite Devices Simucube 1 +evdev:input:b0003v16D0p0D5A* + EVDEV_ABS_00=:::0:0 + EVDEV_ABS_01=:::0:0 + EVDEV_ABS_02=:::0:0 + EVDEV_ABS_03=:::0:0 + EVDEV_ABS_04=:::0:0 + EVDEV_ABS_05=:::0:0 + EVDEV_ABS_06=:::0:0 + EVDEV_ABS_07=:::0:0 + +# Granite Devices Simucube 2 Sport +evdev:input:b0003v16D0p0D61* + EVDEV_ABS_00=:::0:0 + EVDEV_ABS_01=:::0:0 + EVDEV_ABS_02=:::0:0 + EVDEV_ABS_03=:::0:0 + EVDEV_ABS_04=:::0:0 + EVDEV_ABS_05=:::0:0 + EVDEV_ABS_06=:::0:0 + EVDEV_ABS_07=:::0:0 + +# Granite Devices Simucube 2 Pro +evdev:input:b0003v16D0p0D60* + EVDEV_ABS_00=:::0:0 + EVDEV_ABS_01=:::0:0 + EVDEV_ABS_02=:::0:0 + EVDEV_ABS_03=:::0:0 + EVDEV_ABS_04=:::0:0 + EVDEV_ABS_05=:::0:0 + EVDEV_ABS_06=:::0:0 + EVDEV_ABS_07=:::0:0 + +# Granite Devices Simucube 2 Ultimate +evdev:input:b0003v16D0p0D5F* + EVDEV_ABS_00=:::0:0 + EVDEV_ABS_01=:::0:0 + EVDEV_ABS_02=:::0:0 + EVDEV_ABS_03=:::0:0 + EVDEV_ABS_04=:::0:0 + EVDEV_ABS_05=:::0:0 + EVDEV_ABS_06=:::0:0 + EVDEV_ABS_07=:::0:0 + +######################################### # HP ######################################### diff --git a/hwdb.d/60-keyboard.hwdb b/hwdb.d/60-keyboard.hwdb index 60af8fde4d..498a4c5f5e 100644 --- a/hwdb.d/60-keyboard.hwdb +++ b/hwdb.d/60-keyboard.hwdb @@ -296,6 +296,9 @@ evdev:atkbd:dmi:bvn*:bvr*:bd*:svnCompaq*:pn*Evo*N*:* KEYBOARD_KEY_9e=email KEYBOARD_KEY_9f=homepage +evdev:name:AT Translated Set 2 keyboard:dmi:bvn*:bvr*:svnCompaq:pn*:pvr*:rvn*:rnN14KP6* + KEYBOARD_KEY_76=f21 # Fn+f2 toggle touchpad + evdev:input:b0003v049Fp0051* evdev:input:b0003v049Fp008D* KEYBOARD_KEY_0c0011=presentation diff --git a/hwdb.d/70-av-production.hwdb b/hwdb.d/70-av-production.hwdb index 5df128f07e..f89f26eb6f 100644 --- a/hwdb.d/70-av-production.hwdb +++ b/hwdb.d/70-av-production.hwdb @@ -52,6 +52,10 @@ usb:v0FD9p006D* usb:v0FD9p0080* ID_AV_PRODUCTION_CONTROLLER=1 +# Stream Deck Pedal +usb:v0FD9p0086* + ID_AV_PRODUCTION_CONTROLLER=1 + ############################# # Hercules (Guillemot Corp) ############################# diff --git a/man/bootctl.xml b/man/bootctl.xml index dfc56d6125..3083f356e8 100644 --- a/man/bootctl.xml +++ b/man/bootctl.xml @@ -86,7 +86,7 @@ <title>Boot Loader Specification Commands</title> <para>These commands are available for all boot loaders that implement the <ulink - url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader Specification</ulink> and/or the <ulink + url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink> and/or the <ulink url="https://systemd.io/BOOT_LOADER_INTERFACE">Boot Loader Interface</ulink>, such as <command>systemd-boot</command>.</para> @@ -95,7 +95,7 @@ <term><option>list</option></term> <listitem><para>Shows all available boot loader entries implementing the <ulink - url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader Specification</ulink>, as well as any + url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink>, as well as any other entries discovered or automatically generated by a boot loader implementing the <ulink url="https://systemd.io/BOOT_LOADER_INTERFACE">Boot Loader Interface</ulink>. JSON output may be requested with <option>--json=</option>.</para> @@ -120,7 +120,7 @@ entry for all future boots, the current default boot loader entry for the next boot, and the currently booted boot loader entry. These special IDs are resolved to the current values of the EFI variables <varname>LoaderEntryDefault</varname>, <varname>LoaderEntryOneShot</varname> and <varname>LoaderEntrySelected</varname>, - see <ulink url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader Specification</ulink> for details. + see <ulink url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink> for details. These special IDs are primarily useful as a quick way to persistently make the currently booted boot loader entry the default choice, or to upgrade the default boot loader entry for the next boot to the default boot loader entry for all future boots, but may be used for other operations too.</para> @@ -232,7 +232,7 @@ <varlistentry> <term><option>--boot-path=</option></term> <listitem><para>Path to the Extended Boot Loader partition, as defined in the <ulink - url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader Specification</ulink>. If not + url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink>. If not specified, <filename>/boot/</filename> is checked. It is recommended to mount the Extended Boot Loader partition to <filename>/boot/</filename>, if possible.</para></listitem> </varlistentry> @@ -252,7 +252,7 @@ are applied to file system in the indicated disk image. This option is similar to <option>--root=</option>, but operates on file systems stored in disk images or block devices. The disk image should either contain just a file system or a set of file systems within a GPT partition - table, following the <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions + table, following the <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>. For further information on supported disk images, see <citerefentry><refentrytitle>systemd-nspawn</refentrytitle><manvolnum>1</manvolnum></citerefentry>'s switch of the same name.</para></listitem> @@ -318,7 +318,7 @@ <varlistentry> <term><option>--make-entry-directory=yes|no</option></term> <listitem><para>Controls creation and deletion of the <ulink - url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader Specification</ulink> Type #1 entry + url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink> Type #1 entry directory on the file system containing resources such as kernel and initrd images during <option>install</option> and <option>remove</option>, respectively. The directory is named after the entry token, as specified with <option>--entry-token=</option> parameter described below, and is @@ -529,7 +529,7 @@ Boot Loader Entries: <title>See Also</title> <para> <citerefentry><refentrytitle>systemd-boot</refentrytitle><manvolnum>7</manvolnum></citerefentry>, - <ulink url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader Specification</ulink>, + <ulink url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink>, <ulink url="https://systemd.io/BOOT_LOADER_INTERFACE">Boot Loader Interface</ulink>, <citerefentry><refentrytitle>systemd-boot-system-token.service</refentrytitle><manvolnum>8</manvolnum></citerefentry> </para> diff --git a/man/coredumpctl.xml b/man/coredumpctl.xml index 8002549f7d..79632eb2d4 100644 --- a/man/coredumpctl.xml +++ b/man/coredumpctl.xml @@ -262,7 +262,7 @@ are applied to file system in the indicated disk image. This option is similar to <option>--root=</option>, but operates on file systems stored in disk images or block devices. The disk image should either contain just a file system or a set of file systems within a GPT partition - table, following the <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions + table, following the <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>. For further information on supported disk images, see <citerefentry><refentrytitle>systemd-nspawn</refentrytitle><manvolnum>1</manvolnum></citerefentry>'s switch of the same name.</para></listitem> diff --git a/man/hostnamectl.xml b/man/hostnamectl.xml index 6933c68e38..49bad01ded 100644 --- a/man/hostnamectl.xml +++ b/man/hostnamectl.xml @@ -175,7 +175,7 @@ <listitem><para>If <command>status</command> is invoked (or no explicit command is given) and one of these switches is specified, <command>hostnamectl</command> will print out just this selected hostname.</para> - <para>If used with <command>set-hostname</command>, only the selected hostnames will be updated. When more + <para>If used with <command>hostname</command>, only the selected hostnames will be updated. When more than one of these switches are specified, all the specified hostnames will be updated. </para></listitem> </varlistentry> diff --git a/man/journalctl.xml b/man/journalctl.xml index 5bf895fce4..d9ee51b302 100644 --- a/man/journalctl.xml +++ b/man/journalctl.xml @@ -18,7 +18,7 @@ <refnamediv> <refname>journalctl</refname> - <refpurpose>Query the systemd journal</refpurpose> + <refpurpose>Print log entries from the the systemd journal</refpurpose> </refnamediv> <refsynopsisdiv> @@ -32,13 +32,14 @@ <refsect1> <title>Description</title> - <para><command>journalctl</command> may be used to query the contents of the - <citerefentry><refentrytitle>systemd</refentrytitle><manvolnum>1</manvolnum></citerefentry> journal as - written by - <citerefentry><refentrytitle>systemd-journald.service</refentrytitle><manvolnum>8</manvolnum></citerefentry>.</para> + <para><command>journalctl</command> is used to print the log entries stored in the journal by + <citerefentry><refentrytitle>systemd-journald.service</refentrytitle><manvolnum>8</manvolnum></citerefentry> + and + <citerefentry><refentrytitle>systemd-journal-remote.service</refentrytitle><manvolnum>8</manvolnum></citerefentry>. + </para> - <para>If called without parameters, it will show the full contents of the journal, starting with the - oldest entry collected.</para> + <para>If called without parameters, it will show the contents of the journal accessible to the calling + user, starting with the oldest entry collected.</para> <para>If one or more match arguments are passed, the output is filtered accordingly. A match is in the format <literal>FIELD=VALUE</literal>, e.g. <literal>_SYSTEMD_UNIT=httpd.service</literal>, referring to @@ -93,6 +94,13 @@ <para>When outputting to a tty, lines are colored according to priority: lines of level ERROR and higher are colored red; lines of level NOTICE and higher are highlighted; lines of level DEBUG are colored lighter grey; other lines are displayed normally.</para> + + <para>To write entries <emphasis>to</emphasis> the journal, a few methods may be used. In general, output + from systemd units is automatically connected to the journal, see + <citerefentry><refentrytitle>systemd-journald.service</refentrytitle><manvolnum>8</manvolnum></citerefentry>. + In addition, + <citerefentry><refentrytitle>systemd-cat</refentrytitle><manvolnum>1</manvolnum></citerefentry> + may be used to send messages to the journal directly.</para> </refsect1> <refsect1> @@ -168,7 +176,7 @@ option is similar to <option>--root=</option>, but operates on file systems stored in disk images or block devices, thus providing an easy way to extract log data from disk images. The disk image should either contain just a file system or a set of file systems within a GPT partition table, following - the <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions + the <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>. For further information on supported disk images, see <citerefentry><refentrytitle>systemd-nspawn</refentrytitle><manvolnum>1</manvolnum></citerefentry>'s switch of the same name.</para></listitem> @@ -930,6 +938,7 @@ journalctl _SYSTEMD_CGROUP=/user.slice/user-42.slice/session-c1.scope</programli <title>See Also</title> <para> <citerefentry><refentrytitle>systemd</refentrytitle><manvolnum>1</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd-cat</refentrytitle><manvolnum>1</manvolnum></citerefentry>, <citerefentry><refentrytitle>systemd-journald.service</refentrytitle><manvolnum>8</manvolnum></citerefentry>, <citerefentry><refentrytitle>systemctl</refentrytitle><manvolnum>1</manvolnum></citerefentry>, <citerefentry><refentrytitle>coredumpctl</refentrytitle><manvolnum>1</manvolnum></citerefentry>, diff --git a/man/kernel-install.xml b/man/kernel-install.xml index b8ea2b16b2..f3fdc961f4 100644 --- a/man/kernel-install.xml +++ b/man/kernel-install.xml @@ -78,7 +78,7 @@ <programlisting>add <replaceable>KERNEL-VERSION</replaceable> <filename>$BOOT/<replaceable>ENTRY-TOKEN</replaceable>/<replaceable>KERNEL-VERSION</replaceable>/</filename> <replaceable>KERNEL-IMAGE</replaceable> [<replaceable>INITRD-FILE</replaceable> ...]</programlisting> <para>The third argument directly refers to the path where to place kernel images, initrd - images and other resources for <ulink url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot + images and other resources for <ulink url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink> Type #1 entries (the "entry directory"). If other boot loader schemes are used the parameter may be ignored. The <replaceable>ENTRY-TOKEN</replaceable> string is typically the machine ID and is supposed to identify the local installation on the system. For @@ -101,13 +101,21 @@ If <replaceable>INITRD-FILE</replaceable>s are provided, it also copies them to <filename>$BOOT/<replaceable>ENTRY-TOKEN</replaceable>/<replaceable>KERNEL_VERSION</replaceable>/<replaceable>INITRD-FILE</replaceable></filename>. It also creates a boot loader entry according to the <ulink - url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader Specification</ulink> (Type #1) in + url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink> (Type #1) in <filename>$BOOT/loader/entries/<replaceable>ENTRY-TOKEN</replaceable>-<replaceable>KERNEL-VERSION</replaceable>.conf</filename>. The title of the entry is the <replaceable>PRETTY_NAME</replaceable> parameter specified in <filename>/etc/os-release</filename> or <filename>/usr/lib/os-release</filename> (if the former is missing), or "Linux <replaceable>KERNEL-VERSION</replaceable>", if unset.</para> <para>If <varname>$KERNEL_INSTALL_LAYOUT</varname> is not "bls", this plugin does nothing.</para></listitem> + + <listitem><para><filename>90-uki-copy.install</filename> copies a file + <filename>uki.efi</filename> from <varname>$KERNEL_INSTALL_STAGING_AREA</varname> or if it does + not exist the <replaceable>KERNEL-IMAGE</replaceable> argument, iff it has a + <literal>.efi</literal> extension, to + <filename>$BOOT/EFI/Linux/<replaceable>ENTRY-TOKEN</replaceable>-<replaceable>KERNEL-VERSION</replaceable>.efi</filename>.</para> + + <para>If <varname>$KERNEL_INSTALL_LAYOUT</varname> is not "uki", this plugin does nothing.</para></listitem> </itemizedlist> </listitem> </varlistentry> @@ -132,6 +140,9 @@ <listitem><para><filename>90-loaderentry.install</filename> removes the file <filename>$BOOT/loader/entries/<replaceable>ENTRY-TOKEN</replaceable>-<replaceable>KERNEL-VERSION</replaceable>.conf</filename>.</para></listitem> + + <listitem><para><filename>90-uki-copy.install</filename> removes the file + <filename>$BOOT/EFI/Linux/<replaceable>ENTRY-TOKEN</replaceable>-<replaceable>KERNEL-VERSION</replaceable>.efi</filename>.</para></listitem> </itemizedlist> </listitem> </varlistentry> @@ -150,7 +161,7 @@ <refsect1> <title>The <varname>$BOOT</varname> partition</title> - <para>The partition where the kernels and <ulink url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot + <para>The partition where the kernels and <ulink url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink> snippets are located is called <varname>$BOOT</varname>. <command>kernel-install</command> determines the location of this partition by checking <filename>/efi/</filename>, <filename>/boot/</filename>, and <filename>/boot/efi/</filename> in turn. The @@ -213,7 +224,7 @@ (EFI System Partition) are mounted, and also conceptually referred to as <varname>$BOOT</varname>. Can be overridden by setting <varname>$BOOT_ROOT</varname> (see below).</para> - <para><varname>$KERNEL_INSTALL_LAYOUT=bls|other|...</varname> is set for the plugins to specify the + <para><varname>$KERNEL_INSTALL_LAYOUT=bls|uki|other|...</varname> is set for the plugins to specify the installation layout. Defaults to <option>bls</option> if <filename>$BOOT/<replaceable>ENTRY-TOKEN</replaceable></filename> exists, or <option>other</option> otherwise. Additional layout names may be defined by convention. If a plugin uses a special layout, @@ -225,7 +236,7 @@ <varlistentry> <term>bls</term> <listitem> - <para>Standard <ulink url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader + <para>Standard <ulink url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink> Type #1 layout, compatible with <citerefentry><refentrytitle>systemd-boot</refentrytitle><manvolnum>7</manvolnum></citerefentry>: entries in @@ -236,6 +247,18 @@ </listitem> </varlistentry> <varlistentry> + <term>uki</term> + <listitem> + <para>Standard <ulink + url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader + Specification</ulink> Type #2 layout, compatible with + <citerefentry><refentrytitle>systemd-boot</refentrytitle><manvolnum>7</manvolnum></citerefentry>: + unified kernel images under <filename>$BOOT/EFI/Linux</filename> as + <filename>$BOOT/EFI/Linux/<replaceable>ENTRY-TOKEN</replaceable>-<replaceable>KERNEL-VERSION</replaceable>[+<replaceable>TRIES</replaceable>].efi</filename>.</para> + <para>Implemented by <filename>90-uki-copy.install</filename>.</para> + </listitem> + </varlistentry> + <varlistentry> <term>other</term> <listitem> <para>Some other layout not understood natively by <command>kernel-install</command>.</para> @@ -312,12 +335,15 @@ <filename>/etc/kernel/tries</filename> </term> <listitem> - <para>Read by <filename>90-loaderentry.install</filename>. If this file exists a numeric value is read from - it and the naming of the generated entry file is slightly altered to include it as - <filename>$BOOT/loader/entries/<replaceable>MACHINE-ID</replaceable>-<replaceable>KERNEL-VERSION</replaceable>+<replaceable>TRIES</replaceable>.conf</filename>. This + <para>Read by <filename>90-loaderentry.install</filename> and + <filename>90-uki-copy.install</filename>. If this file exists a numeric value is read from it + and the naming of the generated entry file or UKI is slightly altered to include it as + <filename>$BOOT/loader/entries/<replaceable>ENTRY-TOKEN</replaceable>-<replaceable>KERNEL-VERSION</replaceable>+<replaceable>TRIES</replaceable>.conf</filename> + or + <filename>$BOOT/EFI/Linux/<replaceable>ENTRY-TOKEN</replaceable>-<replaceable>KERNEL-VERSION</replaceable>+<replaceable>TRIES</replaceable>.conf</filename>, respectively. This is useful for boot loaders such as - <citerefentry><refentrytitle>systemd-boot</refentrytitle><manvolnum>7</manvolnum></citerefentry> which - implement boot attempt counting with a counter embedded in the entry file name. + <citerefentry><refentrytitle>systemd-boot</refentrytitle><manvolnum>7</manvolnum></citerefentry> + which implement boot attempt counting with a counter embedded in the entry file name. <varname>$KERNEL_INSTALL_CONF_ROOT</varname> may be used to override the path.</para> </listitem> </varlistentry> @@ -385,7 +411,7 @@ <citerefentry><refentrytitle>os-release</refentrytitle><manvolnum>5</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>depmod</refentrytitle><manvolnum>8</manvolnum></citerefentry>, <citerefentry><refentrytitle>systemd-boot</refentrytitle><manvolnum>7</manvolnum></citerefentry>, - <ulink url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader Specification</ulink> + <ulink url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink> </para> </refsect1> diff --git a/man/loader.conf.xml b/man/loader.conf.xml index 7f173aec61..245f4c4536 100644 --- a/man/loader.conf.xml +++ b/man/loader.conf.xml @@ -36,7 +36,7 @@ <literal>.conf</literal> extension under <filename><replaceable>ESP</replaceable>/loader/entries/</filename> on the EFI system partition (ESP), and <filename><replaceable>XBOOTLDR</replaceable>/loader/entries/</filename> on the extended boot loader - partition (XBOOTLDR) as defined by <ulink url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader + partition (XBOOTLDR) as defined by <ulink url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink>. </para> @@ -57,7 +57,7 @@ <para>The configuration options supported by <filename><replaceable>ESP</replaceable>/loader/entries/*.conf</filename> and <filename><replaceable>XBOOTLDR</replaceable>/loader/entries/*.conf</filename> files are defined as part - of the <ulink url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader + of the <ulink url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink>.</para> <para>The following configuration are supported by the <filename>loader.conf</filename> configuration @@ -309,25 +309,6 @@ sign-efi-sig-list -c KEK.crt -k KEK.key db db.esl db.auth encrypted drive to change. If PCR 4 is not measured, this setting can be disabled to speed up booting into Windows.</para></listitem> </varlistentry> - - <varlistentry> - <term>random-seed-mode</term> - - <listitem><para>Takes one of <literal>off</literal>, <literal>with-system-token</literal> and - <literal>always</literal>. If <literal>off</literal> no random seed data is read off the ESP, nor - passed to the OS. If <literal>with-system-token</literal> (the default) - <command>systemd-boot</command> will read a random seed from the ESP (from the file - <filename>/loader/random-seed</filename>) only if the <varname>LoaderSystemToken</varname> EFI - variable is set, and then derive the random seed to pass to the OS from the combination. If - <literal>always</literal> the boot loader will do so even if <varname>LoaderSystemToken</varname> is - not set. This mode is useful in environments where protection against OS image reuse is not a - concern, and the random seed shall be used even with no further setup in place. Use <command>bootctl - random-seed</command> to initialize both the random seed file in the ESP and the system token EFI - variable.</para> - - <para>See <ulink url="https://systemd.io/RANDOM_SEEDS">Random Seeds</ulink> for further - information.</para></listitem> - </varlistentry> </variablelist> </refsect1> diff --git a/man/org.freedesktop.systemd1.xml b/man/org.freedesktop.systemd1.xml index 7dbf98defd..5ebb093082 100644 --- a/man/org.freedesktop.systemd1.xml +++ b/man/org.freedesktop.systemd1.xml @@ -2707,6 +2707,8 @@ node /org/freedesktop/systemd1/unit/avahi_2ddaemon_2eservice { @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly t MemorySwapMax = ...; @org.freedesktop.DBus.Property.EmitsChangedSignal("false") + readonly t MemoryZSwapMax = ...; + @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly t MemoryLimit = ...; @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly s DevicePolicy = '...'; @@ -3278,6 +3280,8 @@ node /org/freedesktop/systemd1/unit/avahi_2ddaemon_2eservice { <!--property MemorySwapMax is not documented!--> + <!--property MemoryZSwapMax is not documented!--> + <!--property MemoryLimit is not documented!--> <!--property DevicePolicy is not documented!--> @@ -3858,6 +3862,8 @@ node /org/freedesktop/systemd1/unit/avahi_2ddaemon_2eservice { <variablelist class="dbus-property" generated="True" extra-ref="MemorySwapMax"/> + <variablelist class="dbus-property" generated="True" extra-ref="MemoryZSwapMax"/> + <variablelist class="dbus-property" generated="True" extra-ref="MemoryLimit"/> <variablelist class="dbus-property" generated="True" extra-ref="DevicePolicy"/> @@ -4595,6 +4601,8 @@ node /org/freedesktop/systemd1/unit/avahi_2ddaemon_2esocket { @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly t MemorySwapMax = ...; @org.freedesktop.DBus.Property.EmitsChangedSignal("false") + readonly t MemoryZSwapMax = ...; + @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly t MemoryLimit = ...; @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly s DevicePolicy = '...'; @@ -5190,6 +5198,8 @@ node /org/freedesktop/systemd1/unit/avahi_2ddaemon_2esocket { <!--property MemorySwapMax is not documented!--> + <!--property MemoryZSwapMax is not documented!--> + <!--property MemoryLimit is not documented!--> <!--property DevicePolicy is not documented!--> @@ -5764,6 +5774,8 @@ node /org/freedesktop/systemd1/unit/avahi_2ddaemon_2esocket { <variablelist class="dbus-property" generated="True" extra-ref="MemorySwapMax"/> + <variablelist class="dbus-property" generated="True" extra-ref="MemoryZSwapMax"/> + <variablelist class="dbus-property" generated="True" extra-ref="MemoryLimit"/> <variablelist class="dbus-property" generated="True" extra-ref="DevicePolicy"/> @@ -6390,6 +6402,8 @@ node /org/freedesktop/systemd1/unit/home_2emount { @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly t MemorySwapMax = ...; @org.freedesktop.DBus.Property.EmitsChangedSignal("false") + readonly t MemoryZSwapMax = ...; + @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly t MemoryLimit = ...; @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly s DevicePolicy = '...'; @@ -6913,6 +6927,8 @@ node /org/freedesktop/systemd1/unit/home_2emount { <!--property MemorySwapMax is not documented!--> + <!--property MemoryZSwapMax is not documented!--> + <!--property MemoryLimit is not documented!--> <!--property DevicePolicy is not documented!--> @@ -7405,6 +7421,8 @@ node /org/freedesktop/systemd1/unit/home_2emount { <variablelist class="dbus-property" generated="True" extra-ref="MemorySwapMax"/> + <variablelist class="dbus-property" generated="True" extra-ref="MemoryZSwapMax"/> + <variablelist class="dbus-property" generated="True" extra-ref="MemoryLimit"/> <variablelist class="dbus-property" generated="True" extra-ref="DevicePolicy"/> @@ -8158,6 +8176,8 @@ node /org/freedesktop/systemd1/unit/dev_2dsda3_2eswap { @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly t MemorySwapMax = ...; @org.freedesktop.DBus.Property.EmitsChangedSignal("false") + readonly t MemoryZSwapMax = ...; + @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly t MemoryLimit = ...; @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly s DevicePolicy = '...'; @@ -8667,6 +8687,8 @@ node /org/freedesktop/systemd1/unit/dev_2dsda3_2eswap { <!--property MemorySwapMax is not documented!--> + <!--property MemoryZSwapMax is not documented!--> + <!--property MemoryLimit is not documented!--> <!--property DevicePolicy is not documented!--> @@ -9145,6 +9167,8 @@ node /org/freedesktop/systemd1/unit/dev_2dsda3_2eswap { <variablelist class="dbus-property" generated="True" extra-ref="MemorySwapMax"/> + <variablelist class="dbus-property" generated="True" extra-ref="MemoryZSwapMax"/> + <variablelist class="dbus-property" generated="True" extra-ref="MemoryLimit"/> <variablelist class="dbus-property" generated="True" extra-ref="DevicePolicy"/> @@ -9757,6 +9781,8 @@ node /org/freedesktop/systemd1/unit/system_2eslice { @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly t MemorySwapMax = ...; @org.freedesktop.DBus.Property.EmitsChangedSignal("false") + readonly t MemoryZSwapMax = ...; + @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly t MemoryLimit = ...; @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly s DevicePolicy = '...'; @@ -9908,6 +9934,8 @@ node /org/freedesktop/systemd1/unit/system_2eslice { <!--property MemorySwapMax is not documented!--> + <!--property MemoryZSwapMax is not documented!--> + <!--property MemoryLimit is not documented!--> <!--property DevicePolicy is not documented!--> @@ -10066,6 +10094,8 @@ node /org/freedesktop/systemd1/unit/system_2eslice { <variablelist class="dbus-property" generated="True" extra-ref="MemorySwapMax"/> + <variablelist class="dbus-property" generated="True" extra-ref="MemoryZSwapMax"/> + <variablelist class="dbus-property" generated="True" extra-ref="MemoryLimit"/> <variablelist class="dbus-property" generated="True" extra-ref="DevicePolicy"/> @@ -10248,6 +10278,8 @@ node /org/freedesktop/systemd1/unit/session_2d1_2escope { @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly t MemorySwapMax = ...; @org.freedesktop.DBus.Property.EmitsChangedSignal("false") + readonly t MemoryZSwapMax = ...; + @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly t MemoryLimit = ...; @org.freedesktop.DBus.Property.EmitsChangedSignal("false") readonly s DevicePolicy = '...'; @@ -10419,6 +10451,8 @@ node /org/freedesktop/systemd1/unit/session_2d1_2escope { <!--property MemorySwapMax is not documented!--> + <!--property MemoryZSwapMax is not documented!--> + <!--property MemoryLimit is not documented!--> <!--property DevicePolicy is not documented!--> @@ -10607,6 +10641,8 @@ node /org/freedesktop/systemd1/unit/session_2d1_2escope { <variablelist class="dbus-property" generated="True" extra-ref="MemorySwapMax"/> + <variablelist class="dbus-property" generated="True" extra-ref="MemoryZSwapMax"/> + <variablelist class="dbus-property" generated="True" extra-ref="MemoryLimit"/> <variablelist class="dbus-property" generated="True" extra-ref="DevicePolicy"/> diff --git a/man/repart.d.xml b/man/repart.d.xml index ebbb31cc20..7e19ab7e0c 100644 --- a/man/repart.d.xml +++ b/man/repart.d.xml @@ -237,7 +237,7 @@ <para>This setting defaults to <constant>linux-generic</constant>.</para> <para>Most of the partition type UUIDs listed above are defined in the <ulink - url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions + url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>.</para></listitem> </varlistentry> @@ -542,7 +542,7 @@ <listitem><para>Configures the No-Auto, Read-Only and Grow-File-System partition flags (bit 63, 60 and 59) of the partition table entry, as defined by the <ulink - url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions Specification</ulink>. Only + url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>. Only available for partition types supported by the specification. This option is a friendly way to set bits 63, 60 and 59 of the partition flags value without setting any of the other bits, and may be set via <varname>Flags=</varname> too, see above.</para> @@ -581,6 +581,17 @@ below. Defaults to <literal>%t</literal>. To disable split artifact generation for a partition, set <varname>SplitName=</varname> to <literal>-</literal>.</para></listitem> </varlistentry> + + <varlistentry> + <term><varname>Minimize=</varname></term> + + <listitem><para>Takes a boolean. Disabled by default. If enabled, the partition is created at least + as big as required for the minimal file system of the type specified by <varname>Format=</varname>, + taking into account the sources configured with <varname>CopyFiles=</varname>. Note that unless the + filesystem is a read-only filesystem, <command>systemd-repart</command> will have to populate the + filesystem twice, so enabling this option might slow down repart when populating large partitions. + </para></listitem> + </varlistentry> </variablelist> </refsect1> diff --git a/man/resolvectl.xml b/man/resolvectl.xml index 2cb855c360..c966ca67bd 100644 --- a/man/resolvectl.xml +++ b/man/resolvectl.xml @@ -323,11 +323,12 @@ <listitem><para>Takes a boolean parameter; used in conjunction with <command>query</command>. If true (the default), select domains are resolved on the local system, among them - <literal>localhost</literal>, <literal>_gateway</literal> and <literal>_outbound</literal>, or - entries from <filename>/etc/hosts</filename>. If false these domains are not resolved locally, and - either fail (in case of <literal>localhost</literal>, <literal>_gateway</literal> or - <literal>_outbound</literal> and suchlike) or go to the network via regular DNS/mDNS/LLMNR lookups - (in case of <filename>/etc/hosts</filename> entries).</para></listitem> + <literal>localhost</literal>, <literal>_gateway</literal>, <literal>_outbound</literal>, + <literal>_localdnsstub</literal> and <literal>_localdnsproxy</literal> or entries from + <filename>/etc/hosts</filename>. If false these domains are not resolved locally, and either fail (in + case of <literal>localhost</literal>, <literal>_gateway</literal> or <literal>_outbound</literal> and + suchlike) or go to the network via regular DNS/mDNS/LLMNR lookups (in case of + <filename>/etc/hosts</filename> entries).</para></listitem> </varlistentry> <varlistentry> diff --git a/man/rules/meson.build b/man/rules/meson.build index bb7799036d..ac4196e548 100644 --- a/man/rules/meson.build +++ b/man/rules/meson.build @@ -248,6 +248,8 @@ manpages = [ 'sd_bus_emit_object_removed', 'sd_bus_emit_properties_changed', 'sd_bus_emit_properties_changed_strv', + 'sd_bus_emit_signal_to', + 'sd_bus_emit_signal_tov', 'sd_bus_emit_signalv'], ''], ['sd_bus_enqueue_for_read', '3', [], ''], @@ -348,7 +350,7 @@ manpages = [ 'sd_bus_message_new_method_errnof', 'sd_bus_message_new_method_errorf'], ''], - ['sd_bus_message_new_signal', '3', [], ''], + ['sd_bus_message_new_signal', '3', ['sd_bus_message_new_signal_to'], ''], ['sd_bus_message_open_container', '3', ['sd_bus_message_close_container', diff --git a/man/sd_bus_default.xml b/man/sd_bus_default.xml index f4b1d6a791..48d9c9a108 100644 --- a/man/sd_bus_default.xml +++ b/man/sd_bus_default.xml @@ -182,7 +182,7 @@ <para><function>sd_bus_open_system_remote()</function> connects to the system bus on the specified host using - <citerefentry project='die-net'><refentrytitle>ssh</refentrytitle><manvolnum>1</manvolnum></citerefentry>. + <citerefentry project='man-pages'><refentrytitle>ssh</refentrytitle><manvolnum>1</manvolnum></citerefentry>. <parameter>host</parameter> consists of an optional user name followed by the <literal>@</literal> symbol, and the hostname, optionally followed by a <literal>:</literal> and a port, optionally followed by a @@ -339,7 +339,7 @@ <citerefentry><refentrytitle>sd_bus_ref</refentrytitle><manvolnum>3</manvolnum></citerefentry>, <citerefentry><refentrytitle>sd_bus_unref</refentrytitle><manvolnum>3</manvolnum></citerefentry>, <citerefentry><refentrytitle>sd_bus_close</refentrytitle><manvolnum>3</manvolnum></citerefentry>, - <citerefentry project='die-net'><refentrytitle>ssh</refentrytitle><manvolnum>1</manvolnum></citerefentry>, + <citerefentry project='man-pages'><refentrytitle>ssh</refentrytitle><manvolnum>1</manvolnum></citerefentry>, <citerefentry><refentrytitle>systemd-machined.service</refentrytitle><manvolnum>8</manvolnum></citerefentry>, <citerefentry><refentrytitle>machinectl</refentrytitle><manvolnum>1</manvolnum></citerefentry> </para> diff --git a/man/systemctl.xml b/man/systemctl.xml index 997925892d..1d91c8a726 100644 --- a/man/systemctl.xml +++ b/man/systemctl.xml @@ -2208,7 +2208,7 @@ Jan 12 10:46:45 example.com bluetoothd[8900]: gatt-time-server: Input/output err are applied to file system in the indicated disk image. This option is similar to <option>--root=</option>, but operates on file systems stored in disk images or block devices. The disk image should either contain just a file system or a set of file systems within a GPT partition - table, following the <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions + table, following the <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>. For further information on supported disk images, see <citerefentry><refentrytitle>systemd-nspawn</refentrytitle><manvolnum>1</manvolnum></citerefentry>'s switch of the same name.</para></listitem> diff --git a/man/systemd-bless-boot.service.xml b/man/systemd-bless-boot.service.xml index 53a58b3a1c..484f072352 100644 --- a/man/systemd-bless-boot.service.xml +++ b/man/systemd-bless-boot.service.xml @@ -39,7 +39,7 @@ <para>Internally, the service operates based on the <varname>LoaderBootCountPath</varname> EFI variable (of the vendor UUID <constant>4a67b082-0a4c-41cf-b6c7-440b29bb8c4</constant>), which is passed from the boot loader to the OS. It contains a file system path (relative to the EFI system partition) of the <ulink - url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader Specification</ulink> compliant boot loader entry + url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink> compliant boot loader entry file or unified kernel image file that was used to boot up the system. <command>systemd-bless-boot.service</command> removes the two 'tries done' and 'tries left' numeric boot counters from the filename, which indicates to future invocations of the boot loader that the entry has completed diff --git a/man/systemd-boot.xml b/man/systemd-boot.xml index 0eee532f90..6d99520036 100644 --- a/man/systemd-boot.xml +++ b/man/systemd-boot.xml @@ -39,12 +39,12 @@ <itemizedlist> <listitem><para>Boot entries defined with <ulink - url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader Specification</ulink> Type #1 + url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink> Type #1 description files located in <filename>/loader/entries/</filename> on the ESP and the Extended Boot Loader Partition. These usually describe Linux kernel images with associated initrd images, but alternatively may also describe other arbitrary EFI executables.</para></listitem> - <listitem><para>Unified kernel images, <ulink url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot + <listitem><para>Unified kernel images, <ulink url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink> Type #2, which are executable EFI binaries in <filename>/EFI/Linux/</filename> on the ESP and the Extended Boot Loader Partition.</para></listitem> @@ -304,11 +304,11 @@ <citerefentry><refentrytitle>loader.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para> <para>Boot entry description files following the <ulink - url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader Specification</ulink> are read from + url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink> are read from <filename>/loader/entries/</filename> on the ESP and the Extended Boot Loader partition.</para> <para>Unified kernel boot entries following the <ulink - url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader Specification</ulink> are read from + url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink> are read from <filename>/EFI/Linux/</filename> on the ESP and the Extended Boot Loader partition.</para> <para>Optionally, a random seed for early boot entropy pool provisioning is stored in @@ -436,28 +436,6 @@ </varlistentry> <varlistentry> - <term><varname>LoaderRandomSeed</varname></term> - - <listitem><para>A binary random seed <command>systemd-boot</command> may optionally pass to the - OS. This is a volatile EFI variable that is hashed at boot from the combination of a random seed - stored in the ESP (in <filename>/loader/random-seed</filename>) and a "system token" persistently - stored in the EFI variable <varname>LoaderSystemToken</varname> (see below). During early OS boot the - system manager reads this variable and passes it to the OS kernel's random pool, crediting the full - entropy it contains. This is an efficient way to ensure the system starts up with a fully initialized - kernel random pool — as early as the initrd phase. <command>systemd-boot</command> reads - the random seed from the ESP, combines it with the "system token", and both derives a new random seed - to update in-place the seed stored in the ESP, and the random seed to pass to the OS from it via - SHA256 hashing in counter mode. This ensures that different physical systems that boot the same - "golden" OS image — i.e. containing the same random seed file in the ESP — will still pass a - different random seed to the OS. It is made sure the random seed stored in the ESP is fully - overwritten before the OS is booted, to ensure different random seed data is used between subsequent - boots.</para> - - <para>See <ulink url="https://systemd.io/RANDOM_SEEDS">Random Seeds</ulink> for - further information.</para></listitem> - </varlistentry> - - <varlistentry> <term><varname>LoaderSystemToken</varname></term> <listitem><para>A binary random data field, that is used for generating the random seed to pass to @@ -474,7 +452,7 @@ <title>Boot Counting</title> <para><command>systemd-boot</command> implements a simple boot counting mechanism on top of the <ulink - url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader Specification</ulink>, for automatic and unattended + url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink>, for automatic and unattended fallback to older kernel versions/boot loader entries when a specific entry continuously fails. Any boot loader entry file and unified kernel image file that contains a <literal>+</literal> followed by one or two numbers (if two they need to be separated by a <literal>-</literal>), before the <filename>.conf</filename> or @@ -526,6 +504,23 @@ </refsect1> <refsect1> + <title>Using systemd-boot in virtual machines.</title> + + <para>When using qemu with OVMF (UEFI Firmware for virtual machines) the <option>-kernel</option> switch + works not only for linux kernels, but for any EFI binary, including sd-boot and unified linux + kernels. Example command line for loading sd-boot on x64:</para> + + <para> + <command>qemu-system-x86_64 <replaceable>[ ... ]</replaceable> + -kernel /usr/lib/systemd/boot/efi/systemd-bootx64.efi</command> + </para> + + <para>systemd-boot will detect that it was started directly instead of being loaded from ESP and will + search for the ESP in that case, taking into account boot order information from the hypervisor (if + available).</para> + </refsect1> + + <refsect1> <title>See Also</title> <para> <citerefentry><refentrytitle>bootctl</refentrytitle><manvolnum>1</manvolnum></citerefentry>, @@ -534,7 +529,7 @@ <citerefentry><refentrytitle>systemd-boot-system-token.service</refentrytitle><manvolnum>8</manvolnum></citerefentry>, <citerefentry><refentrytitle>kernel-install</refentrytitle><manvolnum>8</manvolnum></citerefentry>, <citerefentry><refentrytitle>systemd-stub</refentrytitle><manvolnum>7</manvolnum></citerefentry>, - <ulink url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader Specification</ulink>, + <ulink url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink>, <ulink url="https://systemd.io/BOOT_LOADER_INTERFACE">Boot Loader Interface</ulink> </para> </refsect1> diff --git a/man/systemd-dissect.xml b/man/systemd-dissect.xml index b940408267..2eb8972fee 100644 --- a/man/systemd-dissect.xml +++ b/man/systemd-dissect.xml @@ -74,7 +74,7 @@ <orderedlist> <listitem><para>OS disk images containing a GPT partition table envelope, with partitions marked - according to the <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions + according to the <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>.</para></listitem> <listitem><para>OS disk images containing just a plain file-system without an enveloping partition @@ -115,7 +115,7 @@ <listitem><para>Mount the specified OS image to the specified directory. This will dissect the image, determine the OS root file system — as well as possibly other partitions — and mount them to the specified directory. If the OS image contains multiple partitions marked with the <ulink - url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions Specification</ulink> + url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink> multiple nested mounts are established. This command expects two arguments: a path to an image file and a path to a directory where to mount the image.</para> @@ -270,7 +270,7 @@ <option>--mount</option> or <option>--copy-to</option>) the file systems contained in the OS image are automatically grown to their partition sizes, if bit 59 in the GPT partition flags is set for partition types that are defined by the <ulink - url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions Specification</ulink>. This + url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>. This behavior may be switched off using <option>--growfs=no</option>. File systems are grown automatically on access if all of the following conditions are met:</para> <orderedlist> @@ -322,7 +322,7 @@ <option>--verity-data=</option> specifies a path to a file with the Verity data to use for the OS image, in case it is stored in a detached file. It is recommended to embed the Verity data directly in the image, using the Verity mechanisms in the <ulink - url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions Specification</ulink>. + url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>. </para></listitem> </varlistentry> @@ -356,7 +356,7 @@ <citerefentry><refentrytitle>systemd</refentrytitle><manvolnum>1</manvolnum></citerefentry>, <citerefentry><refentrytitle>systemd-nspawn</refentrytitle><manvolnum>1</manvolnum></citerefentry>, <citerefentry><refentrytitle>systemd.exec</refentrytitle><manvolnum>5</manvolnum></citerefentry>, - <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions Specification</ulink>, + <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>, <citerefentry project='man-pages'><refentrytitle>umount</refentrytitle><manvolnum>8</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>fdisk</refentrytitle><manvolnum>8</manvolnum></citerefentry> </para> diff --git a/man/systemd-firstboot.xml b/man/systemd-firstboot.xml index 66d829941b..3f01836ddd 100644 --- a/man/systemd-firstboot.xml +++ b/man/systemd-firstboot.xml @@ -104,7 +104,7 @@ are applied to file system in the indicated disk image. This is similar to <option>--root=</option> but operates on file systems stored in disk images or block devices. The disk image should either contain just a file system or a set of file systems within a GPT partition table, following the - <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions + <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>. For further information on supported disk images, see <citerefentry><refentrytitle>systemd-nspawn</refentrytitle><manvolnum>1</manvolnum></citerefentry>'s switch of the same name.</para></listitem> diff --git a/man/systemd-gpt-auto-generator.xml b/man/systemd-gpt-auto-generator.xml index 8ad249ec5d..3b166b87f9 100644 --- a/man/systemd-gpt-auto-generator.xml +++ b/man/systemd-gpt-auto-generator.xml @@ -34,7 +34,7 @@ <filename>/var/tmp/</filename>, the EFI System Partition, the Extended Boot Loader Partition and swap partitions and creates mount and swap units for them, based on the partition type GUIDs of GUID partition tables (GPT), see <ulink url="https://uefi.org/specifications">UEFI Specification</ulink>, chapter 5. It - implements the <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions + implements the <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>. Note that this generator has no effect on non-GPT systems, and on specific mount points that are directories already containing files. Also, on systems where the units are explicitly configured (for example, listed in <citerefentry @@ -97,7 +97,7 @@ </entry> <entry>root partitions for other architectures</entry> <entry><filename>/</filename></entry> - <entry>The first partition with the type UUID matching the architecture, located on the same disk as the ESP, is used as the root file system <filename>/</filename>. For the full list and constant values, see <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions Specification</ulink>.</entry> + <entry>The first partition with the type UUID matching the architecture, located on the same disk as the ESP, is used as the root file system <filename>/</filename>. For the full list and constant values, see <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>.</entry> </row> <row> <entry><constant>SD_GPT_HOME</constant> <constant>933ac7e1-2eb4-4f13-b844-0e14e2aef915</constant></entry> @@ -211,7 +211,7 @@ are generated.</para> <para>If the disk contains an Extended Boot Loader partition, as defined in the <ulink - url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader Specification</ulink>, it is made + url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink>, it is made available at <filename>/boot/</filename> (by means of an automount point, similar to the ESP, see above). If both an EFI System Partition and an Extended Boot Loader partition exist the latter is preferably mounted to <filename>/boot/</filename>. Make sure to create both <filename>/efi/</filename> diff --git a/man/systemd-nspawn.xml b/man/systemd-nspawn.xml index 16e2286ed0..053efdb807 100644 --- a/man/systemd-nspawn.xml +++ b/man/systemd-nspawn.xml @@ -288,7 +288,7 @@ a server data partition which are mounted to the appropriate places in the container. All these partitions must be identified by the partition types defined by the <ulink - url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable + url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>.</para></listitem> <listitem><para>No partition table, and a single file system spanning the whole image.</para></listitem> diff --git a/man/systemd-repart.xml b/man/systemd-repart.xml index 3585cbf107..2c74afbe0f 100644 --- a/man/systemd-repart.xml +++ b/man/systemd-repart.xml @@ -363,6 +363,30 @@ if <option>--split</option> is enabled.</para></listitem> </varlistentry> + <varlistentry> + <term><option>--include-partitions=</option><arg rep="repeat">PARTITION</arg></term> + <term><option>--exclude-partitions=</option><arg rep="repeat">PARTITION</arg></term> + + <listitem><para>These options specify which partition types <command>systemd-repart</command> should + operate on. If <option>--include-partitions=</option> is used, all partitions that aren't specified + are excluded. If <option>--exclude-partitions=</option> is used, all partitions that are specified + are excluded. Both options take a comma separated list of GPT partition type UUIDs or identifiers + (see <varname>Type=</varname> in + <citerefentry><refentrytitle>repart.d</refentrytitle><manvolnum>5</manvolnum></citerefentry>). + </para></listitem> + </varlistentry> + + <varlistentry> + <term><option>--skip-partitions=</option><arg rep="repeat">PARTITION</arg></term> + + <listitem><para>This option specifies which partition types <command>systemd-repart</command> should + skip. All partitions that are skipped using this option are still taken into account when calculating + the sizes and offsets of other partitions, but aren't actually written to the disk image. The net + effect of this option is that if you run <command>systemd-repart</command> again without these + options, the missing partitions will be added as if they had not been skipped the first time + <command>systemd-repart</command> was executed.</para></listitem> + </varlistentry> + <xi:include href="standard-options.xml" xpointer="help" /> <xi:include href="standard-options.xml" xpointer="version" /> <xi:include href="standard-options.xml" xpointer="no-pager" /> diff --git a/man/systemd-resolved.service.xml b/man/systemd-resolved.service.xml index 7f30fa6536..c006c03b53 100644 --- a/man/systemd-resolved.service.xml +++ b/man/systemd-resolved.service.xml @@ -118,6 +118,12 @@ local default gateway configured. This assigns a stable hostname to the local outbound IP addresses, useful for referencing them independently of the current network configuration state.</para></listitem> + <listitem><para>The hostname <literal>_localdnsstub</literal> is resolved to the IP address 127.0.0.53, + i.e. the address the local DNS stub (see above) is listening on.</para></listitem> + + <listitem><para>The hostname <literal>_localdnsproxy</literal> is resolved to the IP address 127.0.0.54, + i.e. the address the local DNS proxy (see above) is listening on.</para></listitem> + <listitem><para>The mappings defined in <filename>/etc/hosts</filename> are resolved to their configured addresses and back, but they will not affect lookups for non-address types (like MX). Support for <filename>/etc/hosts</filename> may be disabled with <varname>ReadEtcHosts=no</varname>, diff --git a/man/systemd-stub.xml b/man/systemd-stub.xml index 415d663f53..fcb0c24ce8 100644 --- a/man/systemd-stub.xml +++ b/man/systemd-stub.xml @@ -430,7 +430,7 @@ <citerefentry><refentrytitle>systemd.exec</refentrytitle><manvolnum>5</manvolnum></citerefentry>, <citerefentry><refentrytitle>systemd-creds</refentrytitle><manvolnum>1</manvolnum></citerefentry>, <citerefentry><refentrytitle>systemd-sysext</refentrytitle><manvolnum>8</manvolnum></citerefentry>, - <ulink url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader Specification</ulink>, + <ulink url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink>, <ulink url="https://systemd.io/BOOT_LOADER_INTERFACE">Boot Loader Interface</ulink>, <citerefentry project='man-pages'><refentrytitle>objcopy</refentrytitle><manvolnum>1</manvolnum></citerefentry>, <citerefentry project='archlinux'><refentrytitle>sbsign</refentrytitle><manvolnum>1</manvolnum></citerefentry>, diff --git a/man/systemd-sysext.xml b/man/systemd-sysext.xml index aa0d42d83c..1de1627850 100644 --- a/man/systemd-sysext.xml +++ b/man/systemd-sysext.xml @@ -72,7 +72,7 @@ <orderedlist> <listitem><para>Plain directories or btrfs subvolumes containing the OS tree</para></listitem> <listitem><para>Disk images with a GPT disk label, following the <ulink - url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions Specification</ulink></para></listitem> + url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink></para></listitem> <listitem><para>Disk images lacking a partition table, with a naked Linux file system (e.g. squashfs or ext4)</para></listitem> </orderedlist> diff --git a/man/systemd-sysusers.xml b/man/systemd-sysusers.xml index b399b3b04c..aba275024f 100644 --- a/man/systemd-sysusers.xml +++ b/man/systemd-sysusers.xml @@ -74,7 +74,7 @@ are applied to file system in the indicated disk image. This is similar to <option>--root=</option> but operates on file systems stored in disk images or block devices. The disk image should either contain just a file system or a set of file systems within a GPT partition table, following the - <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions + <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>. For further information on supported disk images, see <citerefentry><refentrytitle>systemd-nspawn</refentrytitle><manvolnum>1</manvolnum></citerefentry>'s switch of the same name.</para></listitem> diff --git a/man/systemd-tmpfiles.xml b/man/systemd-tmpfiles.xml index 92ab322ba0..c2e32f9f3d 100644 --- a/man/systemd-tmpfiles.xml +++ b/man/systemd-tmpfiles.xml @@ -192,7 +192,7 @@ are applied to file system in the indicated disk image. This is similar to <option>--root=</option> but operates on file systems stored in disk images or block devices. The disk image should either contain just a file system or a set of file systems within a GPT partition table, following the - <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions + <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>. For further information on supported disk images, see <citerefentry><refentrytitle>systemd-nspawn</refentrytitle><manvolnum>1</manvolnum></citerefentry>'s switch of the same name.</para> diff --git a/man/systemd.exec.xml b/man/systemd.exec.xml index 29666b102b..d003ab1838 100644 --- a/man/systemd.exec.xml +++ b/man/systemd.exec.xml @@ -156,7 +156,7 @@ or loopback file instead of a directory. The device node or file system image file needs to contain a file system without a partition table, or a file system within an MBR/MS-DOS or GPT partition table with only a single Linux-compatible partition, or a set of file systems within a GPT partition table - that follows the <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions + that follows the <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>.</para> <para>When <varname>DevicePolicy=</varname> is set to <literal>closed</literal> or @@ -188,7 +188,7 @@ </para> <para>Valid partition names follow the <ulink - url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions Specification</ulink>: + url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>: <constant>root</constant>, <constant>usr</constant>, <constant>home</constant>, <constant>srv</constant>, <constant>esp</constant>, <constant>xbootldr</constant>, <constant>tmp</constant>, <constant>var</constant>.</para> @@ -255,7 +255,7 @@ <para>This option is supported only for disk images that contain a single file system, without an enveloping partition table. Images that contain a GPT partition table should instead include both root file system and matching Verity data in the same image, implementing the <ulink - url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions Specification</ulink>.</para> + url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>.</para> <xi:include href="system-only.xml" xpointer="singular"/></listitem> </varlistentry> diff --git a/man/systemd.network.xml b/man/systemd.network.xml index 9c11c5c3dd..6a7ab696a3 100644 --- a/man/systemd.network.xml +++ b/man/systemd.network.xml @@ -4026,6 +4026,27 @@ Token=prefixstable:2002:da8:1::</programlisting></para> </listitem> </varlistentry> + <varlistentry> + <term><varname>RTTSec=</varname></term> + <listitem> + <para>Specifies the RTT for the filter. Takes a timespan. Typical values are e.g. 100us for + extremely high-performance 10GigE+ networks like datacentre, 1ms for non-WiFi LAN connections, + 100ms for typical internet connections. Defaults to unset, and the kernel's default will be used. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>AckFilter=</varname></term> + <listitem> + <para>Takes a boolean value, or special value <literal>aggressive</literal>. If enabled, ACKs in + each flow are queued and redundant ACKs to the upstream are dropped. If yes, the filter will always + keep at least two redundant ACKs in the queue, while in <literal>aggressive</literal> mode, it will + filter down to a single ACK. This may improve download throughput on links with very asymmetrical + rate limits. Defaults to unset, and the kernel's default will be used.</para> + </listitem> + </varlistentry> + </variablelist> </refsect1> diff --git a/man/systemd.resource-control.xml b/man/systemd.resource-control.xml index 2a0e40a17d..fe875a81c3 100644 --- a/man/systemd.resource-control.xml +++ b/man/systemd.resource-control.xml @@ -331,13 +331,31 @@ <para>Takes a swap size in bytes. If the value is suffixed with K, M, G or T, the specified swap size is parsed as Kilobytes, Megabytes, Gigabytes, or Terabytes (with the base 1024), respectively. If assigned the - special value <literal>infinity</literal>, no swap limit is applied. This controls the + special value <literal>infinity</literal>, no swap limit is applied. These settings control the <literal>memory.swap.max</literal> control group attribute. For details about this control group attribute, see <ulink url="https://docs.kernel.org/admin-guide/cgroup-v2.html#memory-interface-files">Memory Interface Files</ulink>.</para> </listitem> </varlistentry> <varlistentry> + <term><varname>MemoryZSwapMax=<replaceable>bytes</replaceable></varname></term> + + <listitem> + <para>Specify the absolute limit on zswap usage of the processes in this unit. Zswap is a lightweight compressed + cache for swap pages. It takes pages that are in the process of being swapped out and attempts to compress them into a + dynamically allocated RAM-based memory pool. If the limit specified is hit, no entries from this unit will be + stored in the pool until existing entries are faulted back or written out to disk. See the kernel's + <ulink url="https://www.kernel.org/doc/html/latest/admin-guide/mm/zswap.html">Zswap</ulink> documentation for more details.</para> + + <para>Takes a size in bytes. If the value is suffixed with K, M, G or T, the specified size is + parsed as Kilobytes, Megabytes, Gigabytes, or Terabytes (with the base 1024), respectively. If assigned the + special value <literal>infinity</literal>, no limit is applied. These settings control the + <literal>memory.zswap.max</literal> control group attribute. For details about this control group attribute, + see <ulink url="https://docs.kernel.org/admin-guide/cgroup-v2.html#memory-interface-files">Memory Interface Files</ulink>.</para> + </listitem> + </varlistentry> + + <varlistentry> <term><varname>TasksAccounting=</varname></term> <listitem> diff --git a/man/sysupdate.d.xml b/man/sysupdate.d.xml index d57fbf0442..3540b44176 100644 --- a/man/sysupdate.d.xml +++ b/man/sysupdate.d.xml @@ -71,7 +71,7 @@ <listitem><para>A file <literal>https://download.example.com/foobarOS_47.root.xz</literal> should be downloaded, decompressed and written to a previously unused partition with GPT partition type UUID 4f68bce3-e8cd-4db1-96e7-fbcaf984b709 for x86-64, as per <ulink - url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions + url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>.</para></listitem> <listitem><para>Similarly, a file <literal>https://download.example.com/foobarOS_47.verity.xz</literal> @@ -80,7 +80,7 @@ for x86-64 root file systems).</para></listitem> <listitem><para>Finally, a file <literal>https://download.example.com/foobarOS_47.efi.xz</literal> (a - unified kernel, as per <ulink url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot Loader + unified kernel, as per <ulink url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink> Type #2) should be downloaded, decompressed and written to the ESP file system, i.e. to <filename>EFI/Linux/foobarOS_47.efi</filename> in the ESP.</para></listitem> </orderedlist> @@ -355,21 +355,21 @@ <entry><literal>@a</literal></entry> <entry>GPT partition flag NoAuto</entry> <entry>Either <literal>0</literal> or <literal>1</literal></entry> - <entry>Controls NoAuto bit of the GPT partition flags, as per <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions Specification</ulink>; only relevant if target resource type chosen as <constant>partition</constant></entry> + <entry>Controls NoAuto bit of the GPT partition flags, as per <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>; only relevant if target resource type chosen as <constant>partition</constant></entry> </row> <row> <entry><literal>@g</literal></entry> <entry>GPT partition flag GrowFileSystem</entry> <entry>Either <literal>0</literal> or <literal>1</literal></entry> - <entry>Controls GrowFileSystem bit of the GPT partition flags, as per <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions Specification</ulink>; only relevant if target resource type chosen as <constant>partition</constant></entry> + <entry>Controls GrowFileSystem bit of the GPT partition flags, as per <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>; only relevant if target resource type chosen as <constant>partition</constant></entry> </row> <row> <entry><literal>@r</literal></entry> <entry>Read-only flag</entry> <entry>Either <literal>0</literal> or <literal>1</literal></entry> - <entry>Controls ReadOnly bit of the GPT partition flags, as per <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions Specification</ulink> and other output read-only flags, see <varname>ReadOnly=</varname> below</entry> + <entry>Controls ReadOnly bit of the GPT partition flags, as per <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink> and other output read-only flags, see <varname>ReadOnly=</varname> below</entry> </row> <row> @@ -610,7 +610,7 @@ overall <varname>PartitionFlags=</varname> flags setting and the individual flag settings <varname>PartitionNoAuto=</varname> and <varname>PartitionGrowFileSystem=</varname> are used (or the wildcards for them), then the latter override the former, i.e. the individual flag bit overrides the - overall flags value. See <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable + overall flags value. See <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink> for details about these flags.</para> <para>Note that these settings are not used for matching, they only have effect on newly written @@ -622,7 +622,7 @@ <listitem><para>Controls whether to mark the resulting file, subvolume or partition read-only. If the target type is <constant>partition</constant> this controls the ReadOnly partition flag, as per - <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions + <ulink url="https://uapi-group.org/specifications/specs/discoverable_partitions_specification">Discoverable Partitions Specification</ulink>, similar to the <varname>PartitionNoAuto=</varname> and <varname>PartitionGrowFileSystem=</varname> flags described above. If the target type is <constant>regular-file</constant>, the writable bit is removed from the access mode. If the the @@ -829,7 +829,7 @@ TriesDone=0 InstancesMax=2</programlisting></para> <para>The above installs a unified kernel image into the ESP (which is mounted to - <filename>/efi/</filename>), as per <ulink url="https://systemd.io/BOOT_LOADER_SPECIFICATION">Boot + <filename>/efi/</filename>), as per <ulink url="https://uapi-group.org/specifications/specs/boot_loader_specification">Boot Loader Specification</ulink> Type #2. This defines three possible patterns for the names of the kernel images, as per <ulink url="https://systemd.io/AUTOMATIC_BOOT_ASSESSMENT">Automatic Boot Assessment</ulink>, and ensures when installing new kernels, they are set up with 3 tries left. No diff --git a/meson.build b/meson.build index 2d41ff8799..5706c776ff 100644 --- a/meson.build +++ b/meson.build @@ -599,6 +599,10 @@ foreach ident : [ #include <unistd.h> #include <signal.h> #include <sys/wait.h>'''], + ['rt_tgsigqueueinfo', '''#include <stdlib.h> + #include <unistd.h> + #include <signal.h> + #include <sys/wait.h>'''], ['mallinfo', '''#include <malloc.h>'''], ['mallinfo2', '''#include <malloc.h>'''], ['execveat', '''#include <unistd.h>'''], @@ -1325,7 +1329,10 @@ if want_libcryptsetup != 'false' and not skip_deps foreach ident : ['crypt_set_metadata_size', 'crypt_activate_by_signed_key', - 'crypt_token_max'] + 'crypt_token_max', + 'crypt_reencrypt_init_by_passphrase', + 'crypt_reencrypt', + 'crypt_set_data_offset'] have_ident = have and cc.has_function( ident, prefix : '#include <libcryptsetup.h>', @@ -1487,11 +1494,14 @@ if want_tpm2 != 'false' and not skip_deps tpm2 = dependency('tss2-esys tss2-rc tss2-mu', required : want_tpm2 == 'true') have = tpm2.found() + have_esys3 = tpm2.version().version_compare('>= 3.0.0') else have = false + have_esys3 = false tpm2 = [] endif conf.set10('HAVE_TPM2', have) +conf.set10('HAVE_TSS2_ESYS3', have_esys3) want_elfutils = get_option('elfutils') if want_elfutils != 'false' and not skip_deps @@ -2742,7 +2752,8 @@ if conf.get('ENABLE_HOMED') == 1 'systemd-homework', systemd_homework_sources, include_directories : includes, - link_with : [libshared], + link_with : [libshared, + libshared_fdisk], dependencies : [threads, libblkid, libcrypt, @@ -3321,7 +3332,8 @@ if conf.get('ENABLE_SYSUPDATE') == 1 'systemd-sysupdate', systemd_sysupdate_sources, include_directories : includes, - link_with : [libshared], + link_with : [libshared, + libshared_fdisk], dependencies : [threads, libblkid, libfdisk, @@ -3797,7 +3809,8 @@ if conf.get('ENABLE_REPART') == 1 'systemd-repart', systemd_repart_sources, include_directories : includes, - link_with : [libshared], + link_with : [libshared, + libshared_fdisk], dependencies : [threads, libblkid, libfdisk, @@ -3817,7 +3830,8 @@ if conf.get('ENABLE_REPART') == 1 link_with : [libshared_static, libbasic, libbasic_gcrypt, - libsystemd_static], + libsystemd_static, + libshared_fdisk], dependencies : [threads, libblkid, libfdisk, @@ -3967,7 +3981,7 @@ exe = custom_target( install_dir : bindir) public_programs += exe -if want_tests != 'false' +if want_tests != 'false' and want_kernel_install test('test-kernel-install', test_kernel_install_sh, args : [exe.full_path(), loaderentry_install]) @@ -4160,19 +4174,19 @@ alias_target('fuzzers', fuzzer_exes) ############################################################ +subdir('docs/sysvinit') +subdir('docs/var-log') +subdir('hwdb.d') +subdir('man') subdir('modprobe.d') +subdir('network') +subdir('presets') +subdir('shell-completion/bash') +subdir('shell-completion/zsh') subdir('sysctl.d') subdir('sysusers.d') subdir('tmpfiles.d') -subdir('hwdb.d') subdir('units') -subdir('presets') -subdir('network') -subdir('man') -subdir('shell-completion/bash') -subdir('shell-completion/zsh') -subdir('docs/sysvinit') -subdir('docs/var-log') install_subdir('factory/etc', install_dir : factorydir) diff --git a/mkosi.conf.d/10-systemd.conf b/mkosi.conf.d/10-systemd.conf index 5bc13f919a..f9e4d08616 100644 --- a/mkosi.conf.d/10-systemd.conf +++ b/mkosi.conf.d/10-systemd.conf @@ -21,6 +21,8 @@ Packages= coreutils diffutils dnsmasq + dosfstools + e2fsprogs findutils gcc # For sanitizer libraries gdb @@ -29,6 +31,7 @@ Packages= kexec-tools kmod less + mtools nano nftables openssl @@ -40,6 +43,7 @@ Packages= util-linux valgrind wireguard-tools + xfsprogs zsh BuildPackages= diff --git a/mkosi.conf.d/arch/10-mkosi.arch b/mkosi.conf.d/arch/10-mkosi.arch index 883dc1fcd5..993e3dd344 100644 --- a/mkosi.conf.d/arch/10-mkosi.arch +++ b/mkosi.conf.d/arch/10-mkosi.arch @@ -11,8 +11,10 @@ Distribution=arch [Content] Packages= alsa-lib + btrfs-progs compsize dhcp + f2fs-tools fuse2 gnutls iproute diff --git a/mkosi.conf.d/debian/10-mkosi.debian b/mkosi.conf.d/debian/10-mkosi.debian index b2da9b6232..7443f7db53 100644 --- a/mkosi.conf.d/debian/10-mkosi.debian +++ b/mkosi.conf.d/debian/10-mkosi.debian @@ -9,14 +9,16 @@ Release=testing [Content] Packages= + btrfs-progs cryptsetup-bin + f2fs-tools fdisk fuse gcc # Provides libasan/libubsan iproute2 isc-dhcp-server libasound2 - libbpf0 + libbpf1 libc6-i386 libcap-ng0 libfido2-1 diff --git a/mkosi.conf.d/fedora/10-mkosi.fedora b/mkosi.conf.d/fedora/10-mkosi.fedora index c76f479956..5f92aab95c 100644 --- a/mkosi.conf.d/fedora/10-mkosi.fedora +++ b/mkosi.conf.d/fedora/10-mkosi.fedora @@ -10,9 +10,11 @@ Release=37 [Content] Packages= alsa-lib + btrfs-progs compsize cryptsetup dhcp-server + f2fs-tools fuse glib2 glibc-minimal-langpack diff --git a/mkosi.conf.d/opensuse/10-mkosi.opensuse b/mkosi.conf.d/opensuse/10-mkosi.opensuse index 7a212237f2..417827f7c0 100644 --- a/mkosi.conf.d/opensuse/10-mkosi.opensuse +++ b/mkosi.conf.d/opensuse/10-mkosi.opensuse @@ -9,7 +9,9 @@ Release=tumbleweed [Content] Packages= + btrfs-progs dbus-1 + f2fs-tools fuse gcc # Provides libasan/libubsan glibc-32bit diff --git a/mkosi.conf.d/ubuntu/10-mkosi.ubuntu b/mkosi.conf.d/ubuntu/10-mkosi.ubuntu index c7badf5742..346b129e52 100644 --- a/mkosi.conf.d/ubuntu/10-mkosi.ubuntu +++ b/mkosi.conf.d/ubuntu/10-mkosi.ubuntu @@ -10,7 +10,9 @@ Repositories=main,universe [Content] Packages= + btrfs-progs cryptsetup-bin + f2fs-tools fdisk fuse gcc # Provides libasan/libubsan @@ -1,12 +1,12 @@ # SPDX-License-Identifier: LGPL-2.1-or-later # # Indonesian translation for systemd. -# Andika Triwidada <andika@gmail.com>, 2014, 2021. +# Andika Triwidada <andika@gmail.com>, 2014, 2021, 2022. msgid "" msgstr "" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2022-10-20 10:35+0200\n" -"PO-Revision-Date: 2021-09-24 11:05+0000\n" +"PO-Revision-Date: 2022-11-25 08:19+0000\n" "Last-Translator: Andika Triwidada <andika@gmail.com>\n" "Language-Team: Indonesian <https://translate.fedoraproject.org/projects/" "systemd/master/id/>\n" @@ -15,7 +15,7 @@ msgstr "" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=1; plural=0;\n" -"X-Generator: Weblate 4.8\n" +"X-Generator: Weblate 4.14.2\n" #: src/core/org.freedesktop.systemd1.policy.in:22 msgid "Send passphrase back to system" @@ -157,22 +157,19 @@ msgstr "Otentikasi diperlukan untuk mendapatkan UUID produk." #: src/hostname/org.freedesktop.hostname1.policy:61 msgid "Get hardware serial number" -msgstr "" +msgstr "Dapatkan nomor seri perangkat keras" #: src/hostname/org.freedesktop.hostname1.policy:62 -#, fuzzy msgid "Authentication is required to get hardware serial number." -msgstr "Otentikasi diperlukan untuk menyetel waktu sistem." +msgstr "Otentikasi diperlukan untuk mendapatkan nomor seri perangkat keras." #: src/hostname/org.freedesktop.hostname1.policy:71 -#, fuzzy msgid "Get system description" -msgstr "Setel zona waktu sistem" +msgstr "Dapatkan deskripsi sistem" #: src/hostname/org.freedesktop.hostname1.policy:72 -#, fuzzy msgid "Authentication is required to get system description." -msgstr "Otentikasi diperlukan untuk menyetel zona waktu sistem." +msgstr "Otentikasi diperlukan untuk mendapatkan deskripsi sistem." #: src/import/org.freedesktop.import1.policy:22 msgid "Import a VM or container image" @@ -495,7 +492,7 @@ msgstr "Otentikasi diperlukan untuk menghibernasi sistem." #: src/login/org.freedesktop.login1.policy:310 msgid "Hibernate the system while other users are logged in" -msgstr "Hibernasikan sistem ketika pengguna lain sedang log masuk." +msgstr "Hibernasikan sistem ketika pengguna lain sedang log masuk" #: src/login/org.freedesktop.login1.policy:311 msgid "" @@ -507,7 +504,7 @@ msgstr "" #: src/login/org.freedesktop.login1.policy:321 msgid "Hibernate the system while an application is inhibiting this" -msgstr "Hibernasikan sistem ketika sebuah aplikasi meminta untuk mencegahnya." +msgstr "Hibernasikan sistem ketika sebuah aplikasi meminta untuk mencegahnya" #: src/login/org.freedesktop.login1.policy:322 msgid "" @@ -2,12 +2,13 @@ # # Dutch translation of systemd. # Pjotr Vertaalt <pjotrvertaalt@gmail.com>, 2021. +# Richard E. van der Luit <fedoraproject@veneax.nl>, 2022. msgid "" msgstr "" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2022-10-20 10:35+0200\n" -"PO-Revision-Date: 2021-03-24 09:16+0000\n" -"Last-Translator: Pjotr Vertaalt <pjotrvertaalt@gmail.com>\n" +"PO-Revision-Date: 2022-11-20 15:19+0000\n" +"Last-Translator: Richard E. van der Luit <fedoraproject@veneax.nl>\n" "Language-Team: Dutch <https://translate.fedoraproject.org/projects/systemd/" "master/nl/>\n" "Language: nl\n" @@ -15,7 +16,7 @@ msgstr "" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=2; plural=n != 1;\n" -"X-Generator: Weblate 4.5.1\n" +"X-Generator: Weblate 4.14.2\n" #: src/core/org.freedesktop.systemd1.policy.in:22 msgid "Send passphrase back to system" @@ -171,23 +172,22 @@ msgstr "" #: src/hostname/org.freedesktop.hostname1.policy:61 msgid "Get hardware serial number" -msgstr "" +msgstr "Serienummer hardware verkrijgen" #: src/hostname/org.freedesktop.hostname1.policy:62 -#, fuzzy msgid "Authentication is required to get hardware serial number." -msgstr "Authenticatie is vereist voor het instellen van de systeemtijd." +msgstr "" +"Authenticatie is vereist voor het verkrijgen van het serienummer van de " +"hardware." #: src/hostname/org.freedesktop.hostname1.policy:71 -#, fuzzy msgid "Get system description" -msgstr "Stel de tijdzone van het systeem in" +msgstr "Systeembeschrijving verkrijgen" #: src/hostname/org.freedesktop.hostname1.policy:72 -#, fuzzy msgid "Authentication is required to get system description." msgstr "" -"Authenticatie is vereist voor het instellen van de tijdzone van het systeem." +"Authenticatie is vereist voor het verkrijgen van de systeembeschrijving." #: src/import/org.freedesktop.import1.policy:22 msgid "Import a VM or container image" diff --git a/po/zh_CN.po b/po/zh_CN.po index 8b8b7aac61..8751aa7f46 100644 --- a/po/zh_CN.po +++ b/po/zh_CN.po @@ -882,5 +882,3 @@ msgid "" "Authentication is required to freeze or thaw the processes of '$(unit)' unit." msgstr "冻结或解冻 '$(unit)' 单元进程需要认证。" -#~ msgid "Authentication is required to kill '$(unit)'." -#~ msgstr "杀死“$(unit)”需要认证。" diff --git a/src/analyze/analyze-inspect-elf.c b/src/analyze/analyze-inspect-elf.c index da2c64565a..cb6692e277 100644 --- a/src/analyze/analyze-inspect-elf.c +++ b/src/analyze/analyze-inspect-elf.c @@ -38,10 +38,6 @@ static int analyze_elf(char **filenames, JsonFormatFlags json_flags) { if (!t) return log_oom(); - r = table_set_align_percent(t, TABLE_HEADER_CELL(0), 100); - if (r < 0) - return table_log_add_error(r); - r = table_add_many( t, TABLE_FIELD, "path", diff --git a/src/basic/dirent-util.c b/src/basic/dirent-util.c index 2eea228c20..17df6a24c9 100644 --- a/src/basic/dirent-util.c +++ b/src/basic/dirent-util.c @@ -8,11 +8,11 @@ #include "stat-util.h" #include "string-util.h" -static int dirent_ensure_type(DIR *d, struct dirent *de) { +int dirent_ensure_type(int dir_fd, struct dirent *de) { STRUCT_STATX_DEFINE(sx); int r; - assert(d); + assert(dir_fd >= 0); assert(de); if (de->d_type != DT_UNKNOWN) @@ -24,7 +24,7 @@ static int dirent_ensure_type(DIR *d, struct dirent *de) { } /* Let's ask only for the type, nothing else. */ - r = statx_fallback(dirfd(d), de->d_name, AT_SYMLINK_NOFOLLOW|AT_NO_AUTOMOUNT, STATX_TYPE, &sx); + r = statx_fallback(dir_fd, de->d_name, AT_SYMLINK_NOFOLLOW|AT_NO_AUTOMOUNT, STATX_TYPE, &sx); if (r < 0) return r; @@ -80,7 +80,7 @@ struct dirent *readdir_ensure_type(DIR *d) { if (!de) return NULL; - r = dirent_ensure_type(d, de); + r = dirent_ensure_type(dirfd(d), de); if (r >= 0) return de; if (r != -ENOENT) { diff --git a/src/basic/dirent-util.h b/src/basic/dirent-util.h index 5fde9043a3..0f1fb23119 100644 --- a/src/basic/dirent-util.h +++ b/src/basic/dirent-util.h @@ -10,6 +10,7 @@ bool dirent_is_file(const struct dirent *de) _pure_; bool dirent_is_file_with_suffix(const struct dirent *de, const char *suffix) _pure_; +int dirent_ensure_type(int dir_fd, struct dirent *de); struct dirent *readdir_ensure_type(DIR *d); struct dirent *readdir_no_dot(DIR *dirp); diff --git a/src/basic/efivars.c b/src/basic/efivars.c index 847b6da1ee..17e0fb895e 100644 --- a/src/basic/efivars.c +++ b/src/basic/efivars.c @@ -354,7 +354,7 @@ static int read_efi_options_variable(char **ret) { int r; /* In SecureBoot mode this is probably not what you want. As your cmdline is cryptographically signed - * like when using Type #2 EFI Unified Kernel Images (https://systemd.io/BOOT_LOADER_SPECIFICATION) + * like when using Type #2 EFI Unified Kernel Images (https://uapi-group.org/specifications/specs/boot_loader_specification) * The user's intention is then that the cmdline should not be modified. You want to make sure that * the system starts up as exactly specified in the signed artifact. * diff --git a/src/basic/fs-util.c b/src/basic/fs-util.c index 4d24cd59de..33b4d1f07b 100644 --- a/src/basic/fs-util.c +++ b/src/basic/fs-util.c @@ -903,7 +903,7 @@ int posix_fallocate_loop(int fd, uint64_t offset, uint64_t size) { /* On EINTR try a couple of times more, but protect against busy looping * (not more than 16 times per 10s) */ - rl = (RateLimit) { 10 * USEC_PER_SEC, 16 }; + rl = (const RateLimit) { 10 * USEC_PER_SEC, 16 }; while (ratelimit_below(&rl)) { r = posix_fallocate(fd, offset, size); if (r != EINTR) diff --git a/src/basic/hostname-util.h b/src/basic/hostname-util.h index a00b852395..bcac3d9fb0 100644 --- a/src/basic/hostname-util.h +++ b/src/basic/hostname-util.h @@ -60,4 +60,12 @@ static inline bool is_outbound_hostname(const char *hostname) { return STRCASE_IN_SET(hostname, "_outbound", "_outbound."); } +static inline bool is_dns_stub_hostname(const char *hostname) { + return STRCASE_IN_SET(hostname, "_localdnsstub", "_localdnsstub."); +} + +static inline bool is_dns_proxy_stub_hostname(const char *hostname) { + return STRCASE_IN_SET(hostname, "_localdnsproxy", "_localdnsproxy."); +} + int get_pretty_hostname(char **ret); diff --git a/src/basic/io-util.c b/src/basic/io-util.c index cdad939aa6..f642beca3a 100644 --- a/src/basic/io-util.c +++ b/src/basic/io-util.c @@ -161,6 +161,21 @@ int ppoll_usec(struct pollfd *fds, size_t nfds, usec_t timeout) { assert(fds || nfds == 0); + /* This is a wrapper around ppoll() that does primarily two things: + * + * ✅ Takes a usec_t instead of a struct timespec + * + * ✅ Guarantees that if an invalid fd is specified we return EBADF (i.e. converts POLLNVAL to + * EBADF). This is done because EBADF is a programming error usually, and hence should bubble up + * as error, and not be eaten up as non-error POLLNVAL event. + * + * ⚠️ ⚠️ ⚠️ Note that this function does not add any special handling for EINTR. Don't forget + * poll()/ppoll() will return with EINTR on any received signal always, there is no automatic + * restarting via SA_RESTART available. Thus, typically you want to handle EINTR not as an error, + * but just as reason to restart things, under the assumption you use a more appropriate mechanism + * to handle signals, such as signalfd() or signal handlers. ⚠️ ⚠️ ⚠️ + */ + if (nfds == 0) return 0; @@ -188,6 +203,9 @@ int fd_wait_for_event(int fd, int event, usec_t timeout) { }; int r; + /* ⚠️ ⚠️ ⚠️ Keep in mind you almost certainly want to handle -EINTR gracefully in the caller, see + * ppoll_usec() above! ⚠️ ⚠️ ⚠️ */ + r = ppoll_usec(&pollfd, 1, timeout); if (r <= 0) return r; diff --git a/src/basic/log.h b/src/basic/log.h index c51941c141..2b1ac5f8c6 100644 --- a/src/basic/log.h +++ b/src/basic/log.h @@ -375,15 +375,12 @@ typedef struct LogRateLimit { RateLimit ratelimit; } LogRateLimit; -#define log_ratelimit_internal(_level, _error, _format, _file, _line, _func, ...) \ +#define log_ratelimit_internal(_level, _error, _ratelimit, _format, _file, _line, _func, ...) \ ({ \ int _log_ratelimit_error = (_error); \ int _log_ratelimit_level = (_level); \ static LogRateLimit _log_ratelimit = { \ - .ratelimit = { \ - .interval = 1 * USEC_PER_SEC, \ - .burst = 1, \ - }, \ + .ratelimit = (_ratelimit), \ }; \ unsigned _num_dropped_errors = ratelimit_num_dropped(&_log_ratelimit.ratelimit); \ if (_log_ratelimit_error != _log_ratelimit.error || _log_ratelimit_level != _log_ratelimit.level) { \ @@ -391,18 +388,35 @@ typedef struct LogRateLimit { _log_ratelimit.error = _log_ratelimit_error; \ _log_ratelimit.level = _log_ratelimit_level; \ } \ - if (ratelimit_below(&_log_ratelimit.ratelimit)) \ + if (log_get_max_level() == LOG_DEBUG || ratelimit_below(&_log_ratelimit.ratelimit)) \ _log_ratelimit_error = _num_dropped_errors > 0 \ - ? log_internal(_log_ratelimit_level, _log_ratelimit_error, _file, _line, _func, _format " (Dropped %u similar message(s))", __VA_ARGS__, _num_dropped_errors) \ - : log_internal(_log_ratelimit_level, _log_ratelimit_error, _file, _line, _func, _format, __VA_ARGS__); \ + ? log_internal(_log_ratelimit_level, _log_ratelimit_error, _file, _line, _func, _format " (Dropped %u similar message(s))", ##__VA_ARGS__, _num_dropped_errors) \ + : log_internal(_log_ratelimit_level, _log_ratelimit_error, _file, _line, _func, _format, ##__VA_ARGS__); \ _log_ratelimit_error; \ }) -#define log_ratelimit_full_errno(level, error, format, ...) \ +#define log_ratelimit_full_errno(level, error, _ratelimit, format, ...) \ ({ \ int _level = (level), _e = (error); \ _e = (log_get_max_level() >= LOG_PRI(_level)) \ - ? log_ratelimit_internal(_level, _e, format, PROJECT_FILE, __LINE__, __func__, __VA_ARGS__) \ + ? log_ratelimit_internal(_level, _e, _ratelimit, format, PROJECT_FILE, __LINE__, __func__, ##__VA_ARGS__) \ : -ERRNO_VALUE(_e); \ _e < 0 ? _e : -ESTRPIPE; \ }) + +#define log_ratelimit_full(level, _ratelimit, format, ...) \ + log_ratelimit_full_errno(level, 0, _ratelimit, format, ##__VA_ARGS__) + +/* Normal logging */ +#define log_ratelimit_info(...) log_ratelimit_full(LOG_INFO, __VA_ARGS__) +#define log_ratelimit_notice(...) log_ratelimit_full(LOG_NOTICE, __VA_ARGS__) +#define log_ratelimit_warning(...) log_ratelimit_full(LOG_WARNING, __VA_ARGS__) +#define log_ratelimit_error(...) log_ratelimit_full(LOG_ERR, __VA_ARGS__) +#define log_ratelimit_emergency(...) log_ratelimit_full(log_emergency_level(), __VA_ARGS__) + +/* Logging triggered by an errno-like error */ +#define log_ratelimit_info_errno(error, ...) log_ratelimit_full_errno(LOG_INFO, error, __VA_ARGS__) +#define log_ratelimit_notice_errno(error, ...) log_ratelimit_full_errno(LOG_NOTICE, error, __VA_ARGS__) +#define log_ratelimit_warning_errno(error, ...) log_ratelimit_full_errno(LOG_WARNING, error, __VA_ARGS__) +#define log_ratelimit_error_errno(error, ...) log_ratelimit_full_errno(LOG_ERR, error, __VA_ARGS__) +#define log_ratelimit_emergency_errno(error, ...) log_ratelimit_full_errno(log_emergency_level(), error, __VA_ARGS__) diff --git a/src/basic/missing_syscall.h b/src/basic/missing_syscall.h index d54e59fdf9..98cd037962 100644 --- a/src/basic/missing_syscall.h +++ b/src/basic/missing_syscall.h @@ -363,6 +363,20 @@ static inline int missing_rt_sigqueueinfo(pid_t tgid, int sig, siginfo_t *info) /* ======================================================================= */ +#if !HAVE_RT_TGSIGQUEUEINFO +static inline int missing_rt_tgsigqueueinfo(pid_t tgid, pid_t tid, int sig, siginfo_t *info) { +# if defined __NR_rt_tgsigqueueinfo && __NR_rt_tgsigqueueinfo >= 0 + return syscall(__NR_rt_tgsigqueueinfo, tgid, tid, sig, info); +# else +# error "__NR_rt_tgsigqueueinfo not defined" +# endif +} + +# define rt_tgsigqueueinfo missing_rt_tgsigqueueinfo +#endif + +/* ======================================================================= */ + #if !HAVE_EXECVEAT static inline int missing_execveat(int dirfd, const char *pathname, char *const argv[], char *const envp[], @@ -412,44 +426,6 @@ static inline int missing_close_range(int first_fd, int end_fd, unsigned flags) /* ======================================================================= */ -#if !HAVE_EPOLL_PWAIT2 - -/* Defined to be equivalent to the kernel's _NSIG_WORDS, i.e. the size of the array of longs that is - * encapsulated by sigset_t. */ -#define KERNEL_NSIG_WORDS (64 / (sizeof(long) * 8)) -#define KERNEL_NSIG_BYTES (KERNEL_NSIG_WORDS * sizeof(long)) - -struct epoll_event; - -static inline int missing_epoll_pwait2( - int fd, - struct epoll_event *events, - int maxevents, - const struct timespec *timeout, - const sigset_t *sigset) { - -# if defined(__NR_epoll_pwait2) && HAVE_LINUX_TIME_TYPES_H - if (timeout) { - /* Convert from userspace timespec to kernel timespec */ - struct __kernel_timespec ts = { - .tv_sec = timeout->tv_sec, - .tv_nsec = timeout->tv_nsec, - }; - - return syscall(__NR_epoll_pwait2, fd, events, maxevents, &ts, sigset, sigset ? KERNEL_NSIG_BYTES : 0); - } else - return syscall(__NR_epoll_pwait2, fd, events, maxevents, NULL, sigset, sigset ? KERNEL_NSIG_BYTES : 0); -# else - errno = ENOSYS; - return -1; -# endif -} - -# define epoll_pwait2 missing_epoll_pwait2 -#endif - -/* ======================================================================= */ - #if !HAVE_MOUNT_SETATTR #if !HAVE_STRUCT_MOUNT_ATTR diff --git a/src/basic/missing_syscall_def.h b/src/basic/missing_syscall_def.h index 67cae7098d..402fdd00dc 100644 --- a/src/basic/missing_syscall_def.h +++ b/src/basic/missing_syscall_def.h @@ -246,74 +246,6 @@ assert_cc(__NR_copy_file_range == systemd_NR_copy_file_range); # endif #endif -#ifndef __IGNORE_epoll_pwait2 -# if defined(__aarch64__) -# define systemd_NR_epoll_pwait2 441 -# elif defined(__alpha__) -# define systemd_NR_epoll_pwait2 551 -# elif defined(__arc__) || defined(__tilegx__) -# define systemd_NR_epoll_pwait2 441 -# elif defined(__arm__) -# define systemd_NR_epoll_pwait2 441 -# elif defined(__i386__) -# define systemd_NR_epoll_pwait2 441 -# elif defined(__ia64__) -# define systemd_NR_epoll_pwait2 1465 -# elif defined(__loongarch64) -# define systemd_NR_epoll_pwait2 441 -# elif defined(__m68k__) -# define systemd_NR_epoll_pwait2 441 -# elif defined(_MIPS_SIM) -# if _MIPS_SIM == _MIPS_SIM_ABI32 -# define systemd_NR_epoll_pwait2 4441 -# elif _MIPS_SIM == _MIPS_SIM_NABI32 -# define systemd_NR_epoll_pwait2 6441 -# elif _MIPS_SIM == _MIPS_SIM_ABI64 -# define systemd_NR_epoll_pwait2 5441 -# else -# error "Unknown MIPS ABI" -# endif -# elif defined(__hppa__) -# define systemd_NR_epoll_pwait2 441 -# elif defined(__powerpc__) -# define systemd_NR_epoll_pwait2 441 -# elif defined(__riscv) -# if __riscv_xlen == 32 -# define systemd_NR_epoll_pwait2 441 -# elif __riscv_xlen == 64 -# define systemd_NR_epoll_pwait2 441 -# else -# error "Unknown RISC-V ABI" -# endif -# elif defined(__s390__) -# define systemd_NR_epoll_pwait2 441 -# elif defined(__sparc__) -# define systemd_NR_epoll_pwait2 441 -# elif defined(__x86_64__) -# if defined(__ILP32__) -# define systemd_NR_epoll_pwait2 (441 | /* __X32_SYSCALL_BIT */ 0x40000000) -# else -# define systemd_NR_epoll_pwait2 441 -# endif -# elif !defined(missing_arch_template) -# warning "epoll_pwait2() syscall number is unknown for your architecture" -# endif - -/* may be an (invalid) negative number due to libseccomp, see PR 13319 */ -# if defined __NR_epoll_pwait2 && __NR_epoll_pwait2 >= 0 -# if defined systemd_NR_epoll_pwait2 -assert_cc(__NR_epoll_pwait2 == systemd_NR_epoll_pwait2); -# endif -# else -# if defined __NR_epoll_pwait2 -# undef __NR_epoll_pwait2 -# endif -# if defined systemd_NR_epoll_pwait2 && systemd_NR_epoll_pwait2 >= 0 -# define __NR_epoll_pwait2 systemd_NR_epoll_pwait2 -# endif -# endif -#endif - #ifndef __IGNORE_getrandom # if defined(__aarch64__) # define systemd_NR_getrandom 278 diff --git a/src/basic/missing_syscalls.py b/src/basic/missing_syscalls.py index 642d4d985d..5ccf02adec 100644 --- a/src/basic/missing_syscalls.py +++ b/src/basic/missing_syscalls.py @@ -9,7 +9,6 @@ SYSCALLS = [ 'bpf', 'close_range', 'copy_file_range', - 'epoll_pwait2', 'getrandom', 'memfd_create', 'mount_setattr', diff --git a/src/basic/random-util.h b/src/basic/random-util.h index 2d99807272..b1a4d10971 100644 --- a/src/basic/random-util.h +++ b/src/basic/random-util.h @@ -23,6 +23,7 @@ static inline uint32_t random_u32(void) { /* Some limits on the pool sizes when we deal with the kernel random pool */ #define RANDOM_POOL_SIZE_MIN 32U #define RANDOM_POOL_SIZE_MAX (10U*1024U*1024U) +#define RANDOM_EFI_SEED_SIZE 32U size_t random_pool_size(void); diff --git a/src/basic/recurse-dir.c b/src/basic/recurse-dir.c index 908e833501..fe18b98d5b 100644 --- a/src/basic/recurse-dir.c +++ b/src/basic/recurse-dir.c @@ -33,6 +33,7 @@ int readdir_all(int dir_fd, struct dirent *entry; DirectoryEntries *nde; size_t add, sz, j; + int r; assert(dir_fd >= 0); @@ -84,6 +85,15 @@ int readdir_all(int dir_fd, if (ignore_dirent(entry, flags)) continue; + if (FLAGS_SET(flags, RECURSE_DIR_ENSURE_TYPE)) { + r = dirent_ensure_type(dir_fd, entry); + if (r == -ENOENT) + /* dentry gone by now? no problem, let's just suppress it */ + continue; + if (r < 0) + return r; + } + de->n_entries++; } @@ -104,8 +114,14 @@ int readdir_all(int dir_fd, if (ignore_dirent(entry, flags)) continue; + /* If d_type == DT_UNKNOWN that means we failed to ensure the type in the earlier loop and + * didn't include the dentry in de->n_entries and as such should skip it here as well. */ + if (FLAGS_SET(flags, RECURSE_DIR_ENSURE_TYPE) && entry->d_type == DT_UNKNOWN) + continue; + de->entries[j++] = entry; } + assert(j == de->n_entries); if (FLAGS_SET(flags, RECURSE_DIR_SORT)) typesafe_qsort(de->entries, de->n_entries, sort_func); @@ -160,7 +176,8 @@ int recurse_dir( return r; } - r = readdir_all(dir_fd, flags, &de); + /* Mask out RECURSE_DIR_ENSURE_TYPE so we can do it ourselves and avoid an extra statx() call. */ + r = readdir_all(dir_fd, flags & ~RECURSE_DIR_ENSURE_TYPE, &de); if (r < 0) return r; diff --git a/src/basic/sigbus.c b/src/basic/sigbus.c index d570b1df47..7e5a493f6b 100644 --- a/src/basic/sigbus.c +++ b/src/basic/sigbus.c @@ -10,6 +10,7 @@ #include "missing_syscall.h" #include "process-util.h" #include "sigbus.h" +#include "signal-util.h" #define SIGBUS_QUEUE_MAX 64 @@ -102,7 +103,7 @@ static void sigbus_handler(int sn, siginfo_t *si, void *data) { if (si->si_code != BUS_ADRERR || !si->si_addr) { assert_se(sigaction(SIGBUS, &old_sigaction, NULL) == 0); - rt_sigqueueinfo(getpid_cached(), SIGBUS, si); + propagate_signal(sn, si); return; } diff --git a/src/basic/signal-util.c b/src/basic/signal-util.c index b61c18b2de..7875ca69bb 100644 --- a/src/basic/signal-util.c +++ b/src/basic/signal-util.c @@ -5,6 +5,7 @@ #include "errno-util.h" #include "macro.h" +#include "missing_syscall.h" #include "parse-util.h" #include "signal-util.h" #include "stdio-util.h" @@ -282,3 +283,20 @@ int pop_pending_signal_internal(int sig, ...) { return r; /* Returns the signal popped */ } + +void propagate_signal(int sig, siginfo_t *siginfo) { + pid_t p; + + /* To be called from a signal handler. Will raise the same signal again, in our process + in our threads. + * + * Note that we use raw_getpid() instead of getpid_cached(). We might have forked with raw_clone() + * earlier (see PID 1), and hence let's go to the raw syscall here. In particular as this is not + * performance sensitive code. + * + * Note that we use kill() rather than raise() as fallback, for similar reasons. */ + + p = raw_getpid(); + + if (rt_tgsigqueueinfo(p, gettid(), sig, siginfo) < 0) + assert_se(kill(p, sig) >= 0); +} diff --git a/src/basic/signal-util.h b/src/basic/signal-util.h index 36372c19bd..ad2ba841c6 100644 --- a/src/basic/signal-util.h +++ b/src/basic/signal-util.h @@ -65,3 +65,5 @@ int signal_is_blocked(int sig); int pop_pending_signal_internal(int sig, ...); #define pop_pending_signal(...) pop_pending_signal_internal(__VA_ARGS__, -1) + +void propagate_signal(int sig, siginfo_t *siginfo); diff --git a/src/basic/strv.h b/src/basic/strv.h index 87a7038a54..f82c76589d 100644 --- a/src/basic/strv.h +++ b/src/basic/strv.h @@ -45,7 +45,7 @@ static inline int strv_extend(char ***l, const char *value) { return strv_extend_with_size(l, NULL, value); } -int strv_extendf(char ***l, const char *format, ...) _printf_(2,0); +int strv_extendf(char ***l, const char *format, ...) _printf_(2,3); int strv_extend_front(char ***l, const char *value); int strv_push_with_size(char ***l, size_t *n, char *value); diff --git a/src/basic/tmpfile-util.c b/src/basic/tmpfile-util.c index 909057429d..dbbd54027e 100644 --- a/src/basic/tmpfile-util.c +++ b/src/basic/tmpfile-util.c @@ -19,31 +19,15 @@ #include "tmpfile-util.h" #include "umask-util.h" -int fopen_temporary_at(int dir_fd, const char *path, FILE **ret_file, char **ret_temp_path) { +static int fopen_temporary_internal(int dir_fd, const char *path, FILE **ret_file) { _cleanup_fclose_ FILE *f = NULL; - _cleanup_free_ char *t = NULL; _cleanup_close_ int fd = -1; int r; assert(dir_fd >= 0 || dir_fd == AT_FDCWD); + assert(path); - if (path) { - r = tempfn_random(path, NULL, &t); - if (r < 0) - return r; - } else { - const char *d; - - r = tmp_dir(&d); - if (r < 0) - return r; - - r = tempfn_random_child(d, NULL, &t); - if (r < 0) - return r; - } - - fd = openat(dir_fd, t, O_CLOEXEC|O_NOCTTY|O_RDWR|O_CREAT|O_EXCL, 0600); + fd = openat(dir_fd, path, O_CLOEXEC|O_NOCTTY|O_RDWR|O_CREAT|O_EXCL, 0600); if (fd < 0) return -errno; @@ -52,15 +36,59 @@ int fopen_temporary_at(int dir_fd, const char *path, FILE **ret_file, char **ret r = take_fdopen_unlocked(&fd, "w", &f); if (r < 0) { - (void) unlinkat(dir_fd, t, 0); + (void) unlinkat(dir_fd, path, 0); return r; } if (ret_file) *ret_file = TAKE_PTR(f); - if (ret_temp_path) - *ret_temp_path = TAKE_PTR(t); + return 0; +} + +int fopen_temporary_at(int dir_fd, const char *path, FILE **ret_file, char **ret_path) { + _cleanup_free_ char *t = NULL; + int r; + + assert(dir_fd >= 0 || dir_fd == AT_FDCWD); + assert(path); + + r = tempfn_random(path, NULL, &t); + if (r < 0) + return r; + + r = fopen_temporary_internal(dir_fd, t, ret_file); + if (r < 0) + return r; + + if (ret_path) + *ret_path = TAKE_PTR(t); + + return 0; +} + +int fopen_temporary_child_at(int dir_fd, const char *path, FILE **ret_file, char **ret_path) { + _cleanup_free_ char *t = NULL; + int r; + + assert(dir_fd >= 0 || dir_fd == AT_FDCWD); + + if (!path) { + r = tmp_dir(&path); + if (r < 0) + return r; + } + + r = tempfn_random_child(path, NULL, &t); + if (r < 0) + return r; + + r = fopen_temporary_internal(dir_fd, t, ret_file); + if (r < 0) + return r; + + if (ret_path) + *ret_path = TAKE_PTR(t); return 0; } diff --git a/src/basic/tmpfile-util.h b/src/basic/tmpfile-util.h index 4af28b9da3..e5b7709e3f 100644 --- a/src/basic/tmpfile-util.h +++ b/src/basic/tmpfile-util.h @@ -8,6 +8,12 @@ int fopen_temporary_at(int dir_fd, const char *path, FILE **ret_file, char **ret static inline int fopen_temporary(const char *path, FILE **ret_file, char **ret_path) { return fopen_temporary_at(AT_FDCWD, path, ret_file, ret_path); } + +int fopen_temporary_child_at(int dir_fd, const char *path, FILE **ret_file, char **ret_path); +static inline int fopen_temporary_child(const char *path, FILE **ret_file, char **ret_path) { + return fopen_temporary_child_at(AT_FDCWD, path, ret_file, ret_path); +} + int mkostemp_safe(char *pattern); int fmkostemp_safe(char *pattern, const char *mode, FILE**_f); diff --git a/src/boot/bootctl.c b/src/boot/bootctl.c index 430887fe67..0df456827c 100644 --- a/src/boot/bootctl.c +++ b/src/boot/bootctl.c @@ -585,8 +585,12 @@ static int print_efi_option(uint16_t id, int *n_printed, bool in_order) { assert(n_printed); r = efi_get_boot_option(id, &title, &partition, &path, &active); + if (r == -ENOENT) { + log_debug_errno(r, "Boot option 0x%04X referenced but missing, ignoring: %m", id); + return 0; + } if (r < 0) - return log_debug_errno(r, "Failed to read boot option 0x%04X: %m", id); + return log_error_errno(r, "Failed to read boot option 0x%04X: %m", id); /* print only configured entries with partition information */ if (!path || sd_id128_is_null(partition)) { @@ -1804,6 +1808,7 @@ static int verb_status(int argc, char *argv[], void *userdata) { { EFI_STUB_FEATURE_PICK_UP_CREDENTIALS, "Picks up credentials from boot partition" }, { EFI_STUB_FEATURE_PICK_UP_SYSEXTS, "Picks up system extension images from boot partition" }, { EFI_STUB_FEATURE_THREE_PCRS, "Measures kernel+command line+sysexts" }, + { EFI_STUB_FEATURE_RANDOM_SEED, "Support for passing random seed to OS" }, }; _cleanup_free_ char *fw_type = NULL, *fw_info = NULL, *loader = NULL, *loader_path = NULL, *stub = NULL; sd_id128_t loader_part_uuid = SD_ID128_NULL; @@ -1886,8 +1891,6 @@ static int verb_status(int argc, char *argv[], void *userdata) { printf("\n"); printf("%sRandom Seed:%s\n", ansi_underline(), ansi_normal()); - have = access(EFIVAR_PATH(EFI_LOADER_VARIABLE(LoaderRandomSeed)), F_OK) >= 0; - printf(" Passed to OS: %s\n", yes_no(have)); have = access(EFIVAR_PATH(EFI_LOADER_VARIABLE(LoaderSystemToken)), F_OK) >= 0; printf(" System Token: %s\n", have ? "set" : "not set"); @@ -1977,10 +1980,10 @@ static int verb_list(int argc, char *argv[], void *userdata) { static int install_random_seed(const char *esp) { _cleanup_(unlink_and_freep) char *tmp = NULL; - _cleanup_free_ void *buffer = NULL; + uint8_t buffer[RANDOM_EFI_SEED_SIZE]; _cleanup_free_ char *path = NULL; _cleanup_close_ int fd = -1; - size_t sz, token_size; + size_t token_size; ssize_t n; int r; @@ -1990,13 +1993,7 @@ static int install_random_seed(const char *esp) { if (!path) return log_oom(); - sz = random_pool_size(); - - buffer = malloc(sz); - if (!buffer) - return log_oom(); - - r = crypto_random_bytes(buffer, sz); + r = crypto_random_bytes(buffer, sizeof(buffer)); if (r < 0) return log_error_errno(r, "Failed to acquire random seed: %m"); @@ -2017,10 +2014,10 @@ static int install_random_seed(const char *esp) { return log_error_errno(fd, "Failed to open random seed file for writing: %m"); } - n = write(fd, buffer, sz); + n = write(fd, buffer, sizeof(buffer)); if (n < 0) return log_error_errno(errno, "Failed to write random seed file: %m"); - if ((size_t) n != sz) + if ((size_t) n != sizeof(buffer)) return log_error_errno(SYNTHETIC_ERRNO(EIO), "Short write while writing random seed file."); if (rename(tmp, path) < 0) @@ -2028,7 +2025,7 @@ static int install_random_seed(const char *esp) { tmp = mfree(tmp); - log_info("Random seed file %s successfully written (%zu bytes).", path, sz); + log_info("Random seed file %s successfully written (%zu bytes).", path, sizeof(buffer)); if (!arg_touch_variables) return 0; @@ -2048,26 +2045,6 @@ static int install_random_seed(const char *esp) { if (r < 0) { if (r != -ENXIO) log_warning_errno(r, "Failed to parse $SYSTEMD_WRITE_SYSTEM_TOKEN, ignoring."); - - if (detect_vm() > 0) { - /* Let's not write a system token if we detect we are running in a VM - * environment. Why? Our default security model for the random seed uses the system - * token as a mechanism to ensure we are not vulnerable to golden master sloppiness - * issues, i.e. that people initialize the random seed file, then copy the image to - * many systems and end up with the same random seed in each that is assumed to be - * valid but in reality is the same for all machines. By storing a system token in - * the EFI variable space we can make sure that even though the random seeds on disk - * are all the same they will be different on each system under the assumption that - * the EFI variable space is maintained separate from the random seed storage. That - * is generally the case on physical systems, as the ESP is stored on persistent - * storage, and the EFI variables in NVRAM. However in virtualized environments this - * is generally not true: the EFI variable set is typically stored along with the - * disk image itself. For example, using the OVMF EFI firmware the EFI variables are - * stored in a file in the ESP itself. */ - - log_notice("Not installing system token, since we are running in a virtualized environment."); - return 0; - } } else if (r == 0) { log_notice("Not writing system token, because $SYSTEMD_WRITE_SYSTEM_TOKEN is set to false."); return 0; @@ -2080,16 +2057,16 @@ static int install_random_seed(const char *esp) { if (r != -ENOENT) return log_error_errno(r, "Failed to test system token validity: %m"); } else { - if (token_size >= sz) { + if (token_size >= sizeof(buffer)) { /* Let's avoid writes if we can, and initialize this only once. */ log_debug("System token already written, not updating."); return 0; } - log_debug("Existing system token size (%zu) does not match our expectations (%zu), replacing.", token_size, sz); + log_debug("Existing system token size (%zu) does not match our expectations (%zu), replacing.", token_size, sizeof(buffer)); } - r = crypto_random_bytes(buffer, sz); + r = crypto_random_bytes(buffer, sizeof(buffer)); if (r < 0) return log_error_errno(r, "Failed to acquire random seed: %m"); @@ -2097,7 +2074,7 @@ static int install_random_seed(const char *esp) { * and possibly get identification information or too much insight into the kernel's entropy pool * state. */ RUN_WITH_UMASK(0077) { - r = efi_set_variable(EFI_LOADER_VARIABLE(LoaderSystemToken), buffer, sz); + r = efi_set_variable(EFI_LOADER_VARIABLE(LoaderSystemToken), buffer, sizeof(buffer)); if (r < 0) { if (!arg_graceful) return log_error_errno(r, "Failed to write 'LoaderSystemToken' EFI variable: %m"); @@ -2107,7 +2084,7 @@ static int install_random_seed(const char *esp) { else log_warning_errno(r, "Unable to write 'LoaderSystemToken' EFI variable, ignoring: %m"); } else - log_info("Successfully initialized system token in EFI variable with %zu bytes.", sz); + log_info("Successfully initialized system token in EFI variable with %zu bytes.", sizeof(buffer)); } return 0; diff --git a/src/boot/efi/boot.c b/src/boot/efi/boot.c index 84f4cc11a3..9123c9a84c 100644 --- a/src/boot/efi/boot.c +++ b/src/boot/efi/boot.c @@ -15,13 +15,14 @@ #include "initrd.h" #include "linux.h" #include "measure.h" +#include "part-discovery.h" #include "pe.h" +#include "vmm.h" #include "random-seed.h" #include "secure-boot.h" #include "shim.h" #include "ticks.h" #include "util.h" -#include "xbootldr.h" #ifndef GNU_EFI_USE_MS_ABI /* We do not use uefi_call_wrapper() in systemd-boot. As such, we rely on the @@ -96,7 +97,6 @@ typedef struct { bool beep; int64_t console_mode; int64_t console_mode_efivar; - RandomSeedMode random_seed_mode; } Config; /* These values have been chosen so that the transitions the user sees could @@ -471,7 +471,6 @@ static void print_status(Config *config, char16_t *loaded_image_path) { _cleanup_free_ char16_t *device_part_uuid = NULL; assert(config); - assert(loaded_image_path); clear_screen(COLOR_NORMAL); console_query_mode(&x_max, &y_max); @@ -529,7 +528,6 @@ static void print_status(Config *config, char16_t *loaded_image_path) { ps_bool(L" auto-firmware: %s\n", config->auto_firmware); ps_bool(L" beep: %s\n", config->beep); ps_bool(L" reboot-for-bitlocker: %s\n", config->reboot_for_bitlocker); - ps_string(L" random-seed-mode: %s\n", random_seed_modes_table[config->random_seed_mode]); switch (config->secure_boot_enroll) { case ENROLL_OFF: @@ -619,7 +617,6 @@ static bool menu_run( assert(config); assert(chosen_entry); - assert(loaded_image_path); EFI_STATUS err; UINTN visible_max = 0; @@ -1207,7 +1204,7 @@ static void config_defaults_load_from_file(Config *config, char *content) { continue; } free(config->entry_default_config); - config->entry_default_config = xstra_to_str(value); + config->entry_default_config = xstr8_to_16(value); continue; } @@ -1274,27 +1271,6 @@ static void config_defaults_load_from_file(Config *config, char *content) { } continue; } - - if (streq8(key, "random-seed-mode")) { - if (streq8(value, "off")) - config->random_seed_mode = RANDOM_SEED_OFF; - else if (streq8(value, "with-system-token")) - config->random_seed_mode = RANDOM_SEED_WITH_SYSTEM_TOKEN; - else if (streq8(value, "always")) - config->random_seed_mode = RANDOM_SEED_ALWAYS; - else { - bool on; - - err = parse_boolean(value, &on); - if (err != EFI_SUCCESS) { - log_error_stall(L"Error parsing 'random-seed-mode' config option: %a", value); - continue; - } - - config->random_seed_mode = on ? RANDOM_SEED_ALWAYS : RANDOM_SEED_OFF; - } - continue; - } } } @@ -1442,32 +1418,32 @@ static void config_entry_add_type1( while ((line = line_get_key_value(content, " \t", &pos, &key, &value))) { if (streq8(key, "title")) { free(entry->title); - entry->title = xstra_to_str(value); + entry->title = xstr8_to_16(value); continue; } if (streq8(key, "sort-key")) { free(entry->sort_key); - entry->sort_key = xstra_to_str(value); + entry->sort_key = xstr8_to_16(value); continue; } if (streq8(key, "version")) { free(entry->version); - entry->version = xstra_to_str(value); + entry->version = xstr8_to_16(value); continue; } if (streq8(key, "machine-id")) { free(entry->machine_id); - entry->machine_id = xstra_to_str(value); + entry->machine_id = xstr8_to_16(value); continue; } if (streq8(key, "linux")) { free(entry->loader); entry->type = LOADER_LINUX; - entry->loader = xstra_to_path(value); + entry->loader = xstr8_to_path(value); entry->key = 'l'; continue; } @@ -1475,10 +1451,10 @@ static void config_entry_add_type1( if (streq8(key, "efi")) { entry->type = LOADER_EFI; free(entry->loader); - entry->loader = xstra_to_path(value); + entry->loader = xstr8_to_path(value); /* do not add an entry for ourselves */ - if (loaded_image_path && strcaseeq16(entry->loader, loaded_image_path)) { + if (strcaseeq16(entry->loader, loaded_image_path)) { entry->type = LOADER_UNDEFINED; break; } @@ -1496,7 +1472,7 @@ static void config_entry_add_type1( if (streq8(key, "devicetree")) { free(entry->devicetree); - entry->devicetree = xstra_to_path(value); + entry->devicetree = xstr8_to_path(value); continue; } @@ -1505,7 +1481,7 @@ static void config_entry_add_type1( entry->initrd, n_initrd == 0 ? 0 : (n_initrd + 1) * sizeof(uint16_t *), (n_initrd + 2) * sizeof(uint16_t *)); - entry->initrd[n_initrd++] = xstra_to_path(value); + entry->initrd[n_initrd++] = xstr8_to_path(value); entry->initrd[n_initrd] = NULL; continue; } @@ -1513,7 +1489,7 @@ static void config_entry_add_type1( if (streq8(key, "options")) { _cleanup_free_ char16_t *new = NULL; - new = xstra_to_str(value); + new = xstr8_to_16(value); if (entry->options) { char16_t *s = xpool_print(L"%s %s", entry->options, new); free(entry->options); @@ -1585,7 +1561,6 @@ static void config_load_defaults(Config *config, EFI_FILE *root_dir) { .auto_firmware = true, .reboot_for_bitlocker = false, .secure_boot_enroll = ENROLL_MANUAL, - .random_seed_mode = RANDOM_SEED_WITH_SYSTEM_TOKEN, .idx_default_efivar = IDX_INVALID, .console_mode = CONSOLE_MODE_KEEP, .console_mode_efivar = CONSOLE_MODE_KEEP, @@ -1908,12 +1883,11 @@ static ConfigEntry *config_entry_add_loader_auto( assert(root_dir); assert(id); assert(title); - assert(loader || loaded_image_path); if (!config->auto_entries) return NULL; - if (loaded_image_path) { + if (!loader) { loader = L"\\EFI\\BOOT\\BOOT" EFI_MACHINE_TYPE_NAME ".efi"; /* We are trying to add the default EFI loader here, @@ -2160,49 +2134,49 @@ static void config_entry_add_unified( while ((line = line_get_key_value(content, "=", &pos, &key, &value))) { if (streq8(key, "PRETTY_NAME")) { free(os_pretty_name); - os_pretty_name = xstra_to_str(value); + os_pretty_name = xstr8_to_16(value); continue; } if (streq8(key, "IMAGE_ID")) { free(os_image_id); - os_image_id = xstra_to_str(value); + os_image_id = xstr8_to_16(value); continue; } if (streq8(key, "NAME")) { free(os_name); - os_name = xstra_to_str(value); + os_name = xstr8_to_16(value); continue; } if (streq8(key, "ID")) { free(os_id); - os_id = xstra_to_str(value); + os_id = xstr8_to_16(value); continue; } if (streq8(key, "IMAGE_VERSION")) { free(os_image_version); - os_image_version = xstra_to_str(value); + os_image_version = xstr8_to_16(value); continue; } if (streq8(key, "VERSION")) { free(os_version); - os_version = xstra_to_str(value); + os_version = xstr8_to_16(value); continue; } if (streq8(key, "VERSION_ID")) { free(os_version_id); - os_version_id = xstra_to_str(value); + os_version_id = xstr8_to_16(value); continue; } if (streq8(key, "BUILD_ID")) { free(os_build_id); - os_build_id = xstra_to_str(value); + os_build_id = xstr8_to_16(value); continue; } } @@ -2245,13 +2219,11 @@ static void config_entry_add_unified( content = mfree(content); /* read the embedded cmdline file */ - err = file_read(linux_dir, f->FileName, offs[SECTION_CMDLINE], szs[SECTION_CMDLINE], &content, NULL); + size_t cmdline_len; + err = file_read(linux_dir, f->FileName, offs[SECTION_CMDLINE], szs[SECTION_CMDLINE], &content, &cmdline_len); if (err == EFI_SUCCESS) { - /* chomp the newline */ - if (content[szs[SECTION_CMDLINE] - 1] == '\n') - content[szs[SECTION_CMDLINE] - 1] = '\0'; - - entry->options = xstra_to_str(content); + entry->options = xstrn8_to_16(content, cmdline_len); + mangle_stub_cmdline(entry->options); } } } @@ -2267,7 +2239,7 @@ static void config_load_xbootldr( assert(config); assert(device); - err = xbootldr_open(device, &new_device, &root_dir); + err = partition_open(XBOOTLDR_GUID, device, &new_device, &root_dir); if (err != EFI_SUCCESS) return; @@ -2562,7 +2534,6 @@ static void export_variables( char16_t uuid[37]; assert(loaded_image); - assert(loaded_image_path); efivar_set_time_usec(LOADER_GUID, L"LoaderTimeInitUSec", init_usec); efivar_set(LOADER_GUID, L"LoaderInfo", L"systemd-boot " GIT_VERSION, 0); @@ -2591,7 +2562,6 @@ static void config_load_all_entries( assert(config); assert(loaded_image); - assert(loaded_image_path); assert(root_dir); config_load_defaults(config, root_dir); @@ -2646,11 +2616,18 @@ static void config_load_all_entries( config_default_entry_select(config); } +static EFI_STATUS discover_root_dir(EFI_LOADED_IMAGE_PROTOCOL *loaded_image, EFI_FILE **ret_dir) { + if (is_direct_boot(loaded_image->DeviceHandle)) + return vmm_open(&loaded_image->DeviceHandle, ret_dir); + else + return open_volume(loaded_image->DeviceHandle, ret_dir); +} + EFI_STATUS efi_main(EFI_HANDLE image, EFI_SYSTEM_TABLE *sys_table) { EFI_LOADED_IMAGE_PROTOCOL *loaded_image; _cleanup_(file_closep) EFI_FILE *root_dir = NULL; _cleanup_(config_free) Config config = {}; - char16_t *loaded_image_path; + _cleanup_free_ char16_t *loaded_image_path = NULL; EFI_STATUS err; uint64_t init_usec; bool menu = false; @@ -2676,13 +2653,11 @@ EFI_STATUS efi_main(EFI_HANDLE image, EFI_SYSTEM_TABLE *sys_table) { if (err != EFI_SUCCESS) return log_error_status_stall(err, L"Error getting a LoadedImageProtocol handle: %r", err); - err = device_path_to_str(loaded_image->FilePath, &loaded_image_path); - if (err != EFI_SUCCESS) - return log_error_status_stall(err, L"Error getting loaded image path: %r", err); + (void) device_path_to_str(loaded_image->FilePath, &loaded_image_path); export_variables(loaded_image, loaded_image_path, init_usec); - err = open_volume(loaded_image->DeviceHandle, &root_dir); + err = discover_root_dir(loaded_image, &root_dir); if (err != EFI_SUCCESS) return log_error_status_stall(err, L"Unable to open root directory: %r", err); @@ -2742,7 +2717,7 @@ EFI_STATUS efi_main(EFI_HANDLE image, EFI_SYSTEM_TABLE *sys_table) { save_selected_entry(&config, entry); /* Optionally, read a random seed off the ESP and pass it to the OS */ - (void) process_random_seed(root_dir, config.random_seed_mode); + (void) process_random_seed(root_dir); err = image_start(image, entry); if (err != EFI_SUCCESS) diff --git a/src/boot/efi/cpio.c b/src/boot/efi/cpio.c index 648f9f000f..76e2cd7f4e 100644 --- a/src/boot/efi/cpio.c +++ b/src/boot/efi/cpio.c @@ -359,24 +359,7 @@ static char16_t *get_dropin_dir(const EFI_DEVICE_PATH *file_path) { if (device_path_to_str(file_path, &file_path_str) != EFI_SUCCESS) return NULL; - for (char16_t *i = file_path_str, *fixed = i;; i++) { - if (*i == '\0') { - *fixed = '\0'; - break; - } - - /* Fix device path node separator. */ - if (*i == '/') - *i = '\\'; - - /* Double '\' is not allowed in EFI file paths. */ - if (fixed != file_path_str && fixed[-1] == '\\' && *i == '\\') - continue; - - *fixed = *i; - fixed++; - } - + convert_efi_path(file_path_str); return xpool_print(u"%s.extra.d", file_path_str); } diff --git a/src/boot/efi/efi-string.c b/src/boot/efi/efi-string.c index b877c6f224..2ba15673c9 100644 --- a/src/boot/efi/efi-string.c +++ b/src/boot/efi/efi-string.c @@ -9,7 +9,8 @@ # include "util.h" #else # include <stdlib.h> -# include "macro.h" +# include "alloc-util.h" +# define xnew(t, n) ASSERT_SE_PTR(new(t, n)) # define xmalloc(n) ASSERT_SE_PTR(malloc(n)) #endif @@ -138,6 +139,81 @@ DEFINE_STRCHR(char16_t, strchr16); DEFINE_STRNDUP(char, xstrndup8, strnlen8); DEFINE_STRNDUP(char16_t, xstrndup16, strnlen16); +static unsigned utf8_to_unichar(const char *utf8, size_t n, char32_t *c) { + char32_t unichar; + unsigned len; + + assert(utf8); + assert(c); + + if (!(utf8[0] & 0x80)) { + *c = utf8[0]; + return 1; + } else if ((utf8[0] & 0xe0) == 0xc0) { + len = 2; + unichar = utf8[0] & 0x1f; + } else if ((utf8[0] & 0xf0) == 0xe0) { + len = 3; + unichar = utf8[0] & 0x0f; + } else if ((utf8[0] & 0xf8) == 0xf0) { + len = 4; + unichar = utf8[0] & 0x07; + } else if ((utf8[0] & 0xfc) == 0xf8) { + len = 5; + unichar = utf8[0] & 0x03; + } else if ((utf8[0] & 0xfe) == 0xfc) { + len = 6; + unichar = utf8[0] & 0x01; + } else { + *c = UINT32_MAX; + return 1; + } + + if (len > n) { + *c = UINT32_MAX; + return len; + } + + for (unsigned i = 1; i < len; i++) { + if ((utf8[i] & 0xc0) != 0x80) { + *c = UINT32_MAX; + return len; + } + unichar <<= 6; + unichar |= utf8[i] & 0x3f; + } + + *c = unichar; + return len; +} + +/* Convert UTF-8 to UCS-2, skipping any invalid or short byte sequences. */ +char16_t *xstrn8_to_16(const char *str8, size_t n) { + if (!str8 || n == 0) + return NULL; + + size_t i = 0; + char16_t *str16 = xnew(char16_t, n + 1); + + while (n > 0 && *str8 != '\0') { + char32_t unichar; + + size_t utf8len = utf8_to_unichar(str8, n, &unichar); + str8 += utf8len; + n = LESS_BY(n, utf8len); + + switch (unichar) { + case 0 ... 0xd7ffU: + case 0xe000U ... 0xffffU: + str16[i++] = unichar; + break; + } + } + + str16[i] = '\0'; + return str16; +} + static bool efi_fnmatch_prefix(const char16_t *p, const char16_t *h, const char16_t **ret_p, const char16_t **ret_h) { assert(p); assert(h); diff --git a/src/boot/efi/efi-string.h b/src/boot/efi/efi-string.h index 1ebd5fd6b7..e12add0b19 100644 --- a/src/boot/efi/efi-string.h +++ b/src/boot/efi/efi-string.h @@ -99,6 +99,11 @@ static inline char16_t *xstrdup16(const char16_t *s) { return xstrndup16(s, SIZE_MAX); } +char16_t *xstrn8_to_16(const char *str8, size_t n); +static inline char16_t *xstr8_to_16(const char *str8) { + return xstrn8_to_16(str8, strlen8(str8)); +} + bool efi_fnmatch(const char16_t *pattern, const char16_t *haystack); bool parse_number8(const char *s, uint64_t *ret_u, const char **ret_tail); @@ -119,6 +124,13 @@ static inline void *mempcpy(void * restrict dest, const void * restrict src, siz memcpy(dest, src, n); return (uint8_t *) dest + n; } + +static inline void explicit_bzero_safe(void *bytes, size_t len) { + if (!bytes || len == 0) + return; + memset(bytes, 0, len); + __asm__ __volatile__("": :"r"(bytes) :"memory"); +} #else /* For unit testing. */ int efi_memcmp(const void *p1, const void *p2, size_t n); diff --git a/src/boot/efi/linux.c b/src/boot/efi/linux.c index 75b9507709..48801f9dd8 100644 --- a/src/boot/efi/linux.c +++ b/src/boot/efi/linux.c @@ -20,35 +20,26 @@ #define STUB_PAYLOAD_GUID \ { 0x55c5d1f8, 0x04cd, 0x46b5, { 0x8a, 0x20, 0xe5, 0x6c, 0xbb, 0x30, 0x52, 0xd0 } } -static EFIAPI EFI_STATUS security_hook( - const SecurityOverride *this, uint32_t authentication_status, const EFI_DEVICE_PATH *file) { +typedef struct { + const void *addr; + size_t len; + const EFI_DEVICE_PATH *device_path; +} ValidationContext; - assert(this); - assert(this->hook == security_hook); +static bool validate_payload( + const void *ctx, const EFI_DEVICE_PATH *device_path, const void *file_buffer, size_t file_size) { - if (file == this->payload_device_path) - return EFI_SUCCESS; + const ValidationContext *payload = ASSERT_PTR(ctx); - return this->original_security->FileAuthenticationState( - this->original_security, authentication_status, file); -} - -static EFIAPI EFI_STATUS security2_hook( - const SecurityOverride *this, - const EFI_DEVICE_PATH *device_path, - void *file_buffer, - size_t file_size, - BOOLEAN boot_policy) { - - assert(this); - assert(this->hook == security2_hook); + if (device_path != payload->device_path) + return false; - if (file_buffer == this->payload && file_size == this->payload_len && - device_path == this->payload_device_path) - return EFI_SUCCESS; + /* Security arch (1) protocol does not provide a file buffer. Instead we are supposed to fetch the payload + * ourselves, which is not needed as we already have everything in memory and the device paths match. */ + if (file_buffer && (file_buffer != payload->addr || file_size != payload->len)) + return false; - return this->original_security2->FileAuthentication( - this->original_security2, device_path, file_buffer, file_size, boot_policy); + return true; } static EFI_STATUS load_image(EFI_HANDLE parent, const void *source, size_t len, EFI_HANDLE *ret_image) { @@ -79,19 +70,13 @@ static EFI_STATUS load_image(EFI_HANDLE parent, const void *source, size_t len, /* We want to support unsigned kernel images as payload, which is safe to do under secure boot * because it is embedded in this stub loader (and since it is already running it must be trusted). */ - SecurityOverride security_override = { - .hook = security_hook, - .payload = source, - .payload_len = len, - .payload_device_path = &payload_device_path.payload.Header, - }, security2_override = { - .hook = security2_hook, - .payload = source, - .payload_len = len, - .payload_device_path = &payload_device_path.payload.Header, - }; - - install_security_override(&security_override, &security2_override); + install_security_override( + validate_payload, + &(ValidationContext) { + .addr = source, + .len = len, + .device_path = &payload_device_path.payload.Header, + }); EFI_STATUS ret = BS->LoadImage( /*BootPolicy=*/false, @@ -101,22 +86,23 @@ static EFI_STATUS load_image(EFI_HANDLE parent, const void *source, size_t len, len, ret_image); - uninstall_security_override(&security_override, &security2_override); + uninstall_security_override(); return ret; } EFI_STATUS linux_exec( EFI_HANDLE parent, - const char *cmdline, UINTN cmdline_len, - const void *linux_buffer, UINTN linux_length, - const void *initrd_buffer, UINTN initrd_length) { + const char16_t *cmdline, + const void *linux_buffer, + size_t linux_length, + const void *initrd_buffer, + size_t initrd_length) { uint32_t compat_address; EFI_STATUS err; assert(parent); - assert(cmdline || cmdline_len == 0); assert(linux_buffer && linux_length > 0); assert(initrd_buffer || initrd_length == 0); @@ -128,7 +114,6 @@ EFI_STATUS linux_exec( return linux_exec_efi_handover( parent, cmdline, - cmdline_len, linux_buffer, linux_length, initrd_buffer, @@ -148,7 +133,7 @@ EFI_STATUS linux_exec( return log_error_status_stall(err, u"Error getting kernel loaded image protocol: %r", err); if (cmdline) { - loaded_image->LoadOptions = xstra_to_str(cmdline); + loaded_image->LoadOptions = (void *) cmdline; loaded_image->LoadOptionsSize = strsize16(loaded_image->LoadOptions); } diff --git a/src/boot/efi/linux.h b/src/boot/efi/linux.h index 19e5f5c4a8..f0a6a37ed1 100644 --- a/src/boot/efi/linux.h +++ b/src/boot/efi/linux.h @@ -2,14 +2,19 @@ #pragma once #include <efi.h> +#include <uchar.h> EFI_STATUS linux_exec( EFI_HANDLE parent, - const char *cmdline, UINTN cmdline_len, - const void *linux_buffer, UINTN linux_length, - const void *initrd_buffer, UINTN initrd_length); + const char16_t *cmdline, + const void *linux_buffer, + size_t linux_length, + const void *initrd_buffer, + size_t initrd_length); EFI_STATUS linux_exec_efi_handover( EFI_HANDLE parent, - const char *cmdline, UINTN cmdline_len, - const void *linux_buffer, UINTN linux_length, - const void *initrd_buffer, UINTN initrd_length); + const char16_t *cmdline, + const void *linux_buffer, + size_t linux_length, + const void *initrd_buffer, + size_t initrd_length); diff --git a/src/boot/efi/linux_x86.c b/src/boot/efi/linux_x86.c index 64336ce348..6a5e431107 100644 --- a/src/boot/efi/linux_x86.c +++ b/src/boot/efi/linux_x86.c @@ -126,12 +126,13 @@ static void linux_efi_handover(EFI_HANDLE parent, uintptr_t kernel, BootParams * EFI_STATUS linux_exec_efi_handover( EFI_HANDLE parent, - const char *cmdline, UINTN cmdline_len, - const void *linux_buffer, UINTN linux_length, - const void *initrd_buffer, UINTN initrd_length) { + const char16_t *cmdline, + const void *linux_buffer, + size_t linux_length, + const void *initrd_buffer, + size_t initrd_length) { assert(parent); - assert(cmdline || cmdline_len == 0); assert(linux_buffer); assert(initrd_buffer || initrd_length == 0); @@ -185,14 +186,20 @@ EFI_STATUS linux_exec_efi_handover( _cleanup_pages_ Pages cmdline_pages = {}; if (cmdline) { + size_t len = MIN(strlen16(cmdline), image_params->hdr.cmdline_size); + cmdline_pages = xmalloc_pages( can_4g ? AllocateAnyPages : AllocateMaxAddress, EfiLoaderData, - EFI_SIZE_TO_PAGES(cmdline_len + 1), + EFI_SIZE_TO_PAGES(len + 1), CMDLINE_PTR_MAX); - memcpy(PHYSICAL_ADDRESS_TO_POINTER(cmdline_pages.addr), cmdline, cmdline_len); - ((char *) PHYSICAL_ADDRESS_TO_POINTER(cmdline_pages.addr))[cmdline_len] = 0; + /* Convert cmdline to ASCII. */ + char *cmdline8 = PHYSICAL_ADDRESS_TO_POINTER(cmdline_pages.addr); + for (size_t i = 0; i < len; i++) + cmdline8[i] = cmdline[i] <= 0x7E ? cmdline[i] : ' '; + cmdline8[len] = '\0'; + boot_params->hdr.cmd_line_ptr = (uint32_t) cmdline_pages.addr; boot_params->ext_cmd_line_ptr = cmdline_pages.addr >> 32; assert(can_4g || cmdline_pages.addr <= CMDLINE_PTR_MAX); diff --git a/src/boot/efi/measure.c b/src/boot/efi/measure.c index 9a16920787..6da07d917e 100644 --- a/src/boot/efi/measure.c +++ b/src/boot/efi/measure.c @@ -187,7 +187,7 @@ EFI_STATUS tpm_log_event_ascii(uint32_t pcrindex, EFI_PHYSICAL_ADDRESS buffer, U _cleanup_free_ char16_t *c = NULL; if (description) - c = xstra_to_str(description); + c = xstr8_to_16(description); return tpm_log_event(pcrindex, buffer, buffer_size, c, ret_measured); } diff --git a/src/boot/efi/meson.build b/src/boot/efi/meson.build index 395386d3ed..2a7e457df3 100644 --- a/src/boot/efi/meson.build +++ b/src/boot/efi/meson.build @@ -360,6 +360,7 @@ efi_headers = files( 'linux.h', 'measure.h', 'missing_efi.h', + 'part-discovery.h', 'pe.h', 'random-seed.h', 'secure-boot.h', @@ -367,7 +368,6 @@ efi_headers = files( 'splash.h', 'ticks.h', 'util.h', - 'xbootldr.h', ) common_sources = files( @@ -379,7 +379,9 @@ common_sources = files( 'graphics.c', 'initrd.c', 'measure.c', + 'part-discovery.c', 'pe.c', + 'random-seed.c', 'secure-boot.c', 'ticks.c', 'util.c', @@ -388,9 +390,8 @@ common_sources = files( systemd_boot_sources = files( 'boot.c', 'drivers.c', - 'random-seed.c', 'shim.c', - 'xbootldr.c', + 'vmm.c', ) stub_sources = files( diff --git a/src/boot/efi/missing_efi.h b/src/boot/efi/missing_efi.h index f9169248ec..250c84c248 100644 --- a/src/boot/efi/missing_efi.h +++ b/src/boot/efi/missing_efi.h @@ -385,3 +385,16 @@ typedef struct _EFI_CONSOLE_CONTROL_PROTOCOL { { 0xd719b2cb, 0x3d3a, 0x4596, {0xa3, 0xbc, 0xda, 0xd0, 0xe, 0x67, 0x65, 0x6f }} #endif + +#ifndef EFI_SHELL_PARAMETERS_PROTOCOL_GUID +# define EFI_SHELL_PARAMETERS_PROTOCOL_GUID \ + { 0x752f3136, 0x4e16, 0x4fdc, { 0xa2, 0x2a, 0xe5, 0xf4, 0x68, 0x12, 0xf4, 0xca } } + +typedef struct { + CHAR16 **Argv; + UINTN Argc; + void *StdIn; + void *StdOut; + void *StdErr; +} EFI_SHELL_PARAMETERS_PROTOCOL; +#endif diff --git a/src/boot/efi/xbootldr.c b/src/boot/efi/part-discovery.c index e5b9ca7268..de6d6112a1 100644 --- a/src/boot/efi/xbootldr.c +++ b/src/boot/efi/part-discovery.c @@ -4,8 +4,8 @@ #include <efigpt.h> #include <efilib.h> +#include "part-discovery.h" #include "util.h" -#include "xbootldr.h" union GptHeaderBuffer { EFI_PARTITION_TABLE_HEADER gpt_header; @@ -81,6 +81,7 @@ static bool verify_gpt(union GptHeaderBuffer *gpt_header_buffer, EFI_LBA lba_exp } static EFI_STATUS try_gpt( + const EFI_GUID *type, EFI_BLOCK_IO_PROTOCOL *block_io, EFI_LBA lba, EFI_LBA *ret_backup_lba, /* May be changed even on error! */ @@ -133,7 +134,7 @@ static EFI_STATUS try_gpt( EFI_PARTITION_ENTRY *entry = (EFI_PARTITION_ENTRY *) ((uint8_t *) entries + gpt.gpt_header.SizeOfPartitionEntry * i); - if (memcmp(&entry->PartitionTypeGUID, XBOOTLDR_GUID, sizeof(entry->PartitionTypeGUID)) != 0) + if (memcmp(&entry->PartitionTypeGUID, type, sizeof(entry->PartitionTypeGUID)) != 0) continue; if (entry->EndingLBA < entry->StartingLBA) /* Bogus? */ @@ -165,7 +166,7 @@ static EFI_STATUS try_gpt( return EFI_NOT_FOUND; } -static EFI_STATUS find_device(EFI_HANDLE *device, EFI_DEVICE_PATH **ret_device_path) { +static EFI_STATUS find_device(const EFI_GUID *type, EFI_HANDLE *device, EFI_DEVICE_PATH **ret_device_path) { EFI_STATUS err; assert(device); @@ -231,8 +232,7 @@ static EFI_STATUS find_device(EFI_HANDLE *device, EFI_DEVICE_PATH **ret_device_p continue; HARDDRIVE_DEVICE_PATH hd; - err = try_gpt( - block_io, lba, + err = try_gpt(type, block_io, lba, nr == 0 ? &backup_lba : NULL, /* Only get backup LBA location from first GPT header. */ &hd); if (err != EFI_SUCCESS) { @@ -252,17 +252,18 @@ static EFI_STATUS find_device(EFI_HANDLE *device, EFI_DEVICE_PATH **ret_device_p return EFI_NOT_FOUND; } -EFI_STATUS xbootldr_open(EFI_HANDLE *device, EFI_HANDLE *ret_device, EFI_FILE **ret_root_dir) { +EFI_STATUS partition_open(const EFI_GUID *type, EFI_HANDLE *device, EFI_HANDLE *ret_device, + EFI_FILE **ret_root_dir) { _cleanup_free_ EFI_DEVICE_PATH *partition_path = NULL; EFI_HANDLE new_device; EFI_FILE *root_dir; EFI_STATUS err; + assert(type); assert(device); - assert(ret_device); assert(ret_root_dir); - err = find_device(device, &partition_path); + err = find_device(type, device, &partition_path); if (err != EFI_SUCCESS) return err; @@ -275,7 +276,8 @@ EFI_STATUS xbootldr_open(EFI_HANDLE *device, EFI_HANDLE *ret_device, EFI_FILE ** if (err != EFI_SUCCESS) return err; - *ret_device = new_device; + if (ret_device) + *ret_device = new_device; *ret_root_dir = root_dir; return EFI_SUCCESS; } diff --git a/src/boot/efi/part-discovery.h b/src/boot/efi/part-discovery.h new file mode 100644 index 0000000000..5cc17f6b3b --- /dev/null +++ b/src/boot/efi/part-discovery.h @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: LGPL-2.1-or-later */ +#pragma once + +#include <efi.h> + +#define XBOOTLDR_GUID \ + &(const EFI_GUID) { 0xbc13c2ff, 0x59e6, 0x4262, { 0xa3, 0x52, 0xb2, 0x75, 0xfd, 0x6f, 0x71, 0x72 } } +#define ESP_GUID \ + &(const EFI_GUID) { 0xc12a7328, 0xf81f, 0x11d2, { 0xba, 0x4b, 0x00, 0xa0, 0xc9, 0x3e, 0xc9, 0x3b } } + +EFI_STATUS partition_open(const EFI_GUID *type, EFI_HANDLE *device, EFI_HANDLE *ret_device, EFI_FILE **ret_root_dir); diff --git a/src/boot/efi/random-seed.c b/src/boot/efi/random-seed.c index aea4f7e532..22ba1c5a30 100644 --- a/src/boot/efi/random-seed.c +++ b/src/boot/efi/random-seed.c @@ -14,11 +14,24 @@ #define EFI_RNG_GUID &(const EFI_GUID) EFI_RNG_PROTOCOL_GUID +struct linux_efi_random_seed { + uint32_t size; + uint8_t seed[]; +}; + +#define LINUX_EFI_RANDOM_SEED_TABLE_GUID \ + { 0x1ce1e5bc, 0x7ceb, 0x42f2, { 0x81, 0xe5, 0x8a, 0xad, 0xf1, 0x80, 0xf5, 0x7b } } + /* SHA256 gives us 256/8=32 bytes */ #define HASH_VALUE_SIZE 32 -static EFI_STATUS acquire_rng(UINTN size, void **ret) { - _cleanup_free_ void *data = NULL; +/* Linux's RNG is 256 bits, so let's provide this much */ +#define DESIRED_SEED_SIZE 32 + +/* Some basic domain separation in case somebody uses this data elsewhere */ +#define HASH_LABEL "systemd-boot random seed label v1" + +static EFI_STATUS acquire_rng(void *ret, UINTN size) { EFI_RNG_PROTOCOL *rng; EFI_STATUS err; @@ -32,126 +45,9 @@ static EFI_STATUS acquire_rng(UINTN size, void **ret) { if (!rng) return EFI_UNSUPPORTED; - data = xmalloc(size); - - err = rng->GetRNG(rng, NULL, size, data); + err = rng->GetRNG(rng, NULL, size, ret); if (err != EFI_SUCCESS) return log_error_status_stall(err, L"Failed to acquire RNG data: %r", err); - - *ret = TAKE_PTR(data); - return EFI_SUCCESS; -} - -static void hash_once( - const void *old_seed, - const void *rng, - UINTN size, - const void *system_token, - UINTN system_token_size, - uint64_t uefi_monotonic_counter, - UINTN counter, - uint8_t ret[static HASH_VALUE_SIZE]) { - - /* This hashes together: - * - * 1. The contents of the old seed file - * 2. Some random data acquired from the UEFI RNG (optional) - * 3. Some 'system token' the installer installed as EFI variable (optional) - * 4. The UEFI "monotonic counter" that increases with each boot - * 5. A supplied counter value - * - * And writes the result to the specified buffer. - */ - - struct sha256_ctx hash; - - assert(old_seed); - assert(system_token_size == 0 || system_token); - - sha256_init_ctx(&hash); - sha256_process_bytes(old_seed, size, &hash); - if (rng) - sha256_process_bytes(rng, size, &hash); - if (system_token_size > 0) - sha256_process_bytes(system_token, system_token_size, &hash); - sha256_process_bytes(&uefi_monotonic_counter, sizeof(uefi_monotonic_counter), &hash); - sha256_process_bytes(&counter, sizeof(counter), &hash); - sha256_finish_ctx(&hash, ret); -} - -static EFI_STATUS hash_many( - const void *old_seed, - const void *rng, - UINTN size, - const void *system_token, - UINTN system_token_size, - uint64_t uefi_monotonic_counter, - UINTN counter_start, - UINTN n, - void **ret) { - - _cleanup_free_ void *output = NULL; - - assert(old_seed); - assert(system_token_size == 0 || system_token); - assert(ret); - - /* Hashes the specified parameters in counter mode, generating n hash values, with the counter in the - * range counter_start…counter_start+n-1. */ - - output = xmalloc_multiply(HASH_VALUE_SIZE, n); - - for (UINTN i = 0; i < n; i++) - hash_once(old_seed, rng, size, - system_token, system_token_size, - uefi_monotonic_counter, - counter_start + i, - (uint8_t*) output + (i * HASH_VALUE_SIZE)); - - *ret = TAKE_PTR(output); - return EFI_SUCCESS; -} - -static EFI_STATUS mangle_random_seed( - const void *old_seed, - const void *rng, - UINTN size, - const void *system_token, - UINTN system_token_size, - uint64_t uefi_monotonic_counter, - void **ret_new_seed, - void **ret_for_kernel) { - - _cleanup_free_ void *new_seed = NULL, *for_kernel = NULL; - EFI_STATUS err; - UINTN n; - - assert(old_seed); - assert(system_token_size == 0 || system_token); - assert(ret_new_seed); - assert(ret_for_kernel); - - /* This takes the old seed file contents, an (optional) random number acquired from the UEFI RNG, an - * (optional) system 'token' installed once by the OS installer in an EFI variable, and hashes them - * together in counter mode, generating a new seed (to replace the file on disk) and the seed for the - * kernel. To keep things simple, the new seed and kernel data have the same size as the old seed and - * RNG data. */ - - n = (size + HASH_VALUE_SIZE - 1) / HASH_VALUE_SIZE; - - /* Begin hashing in counter mode at counter 0 for the new seed for the disk */ - err = hash_many(old_seed, rng, size, system_token, system_token_size, uefi_monotonic_counter, 0, n, &new_seed); - if (err != EFI_SUCCESS) - return err; - - /* Continue counting at 'n' for the seed for the kernel */ - err = hash_many(old_seed, rng, size, system_token, system_token_size, uefi_monotonic_counter, n, n, &for_kernel); - if (err != EFI_SUCCESS) - return err; - - *ret_new_seed = TAKE_PTR(new_seed); - *ret_for_kernel = TAKE_PTR(for_kernel); - return EFI_SUCCESS; } @@ -220,32 +116,82 @@ static void validate_sha256(void) { #endif } -EFI_STATUS process_random_seed(EFI_FILE *root_dir, RandomSeedMode mode) { - _cleanup_free_ void *seed = NULL, *new_seed = NULL, *rng = NULL, *for_kernel = NULL, *system_token = NULL; +EFI_STATUS process_random_seed(EFI_FILE *root_dir) { + _cleanup_erase_ uint8_t random_bytes[DESIRED_SEED_SIZE], hash_key[HASH_VALUE_SIZE]; + _cleanup_free_ struct linux_efi_random_seed *new_seed_table = NULL; + struct linux_efi_random_seed *previous_seed_table = NULL; + _cleanup_free_ void *seed = NULL, *system_token = NULL; _cleanup_(file_closep) EFI_FILE *handle = NULL; - UINTN size, rsize, wsize, system_token_size = 0; _cleanup_free_ EFI_FILE_INFO *info = NULL; + _cleanup_erase_ struct sha256_ctx hash; uint64_t uefi_monotonic_counter = 0; + size_t size, rsize, wsize; + bool seeded_by_efi = false; EFI_STATUS err; + EFI_TIME now; assert(root_dir); + assert_cc(DESIRED_SEED_SIZE == HASH_VALUE_SIZE); validate_sha256(); - if (mode == RANDOM_SEED_OFF) - return EFI_NOT_FOUND; + /* hash = LABEL || sizeof(input1) || input1 || ... || sizeof(inputN) || inputN */ + sha256_init_ctx(&hash); + + /* Some basic domain separation in case somebody uses this data elsewhere */ + sha256_process_bytes(HASH_LABEL, sizeof(HASH_LABEL) - 1, &hash); + + for (size_t i = 0; i < ST->NumberOfTableEntries; ++i) + if (memcmp(&(const EFI_GUID)LINUX_EFI_RANDOM_SEED_TABLE_GUID, + &ST->ConfigurationTable[i].VendorGuid, sizeof(EFI_GUID)) == 0) { + previous_seed_table = ST->ConfigurationTable[i].VendorTable; + break; + } + if (!previous_seed_table) { + size = 0; + sha256_process_bytes(&size, sizeof(size), &hash); + } else { + size = previous_seed_table->size; + seeded_by_efi = size >= DESIRED_SEED_SIZE; + sha256_process_bytes(&size, sizeof(size), &hash); + sha256_process_bytes(previous_seed_table->seed, size, &hash); + + /* Zero and free the previous seed table only at the end after we've managed to install a new + * one, so that in case this function fails or aborts, Linux still receives whatever the + * previous bootloader chain set. So, the next line of this block is not an explicit_bzero() + * call. */ + } - /* Let's better be safe than sorry, and for now disable this logic in SecureBoot mode, so that we - * don't credit a random seed that is not authenticated. */ - if (secure_boot_enabled()) - return EFI_NOT_FOUND; + /* Request some random data from the UEFI RNG. We don't need this to work safely, but it's a good + * idea to use it because it helps us for cases where users mistakenly include a random seed in + * golden master images that are replicated many times. */ + err = acquire_rng(random_bytes, sizeof(random_bytes)); + if (err != EFI_SUCCESS) { + size = 0; + /* If we can't get any randomness from EFI itself, then we'll only be relying on what's in + * ESP. But ESP is mutable, so if secure boot is enabled, we probably shouldn't trust that + * alone, in which case we bail out early. */ + if (!seeded_by_efi && secure_boot_enabled()) + return EFI_NOT_FOUND; + } else { + seeded_by_efi = true; + size = sizeof(random_bytes); + } + sha256_process_bytes(&size, sizeof(size), &hash); + sha256_process_bytes(random_bytes, size, &hash); /* Get some system specific seed that the installer might have placed in an EFI variable. We include * it in our hash. This is protection against golden master image sloppiness, and it remains on the * system, even when disk images are duplicated or swapped out. */ - err = acquire_system_token(&system_token, &system_token_size); - if (mode != RANDOM_SEED_ALWAYS && err != EFI_SUCCESS) + size = 0; + err = acquire_system_token(&system_token, &size); + if ((err != EFI_SUCCESS || size < DESIRED_SEED_SIZE) && !seeded_by_efi) return err; + sha256_process_bytes(&size, sizeof(size), &hash); + if (system_token) { + sha256_process_bytes(system_token, size, &hash); + explicit_bzero_safe(system_token, size); + } err = root_dir->Open( root_dir, @@ -261,7 +207,7 @@ EFI_STATUS process_random_seed(EFI_FILE *root_dir, RandomSeedMode mode) { err = get_file_info_harder(handle, &info, NULL); if (err != EFI_SUCCESS) - return log_error_status_stall(err, L"Failed to get file info for random seed: %r"); + return log_error_status_stall(err, L"Failed to get file info for random seed: %r", err); size = info->FileSize; if (size < RANDOM_MAX_SIZE_MIN) @@ -271,51 +217,114 @@ EFI_STATUS process_random_seed(EFI_FILE *root_dir, RandomSeedMode mode) { return log_error_status_stall(EFI_INVALID_PARAMETER, L"Random seed file is too large."); seed = xmalloc(size); - rsize = size; err = handle->Read(handle, &rsize, seed); if (err != EFI_SUCCESS) return log_error_status_stall(err, L"Failed to read random seed file: %r", err); - if (rsize != size) + if (rsize != size) { + explicit_bzero_safe(seed, rsize); return log_error_status_stall(EFI_PROTOCOL_ERROR, L"Short read on random seed file."); + } + + sha256_process_bytes(&size, sizeof(size), &hash); + sha256_process_bytes(seed, size, &hash); + explicit_bzero_safe(seed, size); err = handle->SetPosition(handle, 0); if (err != EFI_SUCCESS) return log_error_status_stall(err, L"Failed to seek to beginning of random seed file: %r", err); - /* Request some random data from the UEFI RNG. We don't need this to work safely, but it's a good - * idea to use it because it helps us for cases where users mistakenly include a random seed in - * golden master images that are replicated many times. */ - (void) acquire_rng(size, &rng); /* It's fine if this fails */ - /* Let's also include the UEFI monotonic counter (which is supposedly increasing on every single * boot) in the hash, so that even if the changes to the ESP for some reason should not be * persistent, the random seed we generate will still be different on every single boot. */ err = BS->GetNextMonotonicCount(&uefi_monotonic_counter); - if (err != EFI_SUCCESS) + if (err != EFI_SUCCESS && !seeded_by_efi) return log_error_status_stall(err, L"Failed to acquire UEFI monotonic counter: %r", err); + size = sizeof(uefi_monotonic_counter); + sha256_process_bytes(&size, sizeof(size), &hash); + sha256_process_bytes(&uefi_monotonic_counter, size, &hash); - /* Calculate new random seed for the disk and what to pass to the kernel */ - err = mangle_random_seed(seed, rng, size, system_token, system_token_size, uefi_monotonic_counter, &new_seed, &for_kernel); - if (err != EFI_SUCCESS) - return err; + err = RT->GetTime(&now, NULL); + size = err == EFI_SUCCESS ? sizeof(now) : 0; /* Known to be flaky, so don't bark on error. */ + sha256_process_bytes(&size, sizeof(size), &hash); + sha256_process_bytes(&now, size, &hash); + /* hash_key = HASH(hash) */ + sha256_finish_ctx(&hash, hash_key); + + /* hash = hash_key || 0 */ + sha256_init_ctx(&hash); + sha256_process_bytes(hash_key, sizeof(hash_key), &hash); + sha256_process_bytes(&(const uint8_t){ 0 }, sizeof(uint8_t), &hash); + /* random_bytes = HASH(hash) */ + sha256_finish_ctx(&hash, random_bytes); + + size = sizeof(random_bytes); + /* If the file size is too large, zero out the remaining bytes on disk. */ + if (size < info->FileSize) { + err = handle->SetPosition(handle, size); + if (err != EFI_SUCCESS) + return log_error_status_stall(err, L"Failed to seek to offset of random seed file: %r", err); + wsize = info->FileSize - size; + err = handle->Write(handle, &wsize, seed /* All zeros now */); + if (err != EFI_SUCCESS) + return log_error_status_stall(err, L"Failed to write random seed file: %r", err); + if (wsize != info->FileSize - size) + return log_error_status_stall(EFI_PROTOCOL_ERROR, L"Short write on random seed file."); + err = handle->Flush(handle); + if (err != EFI_SUCCESS) + return log_error_status_stall(err, L"Failed to flush random seed file: %r", err); + err = handle->SetPosition(handle, 0); + if (err != EFI_SUCCESS) + return log_error_status_stall(err, L"Failed to seek to beginning of random seed file: %r", err); + + /* We could truncate the file here with something like: + * + * info->FileSize = size; + * err = handle->SetInfo(handle, &GenericFileInfo, info->Size, info); + * if (err != EFI_SUCCESS) + * return log_error_status_stall(err, L"Failed to truncate random seed file: %r", err); + * + * But this is considered slightly risky, because EFI filesystem drivers are a little bit + * flimsy. So instead we rely on userspace eventually truncating this when it writes a new + * seed. For now the best we do is zero it. */ + } /* Update the random seed on disk before we use it */ wsize = size; - err = handle->Write(handle, &wsize, new_seed); + err = handle->Write(handle, &wsize, random_bytes); if (err != EFI_SUCCESS) return log_error_status_stall(err, L"Failed to write random seed file: %r", err); if (wsize != size) return log_error_status_stall(EFI_PROTOCOL_ERROR, L"Short write on random seed file."); - err = handle->Flush(handle); if (err != EFI_SUCCESS) return log_error_status_stall(err, L"Failed to flush random seed file: %r", err); - /* We are good to go */ - err = efivar_set_raw(LOADER_GUID, L"LoaderRandomSeed", for_kernel, size, 0); + err = BS->AllocatePool(EfiACPIReclaimMemory, + offsetof(struct linux_efi_random_seed, seed) + DESIRED_SEED_SIZE, + (void **) &new_seed_table); if (err != EFI_SUCCESS) - return log_error_status_stall(err, L"Failed to write random seed to EFI variable: %r", err); + return log_error_status_stall(err, L"Failed to allocate EFI table for random seed: %r", err); + new_seed_table->size = DESIRED_SEED_SIZE; + + /* hash = hash_key || 1 */ + sha256_init_ctx(&hash); + sha256_process_bytes(hash_key, sizeof(hash_key), &hash); + sha256_process_bytes(&(const uint8_t){ 1 }, sizeof(uint8_t), &hash); + /* new_seed_table->seed = HASH(hash) */ + sha256_finish_ctx(&hash, new_seed_table->seed); + + err = BS->InstallConfigurationTable(&(EFI_GUID)LINUX_EFI_RANDOM_SEED_TABLE_GUID, new_seed_table); + if (err != EFI_SUCCESS) + return log_error_status_stall(err, L"Failed to install EFI table for random seed: %r", err); + TAKE_PTR(new_seed_table); + + if (previous_seed_table) { + /* Now that we've succeeded in installing the new table, we can safely nuke the old one. */ + explicit_bzero_safe(previous_seed_table->seed, previous_seed_table->size); + explicit_bzero_safe(previous_seed_table, sizeof(*previous_seed_table)); + free(previous_seed_table); + } return EFI_SUCCESS; } diff --git a/src/boot/efi/random-seed.h b/src/boot/efi/random-seed.h index 6aa1cc5288..40aaf85860 100644 --- a/src/boot/efi/random-seed.h +++ b/src/boot/efi/random-seed.h @@ -2,21 +2,5 @@ #pragma once #include <efi.h> -#include <errno.h> -#include <uchar.h> -typedef enum RandomSeedMode { - RANDOM_SEED_OFF, - RANDOM_SEED_WITH_SYSTEM_TOKEN, - RANDOM_SEED_ALWAYS, - _RANDOM_SEED_MODE_MAX, - _RANDOM_SEED_MODE_INVALID = -EINVAL, -} RandomSeedMode; - -static const char16_t * const random_seed_modes_table[_RANDOM_SEED_MODE_MAX] = { - [RANDOM_SEED_OFF] = L"off", - [RANDOM_SEED_WITH_SYSTEM_TOKEN] = L"with-system-token", - [RANDOM_SEED_ALWAYS] = L"always", -}; - -EFI_STATUS process_random_seed(EFI_FILE *root_dir, RandomSeedMode mode); +EFI_STATUS process_random_seed(EFI_FILE *root_dir); diff --git a/src/boot/efi/secure-boot.c b/src/boot/efi/secure-boot.c index 171b2c96b3..65457bf423 100644 --- a/src/boot/efi/secure-boot.c +++ b/src/boot/efi/secure-boot.c @@ -127,69 +127,91 @@ out_deallocate: return err; } -static EFI_STATUS install_security_override_one(EFI_GUID guid, SecurityOverride *override) { - EFI_STATUS err; - - assert(override); - - _cleanup_free_ EFI_HANDLE *handles = NULL; - size_t n_handles = 0; +static struct SecurityOverride { + EFI_SECURITY_ARCH_PROTOCOL *security; + EFI_SECURITY2_ARCH_PROTOCOL *security2; + EFI_SECURITY_FILE_AUTHENTICATION_STATE original_hook; + EFI_SECURITY2_FILE_AUTHENTICATION original_hook2; + + security_validator_t validator; + const void *validator_ctx; +} security_override; + +static EFIAPI EFI_STATUS security_hook( + const EFI_SECURITY_ARCH_PROTOCOL *this, + uint32_t authentication_status, + const EFI_DEVICE_PATH *file) { + + assert(security_override.validator); + assert(security_override.security); + assert(security_override.original_hook); + + if (security_override.validator(security_override.validator_ctx, file, NULL, 0)) + return EFI_SUCCESS; - err = BS->LocateHandleBuffer(ByProtocol, &guid, NULL, &n_handles, &handles); - if (err != EFI_SUCCESS) - /* No security arch protocol around? */ - return err; + return security_override.original_hook(security_override.security, authentication_status, file); +} - /* There should only ever be one security arch protocol instance, but let's be paranoid here. */ - assert(n_handles == 1); +static EFIAPI EFI_STATUS security2_hook( + const EFI_SECURITY2_ARCH_PROTOCOL *this, + const EFI_DEVICE_PATH *device_path, + void *file_buffer, + size_t file_size, + BOOLEAN boot_policy) { - void *security = NULL; - err = BS->LocateProtocol(&guid, NULL, &security); - if (err != EFI_SUCCESS) - return log_error_status_stall(err, u"Error getting security arch protocol: %r", err); + assert(security_override.validator); + assert(security_override.security2); + assert(security_override.original_hook2); - err = BS->ReinstallProtocolInterface(handles[0], &guid, security, override); - if (err != EFI_SUCCESS) - return log_error_status_stall(err, u"Error overriding security arch protocol: %r", err); + if (security_override.validator(security_override.validator_ctx, device_path, file_buffer, file_size)) + return EFI_SUCCESS; - override->original = security; - override->original_handle = handles[0]; - return EFI_SUCCESS; + return security_override.original_hook2( + security_override.security2, device_path, file_buffer, file_size, boot_policy); } -/* This replaces the platform provided security arch protocols (defined in the UEFI Platform Initialization - * Specification) with the provided override instances. If not running in secure boot or the protocols are - * not available nothing happens. The override instances are provided with the necessary info to undo this - * in uninstall_security_override(). */ -void install_security_override(SecurityOverride *override, SecurityOverride *override2) { - assert(override); - assert(override2); +/* This replaces the platform provided security arch protocols hooks (defined in the UEFI Platform + * Initialization Specification) with our own that uses the given validator to decide if a image is to be + * trusted. If not running in secure boot or the protocols are not available nothing happens. The override + * must be removed with uninstall_security_override() after LoadImage() has been called. + * + * This is a hack as we do not own the security protocol instances and modifying them is not an official part + * of their spec. But there is little else we can do to circumvent secure boot short of implementing our own + * PE loader. We could replace the firmware instances with our own instance using + * ReinstallProtocolInterface(), but some firmware will still use the old ones. */ +void install_security_override(security_validator_t validator, const void *validator_ctx) { + EFI_STATUS err; + + assert(validator); if (!secure_boot_enabled()) return; - (void) install_security_override_one((EFI_GUID) EFI_SECURITY_ARCH_PROTOCOL_GUID, override); - (void) install_security_override_one((EFI_GUID) EFI_SECURITY2_ARCH_PROTOCOL_GUID, override2); + security_override = (struct SecurityOverride) { + .validator = validator, + .validator_ctx = validator_ctx, + }; + + EFI_SECURITY_ARCH_PROTOCOL *security = NULL; + err = BS->LocateProtocol(&(EFI_GUID) EFI_SECURITY_ARCH_PROTOCOL_GUID, NULL, (void **) &security); + if (err == EFI_SUCCESS) { + security_override.security = security; + security_override.original_hook = security->FileAuthenticationState; + security->FileAuthenticationState = security_hook; + } + + EFI_SECURITY2_ARCH_PROTOCOL *security2 = NULL; + err = BS->LocateProtocol(&(EFI_GUID) EFI_SECURITY2_ARCH_PROTOCOL_GUID, NULL, (void **) &security2); + if (err == EFI_SUCCESS) { + security_override.security2 = security2; + security_override.original_hook2 = security2->FileAuthentication; + security2->FileAuthentication = security2_hook; + } } -void uninstall_security_override(SecurityOverride *override, SecurityOverride *override2) { - assert(override); - assert(override2); - - /* We use assert_se here to guarantee the system is not in a weird state in the unlikely case of an - * error restoring the original protocols. */ - - if (override->original_handle) - assert_se(BS->ReinstallProtocolInterface( - override->original_handle, - &(EFI_GUID) EFI_SECURITY_ARCH_PROTOCOL_GUID, - override, - override->original) == EFI_SUCCESS); - - if (override2->original_handle) - assert_se(BS->ReinstallProtocolInterface( - override2->original_handle, - &(EFI_GUID) EFI_SECURITY2_ARCH_PROTOCOL_GUID, - override2, - override2->original) == EFI_SUCCESS); +void uninstall_security_override(void) { + if (security_override.original_hook) + security_override.security->FileAuthenticationState = security_override.original_hook; + if (security_override.original_hook2) + security_override.security2->FileAuthentication = security_override.original_hook2; } diff --git a/src/boot/efi/secure-boot.h b/src/boot/efi/secure-boot.h index 91b6770edb..e98de81c2a 100644 --- a/src/boot/efi/secure-boot.h +++ b/src/boot/efi/secure-boot.h @@ -17,23 +17,11 @@ SecureBootMode secure_boot_mode(void); EFI_STATUS secure_boot_enroll_at(EFI_FILE *root_dir, const char16_t *path); -typedef struct { - void *hook; - - /* End of EFI_SECURITY_ARCH(2)_PROTOCOL. The rest is our own protocol instance data. */ - - EFI_HANDLE original_handle; - union { - void *original; - EFI_SECURITY_ARCH_PROTOCOL *original_security; - EFI_SECURITY2_ARCH_PROTOCOL *original_security2; - }; - - /* Used by the stub to identify the embedded image. */ - const void *payload; - size_t payload_len; - const EFI_DEVICE_PATH *payload_device_path; -} SecurityOverride; - -void install_security_override(SecurityOverride *override, SecurityOverride *override2); -void uninstall_security_override(SecurityOverride *override, SecurityOverride *override2); +typedef bool (*security_validator_t)( + const void *ctx, + const EFI_DEVICE_PATH *device_path, + const void *file_buffer, + size_t file_size); + +void install_security_override(security_validator_t validator, const void *validator_ctx); +void uninstall_security_override(void); diff --git a/src/boot/efi/shim.c b/src/boot/efi/shim.c index 3ae058cb84..ac224336bc 100644 --- a/src/boot/efi/shim.c +++ b/src/boot/efi/shim.c @@ -23,7 +23,7 @@ #endif struct ShimLock { - EFI_STATUS __sysv_abi__ (*shim_verify) (void *buffer, uint32_t size); + EFI_STATUS __sysv_abi__ (*shim_verify) (const void *buffer, uint32_t size); /* context is actually a struct for the PE header, but it isn't needed so void is sufficient just do define the interface * see shim.c/shim.h and PeHeader.h in the github shim repo */ @@ -41,79 +41,45 @@ bool shim_loaded(void) { return BS->LocateProtocol((EFI_GUID*) SHIM_LOCK_GUID, NULL, (void**) &shim_lock) == EFI_SUCCESS; } -static bool shim_validate(void *data, uint32_t size) { - struct ShimLock *shim_lock; - - if (!data) - return false; - - if (BS->LocateProtocol((EFI_GUID*) SHIM_LOCK_GUID, NULL, (void**) &shim_lock) != EFI_SUCCESS) - return false; - - if (!shim_lock) - return false; - - return shim_lock->shim_verify(data, size) == EFI_SUCCESS; -} - -static EFIAPI EFI_STATUS security2_hook( - const SecurityOverride *this, - const EFI_DEVICE_PATH *device_path, - void *file_buffer, - UINTN file_size, - BOOLEAN boot_policy) { - - assert(this); - assert(this->hook == security2_hook); - - if (shim_validate(file_buffer, file_size)) - return EFI_SUCCESS; - - return this->original_security2->FileAuthentication( - this->original_security2, device_path, file_buffer, file_size, boot_policy); -} - -static EFIAPI EFI_STATUS security_hook( - const SecurityOverride *this, - uint32_t authentication_status, - const EFI_DEVICE_PATH *device_path) { +static bool shim_validate( + const void *ctx, const EFI_DEVICE_PATH *device_path, const void *file_buffer, size_t file_size) { EFI_STATUS err; + _cleanup_free_ char *file_buffer_owned = NULL; - assert(this); - assert(this->hook == security_hook); + if (!file_buffer) { + if (!device_path) + return false; - if (!device_path) - return this->original_security->FileAuthenticationState( - this->original_security, authentication_status, device_path); + EFI_HANDLE device_handle; + EFI_DEVICE_PATH *file_dp = (EFI_DEVICE_PATH *) device_path; + err = BS->LocateDevicePath(&FileSystemProtocol, &file_dp, &device_handle); + if (err != EFI_SUCCESS) + return false; - EFI_HANDLE device_handle; - EFI_DEVICE_PATH *dp = (EFI_DEVICE_PATH *) device_path; - err = BS->LocateDevicePath(&FileSystemProtocol, &dp, &device_handle); - if (err != EFI_SUCCESS) - return err; + _cleanup_(file_closep) EFI_FILE *root = NULL; + err = open_volume(device_handle, &root); + if (err != EFI_SUCCESS) + return false; - _cleanup_(file_closep) EFI_FILE *root = NULL; - err = open_volume(device_handle, &root); - if (err != EFI_SUCCESS) - return err; + _cleanup_free_ char16_t *dp_str = NULL; + err = device_path_to_str(file_dp, &dp_str); + if (err != EFI_SUCCESS) + return false; - _cleanup_free_ char16_t *dp_str = NULL; - err = device_path_to_str(dp, &dp_str); - if (err != EFI_SUCCESS) - return err; + err = file_read(root, dp_str, 0, 0, &file_buffer_owned, &file_size); + if (err != EFI_SUCCESS) + return false; - char *file_buffer; - size_t file_size; - err = file_read(root, dp_str, 0, 0, &file_buffer, &file_size); - if (err != EFI_SUCCESS) - return err; + file_buffer = file_buffer_owned; + } - if (shim_validate(file_buffer, file_size)) - return EFI_SUCCESS; + struct ShimLock *shim_lock; + err = BS->LocateProtocol((EFI_GUID *) SHIM_LOCK_GUID, NULL, (void **) &shim_lock); + if (err != EFI_SUCCESS) + return false; - return this->original_security->FileAuthenticationState( - this->original_security, authentication_status, device_path); + return shim_lock->shim_verify(file_buffer, file_size) == EFI_SUCCESS; } EFI_STATUS shim_load_image(EFI_HANDLE parent, const EFI_DEVICE_PATH *device_path, EFI_HANDLE *ret_image) { @@ -122,20 +88,14 @@ EFI_STATUS shim_load_image(EFI_HANDLE parent, const EFI_DEVICE_PATH *device_path bool have_shim = shim_loaded(); - SecurityOverride security_override = { - .hook = security_hook, - }, security2_override = { - .hook = security2_hook, - }; - if (have_shim) - install_security_override(&security_override, &security2_override); + install_security_override(shim_validate, NULL); EFI_STATUS ret = BS->LoadImage( /*BootPolicy=*/false, parent, (EFI_DEVICE_PATH *) device_path, NULL, 0, ret_image); if (have_shim) - uninstall_security_override(&security_override, &security2_override); + uninstall_security_override(); return ret; } diff --git a/src/boot/efi/splash.c b/src/boot/efi/splash.c index 5bc1084e62..25df97eb21 100644 --- a/src/boot/efi/splash.c +++ b/src/boot/efi/splash.c @@ -39,16 +39,11 @@ struct bmp_map { static EFI_STATUS bmp_parse_header( const uint8_t *bmp, - UINTN size, + size_t size, struct bmp_dib **ret_dib, struct bmp_map **ret_map, const uint8_t **pixmap) { - struct bmp_file *file; - struct bmp_dib *dib; - struct bmp_map *map; - UINTN row_size; - assert(bmp); assert(ret_dib); assert(ret_map); @@ -58,7 +53,7 @@ static EFI_STATUS bmp_parse_header( return EFI_INVALID_PARAMETER; /* check file header */ - file = (struct bmp_file *)bmp; + struct bmp_file *file = (struct bmp_file *) bmp; if (file->signature[0] != 'B' || file->signature[1] != 'M') return EFI_INVALID_PARAMETER; if (file->size != size) @@ -67,7 +62,7 @@ static EFI_STATUS bmp_parse_header( return EFI_INVALID_PARAMETER; /* check device-independent bitmap */ - dib = (struct bmp_dib *)(bmp + sizeof(struct bmp_file)); + struct bmp_dib *dib = (struct bmp_dib *) (bmp + sizeof(struct bmp_file)); if (dib->size < sizeof(struct bmp_dib)) return EFI_UNSUPPORTED; @@ -92,38 +87,26 @@ static EFI_STATUS bmp_parse_header( return EFI_UNSUPPORTED; } - row_size = ((UINTN) dib->depth * dib->x + 31) / 32 * 4; + size_t row_size = ((size_t) dib->depth * dib->x + 31) / 32 * 4; if (file->size - file->offset < dib->y * row_size) return EFI_INVALID_PARAMETER; if (row_size * dib->y > 64 * 1024 * 1024) return EFI_INVALID_PARAMETER; /* check color table */ - map = (struct bmp_map *)(bmp + sizeof(struct bmp_file) + dib->size); + struct bmp_map *map = (struct bmp_map *) (bmp + sizeof(struct bmp_file) + dib->size); if (file->offset < sizeof(struct bmp_file) + dib->size) return EFI_INVALID_PARAMETER; if (file->offset > sizeof(struct bmp_file) + dib->size) { - uint32_t map_count; - UINTN map_size; + uint32_t map_count = 0; if (dib->colors_used) map_count = dib->colors_used; - else { - switch (dib->depth) { - case 1: - case 4: - case 8: - map_count = 1 << dib->depth; - break; + else if (IN_SET(dib->depth, 1, 4, 8)) + map_count = 1 << dib->depth; - default: - map_count = 0; - break; - } - } - - map_size = file->offset - (sizeof(struct bmp_file) + dib->size); + size_t map_size = file->offset - (sizeof(struct bmp_file) + dib->size); if (map_size != sizeof(struct bmp_map) * map_count) return EFI_INVALID_PARAMETER; } @@ -135,28 +118,51 @@ static EFI_STATUS bmp_parse_header( return EFI_SUCCESS; } -static void pixel_blend(uint32_t *dst, const uint32_t source) { - uint32_t alpha, src, src_rb, src_g, dst_rb, dst_g, rb, g; - - assert(dst); - - alpha = (source & 0xff); - - /* convert src from RGBA to XRGB */ - src = source >> 8; +enum Channels { R, G, B, A, _CHANNELS_MAX }; +static void read_channel_maks( + const struct bmp_dib *dib, + uint32_t channel_mask[static _CHANNELS_MAX], + uint8_t channel_shift[static _CHANNELS_MAX], + uint8_t channel_scale[static _CHANNELS_MAX]) { - /* decompose into RB and G components */ - src_rb = (src & 0xff00ff); - src_g = (src & 0x00ff00); - - dst_rb = (*dst & 0xff00ff); - dst_g = (*dst & 0x00ff00); - - /* blend */ - rb = ((((src_rb - dst_rb) * alpha + 0x800080) >> 8) + dst_rb) & 0xff00ff; - g = ((((src_g - dst_g) * alpha + 0x008000) >> 8) + dst_g) & 0x00ff00; + assert(dib); - *dst = (rb | g); + if (IN_SET(dib->depth, 16, 32) && dib->size >= sizeof(*dib) + 3 * sizeof(uint32_t)) { + uint32_t *mask = (uint32_t *) ((uint8_t *) dib + sizeof(*dib)); + channel_mask[R] = mask[R]; + channel_mask[G] = mask[G]; + channel_mask[B] = mask[B]; + channel_shift[R] = __builtin_ctz(mask[R]); + channel_shift[G] = __builtin_ctz(mask[G]); + channel_shift[B] = __builtin_ctz(mask[B]); + channel_scale[R] = 0xff / ((1 << __builtin_popcount(mask[R])) - 1); + channel_scale[G] = 0xff / ((1 << __builtin_popcount(mask[G])) - 1); + channel_scale[B] = 0xff / ((1 << __builtin_popcount(mask[B])) - 1); + + if (dib->size >= sizeof(*dib) + 4 * sizeof(uint32_t) && mask[A] != 0) { + channel_mask[A] = mask[A]; + channel_shift[A] = __builtin_ctz(mask[A]); + channel_scale[A] = 0xff / ((1 << __builtin_popcount(mask[A])) - 1); + } else { + channel_mask[A] = 0; + channel_shift[A] = 0; + channel_scale[A] = 0; + } + } else { + bool bpp16 = dib->depth == 16; + channel_mask[R] = bpp16 ? 0x7C00 : 0xFF0000; + channel_mask[G] = bpp16 ? 0x03E0 : 0x00FF00; + channel_mask[B] = bpp16 ? 0x001F : 0x0000FF; + channel_mask[A] = bpp16 ? 0x0000 : 0x000000; + channel_shift[R] = bpp16 ? 0xA : 0x10; + channel_shift[G] = bpp16 ? 0x5 : 0x08; + channel_shift[B] = bpp16 ? 0x0 : 0x00; + channel_shift[A] = bpp16 ? 0x0 : 0x00; + channel_scale[R] = bpp16 ? 0x08 : 0x1; + channel_scale[G] = bpp16 ? 0x08 : 0x1; + channel_scale[B] = bpp16 ? 0x08 : 0x1; + channel_scale[A] = bpp16 ? 0x00 : 0x0; + } } static EFI_STATUS bmp_to_blt( @@ -172,17 +178,19 @@ static EFI_STATUS bmp_to_blt( assert(map); assert(pixmap); + uint32_t channel_mask[_CHANNELS_MAX]; + uint8_t channel_shift[_CHANNELS_MAX], channel_scale[_CHANNELS_MAX]; + read_channel_maks(dib, channel_mask, channel_shift, channel_scale); + /* transform and copy pixels */ in = pixmap; - for (UINTN y = 0; y < dib->y; y++) { - EFI_GRAPHICS_OUTPUT_BLT_PIXEL *out; - UINTN row_size; + for (uint32_t y = 0; y < dib->y; y++) { + EFI_GRAPHICS_OUTPUT_BLT_PIXEL *out = &buf[(dib->y - y - 1) * dib->x]; - out = &buf[(dib->y - y - 1) * dib->x]; - for (UINTN x = 0; x < dib->x; x++, in++, out++) { + for (uint32_t x = 0; x < dib->x; x++, in++, out++) { switch (dib->depth) { case 1: { - for (UINTN i = 0; i < 8 && x < dib->x; i++) { + for (unsigned i = 0; i < 8 && x < dib->x; i++) { out->Red = map[((*in) >> (7 - i)) & 1].red; out->Green = map[((*in) >> (7 - i)) & 1].green; out->Blue = map[((*in) >> (7 - i)) & 1].blue; @@ -195,9 +203,7 @@ static EFI_STATUS bmp_to_blt( } case 4: { - UINTN i; - - i = (*in) >> 4; + unsigned i = (*in) >> 4; out->Red = map[i].red; out->Green = map[i].green; out->Blue = map[i].blue; @@ -218,16 +224,6 @@ static EFI_STATUS bmp_to_blt( out->Blue = map[*in].blue; break; - case 16: { - uint16_t i = *(uint16_t *) in; - - out->Red = (i & 0x7c00) >> 7; - out->Green = (i & 0x3e0) >> 2; - out->Blue = (i & 0x1f) << 3; - in += 1; - break; - } - case 24: out->Red = in[2]; out->Green = in[1]; @@ -235,34 +231,42 @@ static EFI_STATUS bmp_to_blt( in += 2; break; + case 16: case 32: { - uint32_t i = *(uint32_t *) in; + uint32_t i = dib->depth == 16 ? *(uint16_t *) in : *(uint32_t *) in; + + uint8_t r = ((i & channel_mask[R]) >> channel_shift[R]) * channel_scale[R], + g = ((i & channel_mask[G]) >> channel_shift[G]) * channel_scale[G], + b = ((i & channel_mask[B]) >> channel_shift[B]) * channel_scale[B], + a = 0xFFu; + if (channel_mask[A] != 0) + a = ((i & channel_mask[A]) >> channel_shift[A]) * channel_scale[A]; - pixel_blend((uint32_t *)out, i); + out->Red = (out->Red * (0xFFu - a) + r * a) >> 8; + out->Green = (out->Green * (0xFFu - a) + g * a) >> 8; + out->Blue = (out->Blue * (0xFFu - a) + b * a) >> 8; - in += 3; + in += dib->depth == 16 ? 1 : 3; break; } } } /* add row padding; new lines always start at 32 bit boundary */ - row_size = in - pixmap; + size_t row_size = in - pixmap; in += ((row_size + 3) & ~3) - row_size; } return EFI_SUCCESS; } -EFI_STATUS graphics_splash(const uint8_t *content, UINTN len) { +EFI_STATUS graphics_splash(const uint8_t *content, size_t len) { EFI_GRAPHICS_OUTPUT_BLT_PIXEL background = {}; EFI_GRAPHICS_OUTPUT_PROTOCOL *GraphicsOutput = NULL; struct bmp_dib *dib; struct bmp_map *map; const uint8_t *pixmap; - _cleanup_free_ void *blt = NULL; - UINTN x_pos = 0; - UINTN y_pos = 0; + size_t x_pos = 0, y_pos = 0; EFI_STATUS err; if (len == 0) @@ -297,9 +301,9 @@ EFI_STATUS graphics_splash(const uint8_t *content, UINTN len) { if (err != EFI_SUCCESS) return err; - /* EFI buffer */ - blt = xnew(EFI_GRAPHICS_OUTPUT_BLT_PIXEL, dib->x * dib->y); - + /* Read in current screen content to perform proper alpha blending. */ + _cleanup_free_ EFI_GRAPHICS_OUTPUT_BLT_PIXEL *blt = xnew( + EFI_GRAPHICS_OUTPUT_BLT_PIXEL, dib->x * dib->y); err = GraphicsOutput->Blt( GraphicsOutput, blt, EfiBltVideoToBltBuffer, x_pos, y_pos, 0, 0, diff --git a/src/boot/efi/stub.c b/src/boot/efi/stub.c index a842c5c679..023f8ae255 100644 --- a/src/boot/efi/stub.c +++ b/src/boot/efi/stub.c @@ -9,7 +9,9 @@ #include "graphics.h" #include "linux.h" #include "measure.h" +#include "part-discovery.h" #include "pe.h" +#include "random-seed.h" #include "secure-boot.h" #include "splash.h" #include "tpm-pcr.h" @@ -84,6 +86,7 @@ static void export_variables(EFI_LOADED_IMAGE_PROTOCOL *loaded_image) { EFI_STUB_FEATURE_PICK_UP_CREDENTIALS | /* We pick up credentials from the boot partition */ EFI_STUB_FEATURE_PICK_UP_SYSEXTS | /* We pick up system extensions from the boot partition */ EFI_STUB_FEATURE_THREE_PCRS | /* We can measure kernel image, parameters and sysext */ + EFI_STUB_FEATURE_RANDOM_SEED | /* We pass a random seed to the kernel */ 0; char16_t uuid[37]; @@ -130,18 +133,65 @@ static void export_variables(EFI_LOADED_IMAGE_PROTOCOL *loaded_image) { (void) efivar_set_uint64_le(LOADER_GUID, L"StubFeatures", stub_features, 0); } +static bool use_load_options( + EFI_HANDLE stub_image, + EFI_LOADED_IMAGE_PROTOCOL *loaded_image, + bool have_cmdline, + char16_t **ret) { + + assert(stub_image); + assert(loaded_image); + assert(ret); + + /* We only allow custom command lines if we aren't in secure boot or if no cmdline was baked into + * the stub image. */ + if (secure_boot_enabled() && have_cmdline) + return false; + + /* We also do a superficial check whether first character of passed command line + * is printable character (for compat with some Dell systems which fill in garbage?). */ + if (loaded_image->LoadOptionsSize < sizeof(char16_t) || ((char16_t *) loaded_image->LoadOptions)[0] <= 0x1F) + return false; + + /* The UEFI shell registers EFI_SHELL_PARAMETERS_PROTOCOL onto images it runs. This lets us know that + * LoadOptions starts with the stub binary path which we want to strip off. */ + EFI_SHELL_PARAMETERS_PROTOCOL *shell; + if (BS->HandleProtocol(stub_image, &(EFI_GUID) EFI_SHELL_PARAMETERS_PROTOCOL_GUID, (void **) &shell) + != EFI_SUCCESS) { + /* Not running from EFI shell, use entire LoadOptions. Note that LoadOptions is a void*, so + * it could be anything! */ + *ret = xstrndup16(loaded_image->LoadOptions, loaded_image->LoadOptionsSize / sizeof(char16_t)); + mangle_stub_cmdline(*ret); + return true; + } + + if (shell->Argc < 2) + /* No arguments were provided? Then we fall back to built-in cmdline. */ + return false; + + /* Assemble the command line ourselves without our stub path. */ + *ret = xstrdup16(shell->Argv[1]); + for (size_t i = 2; i < shell->Argc; i++) { + _cleanup_free_ char16_t *old = *ret; + *ret = xpool_print(u"%s %s", old, shell->Argv[i]); + } + + mangle_stub_cmdline(*ret); + return true; +} + EFI_STATUS efi_main(EFI_HANDLE image, EFI_SYSTEM_TABLE *sys_table) { _cleanup_free_ void *credential_initrd = NULL, *global_credential_initrd = NULL, *sysext_initrd = NULL, *pcrsig_initrd = NULL, *pcrpkey_initrd = NULL; - UINTN credential_initrd_size = 0, global_credential_initrd_size = 0, sysext_initrd_size = 0, pcrsig_initrd_size = 0, pcrpkey_initrd_size = 0; - UINTN cmdline_len = 0, linux_size, initrd_size, dt_size; + size_t credential_initrd_size = 0, global_credential_initrd_size = 0, sysext_initrd_size = 0, pcrsig_initrd_size = 0, pcrpkey_initrd_size = 0; + size_t linux_size, initrd_size, dt_size; EFI_PHYSICAL_ADDRESS linux_base, initrd_base, dt_base; _cleanup_(devicetree_cleanup) struct devicetree_state dt_state = {}; EFI_LOADED_IMAGE_PROTOCOL *loaded_image; - UINTN addrs[_UNIFIED_SECTION_MAX] = {}, szs[_UNIFIED_SECTION_MAX] = {}; - char *cmdline = NULL; - _cleanup_free_ char *cmdline_owned = NULL; + size_t addrs[_UNIFIED_SECTION_MAX] = {}, szs[_UNIFIED_SECTION_MAX] = {}; + _cleanup_free_ char16_t *cmdline = NULL; int sections_measured = -1, parameters_measured = -1; bool sysext_measured = false, m; + uint64_t loader_features = 0; EFI_STATUS err; InitializeLib(image, sys_table); @@ -159,6 +209,15 @@ EFI_STATUS efi_main(EFI_HANDLE image, EFI_SYSTEM_TABLE *sys_table) { if (err != EFI_SUCCESS) return log_error_status_stall(err, L"Error getting a LoadedImageProtocol handle: %r", err); + if (efivar_get_uint64_le(LOADER_GUID, L"LoaderFeatures", &loader_features) != EFI_SUCCESS || + !FLAGS_SET(loader_features, EFI_LOADER_FEATURE_RANDOM_SEED)) { + _cleanup_(file_closep) EFI_FILE *esp_dir = NULL; + + err = partition_open(ESP_GUID, loaded_image->DeviceHandle, NULL, &esp_dir); + if (err == EFI_SUCCESS) /* Non-fatal on failure, so that we still boot without it. */ + (void) process_random_seed(esp_dir); + } + err = pe_memory_locate_sections(loaded_image->ImageBase, unified_sections, addrs, szs); if (err != EFI_SUCCESS || szs[UNIFIED_SECTION_LINUX] == 0) { if (err == EFI_SUCCESS) @@ -208,32 +267,19 @@ EFI_STATUS efi_main(EFI_HANDLE image, EFI_SYSTEM_TABLE *sys_table) { /* Show splash screen as early as possible */ graphics_splash((const uint8_t*) loaded_image->ImageBase + addrs[UNIFIED_SECTION_SPLASH], szs[UNIFIED_SECTION_SPLASH]); - if (szs[UNIFIED_SECTION_CMDLINE] > 0) { - cmdline = (char *) loaded_image->ImageBase + addrs[UNIFIED_SECTION_CMDLINE]; - cmdline_len = szs[UNIFIED_SECTION_CMDLINE]; - } - - /* if we are not in secure boot mode, or none was provided, accept a custom command line and replace - * the built-in one. We also do a superficial check whether first character of passed command line - * is printable character (for compat with some Dell systems which fill in garbage?). */ - if ((!secure_boot_enabled() || cmdline_len == 0) && - loaded_image->LoadOptionsSize > 0 && - ((char16_t *) loaded_image->LoadOptions)[0] > 0x1F) { - cmdline_len = (loaded_image->LoadOptionsSize / sizeof(char16_t)) * sizeof(char); - cmdline = cmdline_owned = xnew(char, cmdline_len); - - for (UINTN i = 0; i < cmdline_len; i++) { - char16_t c = ((char16_t *) loaded_image->LoadOptions)[i]; - cmdline[i] = c > 0x1F && c < 0x7F ? c : ' '; /* convert non-printable and non_ASCII characters to spaces. */ - } - + if (use_load_options(image, loaded_image, szs[UNIFIED_SECTION_CMDLINE] > 0, &cmdline)) { /* Let's measure the passed kernel command line into the TPM. Note that this possibly * duplicates what we already did in the boot menu, if that was already used. However, since * we want the boot menu to support an EFI binary, and want to this stub to be usable from * any boot menu, let's measure things anyway. */ m = false; - (void) tpm_log_load_options(loaded_image->LoadOptions, &m); + (void) tpm_log_load_options(cmdline, &m); parameters_measured = m; + } else if (szs[UNIFIED_SECTION_CMDLINE] > 0) { + cmdline = xstrn8_to_16( + (char *) loaded_image->ImageBase + addrs[UNIFIED_SECTION_CMDLINE], + szs[UNIFIED_SECTION_CMDLINE]); + mangle_stub_cmdline(cmdline); } export_variables(loaded_image); @@ -374,7 +420,7 @@ EFI_STATUS efi_main(EFI_HANDLE image, EFI_SYSTEM_TABLE *sys_table) { log_error_stall(L"Error loading embedded devicetree: %r", err); } - err = linux_exec(image, cmdline, cmdline_len, + err = linux_exec(image, cmdline, PHYSICAL_ADDRESS_TO_POINTER(linux_base), linux_size, PHYSICAL_ADDRESS_TO_POINTER(initrd_base), initrd_size); graphics_mode(false); diff --git a/src/boot/efi/test-efi-string.c b/src/boot/efi/test-efi-string.c index 2b2359fe5c..7b43e1d629 100644 --- a/src/boot/efi/test-efi-string.c +++ b/src/boot/efi/test-efi-string.c @@ -324,6 +324,33 @@ TEST(xstrdup16) { free(s); } +TEST(xstrn8_to_16) { + char16_t *s = NULL; + + assert_se(xstrn8_to_16(NULL, 1) == NULL); + assert_se(xstrn8_to_16("a", 0) == NULL); + + assert_se(s = xstrn8_to_16("", 1)); + assert_se(streq16(s, u"")); + free(s); + + assert_se(s = xstrn8_to_16("1", 1)); + assert_se(streq16(s, u"1")); + free(s); + + assert_se(s = xstr8_to_16("abcxyzABCXYZ09 .,-_#*!\"§$%&/()=?`~")); + assert_se(streq16(s, u"abcxyzABCXYZ09 .,-_#*!\"§$%&/()=?`~")); + free(s); + + assert_se(s = xstr8_to_16("ÿⱿ𝇉 😺")); + assert_se(streq16(s, u"ÿⱿ ")); + free(s); + + assert_se(s = xstrn8_to_16("¶¶", 3)); + assert_se(streq16(s, u"¶")); + free(s); +} + #define TEST_FNMATCH_ONE(pattern, haystack, expect) \ ({ \ assert_se(fnmatch(pattern, haystack, 0) == (expect ? 0 : FNM_NOMATCH)); \ diff --git a/src/boot/efi/util.c b/src/boot/efi/util.c index 5547d288de..f9aeeb4833 100644 --- a/src/boot/efi/util.c +++ b/src/boot/efi/util.c @@ -244,127 +244,36 @@ void efivar_set_time_usec(const EFI_GUID *vendor, const char16_t *name, uint64_t efivar_set(vendor, name, str, 0); } -static int utf8_to_16(const char *stra, char16_t *c) { - char16_t unichar; - UINTN len; - - assert(stra); - assert(c); - - if (!(stra[0] & 0x80)) - len = 1; - else if ((stra[0] & 0xe0) == 0xc0) - len = 2; - else if ((stra[0] & 0xf0) == 0xe0) - len = 3; - else if ((stra[0] & 0xf8) == 0xf0) - len = 4; - else if ((stra[0] & 0xfc) == 0xf8) - len = 5; - else if ((stra[0] & 0xfe) == 0xfc) - len = 6; - else - return -1; - - switch (len) { - case 1: - unichar = stra[0]; - break; - case 2: - unichar = stra[0] & 0x1f; - break; - case 3: - unichar = stra[0] & 0x0f; - break; - case 4: - unichar = stra[0] & 0x07; - break; - case 5: - unichar = stra[0] & 0x03; - break; - case 6: - unichar = stra[0] & 0x01; - break; - } - - for (UINTN i = 1; i < len; i++) { - if ((stra[i] & 0xc0) != 0x80) - return -1; - unichar <<= 6; - unichar |= stra[i] & 0x3f; - } - - *c = unichar; - return len; -} - -char16_t *xstra_to_str(const char *stra) { - UINTN strlen; - UINTN len; - UINTN i; - char16_t *str; - - assert(stra); +void convert_efi_path(char16_t *path) { + assert(path); - len = strlen8(stra); - str = xnew(char16_t, len + 1); + for (size_t i = 0, fixed = 0;; i++) { + /* Fix device path node separator. */ + path[fixed] = (path[i] == '/') ? '\\' : path[i]; - strlen = 0; - i = 0; - while (i < len) { - int utf8len; - - utf8len = utf8_to_16(stra + i, str + strlen); - if (utf8len <= 0) { - /* invalid utf8 sequence, skip the garbage */ - i++; + /* Double '\' is not allowed in EFI file paths. */ + if (fixed > 0 && path[fixed - 1] == '\\' && path[fixed] == '\\') continue; - } - strlen++; - i += utf8len; + if (path[i] == '\0') + break; + + fixed++; } - str[strlen] = '\0'; - return str; } -char16_t *xstra_to_path(const char *stra) { - char16_t *str; - UINTN strlen; - UINTN len; - UINTN i; - - assert(stra); - - len = strlen8(stra); - str = xnew(char16_t, len + 2); - - str[0] = '\\'; - strlen = 1; - i = 0; - while (i < len) { - int utf8len; - - utf8len = utf8_to_16(stra + i, str + strlen); - if (utf8len <= 0) { - /* invalid utf8 sequence, skip the garbage */ - i++; - continue; - } - - if (str[strlen] == '/') - str[strlen] = '\\'; - if (str[strlen] == '\\' && str[strlen-1] == '\\') { - /* skip double slashes */ - i += utf8len; - continue; - } +char16_t *xstr8_to_path(const char *str8) { + assert(str8); + char16_t *path = xstr8_to_16(str8); + convert_efi_path(path); + return path; +} - strlen++; - i += utf8len; - } - str[strlen] = '\0'; - return str; +void mangle_stub_cmdline(char16_t *cmdline) { + for (; *cmdline != '\0'; cmdline++) + /* Convert ASCII control characters to spaces. */ + if (*cmdline <= 0x1F) + *cmdline = ' '; } EFI_STATUS file_read(EFI_FILE *dir, const char16_t *name, UINTN off, UINTN size, char **ret, UINTN *ret_size) { @@ -772,19 +681,51 @@ EFI_STATUS make_file_device_path(EFI_HANDLE device, const char16_t *file, EFI_DE EFI_STATUS device_path_to_str(const EFI_DEVICE_PATH *dp, char16_t **ret) { EFI_DEVICE_PATH_TO_TEXT_PROTOCOL *dp_to_text; EFI_STATUS err; + _cleanup_free_ char16_t *str = NULL; assert(dp); assert(ret); err = BS->LocateProtocol(&(EFI_GUID) EFI_DEVICE_PATH_TO_TEXT_PROTOCOL_GUID, NULL, (void **) &dp_to_text); - if (err != EFI_SUCCESS) - return err; + if (err != EFI_SUCCESS) { + /* If the device path to text protocol is not available we can still do a best-effort attempt + * to convert it ourselves if we are given filepath-only device path. */ + + size_t size = 0; + for (const EFI_DEVICE_PATH *node = dp; !IsDevicePathEnd(node); + node = NextDevicePathNode(node)) { + + if (DevicePathType(node) != MEDIA_DEVICE_PATH || + DevicePathSubType(node) != MEDIA_FILEPATH_DP) + return err; + + size_t path_size = DevicePathNodeLength(node); + if (path_size <= offsetof(FILEPATH_DEVICE_PATH, PathName) || path_size % sizeof(char16_t)) + return EFI_INVALID_PARAMETER; + path_size -= offsetof(FILEPATH_DEVICE_PATH, PathName); + + _cleanup_free_ char16_t *old = str; + str = xmalloc(size + path_size); + if (old) { + memcpy(str, old, size); + str[size / sizeof(char16_t) - 1] = '\\'; + } + + memcpy(str + (size / sizeof(char16_t)), + ((uint8_t *) node) + offsetof(FILEPATH_DEVICE_PATH, PathName), + path_size); + size += path_size; + } + + *ret = TAKE_PTR(str); + return EFI_SUCCESS; + } - char16_t *str = dp_to_text->ConvertDevicePathToText(dp, false, false); + str = dp_to_text->ConvertDevicePathToText(dp, false, false); if (!str) return EFI_OUT_OF_RESOURCES; - *ret = str; + *ret = TAKE_PTR(str); return EFI_SUCCESS; } diff --git a/src/boot/efi/util.h b/src/boot/efi/util.h index b33c50f9fc..08f732f484 100644 --- a/src/boot/efi/util.h +++ b/src/boot/efi/util.h @@ -10,6 +10,17 @@ #define UINTN_MAX (~(UINTN)0) #define INTN_MAX ((INTN)(UINTN_MAX>>1)) +#ifndef __has_attribute +#define __has_attribute(x) 0 +#endif +#if __has_attribute(__error__) +__attribute__((noreturn)) extern void __assert_cl_failure__(void) __attribute__((__error__("compile-time assertion failed"))); +#else +__attribute__((noreturn)) extern void __assert_cl_failure__(void); +#endif +/* assert_cl generates a later-stage compile-time assertion when constant folding occurs. */ +#define assert_cl(condition) ({ if (!(condition)) __assert_cl_failure__(); }) + /* gnu-efi format specifiers for integers are fixed to either 64bit with 'l' and 32bit without a size prefix. * We rely on %u/%d/%x to format regular ints, so ensure the size is what we expect. At the same time, we also * need specifiers for (U)INTN which are native (pointer) sized. */ @@ -43,6 +54,20 @@ static inline void freep(void *p) { #define _cleanup_free_ _cleanup_(freep) +static __always_inline void erase_obj(void *p) { +#ifdef __OPTIMIZE__ + size_t l; + assert_cl(p); + l = __builtin_object_size(p, 0); + assert_cl(l != (size_t) -1); + explicit_bzero_safe(p, l); +#else +#warning "Object will not be erased with -O0; do not release to production." +#endif +} + +#define _cleanup_erase_ _cleanup_(erase_obj) + _malloc_ _alloc_(1) _returns_nonnull_ _warn_unused_result_ static inline void *xmalloc(size_t size) { void *p; @@ -112,8 +137,9 @@ EFI_STATUS efivar_get_uint32_le(const EFI_GUID *vendor, const char16_t *name, ui EFI_STATUS efivar_get_uint64_le(const EFI_GUID *vendor, const char16_t *name, uint64_t *ret); EFI_STATUS efivar_get_boolean_u8(const EFI_GUID *vendor, const char16_t *name, bool *ret); -char16_t *xstra_to_path(const char *stra); -char16_t *xstra_to_str(const char *stra); +void convert_efi_path(char16_t *path); +char16_t *xstr8_to_path(const char *stra); +void mangle_stub_cmdline(char16_t *cmdline); EFI_STATUS file_read(EFI_FILE *dir, const char16_t *name, UINTN off, UINTN size, char **content, UINTN *content_size); diff --git a/src/boot/efi/vmm.c b/src/boot/efi/vmm.c new file mode 100644 index 0000000000..b1bfd778fc --- /dev/null +++ b/src/boot/efi/vmm.c @@ -0,0 +1,130 @@ +/* SPDX-License-Identifier: LGPL-2.1-or-later */ + +#include <efi.h> +#include <efilib.h> +#include <stdbool.h> + +#include "drivers.h" +#include "efi-string.h" +#include "string-util-fundamental.h" +#include "util.h" + +#define QEMU_KERNEL_LOADER_FS_MEDIA_GUID \ + { 0x1428f772, 0xb64a, 0x441e, {0xb8, 0xc3, 0x9e, 0xbd, 0xd7, 0xf8, 0x93, 0xc7 }} + +#define VMM_BOOT_ORDER_GUID \ + { 0x668f4529, 0x63d0, 0x4bb5, {0xb6, 0x5d, 0x6f, 0xbb, 0x9d, 0x36, 0xa4, 0x4a }} + +/* detect direct boot */ +bool is_direct_boot(EFI_HANDLE device) { + EFI_STATUS err; + VENDOR_DEVICE_PATH *dp; + + err = BS->HandleProtocol(device, &DevicePathProtocol, (void **) &dp); + if (err != EFI_SUCCESS) + return false; + + /* 'qemu -kernel systemd-bootx64.efi' */ + if (dp->Header.Type == MEDIA_DEVICE_PATH && + dp->Header.SubType == MEDIA_VENDOR_DP && + memcmp(&dp->Guid, &(EFI_GUID)QEMU_KERNEL_LOADER_FS_MEDIA_GUID, sizeof(EFI_GUID)) == 0) + return true; + + /* loaded from firmware volume (sd-boot added to ovmf) */ + if (dp->Header.Type == MEDIA_DEVICE_PATH && + dp->Header.SubType == MEDIA_PIWG_FW_VOL_DP) + return true; + + return false; +} + +static bool device_path_startswith(const EFI_DEVICE_PATH *dp, const EFI_DEVICE_PATH *start) { + if (!start) + return true; + if (!dp) + return false; + for (;;) { + if (IsDevicePathEnd(start)) + return true; + if (IsDevicePathEnd(dp)) + return false; + size_t l1 = DevicePathNodeLength(start); + size_t l2 = DevicePathNodeLength(dp); + if (l1 != l2) + return false; + if (memcmp(dp, start, l1) != 0) + return false; + start = NextDevicePathNode(start); + dp = NextDevicePathNode(dp); + } +} + +/* + * Try find ESP when not loaded from ESP + * + * Inspect all filesystems known to the firmware, try find the ESP. In case VMMBootOrderNNNN variables are + * present they are used to inspect the filesystems in the specified order. When nothing was found or the + * variables are not present the function will do one final search pass over all filesystems. + * + * Recent OVMF builds store the qemu boot order (as specified using the bootindex property on the qemu + * command line) in VMMBootOrderNNNN. The variables contain a device path. + * + * Example qemu command line: + * qemu -virtio-scsi-pci,addr=14.0 -device scsi-cd,scsi-id=4,bootindex=1 + * + * Resulting variable: + * VMMBootOrder0000 = PciRoot(0x0)/Pci(0x14,0x0)/Scsi(0x4,0x0) + */ +EFI_STATUS vmm_open(EFI_HANDLE *ret_vmm_dev, EFI_FILE **ret_vmm_dir) { + _cleanup_free_ EFI_HANDLE *handles = NULL; + size_t n_handles; + EFI_STATUS err, dp_err; + + assert(ret_vmm_dev); + assert(ret_vmm_dir); + + /* find all file system handles */ + err = BS->LocateHandleBuffer(ByProtocol, &FileSystemProtocol, NULL, &n_handles, &handles); + if (err != EFI_SUCCESS) + return err; + + for (size_t order = 0;; order++) { + _cleanup_free_ EFI_DEVICE_PATH *dp = NULL; + char16_t order_str[STRLEN("VMMBootOrder") + 4 + 1]; + + SPrint(order_str, sizeof(order_str), u"VMMBootOrder%04x", order); + dp_err = efivar_get_raw(&(EFI_GUID)VMM_BOOT_ORDER_GUID, order_str, (char**)&dp, NULL); + + for (size_t i = 0; i < n_handles; i++) { + _cleanup_(file_closep) EFI_FILE *root_dir = NULL, *efi_dir = NULL; + EFI_DEVICE_PATH *fs; + + err = BS->HandleProtocol(handles[i], &DevicePathProtocol, (void **) &fs); + if (err != EFI_SUCCESS) + return err; + + /* check against VMMBootOrderNNNN (if set) */ + if (dp_err == EFI_SUCCESS && !device_path_startswith(fs, dp)) + continue; + + err = open_volume(handles[i], &root_dir); + if (err != EFI_SUCCESS) + continue; + + /* simple ESP check */ + err = root_dir->Open(root_dir, &efi_dir, (char16_t*) u"\\EFI", + EFI_FILE_MODE_READ, + EFI_FILE_READ_ONLY | EFI_FILE_DIRECTORY); + if (err != EFI_SUCCESS) + continue; + + *ret_vmm_dev = handles[i]; + *ret_vmm_dir = TAKE_PTR(root_dir); + return EFI_SUCCESS; + } + + if (dp_err != EFI_SUCCESS) + return EFI_NOT_FOUND; + } + assert_not_reached(); +} diff --git a/src/boot/efi/vmm.h b/src/boot/efi/vmm.h new file mode 100644 index 0000000000..7bac1a324a --- /dev/null +++ b/src/boot/efi/vmm.h @@ -0,0 +1,8 @@ +/* SPDX-License-Identifier: LGPL-2.1-or-later */ +#pragma once + +#include <efi.h> +#include <efilib.h> + +bool is_direct_boot(EFI_HANDLE device); +EFI_STATUS vmm_open(EFI_HANDLE *ret_qemu_dev, EFI_FILE **ret_qemu_dir); diff --git a/src/boot/efi/xbootldr.h b/src/boot/efi/xbootldr.h deleted file mode 100644 index 205ce71edf..0000000000 --- a/src/boot/efi/xbootldr.h +++ /dev/null @@ -1,9 +0,0 @@ -/* SPDX-License-Identifier: LGPL-2.1-or-later */ -#pragma once - -#include <efi.h> - -#define XBOOTLDR_GUID \ - &(const EFI_GUID) { 0xbc13c2ff, 0x59e6, 0x4262, { 0xa3, 0x52, 0xb2, 0x75, 0xfd, 0x6f, 0x71, 0x72 } } - -EFI_STATUS xbootldr_open(EFI_HANDLE *device, EFI_HANDLE *ret_device, EFI_FILE **ret_root_dir); diff --git a/src/boot/measure.c b/src/boot/measure.c index b9cd2853b6..913cf18ee6 100644 --- a/src/boot/measure.c +++ b/src/boot/measure.c @@ -898,7 +898,7 @@ static int verb_sign(int argc, char *argv[], void *userdata) { } _cleanup_free_ void *sig = malloc(ss); - if (!ss) { + if (!sig) { r = log_oom(); goto finish; } diff --git a/src/boot/pcrphase.c b/src/boot/pcrphase.c index a77a85fb2e..9ae1709253 100644 --- a/src/boot/pcrphase.c +++ b/src/boot/pcrphase.c @@ -6,6 +6,7 @@ #include "build.h" #include "efivars.h" +#include "env-util.h" #include "main-func.h" #include "openssl-util.h" #include "parse-util.h" @@ -175,21 +176,33 @@ static int run(int argc, char *argv[]) { length = strlen(word); + int b = getenv_bool("SYSTEMD_PCRPHASE_STUB_VERIFY"); + if (b < 0 && b != -ENXIO) + log_warning_errno(b, "Unable to parse $SYSTEMD_PCRPHASE_STUB_VERIFY value, ignoring."); + /* Skip logic if sd-stub is not used, after all PCR 11 might have a very different purpose then. */ r = efi_get_variable_string(EFI_LOADER_VARIABLE(StubPcrKernelImage), &pcr_string); if (r == -ENOENT) { - log_info("Kernel stub did not measure kernel image into PCR %u, skipping measurement.", TPM_PCR_INDEX_KERNEL_IMAGE); - return EXIT_SUCCESS; - } - if (r < 0) + if (b != 0) { + log_info("Kernel stub did not measure kernel image into PCR %u, skipping measurement.", TPM_PCR_INDEX_KERNEL_IMAGE); + return EXIT_SUCCESS; + } else + log_notice("Kernel stub did not measure kernel image into PCR %u, but told to measure anyway, hence proceeding.", TPM_PCR_INDEX_KERNEL_IMAGE); + } else if (r < 0) return log_error_errno(r, "Failed to read StubPcrKernelImage EFI variable: %m"); - - /* Let's validate that the stub announced PCR 11 as we expected. */ - r = safe_atou(pcr_string, &pcr_nr); - if (r < 0) - return log_error_errno(r, "Failed to parse StubPcrKernelImage EFI variable: %s", pcr_string); - if (pcr_nr != TPM_PCR_INDEX_KERNEL_IMAGE) - return log_error_errno(SYNTHETIC_ERRNO(EREMOTE), "Kernel stub measured kernel image into PCR %u, which is different than expected %u.", pcr_nr, TPM_PCR_INDEX_KERNEL_IMAGE); + else { + /* Let's validate that the stub announced PCR 11 as we expected. */ + r = safe_atou(pcr_string, &pcr_nr); + if (r < 0) + return log_error_errno(r, "Failed to parse StubPcrKernelImage EFI variable: %s", pcr_string); + if (pcr_nr != TPM_PCR_INDEX_KERNEL_IMAGE) { + if (b != 0) + return log_error_errno(SYNTHETIC_ERRNO(EREMOTE), "Kernel stub measured kernel image into PCR %u, which is different than expected %u.", pcr_nr, TPM_PCR_INDEX_KERNEL_IMAGE); + else + log_notice("Kernel stub measured kernel image into PCR %u, which is different than expected %u, but told to measure anyway, hence proceeding.", pcr_nr, TPM_PCR_INDEX_KERNEL_IMAGE); + } else + log_debug("Kernel stub reported same PCR %u as we want to use, proceeding.", TPM_PCR_INDEX_KERNEL_IMAGE); + } r = dlopen_tpm2(); if (r < 0) diff --git a/src/cgtop/cgtop.c b/src/cgtop/cgtop.c index cf51024dcb..cef5b654e7 100644 --- a/src/cgtop/cgtop.c +++ b/src/cgtop/cgtop.c @@ -56,6 +56,12 @@ typedef struct Group { uint64_t io_input_bps, io_output_bps; } Group; +typedef enum PidsCount { + COUNT_USERSPACE_PROCESSES, + COUNT_ALL_PROCESSES, + COUNT_PIDS, +} PidsCount; + static unsigned arg_depth = 3; static unsigned arg_iterations = UINT_MAX; static bool arg_batch = false; @@ -66,11 +72,7 @@ static char* arg_root = NULL; static bool arg_recursive = true; static bool arg_recursive_unset = false; -static enum { - COUNT_PIDS, - COUNT_USERSPACE_PROCESSES, - COUNT_ALL_PROCESSES, -} arg_count = COUNT_PIDS; +static PidsCount arg_count = COUNT_PIDS; static enum { ORDER_PATH, @@ -916,6 +918,7 @@ static int run(int argc, char *argv[]) { usec_t last_refresh = 0; bool quit = false, immediate_refresh = false; _cleanup_free_ char *root = NULL; + PidsCount possible_count; CGroupMask mask; int r; @@ -929,7 +932,8 @@ static int run(int argc, char *argv[]) { if (r < 0) return log_error_errno(r, "Failed to determine supported controllers: %m"); - arg_count = (mask & CGROUP_MASK_PIDS) ? COUNT_PIDS : COUNT_USERSPACE_PROCESSES; + possible_count = (mask & CGROUP_MASK_PIDS) ? COUNT_PIDS : COUNT_ALL_PROCESSES; + arg_count = MIN(possible_count, arg_count); if (arg_recursive_unset && arg_count == COUNT_PIDS) return log_error_errno(SYNTHETIC_ERRNO(EINVAL), diff --git a/src/core/cgroup.c b/src/core/cgroup.c index c44966839c..1e9cb758de 100644 --- a/src/core/cgroup.c +++ b/src/core/cgroup.c @@ -151,6 +151,7 @@ void cgroup_context_init(CGroupContext *c) { .memory_high = CGROUP_LIMIT_MAX, .memory_max = CGROUP_LIMIT_MAX, .memory_swap_max = CGROUP_LIMIT_MAX, + .memory_zswap_max = CGROUP_LIMIT_MAX, .memory_limit = CGROUP_LIMIT_MAX, @@ -354,6 +355,9 @@ static int unit_compare_memory_limit(Unit *u, const char *property_name, uint64_ } else if (streq(property_name, "MemorySwapMax")) { unit_value = c->memory_swap_max; file = "memory.swap.max"; + } else if (streq(property_name, "MemoryZSwapMax")) { + unit_value = c->memory_zswap_max; + file = "memory.zswap.max"; } else return -EINVAL; @@ -396,9 +400,10 @@ static char *format_cgroup_memory_limit_comparison(char *buf, size_t l, Unit *u, /* memory.swap.max is special in that it relies on CONFIG_MEMCG_SWAP (and the default swapaccount=1). * In the absence of reliably being able to detect whether memcg swap support is available or not, - * only complain if the error is not ENOENT. */ + * only complain if the error is not ENOENT. This is similarly the case for memory.zswap.max relying + * on CONFIG_ZSWAP. */ if (r > 0 || IN_SET(r, -ENODATA, -EOWNERDEAD) || - (r == -ENOENT && streq(property_name, "MemorySwapMax"))) + (r == -ENOENT && STR_IN_SET(property_name, "MemorySwapMax", "MemoryZSwapMax"))) buf[0] = 0; else if (r < 0) { errno = -r; @@ -462,6 +467,7 @@ void cgroup_context_dump(Unit *u, FILE* f, const char *prefix) { "%sMemoryHigh: %" PRIu64 "%s\n" "%sMemoryMax: %" PRIu64 "%s\n" "%sMemorySwapMax: %" PRIu64 "%s\n" + "%sMemoryZSwapMax: %" PRIu64 "%s\n" "%sMemoryLimit: %" PRIu64 "\n" "%sTasksMax: %" PRIu64 "\n" "%sDevicePolicy: %s\n" @@ -498,6 +504,7 @@ void cgroup_context_dump(Unit *u, FILE* f, const char *prefix) { prefix, c->memory_high, format_cgroup_memory_limit_comparison(cdc, sizeof(cdc), u, "MemoryHigh"), prefix, c->memory_max, format_cgroup_memory_limit_comparison(cdd, sizeof(cdd), u, "MemoryMax"), prefix, c->memory_swap_max, format_cgroup_memory_limit_comparison(cde, sizeof(cde), u, "MemorySwapMax"), + prefix, c->memory_zswap_max, format_cgroup_memory_limit_comparison(cde, sizeof(cde), u, "MemoryZSwapMax"), prefix, c->memory_limit, prefix, tasks_max_resolve(&c->tasks_max), prefix, cgroup_device_policy_to_string(c->device_policy), @@ -1209,7 +1216,7 @@ static bool unit_has_unified_memory_config(Unit *u) { return unit_get_ancestor_memory_min(u) > 0 || unit_get_ancestor_memory_low(u) > 0 || c->memory_high != CGROUP_LIMIT_MAX || c->memory_max != CGROUP_LIMIT_MAX || - c->memory_swap_max != CGROUP_LIMIT_MAX; + c->memory_swap_max != CGROUP_LIMIT_MAX || c->memory_zswap_max != CGROUP_LIMIT_MAX; } static void cgroup_apply_unified_memory_limit(Unit *u, const char *file, uint64_t v) { @@ -1569,11 +1576,12 @@ static void cgroup_context_apply( if ((apply_mask & CGROUP_MASK_MEMORY) && !is_local_root) { if (cg_all_unified() > 0) { - uint64_t max, swap_max = CGROUP_LIMIT_MAX; + uint64_t max, swap_max = CGROUP_LIMIT_MAX, zswap_max = CGROUP_LIMIT_MAX; if (unit_has_unified_memory_config(u)) { max = c->memory_max; swap_max = c->memory_swap_max; + zswap_max = c->memory_zswap_max; } else { max = c->memory_limit; @@ -1586,6 +1594,7 @@ static void cgroup_context_apply( cgroup_apply_unified_memory_limit(u, "memory.high", c->memory_high); cgroup_apply_unified_memory_limit(u, "memory.max", max); cgroup_apply_unified_memory_limit(u, "memory.swap.max", swap_max); + cgroup_apply_unified_memory_limit(u, "memory.zswap.max", zswap_max); (void) set_attribute_and_warn(u, "memory", "memory.oom.group", one_zero(c->memory_oom_group)); diff --git a/src/core/cgroup.h b/src/core/cgroup.h index 4413eeaaa0..09352bafc6 100644 --- a/src/core/cgroup.h +++ b/src/core/cgroup.h @@ -149,6 +149,7 @@ struct CGroupContext { uint64_t memory_high; uint64_t memory_max; uint64_t memory_swap_max; + uint64_t memory_zswap_max; bool default_memory_min_set:1; bool default_memory_low_set:1; diff --git a/src/core/crash-handler.c b/src/core/crash-handler.c index 561b7fc19c..6983f2e2b7 100644 --- a/src/core/crash-handler.c +++ b/src/core/crash-handler.c @@ -49,7 +49,7 @@ _noreturn_ static void crash(int sig, siginfo_t *siginfo, void *context) { if (getpid_cached() != 1) /* Pass this on immediately, if this is not PID 1 */ - (void) raise(sig); + propagate_signal(sig, siginfo); else if (!arg_dump_core) log_emergency("Caught <%s>, not dumping core.", signal_to_string(sig)); else { @@ -79,9 +79,7 @@ _noreturn_ static void crash(int sig, siginfo_t *siginfo, void *context) { (void) chdir("/"); /* Raise the signal again */ - pid = raw_getpid(); - (void) kill(pid, sig); /* raise() would kill the parent */ - + propagate_signal(sig, siginfo); assert_not_reached(); _exit(EXIT_EXCEPTION); } else { diff --git a/src/core/dbus-cgroup.c b/src/core/dbus-cgroup.c index cbadb5bc44..b5484eda78 100644 --- a/src/core/dbus-cgroup.c +++ b/src/core/dbus-cgroup.c @@ -468,6 +468,7 @@ const sd_bus_vtable bus_cgroup_vtable[] = { SD_BUS_PROPERTY("MemoryHigh", "t", NULL, offsetof(CGroupContext, memory_high), 0), SD_BUS_PROPERTY("MemoryMax", "t", NULL, offsetof(CGroupContext, memory_max), 0), SD_BUS_PROPERTY("MemorySwapMax", "t", NULL, offsetof(CGroupContext, memory_swap_max), 0), + SD_BUS_PROPERTY("MemoryZSwapMax", "t", NULL, offsetof(CGroupContext, memory_zswap_max), 0), SD_BUS_PROPERTY("MemoryLimit", "t", NULL, offsetof(CGroupContext, memory_limit), 0), SD_BUS_PROPERTY("DevicePolicy", "s", property_get_cgroup_device_policy, offsetof(CGroupContext, device_policy), 0), SD_BUS_PROPERTY("DeviceAllow", "a(ss)", property_get_device_allow, 0, 0), @@ -887,6 +888,7 @@ BUS_DEFINE_SET_CGROUP_WEIGHT(blockio_weight, CGROUP_MASK_BLKIO, CGROUP_BLKIO_WEI BUS_DEFINE_SET_CGROUP_LIMIT(memory, CGROUP_MASK_MEMORY, physical_memory_scale, 1); BUS_DEFINE_SET_CGROUP_LIMIT(memory_protection, CGROUP_MASK_MEMORY, physical_memory_scale, 0); BUS_DEFINE_SET_CGROUP_LIMIT(swap, CGROUP_MASK_MEMORY, physical_memory_scale, 0); +BUS_DEFINE_SET_CGROUP_LIMIT(zswap, CGROUP_MASK_MEMORY, physical_memory_scale, 0); REENABLE_WARNING; static int bus_cgroup_set_cpu_weight( @@ -1075,6 +1077,9 @@ int bus_cgroup_set_property( if (streq(name, "MemorySwapMax")) return bus_cgroup_set_swap(u, name, &c->memory_swap_max, message, flags, error); + if (streq(name, "MemoryZSwapMax")) + return bus_cgroup_set_zswap(u, name, &c->memory_zswap_max, message, flags, error); + if (streq(name, "MemoryMax")) return bus_cgroup_set_memory(u, name, &c->memory_max, message, flags, error); @@ -1115,6 +1120,9 @@ int bus_cgroup_set_property( if (streq(name, "MemorySwapMaxScale")) return bus_cgroup_set_swap_scale(u, name, &c->memory_swap_max, message, flags, error); + if (streq(name, "MemoryZSwapMaxScale")) + return bus_cgroup_set_zswap_scale(u, name, &c->memory_zswap_max, message, flags, error); + if (streq(name, "MemoryMaxScale")) return bus_cgroup_set_memory_scale(u, name, &c->memory_max, message, flags, error); diff --git a/src/core/dbus-manager.c b/src/core/dbus-manager.c index 88f098ec86..ab2617153a 100644 --- a/src/core/dbus-manager.c +++ b/src/core/dbus-manager.c @@ -694,31 +694,31 @@ static int method_start_unit_generic(sd_bus_message *message, Manager *m, JobTyp } static int method_start_unit(sd_bus_message *message, void *userdata, sd_bus_error *error) { - return method_start_unit_generic(message, userdata, JOB_START, false, error); + return method_start_unit_generic(message, userdata, JOB_START, /* reload_if_possible = */ false, error); } static int method_stop_unit(sd_bus_message *message, void *userdata, sd_bus_error *error) { - return method_start_unit_generic(message, userdata, JOB_STOP, false, error); + return method_start_unit_generic(message, userdata, JOB_STOP, /* reload_if_possible = */ false, error); } static int method_reload_unit(sd_bus_message *message, void *userdata, sd_bus_error *error) { - return method_start_unit_generic(message, userdata, JOB_RELOAD, false, error); + return method_start_unit_generic(message, userdata, JOB_RELOAD, /* reload_if_possible = */ false, error); } static int method_restart_unit(sd_bus_message *message, void *userdata, sd_bus_error *error) { - return method_start_unit_generic(message, userdata, JOB_RESTART, false, error); + return method_start_unit_generic(message, userdata, JOB_RESTART, /* reload_if_possible = */ false, error); } static int method_try_restart_unit(sd_bus_message *message, void *userdata, sd_bus_error *error) { - return method_start_unit_generic(message, userdata, JOB_TRY_RESTART, false, error); + return method_start_unit_generic(message, userdata, JOB_TRY_RESTART, /* reload_if_possible = */ false, error); } static int method_reload_or_restart_unit(sd_bus_message *message, void *userdata, sd_bus_error *error) { - return method_start_unit_generic(message, userdata, JOB_RESTART, true, error); + return method_start_unit_generic(message, userdata, JOB_RESTART, /* reload_if_possible = */ true, error); } static int method_reload_or_try_restart_unit(sd_bus_message *message, void *userdata, sd_bus_error *error) { - return method_start_unit_generic(message, userdata, JOB_TRY_RESTART, true, error); + return method_start_unit_generic(message, userdata, JOB_TRY_RESTART, /* reload_if_possible = */ true, error); } typedef enum GenericUnitOperationFlags { @@ -786,7 +786,7 @@ static int method_start_unit_replace(sd_bus_message *message, void *userdata, sd if (!u->job || u->job->type != JOB_START) return sd_bus_error_setf(error, BUS_ERROR_NO_SUCH_JOB, "No job queued for unit %s", old_name); - return method_start_unit_generic(message, m, JOB_START, false, error); + return method_start_unit_generic(message, m, JOB_START, /* reload_if_possible = */ false, error); } static int method_kill_unit(sd_bus_message *message, void *userdata, sd_bus_error *error) { @@ -2350,19 +2350,19 @@ static int method_enable_unit_files_generic( } static int method_enable_unit_files_with_flags(sd_bus_message *message, void *userdata, sd_bus_error *error) { - return method_enable_unit_files_generic(message, userdata, unit_file_enable, true, error); + return method_enable_unit_files_generic(message, userdata, unit_file_enable, /* carries_install_info = */ true, error); } static int method_enable_unit_files(sd_bus_message *message, void *userdata, sd_bus_error *error) { - return method_enable_unit_files_generic(message, userdata, unit_file_enable, true, error); + return method_enable_unit_files_generic(message, userdata, unit_file_enable, /* carries_install_info = */ true, error); } static int method_reenable_unit_files(sd_bus_message *message, void *userdata, sd_bus_error *error) { - return method_enable_unit_files_generic(message, userdata, unit_file_reenable, true, error); + return method_enable_unit_files_generic(message, userdata, unit_file_reenable, /* carries_install_info = */ true, error); } static int method_link_unit_files(sd_bus_message *message, void *userdata, sd_bus_error *error) { - return method_enable_unit_files_generic(message, userdata, unit_file_link, false, error); + return method_enable_unit_files_generic(message, userdata, unit_file_link, /* carries_install_info = */ false, error); } static int unit_file_preset_without_mode(LookupScope scope, UnitFileFlags flags, const char *root_dir, char **files, InstallChange **changes, size_t *n_changes) { @@ -2370,11 +2370,11 @@ static int unit_file_preset_without_mode(LookupScope scope, UnitFileFlags flags, } static int method_preset_unit_files(sd_bus_message *message, void *userdata, sd_bus_error *error) { - return method_enable_unit_files_generic(message, userdata, unit_file_preset_without_mode, true, error); + return method_enable_unit_files_generic(message, userdata, unit_file_preset_without_mode, /* carries_install_info = */ true, error); } static int method_mask_unit_files(sd_bus_message *message, void *userdata, sd_bus_error *error) { - return method_enable_unit_files_generic(message, userdata, unit_file_mask, false, error); + return method_enable_unit_files_generic(message, userdata, unit_file_mask, /* carries_install_info = */ false, error); } static int method_preset_unit_files_with_mode(sd_bus_message *message, void *userdata, sd_bus_error *error) { diff --git a/src/core/device.c b/src/core/device.c index 224fc90835..6e07f2745b 100644 --- a/src/core/device.c +++ b/src/core/device.c @@ -135,6 +135,7 @@ static void device_done(Unit *u) { assert(d); device_unset_sysfs(d); + d->deserialized_sysfs = mfree(d->deserialized_sysfs); d->wants_property = strv_free(d->wants_property); d->path = mfree(d->path); } @@ -267,7 +268,7 @@ static int device_coldplug(Unit *u) { * 1. MANAGER_IS_RUNNING() == false * 2. enumerate devices: manager_enumerate() -> device_enumerate() * Device.enumerated_found is set. - * 3. deserialize devices: manager_deserialize() -> device_deserialize() + * 3. deserialize devices: manager_deserialize() -> device_deserialize_item() * Device.deserialize_state and Device.deserialized_found are set. * 4. coldplug devices: manager_coldplug() -> device_coldplug() * deserialized properties are copied to the main properties. @@ -282,23 +283,41 @@ static int device_coldplug(Unit *u) { * * - On switch-root, the udev database may be cleared, except for devices with sticky bit, i.e. * OPTIONS="db_persist". Hence, almost no devices are enumerated in the step 2. However, in - * general, we have several serialized devices. So, DEVICE_FOUND_UDEV bit in the deserialized_found - * must be ignored, as udev rules in initrd and the main system are often different. If the - * deserialized state is DEVICE_PLUGGED, we need to downgrade it to DEVICE_TENTATIVE. Unlike the - * other starting mode, MANAGER_IS_SWITCHING_ROOT() is true when device_coldplug() and - * device_catchup() are called. Hence, let's conditionalize the operations by using the - * flag. After switch-root, systemd-udevd will (re-)process all devices, and the Device.found and - * Device.state will be adjusted. + * general, we have several serialized devices. So, DEVICE_FOUND_UDEV bit in the + * Device.deserialized_found must be ignored, as udev rules in initrd and the main system are often + * different. If the deserialized state is DEVICE_PLUGGED, we need to downgrade it to + * DEVICE_TENTATIVE. Unlike the other starting mode, MANAGER_IS_SWITCHING_ROOT() is true when + * device_coldplug() and device_catchup() are called. Hence, let's conditionalize the operations by + * using the flag. After switch-root, systemd-udevd will (re-)process all devices, and the + * Device.found and Device.state will be adjusted. * - * - On reload or reexecute, we can trust enumerated_found, deserialized_found, and deserialized_state. - * Of course, deserialized parameters may be outdated, but the unit state can be adjusted later by - * device_catchup() or uevents. */ + * - On reload or reexecute, we can trust Device.enumerated_found, Device.deserialized_found, and + * Device.deserialized_state. Of course, deserialized parameters may be outdated, but the unit + * state can be adjusted later by device_catchup() or uevents. */ if (MANAGER_IS_SWITCHING_ROOT(m) && !FLAGS_SET(d->enumerated_found, DEVICE_FOUND_UDEV)) { - found &= ~DEVICE_FOUND_UDEV; /* ignore DEVICE_FOUND_UDEV bit */ + + /* The device has not been enumerated. On switching-root, such situation is natural. See the + * above comment. To prevent problematic state transition active → dead → active, let's + * drop the DEVICE_FOUND_UDEV flag and downgrade state to DEVICE_TENTATIVE(activating). See + * issue #12953 and #23208. */ + found &= ~DEVICE_FOUND_UDEV; if (state == DEVICE_PLUGGED) - state = DEVICE_TENTATIVE; /* downgrade state */ + state = DEVICE_TENTATIVE; + + /* Also check the validity of the device syspath. Without this check, if the device was + * removed while switching root, it would never go to inactive state, as both Device.found + * and Device.enumerated_found do not have the DEVICE_FOUND_UDEV flag, so device_catchup() in + * device_update_found_one() does nothing in most cases. See issue #25106. Note that the + * syspath field is only serialized when systemd is sufficiently new and the device has been + * already processed by udevd. */ + if (d->deserialized_sysfs) { + _cleanup_(sd_device_unrefp) sd_device *dev = NULL; + + if (sd_device_new_from_syspath(&dev, d->deserialized_sysfs) < 0) + state = DEVICE_DEAD; + } } if (d->found == found && d->state == state) @@ -387,6 +406,9 @@ static int device_serialize(Unit *u, FILE *f, FDSet *fds) { assert(f); assert(fds); + if (d->sysfs) + (void) serialize_item(f, "sysfs", d->sysfs); + if (d->path) (void) serialize_item(f, "path", d->path); @@ -408,7 +430,14 @@ static int device_deserialize_item(Unit *u, const char *key, const char *value, assert(value); assert(fds); - if (streq(key, "path")) { + if (streq(key, "sysfs")) { + if (!d->deserialized_sysfs) { + d->deserialized_sysfs = strdup(value); + if (!d->deserialized_sysfs) + log_oom_debug(); + } + + } else if (streq(key, "path")) { if (!d->path) { d->path = strdup(value); if (!d->path) diff --git a/src/core/device.h b/src/core/device.h index 7584bc70c4..9dd6fb57c2 100644 --- a/src/core/device.h +++ b/src/core/device.h @@ -20,7 +20,7 @@ typedef enum DeviceFound { struct Device { Unit meta; - char *sysfs; + char *sysfs, *deserialized_sysfs; char *path; /* syspath, device node, alias, or devlink */ /* In order to be able to distinguish dependencies on different device nodes we might end up creating multiple diff --git a/src/core/efi-random.c b/src/core/efi-random.c index 4086b12739..61516775fc 100644 --- a/src/core/efi-random.c +++ b/src/core/efi-random.c @@ -12,79 +12,23 @@ #include "random-util.h" #include "strv.h" -/* If a random seed was passed by the boot loader in the LoaderRandomSeed EFI variable, let's credit it to - * the kernel's random pool, but only once per boot. If this is run very early during initialization we can - * instantly boot up with a filled random pool. - * - * This makes no judgement on the entropy passed, it's the job of the boot loader to only pass us a seed that - * is suitably validated. */ - -static void lock_down_efi_variables(void) { +void lock_down_efi_variables(void) { + _cleanup_close_ int fd = -1; int r; + fd = open(EFIVAR_PATH(EFI_LOADER_VARIABLE(LoaderSystemToken)), O_RDONLY|O_CLOEXEC); + if (fd < 0) { + if (errno != ENOENT) + log_warning_errno(errno, "Unable to open LoaderSystemToken EFI variable, ignoring: %m"); + return; + } + /* Paranoia: let's restrict access modes of these a bit, so that unprivileged users can't use them to * identify the system or gain too much insight into what we might have credited to the entropy * pool. */ - FOREACH_STRING(path, - EFIVAR_PATH(EFI_LOADER_VARIABLE(LoaderRandomSeed)), - EFIVAR_PATH(EFI_LOADER_VARIABLE(LoaderSystemToken))) { - - r = chattr_path(path, 0, FS_IMMUTABLE_FL, NULL); - if (r == -ENOENT) - continue; - if (r < 0) - log_warning_errno(r, "Failed to drop FS_IMMUTABLE_FL from %s, ignoring: %m", path); - - if (chmod(path, 0600) < 0) - log_warning_errno(errno, "Failed to reduce access mode of %s, ignoring: %m", path); - } -} - -int efi_take_random_seed(void) { - _cleanup_free_ void *value = NULL; - size_t size; - int r; - - /* Paranoia comes first. */ - lock_down_efi_variables(); - - if (access("/run/systemd/efi-random-seed-taken", F_OK) < 0) { - if (errno != ENOENT) { - log_warning_errno(errno, "Failed to determine whether we already used the random seed token, not using it."); - return 0; - } - - /* ENOENT means we haven't used it yet. */ - } else { - log_debug("EFI random seed already used, not using again."); - return 0; - } - - r = efi_get_variable(EFI_LOADER_VARIABLE(LoaderRandomSeed), NULL, &value, &size); - if (r == -EOPNOTSUPP) { - log_debug_errno(r, "System lacks EFI support, not initializing random seed from EFI variable."); - return 0; - } - if (r == -ENOENT) { - log_debug_errno(r, "Boot loader did not pass LoaderRandomSeed EFI variable, not crediting any entropy."); - return 0; - } + r = chattr_fd(fd, 0, FS_IMMUTABLE_FL, NULL); if (r < 0) - return log_warning_errno(r, "Failed to read LoaderRandomSeed EFI variable, ignoring: %m"); - - if (size == 0) - return log_warning_errno(SYNTHETIC_ERRNO(EINVAL), "Random seed passed from boot loader has zero size? Ignoring."); - - /* Before we use the seed, let's mark it as used, so that we never credit it twice. Also, it's a nice - * way to let users known that we successfully acquired entropy from the boot loader. */ - r = touch("/run/systemd/efi-random-seed-taken"); - if (r < 0) - return log_warning_errno(r, "Unable to mark EFI random seed as used, not using it: %m"); - - r = random_write_entropy(-1, value, size, true); - if (r < 0) - return log_warning_errno(errno, "Failed to credit entropy, ignoring: %m"); - - log_info("Successfully credited entropy passed from boot loader."); - return 1; + log_warning_errno(r, "Failed to drop FS_IMMUTABLE_FL from LoaderSystemToken EFI variable, ignoring: %m"); + if (fchmod(fd, 0600) < 0) + log_warning_errno(errno, "Failed to reduce access mode of LoaderSystemToken EFI variable, ignoring: %m"); } diff --git a/src/core/efi-random.h b/src/core/efi-random.h index 7d20fff57d..87166c9e3f 100644 --- a/src/core/efi-random.h +++ b/src/core/efi-random.h @@ -1,4 +1,4 @@ /* SPDX-License-Identifier: LGPL-2.1-or-later */ #pragma once -int efi_take_random_seed(void); +void lock_down_efi_variables(void); diff --git a/src/core/kmod-setup.c b/src/core/kmod-setup.c index 966631d44e..dcbc28205f 100644 --- a/src/core/kmod-setup.c +++ b/src/core/kmod-setup.c @@ -5,6 +5,7 @@ #include "alloc-util.h" #include "bus-util.h" #include "capability-util.h" +#include "efi-api.h" #include "fileio.h" #include "kmod-setup.h" #include "macro.h" @@ -99,27 +100,32 @@ int kmod_setup(void) { } kmod_table[] = { /* This one we need to load explicitly, since auto-loading on use doesn't work * before udev created the ghost device nodes, and we need it earlier than that. */ - { "autofs4", "/sys/class/misc/autofs", true, false, NULL }, + { "autofs4", "/sys/class/misc/autofs", true, false, NULL }, /* This one we need to load explicitly, since auto-loading of IPv6 is not done when * we try to configure ::1 on the loopback device. */ - { "ipv6", "/sys/module/ipv6", false, true, NULL }, + { "ipv6", "/sys/module/ipv6", false, true, NULL }, /* This should never be a module */ - { "unix", "/proc/net/unix", true, true, NULL }, + { "unix", "/proc/net/unix", true, true, NULL }, #if HAVE_LIBIPTC /* netfilter is needed by networkd, nspawn among others, and cannot be autoloaded */ - { "ip_tables", "/proc/net/ip_tables_names", false, false, NULL }, + { "ip_tables", "/proc/net/ip_tables_names", false, false, NULL }, #endif /* virtio_rng would be loaded by udev later, but real entropy might be needed very early */ - { "virtio_rng", NULL, false, false, has_virtio_rng }, + { "virtio_rng", NULL, false, false, has_virtio_rng }, /* qemu_fw_cfg would be loaded by udev later, but we want to import credentials from it super early */ - { "qemu_fw_cfg", "/sys/firmware/qemu_fw_cfg", false, false, in_qemu }, + { "qemu_fw_cfg", "/sys/firmware/qemu_fw_cfg", false, false, in_qemu }, /* dmi-sysfs is needed to import credentials from it super early */ - { "dmi-sysfs", "/sys/firmware/dmi/entries", false, false, NULL }, + { "dmi-sysfs", "/sys/firmware/dmi/entries", false, false, NULL }, + +#if HAVE_TPM2 + /* Make sure the tpm subsystem is available which ConditionSecurity=tpm2 depends on. */ + { "tpm", "/sys/class/tpmrm", false, false, efi_has_tpm2 }, +#endif }; _cleanup_(kmod_unrefp) struct kmod_ctx *ctx = NULL; unsigned i; diff --git a/src/core/load-fragment-gperf.gperf.in b/src/core/load-fragment-gperf.gperf.in index 7675b7bb2e..bba6666a52 100644 --- a/src/core/load-fragment-gperf.gperf.in +++ b/src/core/load-fragment-gperf.gperf.in @@ -205,6 +205,7 @@ {{type}}.MemoryHigh, config_parse_memory_limit, 0, offsetof({{type}}, cgroup_context) {{type}}.MemoryMax, config_parse_memory_limit, 0, offsetof({{type}}, cgroup_context) {{type}}.MemorySwapMax, config_parse_memory_limit, 0, offsetof({{type}}, cgroup_context) +{{type}}.MemoryZSwapMax, config_parse_memory_limit, 0, offsetof({{type}}, cgroup_context) {{type}}.MemoryLimit, config_parse_memory_limit, 0, offsetof({{type}}, cgroup_context) {{type}}.DeviceAllow, config_parse_device_allow, 0, offsetof({{type}}, cgroup_context) {{type}}.DevicePolicy, config_parse_device_policy, 0, offsetof({{type}}, cgroup_context.device_policy) diff --git a/src/core/load-fragment.c b/src/core/load-fragment.c index 49d3c03591..734a5941cc 100644 --- a/src/core/load-fragment.c +++ b/src/core/load-fragment.c @@ -3826,7 +3826,7 @@ int config_parse_memory_limit( bytes = physical_memory_scale(r, 10000U); if (bytes >= UINT64_MAX || - (bytes <= 0 && !STR_IN_SET(lvalue, "MemorySwapMax", "MemoryLow", "MemoryMin", "DefaultMemoryLow", "DefaultMemoryMin"))) { + (bytes <= 0 && !STR_IN_SET(lvalue, "MemorySwapMax", "MemoryZSwapMax", "MemoryLow", "MemoryMin", "DefaultMemoryLow", "DefaultMemoryMin"))) { log_syntax(unit, LOG_WARNING, filename, line, 0, "Memory limit '%s' out of range, ignoring.", rvalue); return 0; } @@ -3850,6 +3850,8 @@ int config_parse_memory_limit( c->memory_max = bytes; else if (streq(lvalue, "MemorySwapMax")) c->memory_swap_max = bytes; + else if (streq(lvalue, "MemoryZSwapMax")) + c->memory_zswap_max = bytes; else if (streq(lvalue, "MemoryLimit")) { log_syntax(unit, LOG_WARNING, filename, line, 0, "Unit uses MemoryLimit=; please use MemoryMax= instead. Support for MemoryLimit= will be removed soon."); diff --git a/src/core/main.c b/src/core/main.c index cc725e6c42..119c518664 100644 --- a/src/core/main.c +++ b/src/core/main.c @@ -2831,8 +2831,8 @@ int main(int argc, char *argv[]) { goto finish; } - /* The efivarfs is now mounted, let's read the random seed off it */ - (void) efi_take_random_seed(); + /* The efivarfs is now mounted, let's lock down the system token. */ + lock_down_efi_variables(); /* Cache command-line options passed from EFI variables */ if (!skip_setup) diff --git a/src/core/manager.c b/src/core/manager.c index ffaa9fa595..598604d694 100644 --- a/src/core/manager.c +++ b/src/core/manager.c @@ -890,7 +890,7 @@ int manager_new(LookupScope scope, ManagerTestRunFlags test_run_flags, Manager * } /* Reboot immediately if the user hits C-A-D more often than 7x per 2s */ - m->ctrl_alt_del_ratelimit = (RateLimit) { .interval = 2 * USEC_PER_SEC, .burst = 7 }; + m->ctrl_alt_del_ratelimit = (const RateLimit) { .interval = 2 * USEC_PER_SEC, .burst = 7 }; r = manager_default_environment(m); if (r < 0) diff --git a/src/core/namespace.c b/src/core/namespace.c index 7752e48fb0..c0d0cc9715 100644 --- a/src/core/namespace.c +++ b/src/core/namespace.c @@ -2486,7 +2486,7 @@ int setup_namespace( goto finish; /* MS_MOVE does not work on MS_SHARED so the remount MS_SHARED will be done later */ - r = mount_move_root(root); + r = mount_pivot_root(root); if (r == -EINVAL && root_directory) { /* If we are using root_directory and we don't have privileges (ie: user manager in a user * namespace) and the root_directory is already a mount point in the parent namespace, @@ -2496,7 +2496,7 @@ int setup_namespace( r = mount_nofollow_verbose(LOG_DEBUG, root, root, NULL, MS_BIND|MS_REC, NULL); if (r < 0) goto finish; - r = mount_move_root(root); + r = mount_pivot_root(root); } if (r < 0) { log_debug_errno(r, "Failed to mount root with MS_MOVE: %m"); diff --git a/src/core/timer.c b/src/core/timer.c index 8bd430b931..b6810c8599 100644 --- a/src/core/timer.c +++ b/src/core/timer.c @@ -948,11 +948,11 @@ static int activation_details_timer_append_env(ActivationDetails *details, char if (!dual_timestamp_is_set(&t->last_trigger)) return 0; - r = strv_extendf(strv, "TRIGGER_TIMER_REALTIME_USEC=%" USEC_FMT, t->last_trigger.realtime); + r = strv_extendf(strv, "TRIGGER_TIMER_REALTIME_USEC=" USEC_FMT, t->last_trigger.realtime); if (r < 0) return r; - r = strv_extendf(strv, "TRIGGER_TIMER_MONOTONIC_USEC=%" USEC_FMT, t->last_trigger.monotonic); + r = strv_extendf(strv, "TRIGGER_TIMER_MONOTONIC_USEC=" USEC_FMT, t->last_trigger.monotonic); if (r < 0) return r; @@ -974,7 +974,7 @@ static int activation_details_timer_append_pair(ActivationDetails *details, char if (r < 0) return r; - r = strv_extendf(strv, "%" USEC_FMT, t->last_trigger.realtime); + r = strv_extendf(strv, USEC_FMT, t->last_trigger.realtime); if (r < 0) return r; @@ -982,7 +982,7 @@ static int activation_details_timer_append_pair(ActivationDetails *details, char if (r < 0) return r; - r = strv_extendf(strv, "%" USEC_FMT, t->last_trigger.monotonic); + r = strv_extendf(strv, USEC_FMT, t->last_trigger.monotonic); if (r < 0) return r; diff --git a/src/core/unit.c b/src/core/unit.c index d08c73613b..29b07a6e7a 100644 --- a/src/core/unit.c +++ b/src/core/unit.c @@ -22,6 +22,7 @@ #include "dbus-unit.h" #include "dbus.h" #include "dropin.h" +#include "env-util.h" #include "escape.h" #include "execute.h" #include "fd-util.h" @@ -127,7 +128,7 @@ Unit* unit_new(Manager *m, size_t size) { u->last_section_private = -1; u->start_ratelimit = (RateLimit) { m->default_start_limit_interval, m->default_start_limit_burst }; - u->auto_start_stop_ratelimit = (RateLimit) { 10 * USEC_PER_SEC, 16 }; + u->auto_start_stop_ratelimit = (const RateLimit) { 10 * USEC_PER_SEC, 16 }; return u; } @@ -4781,11 +4782,28 @@ int unit_setup_dynamic_creds(Unit *u) { } bool unit_type_supported(UnitType t) { + static int8_t cache[_UNIT_TYPE_MAX] = {}; /* -1: disabled, 1: enabled: 0: don't know */ + int r; + if (_unlikely_(t < 0)) return false; if (_unlikely_(t >= _UNIT_TYPE_MAX)) return false; + if (cache[t] == 0) { + char *e; + + e = strjoina("SYSTEMD_SUPPORT_", unit_type_to_string(t)); + + r = getenv_bool(ascii_strupper(e)); + if (r < 0 && r != -ENXIO) + log_debug_errno(r, "Failed to parse $%s, ignoring: %m", e); + + cache[t] = r == 0 ? -1 : 1; + } + if (cache[t] < 0) + return false; + if (!unit_vtable[t]->supported) return true; diff --git a/src/fundamental/efivars-fundamental.h b/src/fundamental/efivars-fundamental.h index fe34e6c714..cf785f8b7d 100644 --- a/src/fundamental/efivars-fundamental.h +++ b/src/fundamental/efivars-fundamental.h @@ -22,6 +22,7 @@ #define EFI_STUB_FEATURE_PICK_UP_CREDENTIALS (UINT64_C(1) << 1) #define EFI_STUB_FEATURE_PICK_UP_SYSEXTS (UINT64_C(1) << 2) #define EFI_STUB_FEATURE_THREE_PCRS (UINT64_C(1) << 3) +#define EFI_STUB_FEATURE_RANDOM_SEED (UINT64_C(1) << 4) typedef enum SecureBootMode { SECURE_BOOT_UNSUPPORTED, diff --git a/src/fuzz/fuzz-compress.c b/src/fuzz/fuzz-compress.c index 712ab3ffa9..10956cc548 100644 --- a/src/fuzz/fuzz-compress.c +++ b/src/fuzz/fuzz-compress.c @@ -55,7 +55,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) { size_t sw_alloc = MAX(h->sw_alloc, 1u); buf2 = malloc(sw_alloc); - if (!buf) { + if (!buf2) { log_oom(); return 0; } diff --git a/src/home/homework-cifs.c b/src/home/homework-cifs.c index e79def3dae..8ad1da6303 100644 --- a/src/home/homework-cifs.c +++ b/src/home/homework-cifs.c @@ -64,7 +64,7 @@ int home_setup_cifs( pid_t mount_pid; int exit_status; - r = fopen_temporary(NULL, &f, &p); + r = fopen_temporary_child(NULL, &f, &p); if (r < 0) return log_error_errno(r, "Failed to create temporary credentials file: %m"); diff --git a/src/home/homework-luks.c b/src/home/homework-luks.c index 97fb5a1051..48e8cd1808 100644 --- a/src/home/homework-luks.c +++ b/src/home/homework-luks.c @@ -1837,7 +1837,7 @@ static int make_partition_table( _cleanup_(fdisk_unref_partitionp) struct fdisk_partition *p = NULL, *q = NULL; _cleanup_(fdisk_unref_parttypep) struct fdisk_parttype *t = NULL; _cleanup_(fdisk_unref_contextp) struct fdisk_context *c = NULL; - _cleanup_free_ char *path = NULL, *disk_uuid_as_string = NULL; + _cleanup_free_ char *disk_uuid_as_string = NULL; uint64_t offset, size, first_lba, start, last_lba, end; sd_id128_t disk_uuid; int r; @@ -1855,14 +1855,7 @@ static int make_partition_table( if (r < 0) return log_error_errno(r, "Failed to initialize partition type: %m"); - c = fdisk_new_context(); - if (!c) - return log_oom(); - - if (asprintf(&path, "/proc/self/fd/%i", fd) < 0) - return log_oom(); - - r = fdisk_assign_device(c, path, 0); + r = fdisk_new_context_fd(fd, /* read_only= */ false, &c); if (r < 0) return log_error_errno(r, "Failed to open device: %m"); @@ -2017,9 +2010,12 @@ static int wait_for_devlink(const char *path) { if (w >= until) return log_error_errno(SYNTHETIC_ERRNO(ETIMEDOUT), "Device link %s still hasn't shown up, giving up.", path); - r = fd_wait_for_event(inotify_fd, POLLIN, usec_sub_unsigned(until, w)); - if (r < 0) + r = fd_wait_for_event(inotify_fd, POLLIN, until - w); + if (r < 0) { + if (ERRNO_IS_TRANSIENT(r)) + continue; return log_error_errno(r, "Failed to watch inotify: %m"); + } (void) flush_fd(inotify_fd); } @@ -2642,7 +2638,7 @@ static int prepare_resize_partition( _cleanup_(fdisk_unref_contextp) struct fdisk_context *c = NULL; _cleanup_(fdisk_unref_tablep) struct fdisk_table *t = NULL; - _cleanup_free_ char *path = NULL, *disk_uuid_as_string = NULL; + _cleanup_free_ char *disk_uuid_as_string = NULL; struct fdisk_partition *found = NULL; sd_id128_t disk_uuid; size_t n_partitions; @@ -2665,14 +2661,7 @@ static int prepare_resize_partition( return 0; } - c = fdisk_new_context(); - if (!c) - return log_oom(); - - if (asprintf(&path, "/proc/self/fd/%i", fd) < 0) - return log_oom(); - - r = fdisk_assign_device(c, path, 0); + r = fdisk_new_context_fd(fd, /* read_only= */ false, &c); if (r < 0) return log_error_errno(r, "Failed to open device: %m"); @@ -2756,7 +2745,6 @@ static int apply_resize_partition( _cleanup_(fdisk_unref_contextp) struct fdisk_context *c = NULL; _cleanup_free_ void *two_zero_lbas = NULL; - _cleanup_free_ char *path = NULL; ssize_t n; int r; @@ -2788,14 +2776,7 @@ static int apply_resize_partition( if (n != 1024) return log_error_errno(SYNTHETIC_ERRNO(EIO), "Short write while wiping partition table."); - c = fdisk_new_context(); - if (!c) - return log_oom(); - - if (asprintf(&path, "/proc/self/fd/%i", fd) < 0) - return log_oom(); - - r = fdisk_assign_device(c, path, 0); + r = fdisk_new_context_fd(fd, /* read_only= */ false, &c); if (r < 0) return log_error_errno(r, "Failed to open device: %m"); diff --git a/src/id128/id128.c b/src/id128/id128.c index af88e315bb..53a24348d5 100644 --- a/src/id128/id128.c +++ b/src/id128/id128.c @@ -123,10 +123,13 @@ static int verb_show(int argc, char **argv, void *userdata) { if (have_uuid) id = gpt_partition_type_uuid_to_string(uuid) ?: "XYZ"; else { - r = gpt_partition_type_uuid_from_string(*p, &uuid); + GptPartitionType type; + + r = gpt_partition_type_from_string(*p, &type); if (r < 0) return log_error_errno(r, "Unknown identifier \"%s\".", *p); + uuid = type.uuid; id = *p; } diff --git a/src/import/import-fs.c b/src/import/import-fs.c index 4e7250c02e..ca5d33c008 100644 --- a/src/import/import-fs.c +++ b/src/import/import-fs.c @@ -188,7 +188,7 @@ static int import_fs(int argc, char *argv[], void *userdata) { (void) mkdir_parents_label(dest, 0700); - progress.limit = (RateLimit) { 200*USEC_PER_MSEC, 1 }; + progress.limit = (const RateLimit) { 200*USEC_PER_MSEC, 1 }; { BLOCK_SIGNALS(SIGINT, SIGTERM); diff --git a/src/journal/journald-audit.c b/src/journal/journald-audit.c index 3e87a93a9e..d301d28966 100644 --- a/src/journal/journald-audit.c +++ b/src/journal/journald-audit.c @@ -441,7 +441,7 @@ void server_process_audit_message( } if (!NLMSG_OK(nl, buffer_size)) { - log_error("Audit netlink message truncated."); + log_ratelimit_error(JOURNALD_LOG_RATELIMIT, "Audit netlink message truncated."); return; } diff --git a/src/journal/journald-context.c b/src/journal/journald-context.c index 27608ff089..6d58422ddd 100644 --- a/src/journal/journald-context.c +++ b/src/journal/journald-context.c @@ -771,7 +771,8 @@ void client_context_acquire_default(Server *s) { r = client_context_acquire(s, ucred.pid, &ucred, NULL, 0, NULL, &s->my_context); if (r < 0) - log_warning_errno(r, "Failed to acquire our own context, ignoring: %m"); + log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to acquire our own context, ignoring: %m"); } if (!s->namespace && !s->pid1_context) { @@ -780,7 +781,8 @@ void client_context_acquire_default(Server *s) { r = client_context_acquire(s, 1, NULL, NULL, 0, NULL, &s->pid1_context); if (r < 0) - log_warning_errno(r, "Failed to acquire PID1's context, ignoring: %m"); + log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to acquire PID1's context, ignoring: %m"); } } diff --git a/src/journal/journald-kmsg.c b/src/journal/journald-kmsg.c index 8ae7a23d56..10faf2dd06 100644 --- a/src/journal/journald-kmsg.c +++ b/src/journal/journald-kmsg.c @@ -320,7 +320,7 @@ static int server_read_dev_kmsg(Server *s) { if (l < 0) { /* Old kernels who don't allow reading from /dev/kmsg * return EINVAL when we try. So handle this cleanly, - * but don' try to ever read from it again. */ + * but don't try to ever read from it again. */ if (errno == EINVAL) { s->dev_kmsg_event_source = sd_event_source_unref(s->dev_kmsg_event_source); return 0; @@ -329,7 +329,7 @@ static int server_read_dev_kmsg(Server *s) { if (ERRNO_IS_TRANSIENT(errno) || errno == EPIPE) return 0; - return log_error_errno(errno, "Failed to read from /dev/kmsg: %m"); + return log_ratelimit_error_errno(errno, JOURNALD_LOG_RATELIMIT, "Failed to read from /dev/kmsg: %m"); } dev_kmsg_record(s, buffer, l); @@ -368,7 +368,8 @@ static int dispatch_dev_kmsg(sd_event_source *es, int fd, uint32_t revents, void assert(fd == s->dev_kmsg_fd); if (revents & EPOLLERR) - log_warning("/dev/kmsg buffer overrun, some messages lost."); + log_ratelimit_warning(JOURNALD_LOG_RATELIMIT, + "/dev/kmsg buffer overrun, some messages lost."); if (!(revents & EPOLLIN)) log_error("Got invalid event from epoll for /dev/kmsg: %"PRIx32, revents); diff --git a/src/journal/journald-native.c b/src/journal/journald-native.c index 032578822d..21e20db2d4 100644 --- a/src/journal/journald-native.c +++ b/src/journal/journald-native.c @@ -309,7 +309,9 @@ void server_process_native_message( if (ucred && pid_is_valid(ucred->pid)) { r = client_context_get(s, ucred->pid, ucred, label, label_len, NULL, &context); if (r < 0) - log_warning_errno(r, "Failed to retrieve credentials for PID " PID_FMT ", ignoring: %m", ucred->pid); + log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to retrieve credentials for PID " PID_FMT ", ignoring: %m", + ucred->pid); } do { @@ -348,29 +350,34 @@ void server_process_native_file( r = fd_get_path(fd, &k); if (r < 0) { - log_error_errno(r, "readlink(/proc/self/fd/%i) failed: %m", fd); + log_ratelimit_error_errno(r, JOURNALD_LOG_RATELIMIT, + "readlink(/proc/self/fd/%i) failed: %m", fd); return; } e = PATH_STARTSWITH_SET(k, "/dev/shm/", "/tmp/", "/var/tmp/"); if (!e) { - log_error("Received file outside of allowed directories. Refusing."); + log_ratelimit_error(JOURNALD_LOG_RATELIMIT, + "Received file outside of allowed directories. Refusing."); return; } if (!filename_is_valid(e)) { - log_error("Received file in subdirectory of allowed directories. Refusing."); + log_ratelimit_error(JOURNALD_LOG_RATELIMIT, + "Received file in subdirectory of allowed directories. Refusing."); return; } } if (fstat(fd, &st) < 0) { - log_error_errno(errno, "Failed to stat passed file, ignoring: %m"); + log_ratelimit_error_errno(errno, JOURNALD_LOG_RATELIMIT, + "Failed to stat passed file, ignoring: %m"); return; } if (!S_ISREG(st.st_mode)) { - log_error("File passed is not regular. Ignoring."); + log_ratelimit_error(JOURNALD_LOG_RATELIMIT, + "File passed is not regular. Ignoring."); return; } @@ -380,7 +387,9 @@ void server_process_native_file( /* When !sealed, set a lower memory limit. We have to read the file, * effectively doubling memory use. */ if (st.st_size > ENTRY_SIZE_MAX / (sealed ? 1 : 2)) { - log_error("File passed too large (%"PRIu64" bytes). Ignoring.", (uint64_t) st.st_size); + log_ratelimit_error(JOURNALD_LOG_RATELIMIT, + "File passed too large (%"PRIu64" bytes). Ignoring.", + (uint64_t) st.st_size); return; } @@ -393,7 +402,8 @@ void server_process_native_file( ps = PAGE_ALIGN(st.st_size); p = mmap(NULL, ps, PROT_READ, MAP_PRIVATE, fd, 0); if (p == MAP_FAILED) { - log_error_errno(errno, "Failed to map memfd, ignoring: %m"); + log_ratelimit_error_errno(errno, JOURNALD_LOG_RATELIMIT, + "Failed to map memfd, ignoring: %m"); return; } @@ -405,7 +415,8 @@ void server_process_native_file( ssize_t n; if (fstatvfs(fd, &vfs) < 0) { - log_error_errno(errno, "Failed to stat file system of passed file, not processing it: %m"); + log_ratelimit_error_errno(errno, JOURNALD_LOG_RATELIMIT, + "Failed to stat file system of passed file, not processing it: %m"); return; } @@ -415,7 +426,8 @@ void server_process_native_file( * https://github.com/systemd/systemd/issues/1822 */ if (vfs.f_flag & ST_MANDLOCK) { - log_error("Received file descriptor from file system with mandatory locking enabled, not processing it."); + log_ratelimit_error(JOURNALD_LOG_RATELIMIT, + "Received file descriptor from file system with mandatory locking enabled, not processing it."); return; } @@ -428,7 +440,8 @@ void server_process_native_file( * and so is SMB. */ r = fd_nonblock(fd, true); if (r < 0) { - log_error_errno(r, "Failed to make fd non-blocking, not processing it: %m"); + log_ratelimit_error_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to make fd non-blocking, not processing it: %m"); return; } @@ -444,7 +457,8 @@ void server_process_native_file( n = pread(fd, p, st.st_size, 0); if (n < 0) - log_error_errno(errno, "Failed to read file, ignoring: %m"); + log_ratelimit_error_errno(errno, JOURNALD_LOG_RATELIMIT, + "Failed to read file, ignoring: %m"); else if (n > 0) server_process_native_message(s, p, n, ucred, tv, label, label_len); } diff --git a/src/journal/journald-server.c b/src/journal/journald-server.c index c02d73bdc2..cb94a037d5 100644 --- a/src/journal/journald-server.c +++ b/src/journal/journald-server.c @@ -83,6 +83,8 @@ #define IDLE_TIMEOUT_USEC (30*USEC_PER_SEC) +#define FAILED_TO_WRITE_ENTRY_RATELIMIT ((const RateLimit) { .interval = 1 * USEC_PER_SEC, .burst = 1 }) + static int determine_path_usage( Server *s, const char *path, @@ -99,11 +101,12 @@ static int determine_path_usage( d = opendir(path); if (!d) - return log_full_errno(errno == ENOENT ? LOG_DEBUG : LOG_ERR, - errno, "Failed to open %s: %m", path); + return log_ratelimit_full_errno(errno == ENOENT ? LOG_DEBUG : LOG_ERR, + errno, JOURNALD_LOG_RATELIMIT, "Failed to open %s: %m", path); if (fstatvfs(dirfd(d), &ss) < 0) - return log_error_errno(errno, "Failed to fstatvfs(%s): %m", path); + return log_ratelimit_error_errno(errno, JOURNALD_LOG_RATELIMIT, + "Failed to fstatvfs(%s): %m", path); *ret_free = ss.f_bsize * ss.f_bavail; *ret_used = 0; @@ -253,7 +256,8 @@ static void server_add_acls(ManagedJournalFile *f, uid_t uid) { r = fd_add_uid_acl_permission(f->file->fd, uid, ACL_READ); if (r < 0) - log_warning_errno(r, "Failed to set ACL on %s, ignoring: %m", f->file->path); + log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to set ACL on %s, ignoring: %m", f->file->path); #endif } @@ -353,7 +357,8 @@ static int system_journal_open(Server *s, bool flush_requested, bool relinquish_ patch_min_use(&s->system_storage); } else { if (!IN_SET(r, -ENOENT, -EROFS)) - log_warning_errno(r, "Failed to open system journal: %m"); + log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to open system journal: %m"); r = 0; } @@ -382,7 +387,8 @@ static int system_journal_open(Server *s, bool flush_requested, bool relinquish_ r = open_journal(s, false, fn, O_RDWR, false, &s->runtime_storage.metrics, &s->runtime_journal); if (r < 0) { if (r != -ENOENT) - log_warning_errno(r, "Failed to open runtime journal: %m"); + log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to open runtime journal: %m"); r = 0; } @@ -396,7 +402,8 @@ static int system_journal_open(Server *s, bool flush_requested, bool relinquish_ r = open_journal(s, true, fn, O_RDWR|O_CREAT, false, &s->runtime_storage.metrics, &s->runtime_journal); if (r < 0) - return log_error_errno(r, "Failed to open runtime journal: %m"); + return log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to open runtime journal: %m"); } if (s->runtime_journal) { @@ -493,9 +500,11 @@ static int do_rotate( r = managed_journal_file_rotate(f, s->mmap, file_flags, s->compress.threshold_bytes, s->deferred_closes); if (r < 0) { if (*f) - return log_error_errno(r, "Failed to rotate %s: %m", (*f)->file->path); + return log_ratelimit_error_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to rotate %s: %m", (*f)->file->path); else - return log_error_errno(r, "Failed to create new %s journal: %m", name); + return log_ratelimit_error_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to create new %s journal: %m", name); } server_add_acls(*f, uid); @@ -545,7 +554,8 @@ static int vacuum_offline_user_journals(Server *s) { if (errno == ENOENT) return 0; - return log_error_errno(errno, "Failed to open %s: %m", s->system_storage.path); + return log_ratelimit_error_errno(errno, JOURNALD_LOG_RATELIMIT, + "Failed to open %s: %m", s->system_storage.path); } for (;;) { @@ -560,7 +570,9 @@ static int vacuum_offline_user_journals(Server *s) { de = readdir_no_dot(d); if (!de) { if (errno != 0) - log_warning_errno(errno, "Failed to enumerate %s, ignoring: %m", s->system_storage.path); + log_ratelimit_warning_errno(errno, JOURNALD_LOG_RATELIMIT, + "Failed to enumerate %s, ignoring: %m", + s->system_storage.path); break; } @@ -592,8 +604,9 @@ static int vacuum_offline_user_journals(Server *s) { fd = openat(dirfd(d), de->d_name, O_RDWR|O_CLOEXEC|O_NOCTTY|O_NOFOLLOW|O_NONBLOCK); if (fd < 0) { - log_full_errno(IN_SET(errno, ELOOP, ENOENT) ? LOG_DEBUG : LOG_WARNING, errno, - "Failed to open journal file '%s' for rotation: %m", full); + log_ratelimit_full_errno(IN_SET(errno, ELOOP, ENOENT) ? LOG_DEBUG : LOG_WARNING, + errno, JOURNALD_LOG_RATELIMIT, + "Failed to open journal file '%s' for rotation: %m", full); continue; } @@ -615,11 +628,15 @@ static int vacuum_offline_user_journals(Server *s) { NULL, &f); if (r < 0) { - log_warning_errno(r, "Failed to read journal file %s for rotation, trying to move it out of the way: %m", full); + log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to read journal file %s for rotation, trying to move it out of the way: %m", + full); r = journal_file_dispose(dirfd(d), de->d_name); if (r < 0) - log_warning_errno(r, "Failed to move %s out of the way, ignoring: %m", full); + log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to move %s out of the way, ignoring: %m", + full); else log_debug("Successfully moved %s out of the way.", full); @@ -675,19 +692,22 @@ void server_sync(Server *s) { if (s->system_journal) { r = managed_journal_file_set_offline(s->system_journal, false); if (r < 0) - log_warning_errno(r, "Failed to sync system journal, ignoring: %m"); + log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to sync system journal, ignoring: %m"); } ORDERED_HASHMAP_FOREACH(f, s->user_journals) { r = managed_journal_file_set_offline(f, false); if (r < 0) - log_warning_errno(r, "Failed to sync user journal, ignoring: %m"); + log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to sync user journal, ignoring: %m"); } if (s->sync_event_source) { r = sd_event_source_set_enabled(s->sync_event_source, SD_EVENT_OFF); if (r < 0) - log_error_errno(r, "Failed to disable sync timer source: %m"); + log_ratelimit_error_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to disable sync timer source: %m"); } s->sync_scheduled = false; @@ -709,7 +729,8 @@ static void do_vacuum(Server *s, JournalStorage *storage, bool verbose) { storage->metrics.n_max_files, s->max_retention_usec, &s->oldest_file_usec, verbose); if (r < 0 && r != -ENOENT) - log_warning_errno(r, "Failed to vacuum %s, ignoring: %m", storage->path); + log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to vacuum %s, ignoring: %m", storage->path); cache_space_invalidate(&storage->space); } @@ -781,37 +802,39 @@ static bool shall_try_append_again(JournalFile *f, int r) { return true; case -EIO: /* I/O error of some kind (mmap) */ - log_warning("%s: IO error, rotating.", f->path); + log_ratelimit_warning(JOURNALD_LOG_RATELIMIT, "%s: IO error, rotating.", f->path); return true; case -EHOSTDOWN: /* Other machine */ - log_info("%s: Journal file from other machine, rotating.", f->path); + log_ratelimit_info(JOURNALD_LOG_RATELIMIT, "%s: Journal file from other machine, rotating.", f->path); return true; case -EBUSY: /* Unclean shutdown */ - log_info("%s: Unclean shutdown, rotating.", f->path); + log_ratelimit_info(JOURNALD_LOG_RATELIMIT, "%s: Unclean shutdown, rotating.", f->path); return true; case -EPROTONOSUPPORT: /* Unsupported feature */ - log_info("%s: Unsupported feature, rotating.", f->path); + log_ratelimit_info(JOURNALD_LOG_RATELIMIT, "%s: Unsupported feature, rotating.", f->path); return true; case -EBADMSG: /* Corrupted */ case -ENODATA: /* Truncated */ case -ESHUTDOWN: /* Already archived */ - log_warning("%s: Journal file corrupted, rotating.", f->path); + log_ratelimit_warning(JOURNALD_LOG_RATELIMIT, "%s: Journal file corrupted, rotating.", f->path); return true; case -EIDRM: /* Journal file has been deleted */ - log_warning("%s: Journal file has been deleted, rotating.", f->path); + log_ratelimit_warning(JOURNALD_LOG_RATELIMIT, "%s: Journal file has been deleted, rotating.", f->path); return true; case -ETXTBSY: /* Journal file is from the future */ - log_warning("%s: Journal file is from the future, rotating.", f->path); + log_ratelimit_warning(JOURNALD_LOG_RATELIMIT, "%s: Journal file is from the future, rotating.", f->path); return true; case -EAFNOSUPPORT: - log_warning("%s: underlying file system does not support memory mapping or another required file system feature.", f->path); + log_ratelimit_warning(JOURNALD_LOG_RATELIMIT, + "%s: underlying file system does not support memory mapping or another required file system feature.", + f->path); return false; default: @@ -841,7 +864,7 @@ static void write_to_journal(Server *s, uid_t uid, struct iovec *iovec, size_t n * to ensure that the entries in the journal files are strictly ordered by time, in order to ensure * bisection works correctly. */ - log_info("Time jumped backwards, rotating."); + log_ratelimit_info(JOURNALD_LOG_RATELIMIT, "Time jumped backwards, rotating."); rotate = true; } else { @@ -850,7 +873,9 @@ static void write_to_journal(Server *s, uid_t uid, struct iovec *iovec, size_t n return; if (journal_file_rotate_suggested(f->file, s->max_file_usec, LOG_INFO)) { - log_info("%s: Journal header limits reached or header out-of-date, rotating.", f->file->path); + log_ratelimit_info(JOURNALD_LOG_RATELIMIT, + "%s: Journal header limits reached or header out-of-date, rotating.", + f->file->path); rotate = true; } } @@ -874,14 +899,18 @@ static void write_to_journal(Server *s, uid_t uid, struct iovec *iovec, size_t n } if (vacuumed || !shall_try_append_again(f->file, r)) { - log_ratelimit_full_errno(LOG_ERR, r, "Failed to write entry (%zu items, %zu bytes), ignoring: %m", n, IOVEC_TOTAL_SIZE(iovec, n)); + log_ratelimit_error_errno(r, FAILED_TO_WRITE_ENTRY_RATELIMIT, + "Failed to write entry (%zu items, %zu bytes), ignoring: %m", + n, IOVEC_TOTAL_SIZE(iovec, n)); return; } if (r == -E2BIG) log_debug("Journal file %s is full, rotating to a new file", f->file->path); else - log_ratelimit_full_errno(LOG_INFO, r, "Failed to write entry to %s (%zu items, %zu bytes), rotating before retrying: %m", f->file->path, n, IOVEC_TOTAL_SIZE(iovec, n)); + log_ratelimit_info_errno(r, FAILED_TO_WRITE_ENTRY_RATELIMIT, + "Failed to write entry to %s (%zu items, %zu bytes), rotating before retrying: %m", + f->file->path, n, IOVEC_TOTAL_SIZE(iovec, n)); server_rotate(s); server_vacuum(s, false); @@ -890,10 +919,12 @@ static void write_to_journal(Server *s, uid_t uid, struct iovec *iovec, size_t n if (!f) return; - log_debug("Retrying write."); + log_debug_errno(r, "Retrying write."); r = journal_file_append_entry(f->file, &ts, NULL, iovec, n, &s->seqnum, NULL, NULL); if (r < 0) - log_ratelimit_full_errno(LOG_ERR, r, "Failed to write entry to %s (%zu items, %zu bytes) despite vacuuming, ignoring: %m", f->file->path, n, IOVEC_TOTAL_SIZE(iovec, n)); + log_ratelimit_error_errno(r, FAILED_TO_WRITE_ENTRY_RATELIMIT, + "Failed to write entry to %s (%zu items, %zu bytes) despite vacuuming, ignoring: %m", + f->file->path, n, IOVEC_TOTAL_SIZE(iovec, n)); else server_schedule_sync(s, priority); } @@ -1181,7 +1212,8 @@ int server_flush_to_var(Server *s, bool require_flag_file) { r = sd_journal_open(&j, SD_JOURNAL_RUNTIME_ONLY); if (r < 0) - return log_error_errno(r, "Failed to read runtime journal: %m"); + return log_ratelimit_error_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to read runtime journal: %m"); sd_journal_set_data_threshold(j, 0); @@ -1196,7 +1228,7 @@ int server_flush_to_var(Server *s, bool require_flag_file) { r = journal_file_move_to_object(f, OBJECT_ENTRY, f->current_offset, &o); if (r < 0) { - log_error_errno(r, "Can't read entry: %m"); + log_ratelimit_error_errno(r, JOURNALD_LOG_RATELIMIT, "Can't read entry: %m"); goto finish; } @@ -1205,17 +1237,18 @@ int server_flush_to_var(Server *s, bool require_flag_file) { continue; if (!shall_try_append_again(s->system_journal->file, r)) { - log_error_errno(r, "Can't write entry: %m"); + log_ratelimit_error_errno(r, JOURNALD_LOG_RATELIMIT, "Can't write entry: %m"); goto finish; } - log_info("Rotating system journal."); + log_ratelimit_info(JOURNALD_LOG_RATELIMIT, "Rotating system journal."); server_rotate(s); server_vacuum(s, false); if (!s->system_journal) { - log_notice("Didn't flush runtime journal since rotation of system journal wasn't successful."); + log_ratelimit_notice(JOURNALD_LOG_RATELIMIT, + "Didn't flush runtime journal since rotation of system journal wasn't successful."); r = -EIO; goto finish; } @@ -1223,7 +1256,7 @@ int server_flush_to_var(Server *s, bool require_flag_file) { log_debug("Retrying write."); r = journal_file_copy_entry(f, s->system_journal->file, o, f->current_offset); if (r < 0) { - log_error_errno(r, "Can't write entry: %m"); + log_ratelimit_error_errno(r, JOURNALD_LOG_RATELIMIT, "Can't write entry: %m"); goto finish; } } @@ -1251,7 +1284,8 @@ finish: fn = strjoina(s->runtime_directory, "/flushed"); k = touch(fn); if (k < 0) - log_warning_errno(k, "Failed to touch %s, ignoring: %m", fn); + log_ratelimit_warning_errno(k, JOURNALD_LOG_RATELIMIT, + "Failed to touch %s, ignoring: %m", fn); server_refresh_idle_timer(s); return r; @@ -1280,7 +1314,8 @@ static int server_relinquish_var(Server *s) { fn = strjoina(s->runtime_directory, "/flushed"); if (unlink(fn) < 0 && errno != ENOENT) - log_warning_errno(errno, "Failed to unlink %s, ignoring: %m", fn); + log_ratelimit_warning_errno(errno, JOURNALD_LOG_RATELIMIT, + "Failed to unlink %s, ignoring: %m", fn); server_refresh_idle_timer(s); return 0; @@ -1352,10 +1387,11 @@ int server_process_datagram( if (ERRNO_IS_TRANSIENT(n)) return 0; if (n == -EXFULL) { - log_warning("Got message with truncated control data (too many fds sent?), ignoring."); + log_ratelimit_warning(JOURNALD_LOG_RATELIMIT, + "Got message with truncated control data (too many fds sent?), ignoring."); return 0; } - return log_error_errno(n, "recvmsg() failed: %m"); + return log_ratelimit_error_errno(n, JOURNALD_LOG_RATELIMIT, "recvmsg() failed: %m"); } CMSG_FOREACH(cmsg, &msghdr) @@ -1388,7 +1424,8 @@ int server_process_datagram( if (n > 0 && n_fds == 0) server_process_syslog_message(s, s->buffer, n, ucred, tv, label, label_len); else if (n_fds > 0) - log_warning("Got file descriptors via syslog socket. Ignoring."); + log_ratelimit_warning(JOURNALD_LOG_RATELIMIT, + "Got file descriptors via syslog socket. Ignoring."); } else if (fd == s->native_fd) { if (n > 0 && n_fds == 0) @@ -1396,7 +1433,8 @@ int server_process_datagram( else if (n == 0 && n_fds == 1) server_process_native_file(s, fds[0], ucred, tv, label, label_len); else if (n_fds > 0) - log_warning("Got too many file descriptors via native socket. Ignoring."); + log_ratelimit_warning(JOURNALD_LOG_RATELIMIT, + "Got too many file descriptors via native socket. Ignoring."); } else { assert(fd == s->audit_fd); @@ -1404,7 +1442,8 @@ int server_process_datagram( if (n > 0 && n_fds == 0) server_process_audit_message(s, s->buffer, n, ucred, &sa, msghdr.msg_namelen); else if (n_fds > 0) - log_warning("Got file descriptors via audit socket. Ignoring."); + log_ratelimit_warning(JOURNALD_LOG_RATELIMIT, + "Got file descriptors via audit socket. Ignoring."); } close_many(fds, n_fds); @@ -1457,7 +1496,8 @@ static void server_full_rotate(Server *s) { fn = strjoina(s->runtime_directory, "/rotated"); r = write_timestamp_file_atomic(fn, now(CLOCK_MONOTONIC)); if (r < 0) - log_warning_errno(r, "Failed to write %s, ignoring: %m", fn); + log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to write %s, ignoring: %m", fn); } static int dispatch_sigusr2(sd_event_source *es, const struct signalfd_siginfo *si, void *userdata) { @@ -1560,7 +1600,8 @@ static void server_full_sync(Server *s) { fn = strjoina(s->runtime_directory, "/synced"); r = write_timestamp_file_atomic(fn, now(CLOCK_MONOTONIC)); if (r < 0) - log_warning_errno(r, "Failed to write %s, ignoring: %m", fn); + log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to write %s, ignoring: %m", fn); return; } diff --git a/src/journal/journald-server.h b/src/journal/journald-server.h index ee8f374190..fb512dcfeb 100644 --- a/src/journal/journald-server.h +++ b/src/journal/journald-server.h @@ -20,6 +20,8 @@ typedef struct Server Server; #include "time-util.h" #include "varlink.h" +#define JOURNALD_LOG_RATELIMIT ((const RateLimit) { .interval = 60 * USEC_PER_SEC, .burst = 3 }) + typedef enum Storage { STORAGE_AUTO, STORAGE_VOLATILE, diff --git a/src/journal/journald-stream.c b/src/journal/journald-stream.c index 8bdcd8c2ae..abfd046837 100644 --- a/src/journal/journald-stream.c +++ b/src/journal/journald-stream.c @@ -160,7 +160,8 @@ static int stdout_stream_save(StdoutStream *s) { r = fstat(s->fd, &st); if (r < 0) - return log_warning_errno(errno, "Failed to stat connected stream: %m"); + return log_ratelimit_warning_errno(errno, JOURNALD_LOG_RATELIMIT, + "Failed to stat connected stream: %m"); /* We use device and inode numbers as identifier for the stream */ r = asprintf(&s->state_file, "%s/streams/%lu:%lu", s->server->runtime_directory, (unsigned long) st.st_dev, (unsigned long) st.st_ino); @@ -231,7 +232,7 @@ static int stdout_stream_save(StdoutStream *s) { if (s->server->notify_event_source) { r = sd_event_source_set_enabled(s->server->notify_event_source, SD_EVENT_ON); if (r < 0) - log_warning_errno(r, "Failed to enable notify event source: %m"); + log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, "Failed to enable notify event source: %m"); } } @@ -239,7 +240,8 @@ static int stdout_stream_save(StdoutStream *s) { fail: (void) unlink(s->state_file); - return log_error_errno(r, "Failed to save stream data %s: %m", s->state_file); + return log_ratelimit_error_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to save stream data %s: %m", s->state_file); } static int stdout_stream_log( @@ -266,7 +268,8 @@ static int stdout_stream_log( else if (pid_is_valid(s->ucred.pid)) { r = client_context_acquire(s->server, s->ucred.pid, &s->ucred, s->label, strlen_ptr(s->label), s->unit_id, &s->context); if (r < 0) - log_warning_errno(r, "Failed to acquire client context, ignoring: %m"); + log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to acquire client context, ignoring: %m"); } priority = s->priority; @@ -363,8 +366,8 @@ static int stdout_stream_line(StdoutStream *s, char *p, LineBreak line_break) { /* line breaks by NUL, line max length or EOF are not permissible during the negotiation part of the protocol */ if (line_break != LINE_BREAK_NEWLINE && s->state != STDOUT_STREAM_RUNNING) - return log_warning_errno(SYNTHETIC_ERRNO(EINVAL), - "Control protocol line not properly terminated."); + return log_ratelimit_warning_errno(SYNTHETIC_ERRNO(EINVAL), JOURNALD_LOG_RATELIMIT, + "Control protocol line not properly terminated."); switch (s->state) { @@ -395,7 +398,8 @@ static int stdout_stream_line(StdoutStream *s, char *p, LineBreak line_break) { priority = syslog_parse_priority_and_facility(p); if (priority < 0) - return log_warning_errno(priority, "Failed to parse log priority line: %m"); + return log_ratelimit_warning_errno(priority, JOURNALD_LOG_RATELIMIT, + "Failed to parse log priority line: %m"); s->priority = priority; s->state = STDOUT_STREAM_LEVEL_PREFIX; @@ -405,7 +409,8 @@ static int stdout_stream_line(StdoutStream *s, char *p, LineBreak line_break) { case STDOUT_STREAM_LEVEL_PREFIX: r = parse_boolean(p); if (r < 0) - return log_warning_errno(r, "Failed to parse level prefix line: %m"); + return log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to parse level prefix line: %m"); s->level_prefix = r; s->state = STDOUT_STREAM_FORWARD_TO_SYSLOG; @@ -414,7 +419,8 @@ static int stdout_stream_line(StdoutStream *s, char *p, LineBreak line_break) { case STDOUT_STREAM_FORWARD_TO_SYSLOG: r = parse_boolean(p); if (r < 0) - return log_warning_errno(r, "Failed to parse forward to syslog line: %m"); + return log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to parse forward to syslog line: %m"); s->forward_to_syslog = r; s->state = STDOUT_STREAM_FORWARD_TO_KMSG; @@ -423,7 +429,8 @@ static int stdout_stream_line(StdoutStream *s, char *p, LineBreak line_break) { case STDOUT_STREAM_FORWARD_TO_KMSG: r = parse_boolean(p); if (r < 0) - return log_warning_errno(r, "Failed to parse copy to kmsg line: %m"); + return log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to parse copy to kmsg line: %m"); s->forward_to_kmsg = r; s->state = STDOUT_STREAM_FORWARD_TO_CONSOLE; @@ -432,7 +439,8 @@ static int stdout_stream_line(StdoutStream *s, char *p, LineBreak line_break) { case STDOUT_STREAM_FORWARD_TO_CONSOLE: r = parse_boolean(p); if (r < 0) - return log_warning_errno(r, "Failed to parse copy to console line."); + return log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to parse copy to console line."); s->forward_to_console = r; s->state = STDOUT_STREAM_RUNNING; @@ -589,7 +597,7 @@ static int stdout_stream_process(sd_event_source *es, int fd, uint32_t revents, if (ERRNO_IS_TRANSIENT(errno)) return 0; - log_warning_errno(errno, "Failed to read from stream: %m"); + log_ratelimit_warning_errno(errno, JOURNALD_LOG_RATELIMIT, "Failed to read from stream: %m"); goto terminate; } cmsg_close_all(&msghdr); @@ -648,7 +656,7 @@ int stdout_stream_install(Server *s, int fd, StdoutStream **ret) { r = sd_id128_randomize(&id); if (r < 0) - return log_error_errno(r, "Failed to generate stream ID: %m"); + return log_ratelimit_error_errno(r, JOURNALD_LOG_RATELIMIT, "Failed to generate stream ID: %m"); stream = new(StdoutStream, 1); if (!stream) @@ -664,7 +672,7 @@ int stdout_stream_install(Server *s, int fd, StdoutStream **ret) { r = getpeercred(fd, &stream->ucred); if (r < 0) - return log_error_errno(r, "Failed to determine peer credentials: %m"); + return log_ratelimit_error_errno(r, JOURNALD_LOG_RATELIMIT, "Failed to determine peer credentials: %m"); r = setsockopt_int(fd, SOL_SOCKET, SO_PASSCRED, true); if (r < 0) @@ -673,18 +681,18 @@ int stdout_stream_install(Server *s, int fd, StdoutStream **ret) { if (mac_selinux_use()) { r = getpeersec(fd, &stream->label); if (r < 0 && r != -EOPNOTSUPP) - (void) log_warning_errno(r, "Failed to determine peer security context: %m"); + (void) log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, "Failed to determine peer security context: %m"); } (void) shutdown(fd, SHUT_WR); r = sd_event_add_io(s->event, &stream->event_source, fd, EPOLLIN, stdout_stream_process, stream); if (r < 0) - return log_error_errno(r, "Failed to add stream to event loop: %m"); + return log_ratelimit_error_errno(r, JOURNALD_LOG_RATELIMIT, "Failed to add stream to event loop: %m"); r = sd_event_source_set_priority(stream->event_source, SD_EVENT_PRIORITY_NORMAL+5); if (r < 0) - return log_error_errno(r, "Failed to adjust stdout event source priority: %m"); + return log_ratelimit_error_errno(r, JOURNALD_LOG_RATELIMIT, "Failed to adjust stdout event source priority: %m"); stream->fd = fd; @@ -716,7 +724,7 @@ static int stdout_stream_new(sd_event_source *es, int listen_fd, uint32_t revent if (ERRNO_IS_ACCEPT_AGAIN(errno)) return 0; - return log_error_errno(errno, "Failed to accept stdout connection: %m"); + return log_ratelimit_error_errno(errno, JOURNALD_LOG_RATELIMIT, "Failed to accept stdout connection: %m"); } if (s->n_stdout_streams >= STDOUT_STREAMS_MAX) { diff --git a/src/journal/journald-syslog.c b/src/journal/journald-syslog.c index ce02378675..6394adfdfd 100644 --- a/src/journal/journald-syslog.c +++ b/src/journal/journald-syslog.c @@ -334,7 +334,9 @@ void server_process_syslog_message( if (ucred && pid_is_valid(ucred->pid)) { r = client_context_get(s, ucred->pid, ucred, label, label_len, NULL, &context); if (r < 0) - log_warning_errno(r, "Failed to retrieve credentials for PID " PID_FMT ", ignoring: %m", ucred->pid); + log_ratelimit_warning_errno(r, JOURNALD_LOG_RATELIMIT, + "Failed to retrieve credentials for PID " PID_FMT ", ignoring: %m", + ucred->pid); } /* We are creating a copy of the message because we want to forward the original message diff --git a/src/journal/test-journal-interleaving.c b/src/journal/test-journal-interleaving.c index c5288000af..7939574c37 100644 --- a/src/journal/test-journal-interleaving.c +++ b/src/journal/test-journal-interleaving.c @@ -157,7 +157,6 @@ static void test_skip_one(void (*setup)(void)) { */ assert_ret(sd_journal_open_directory(&j, t, 0)); assert_ret(sd_journal_seek_head(j)); - assert_ret(sd_journal_previous(j) == 0); assert_ret(sd_journal_next(j)); test_check_numbers_down(j, 4); sd_journal_close(j); @@ -166,7 +165,6 @@ static void test_skip_one(void (*setup)(void)) { */ assert_ret(sd_journal_open_directory(&j, t, 0)); assert_ret(sd_journal_seek_tail(j)); - assert_ret(sd_journal_next(j) == 0); assert_ret(sd_journal_previous(j)); test_check_numbers_up(j, 4); sd_journal_close(j); @@ -175,7 +173,6 @@ static void test_skip_one(void (*setup)(void)) { */ assert_ret(sd_journal_open_directory(&j, t, 0)); assert_ret(sd_journal_seek_tail(j)); - assert_ret(sd_journal_next(j) == 0); assert_ret(r = sd_journal_previous_skip(j, 4)); assert_se(r == 4); test_check_numbers_down(j, 4); @@ -185,7 +182,6 @@ static void test_skip_one(void (*setup)(void)) { */ assert_ret(sd_journal_open_directory(&j, t, 0)); assert_ret(sd_journal_seek_head(j)); - assert_ret(sd_journal_previous(j) == 0); assert_ret(r = sd_journal_next_skip(j, 4)); assert_se(r == 4); test_check_numbers_up(j, 4); diff --git a/src/kernel-install/90-loaderentry.install b/src/kernel-install/90-loaderentry.install.in index 41a05534b9..4e936d95f4 100755 --- a/src/kernel-install/90-loaderentry.install +++ b/src/kernel-install/90-loaderentry.install.in @@ -138,6 +138,8 @@ mkdir -p "${LOADER_ENTRY%/*}" || { [ "$KERNEL_INSTALL_VERBOSE" -gt 0 ] && echo "Creating $LOADER_ENTRY" { + echo "# Boot Loader Specification type#1 entry" + echo "# File created by $0 (systemd {{GIT_VERSION}})" echo "title $PRETTY_NAME" echo "version $KERNEL_VERSION" if [ "$ENTRY_TOKEN" = "$MACHINE_ID" ]; then diff --git a/src/kernel-install/90-uki-copy.install b/src/kernel-install/90-uki-copy.install new file mode 100755 index 0000000000..d6e3deb723 --- /dev/null +++ b/src/kernel-install/90-uki-copy.install @@ -0,0 +1,97 @@ +#!/bin/sh +# -*- mode: shell-script; indent-tabs-mode: nil; sh-basic-offset: 4; -*- +# ex: ts=8 sw=4 sts=4 et filetype=sh +# SPDX-License-Identifier: LGPL-2.1-or-later +# +# This file is part of systemd. +# +# systemd is free software; you can redistribute it and/or modify it +# under the terms of the GNU Lesser General Public License as published by +# the Free Software Foundation; either version 2.1 of the License, or +# (at your option) any later version. +# +# systemd is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public License +# along with systemd; If not, see <https://www.gnu.org/licenses/>. + +set -e + +COMMAND="${1:?}" +KERNEL_VERSION="${2:?}" +# shellcheck disable=SC2034 +ENTRY_DIR_ABS="$3" +KERNEL_IMAGE="$4" + +[ "$KERNEL_INSTALL_LAYOUT" = "uki" ] || exit 0 + +ENTRY_TOKEN="$KERNEL_INSTALL_ENTRY_TOKEN" +BOOT_ROOT="$KERNEL_INSTALL_BOOT_ROOT" + +UKI_DIR="$BOOT_ROOT/EFI/Linux" + +case "$COMMAND" in + remove) + [ "$KERNEL_INSTALL_VERBOSE" -gt 0 ] && \ + echo "Removing $UKI_DIR/$ENTRY_TOKEN-$KERNEL_VERSION*.efi" + exec rm -f \ + "$UKI_DIR/$ENTRY_TOKEN-$KERNEL_VERSION.efi" \ + "$UKI_DIR/$ENTRY_TOKEN-$KERNEL_VERSION+"*".efi" + ;; + add) + ;; + *) + exit 0 + ;; +esac + +if ! [ -d "$UKI_DIR" ]; then + echo "Error: entry directory '$UKI_DIR' does not exist" >&2 + exit 1 +fi + +TRIES_FILE="${KERNEL_INSTALL_CONF_ROOT:-/etc/kernel}/tries" + +if [ -f "$TRIES_FILE" ]; then + read -r TRIES <"$TRIES_FILE" + if ! echo "$TRIES" | grep -q '^[0-9][0-9]*$'; then + echo "$TRIES_FILE does not contain an integer." >&2 + exit 1 + fi + UKI_FILE="$UKI_DIR/$ENTRY_TOKEN-$KERNEL_VERSION+$TRIES.efi" +else + UKI_FILE="$UKI_DIR/$ENTRY_TOKEN-$KERNEL_VERSION.efi" +fi + +# If there is a UKI named uki.efi on the staging area use that, if not use what +# was passed in as $KERNEL_IMAGE but insist it has a .efi extension +if [ -f "$KERNEL_INSTALL_STAGING_AREA/uki.efi" ]; then + [ "$KERNEL_INSTALL_VERBOSE" -gt 0 ] && echo "Installing $KERNEL_INSTALL_STAGING_AREA/uki.efi" + install -m 0644 "$KERNEL_INSTALL_STAGING_AREA/uki.efi" "$UKI_FILE" || { + echo "Error: could not copy '$KERNEL_INSTALL_STAGING_AREA/uki.efi' to '$UKI_FILE'." >&2 + exit 1 + } +elif [ -n "$KERNEL_IMAGE" ]; then + [ -f "$KERNEL_IMAGE" ] || { + echo "Error: UKI '$KERNEL_IMAGE' not a file." >&2 + exit 1 + } + [ "$KERNEL_IMAGE" != "${KERNEL_IMAGE%*.efi}.efi" ] && { + echo "Error: $KERNEL_IMAGE is missing .efi suffix." >&2 + exit 1 + } + [ "$KERNEL_INSTALL_VERBOSE" -gt 0 ] && echo "Installing $KERNEL_IMAGE" + install -m 0644 "$KERNEL_IMAGE" "$UKI_FILE" || { + echo "Error: could not copy '$KERNEL_IMAGE' to '$UKI_FILE'." >&2 + exit 1 + } +else + [ "$KERNEL_INSTALL_VERBOSE" -gt 0 ] && echo "No UKI available. Nothing to do." + exit 0 +fi +chown root:root "$UKI_FILE" || : + +exit 0 diff --git a/src/kernel-install/kernel-install.in b/src/kernel-install/kernel-install.in index 22eb4d2be1..fa2c0d5276 100755 --- a/src/kernel-install/kernel-install.in +++ b/src/kernel-install/kernel-install.in @@ -158,8 +158,9 @@ if [ -z "$MACHINE_ID" ] && [ -f /etc/machine-info ]; then [ -n "$MACHINE_ID" ] && \ log_verbose "machine-id $MACHINE_ID acquired from /etc/machine-info" fi -if [ -z "$MACHINE_ID" ] && [ -f /etc/machine-id ]; then +if [ -z "$MACHINE_ID" ] && [ -s /etc/machine-id ]; then read -r MACHINE_ID </etc/machine-id + [ "$MACHINE_ID" = "uninitialized" ] && unset MACHINE_ID [ -n "$MACHINE_ID" ] && \ log_verbose "machine-id $MACHINE_ID acquired from /etc/machine-id" fi diff --git a/src/kernel-install/meson.build b/src/kernel-install/meson.build index 90a0e3ae49..b0b6c27ede 100644 --- a/src/kernel-install/meson.build +++ b/src/kernel-install/meson.build @@ -1,11 +1,21 @@ # SPDX-License-Identifier: LGPL-2.1-or-later kernel_install_in = files('kernel-install.in') -loaderentry_install = files('90-loaderentry.install') +loaderentry_install_in = files('90-loaderentry.install.in') + +loaderentry_install = custom_target( + '90-loaderentry.install', + input : loaderentry_install_in, + output : '90-loaderentry.install', + command : [jinja2_cmdline, '@INPUT@', '@OUTPUT@'], + install : want_kernel_install, + install_mode : 'rwxr-xr-x', + install_dir : kernelinstalldir) + +uki_copy_install = files('90-uki-copy.install') if want_kernel_install install_data('50-depmod.install', - loaderentry_install, install_mode : 'rwxr-xr-x', install_dir : kernelinstalldir) diff --git a/src/libsystemd/sd-bus/bus-socket.c b/src/libsystemd/sd-bus/bus-socket.c index c94befef73..253f41c636 100644 --- a/src/libsystemd/sd-bus/bus-socket.c +++ b/src/libsystemd/sd-bus/bus-socket.c @@ -1308,8 +1308,11 @@ int bus_socket_process_opening(sd_bus *b) { assert(b->state == BUS_OPENING); events = fd_wait_for_event(b->output_fd, POLLOUT, 0); - if (events < 0) + if (events < 0) { + if (ERRNO_IS_TRANSIENT(events)) + return 0; return events; + } if (!(events & (POLLOUT|POLLERR|POLLHUP))) return 0; diff --git a/src/libsystemd/sd-bus/sd-bus.c b/src/libsystemd/sd-bus/sd-bus.c index ba5ef7de00..c75276f4ba 100644 --- a/src/libsystemd/sd-bus/sd-bus.c +++ b/src/libsystemd/sd-bus/sd-bus.c @@ -2465,8 +2465,11 @@ _public_ int sd_bus_call( left = UINT64_MAX; r = bus_poll(bus, true, left); - if (r < 0) + if (r < 0) { + if (ERRNO_IS_TRANSIENT(r)) + continue; goto fail; + } if (r == 0) { r = -ETIMEDOUT; goto fail; @@ -3321,6 +3324,7 @@ static int bus_poll(sd_bus *bus, bool need_more, uint64_t timeout_usec) { } _public_ int sd_bus_wait(sd_bus *bus, uint64_t timeout_usec) { + int r; assert_return(bus, -EINVAL); assert_return(bus = bus_resolve(bus), -ENOPKG); @@ -3335,7 +3339,11 @@ _public_ int sd_bus_wait(sd_bus *bus, uint64_t timeout_usec) { if (bus->rqueue_size > 0) return 0; - return bus_poll(bus, false, timeout_usec); + r = bus_poll(bus, false, timeout_usec); + if (r < 0 && ERRNO_IS_TRANSIENT(r)) + return 1; /* treat EINTR as success, but let's exit, so that the caller will call back into us soon. */ + + return r; } _public_ int sd_bus_flush(sd_bus *bus) { @@ -3377,8 +3385,12 @@ _public_ int sd_bus_flush(sd_bus *bus) { return 0; r = bus_poll(bus, false, UINT64_MAX); - if (r < 0) + if (r < 0) { + if (ERRNO_IS_TRANSIENT(r)) + continue; + return r; + } } } diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c index 27908a9abc..299a6a2c8c 100644 --- a/src/libsystemd/sd-event/sd-event.c +++ b/src/libsystemd/sd-event/sd-event.c @@ -3177,7 +3177,7 @@ static int event_arm_timer( assert_se(d->fd >= 0); if (t == 0) { - /* We don' want to disarm here, just mean some time looooong ago. */ + /* We don't want to disarm here, just mean some time looooong ago. */ its.it_value.tv_sec = 0; its.it_value.tv_nsec = 1; } else @@ -3969,15 +3969,18 @@ static int epoll_wait_usec( usec_t timeout) { int msec; -#if 0 + /* A wrapper that uses epoll_pwait2() if available, and falls back to epoll_wait() if not. */ + +#if HAVE_EPOLL_PWAIT2 static bool epoll_pwait2_absent = false; int r; - /* A wrapper that uses epoll_pwait2() if available, and falls back to epoll_wait() if not. - * - * FIXME: this is temporarily disabled until epoll_pwait2() becomes more widely available. - * See https://github.com/systemd/systemd/pull/18973 and - * https://github.com/systemd/systemd/issues/19052. */ + /* epoll_pwait2() was added to Linux 5.11 (2021-02-14) and to glibc in 2.35 (2022-02-03). In contrast + * to other syscalls we don't bother with our own fallback syscall wrappers on old libcs, since this + * is not that obvious to implement given the libc and kernel definitions differ in the last + * argument. Moreover, the only reason to use it is the more accurate time-outs (which is not a + * biggie), let's hence rely on glibc's definitions, and fallback to epoll_pwait() when that's + * missing. */ if (!epoll_pwait2_absent && timeout != USEC_INFINITY) { r = epoll_pwait2(fd, diff --git a/src/libsystemd/sd-journal/journal-file.c b/src/libsystemd/sd-journal/journal-file.c index 3f0dcaebf1..9084da41e3 100644 --- a/src/libsystemd/sd-journal/journal-file.c +++ b/src/libsystemd/sd-journal/journal-file.c @@ -4161,6 +4161,10 @@ int journal_file_get_cutoff_monotonic_usec(JournalFile *f, sd_id128_t boot_id, u return 1; } +/* Ideally this would be a function parameter but initializers for static fields have to be compile + * time constants so we hardcode the interval instead. */ +#define LOG_RATELIMIT ((const RateLimit) { .interval = 60 * USEC_PER_SEC, .burst = 3 }) + bool journal_file_rotate_suggested(JournalFile *f, usec_t max_file_usec, int log_level) { assert(f); assert(f->header); @@ -4178,25 +4182,27 @@ bool journal_file_rotate_suggested(JournalFile *f, usec_t max_file_usec, int log if (JOURNAL_HEADER_CONTAINS(f->header, n_data)) if (le64toh(f->header->n_data) * 4ULL > (le64toh(f->header->data_hash_table_size) / sizeof(HashItem)) * 3ULL) { - log_full(log_level, - "Data hash table of %s has a fill level at %.1f (%"PRIu64" of %"PRIu64" items, %llu file size, %"PRIu64" bytes per hash table item), suggesting rotation.", - f->path, - 100.0 * (double) le64toh(f->header->n_data) / ((double) (le64toh(f->header->data_hash_table_size) / sizeof(HashItem))), - le64toh(f->header->n_data), - le64toh(f->header->data_hash_table_size) / sizeof(HashItem), - (unsigned long long) f->last_stat.st_size, - f->last_stat.st_size / le64toh(f->header->n_data)); + log_ratelimit_full( + log_level, LOG_RATELIMIT, + "Data hash table of %s has a fill level at %.1f (%"PRIu64" of %"PRIu64" items, %llu file size, %"PRIu64" bytes per hash table item), suggesting rotation.", + f->path, + 100.0 * (double) le64toh(f->header->n_data) / ((double) (le64toh(f->header->data_hash_table_size) / sizeof(HashItem))), + le64toh(f->header->n_data), + le64toh(f->header->data_hash_table_size) / sizeof(HashItem), + (unsigned long long) f->last_stat.st_size, + f->last_stat.st_size / le64toh(f->header->n_data)); return true; } if (JOURNAL_HEADER_CONTAINS(f->header, n_fields)) if (le64toh(f->header->n_fields) * 4ULL > (le64toh(f->header->field_hash_table_size) / sizeof(HashItem)) * 3ULL) { - log_full(log_level, - "Field hash table of %s has a fill level at %.1f (%"PRIu64" of %"PRIu64" items), suggesting rotation.", - f->path, - 100.0 * (double) le64toh(f->header->n_fields) / ((double) (le64toh(f->header->field_hash_table_size) / sizeof(HashItem))), - le64toh(f->header->n_fields), - le64toh(f->header->field_hash_table_size) / sizeof(HashItem)); + log_ratelimit_full( + log_level, LOG_RATELIMIT, + "Field hash table of %s has a fill level at %.1f (%"PRIu64" of %"PRIu64" items), suggesting rotation.", + f->path, + 100.0 * (double) le64toh(f->header->n_fields) / ((double) (le64toh(f->header->field_hash_table_size) / sizeof(HashItem))), + le64toh(f->header->n_fields), + le64toh(f->header->field_hash_table_size) / sizeof(HashItem)); return true; } @@ -4204,17 +4210,19 @@ bool journal_file_rotate_suggested(JournalFile *f, usec_t max_file_usec, int log * longest chain is longer than some threshold, let's suggest rotation. */ if (JOURNAL_HEADER_CONTAINS(f->header, data_hash_chain_depth) && le64toh(f->header->data_hash_chain_depth) > HASH_CHAIN_DEPTH_MAX) { - log_full(log_level, - "Data hash table of %s has deepest hash chain of length %" PRIu64 ", suggesting rotation.", - f->path, le64toh(f->header->data_hash_chain_depth)); + log_ratelimit_full( + log_level, LOG_RATELIMIT, + "Data hash table of %s has deepest hash chain of length %" PRIu64 ", suggesting rotation.", + f->path, le64toh(f->header->data_hash_chain_depth)); return true; } if (JOURNAL_HEADER_CONTAINS(f->header, field_hash_chain_depth) && le64toh(f->header->field_hash_chain_depth) > HASH_CHAIN_DEPTH_MAX) { - log_full(log_level, - "Field hash table of %s has deepest hash chain of length at %" PRIu64 ", suggesting rotation.", - f->path, le64toh(f->header->field_hash_chain_depth)); + log_ratelimit_full( + log_level, LOG_RATELIMIT, + "Field hash table of %s has deepest hash chain of length at %" PRIu64 ", suggesting rotation.", + f->path, le64toh(f->header->field_hash_chain_depth)); return true; } @@ -4223,9 +4231,10 @@ bool journal_file_rotate_suggested(JournalFile *f, usec_t max_file_usec, int log JOURNAL_HEADER_CONTAINS(f->header, n_fields) && le64toh(f->header->n_data) > 0 && le64toh(f->header->n_fields) == 0) { - log_full(log_level, - "Data objects of %s are not indexed by field objects, suggesting rotation.", - f->path); + log_ratelimit_full( + log_level, LOG_RATELIMIT, + "Data objects of %s are not indexed by field objects, suggesting rotation.", + f->path); return true; } @@ -4236,9 +4245,10 @@ bool journal_file_rotate_suggested(JournalFile *f, usec_t max_file_usec, int log t = now(CLOCK_REALTIME); if (h > 0 && t > h + max_file_usec) { - log_full(log_level, - "Oldest entry in %s is older than the configured file retention duration (%s), suggesting rotation.", - f->path, FORMAT_TIMESPAN(max_file_usec, USEC_PER_SEC)); + log_ratelimit_full( + log_level, LOG_RATELIMIT, + "Oldest entry in %s is older than the configured file retention duration (%s), suggesting rotation.", + f->path, FORMAT_TIMESPAN(max_file_usec, USEC_PER_SEC)); return true; } } diff --git a/src/libsystemd/sd-journal/sd-journal.c b/src/libsystemd/sd-journal/sd-journal.c index aa90d873a5..0f3376823b 100644 --- a/src/libsystemd/sd-journal/sd-journal.c +++ b/src/libsystemd/sd-journal/sd-journal.c @@ -606,9 +606,9 @@ static int find_location_for_match( /* FIXME: missing: find by monotonic */ if (j->current_location.type == LOCATION_HEAD) - return direction == DIRECTION_DOWN ? journal_file_next_entry_for_data(f, d, DIRECTION_DOWN, ret, offset) : 0; + return journal_file_next_entry_for_data(f, d, DIRECTION_DOWN, ret, offset); if (j->current_location.type == LOCATION_TAIL) - return direction == DIRECTION_UP ? journal_file_next_entry_for_data(f, d, DIRECTION_UP, ret, offset) : 0; + return journal_file_next_entry_for_data(f, d, DIRECTION_UP, ret, offset); if (j->current_location.seqnum_set && sd_id128_equal(j->current_location.seqnum_id, f->header->seqnum_id)) return journal_file_move_to_entry_by_seqnum_for_data(f, d, j->current_location.seqnum, direction, ret, offset); if (j->current_location.monotonic_set) { @@ -701,9 +701,9 @@ static int find_location_with_matches( /* No matches is simple */ if (j->current_location.type == LOCATION_HEAD) - return direction == DIRECTION_DOWN ? journal_file_next_entry(f, 0, DIRECTION_DOWN, ret, offset) : 0; + return journal_file_next_entry(f, 0, DIRECTION_DOWN, ret, offset); if (j->current_location.type == LOCATION_TAIL) - return direction == DIRECTION_UP ? journal_file_next_entry(f, 0, DIRECTION_UP, ret, offset) : 0; + return journal_file_next_entry(f, 0, DIRECTION_UP, ret, offset); if (j->current_location.seqnum_set && sd_id128_equal(j->current_location.seqnum_id, f->header->seqnum_id)) return journal_file_move_to_entry_by_seqnum(f, j->current_location.seqnum, direction, ret, offset); if (j->current_location.monotonic_set) { diff --git a/src/libsystemd/sd-netlink/netlink-internal.h b/src/libsystemd/sd-netlink/netlink-internal.h index 514f22511c..bca13bce57 100644 --- a/src/libsystemd/sd-netlink/netlink-internal.h +++ b/src/libsystemd/sd-netlink/netlink-internal.h @@ -7,6 +7,7 @@ #include "list.h" #include "netlink-types.h" +#include "ordered-set.h" #include "prioq.h" #include "time-util.h" @@ -72,11 +73,9 @@ struct sd_netlink { Hashmap *broadcast_group_refs; bool broadcast_group_dont_leave:1; /* until we can rely on 4.2 */ - sd_netlink_message **rqueue; - unsigned rqueue_size; - - sd_netlink_message **rqueue_partial; - unsigned rqueue_partial_size; + OrderedSet *rqueue; + Hashmap *rqueue_by_serial; + Hashmap *rqueue_partial_by_serial; struct nlmsghdr *rbuffer; @@ -148,8 +147,6 @@ void message_seal(sd_netlink_message *m); int netlink_open_family(sd_netlink **ret, int family); bool netlink_pid_changed(sd_netlink *nl); -int netlink_rqueue_make_room(sd_netlink *nl); -int netlink_rqueue_partial_make_room(sd_netlink *nl); int socket_bind(sd_netlink *nl); int socket_broadcast_group_ref(sd_netlink *nl, unsigned group); diff --git a/src/libsystemd/sd-netlink/netlink-socket.c b/src/libsystemd/sd-netlink/netlink-socket.c index 1da459c014..96162963a7 100644 --- a/src/libsystemd/sd-netlink/netlink-socket.c +++ b/src/libsystemd/sd-netlink/netlink-socket.c @@ -180,11 +180,12 @@ int socket_write_message(sd_netlink *nl, sd_netlink_message *m) { return k; } -static int socket_recv_message(int fd, struct iovec *iov, uint32_t *ret_mcast_group, bool peek) { +static int socket_recv_message(int fd, void *buf, size_t buf_size, uint32_t *ret_mcast_group, bool peek) { + struct iovec iov = IOVEC_MAKE(buf, buf_size); union sockaddr_union sender; CMSG_BUFFER_TYPE(CMSG_SPACE(sizeof(struct nl_pktinfo))) control; struct msghdr msg = { - .msg_iov = iov, + .msg_iov = &iov, .msg_iovlen = 1, .msg_name = &sender, .msg_namelen = sizeof(sender), @@ -194,14 +195,17 @@ static int socket_recv_message(int fd, struct iovec *iov, uint32_t *ret_mcast_gr ssize_t n; assert(fd >= 0); - assert(iov); + assert(peek || (buf && buf_size > 0)); n = recvmsg_safe(fd, &msg, MSG_TRUNC | (peek ? MSG_PEEK : 0)); if (n < 0) { if (n == -ENOBUFS) return log_debug_errno(n, "sd-netlink: kernel receive buffer overrun"); - if (ERRNO_IS_TRANSIENT(n)) + if (ERRNO_IS_TRANSIENT(n)) { + if (ret_mcast_group) + *ret_mcast_group = 0; return 0; + } return (int) n; } @@ -216,9 +220,14 @@ static int socket_recv_message(int fd, struct iovec *iov, uint32_t *ret_mcast_gr return (int) n; } + if (ret_mcast_group) + *ret_mcast_group = 0; return 0; } + if (!peek && (size_t) n > buf_size) /* message did not fit in read buffer */ + return -EIO; + if (ret_mcast_group) { struct nl_pktinfo *pi; @@ -232,151 +241,221 @@ static int socket_recv_message(int fd, struct iovec *iov, uint32_t *ret_mcast_gr return (int) n; } -/* On success, the number of bytes received is returned and *ret points to the received message - * which has a valid header and the correct size. - * If nothing useful was received 0 is returned. - * On failure, a negative error code is returned. - */ -int socket_read_message(sd_netlink *nl) { - _cleanup_(sd_netlink_message_unrefp) sd_netlink_message *first = NULL; - bool multi_part = false, done = false; - size_t len, allocated; - struct iovec iov = {}; - uint32_t group = 0; - unsigned i = 0; +DEFINE_PRIVATE_HASH_OPS_WITH_VALUE_DESTRUCTOR( + netlink_message_hash_ops, + void, trivial_hash_func, trivial_compare_func, + sd_netlink_message, sd_netlink_message_unref); + +static int netlink_queue_received_message(sd_netlink *nl, sd_netlink_message *m) { + uint32_t serial; int r; assert(nl); - assert(nl->rbuffer); + assert(m); - /* read nothing, just get the pending message size */ - r = socket_recv_message(nl->fd, &iov, NULL, true); - if (r <= 0) + if (ordered_set_size(nl->rqueue) >= NETLINK_RQUEUE_MAX) + return log_debug_errno(SYNTHETIC_ERRNO(ENOBUFS), + "sd-netlink: exhausted the read queue size (%d)", NETLINK_RQUEUE_MAX); + + r = ordered_set_ensure_put(&nl->rqueue, &netlink_message_hash_ops, m); + if (r < 0) return r; - else - len = (size_t) r; - /* make room for the pending message */ - if (!greedy_realloc((void**) &nl->rbuffer, len, sizeof(uint8_t))) - return -ENOMEM; + sd_netlink_message_ref(m); - allocated = MALLOC_SIZEOF_SAFE(nl->rbuffer); - iov = IOVEC_MAKE(nl->rbuffer, allocated); + if (sd_netlink_message_is_broadcast(m)) + return 0; - /* read the pending message */ - r = socket_recv_message(nl->fd, &iov, &group, false); - if (r <= 0) - return r; - else - len = (size_t) r; + serial = message_get_serial(m); + if (serial == 0) + return 0; - if (len > allocated) - /* message did not fit in read buffer */ - return -EIO; + if (sd_netlink_message_get_errno(m) < 0) { + _cleanup_(sd_netlink_message_unrefp) sd_netlink_message *old = NULL; - if (NLMSG_OK(nl->rbuffer, len) && nl->rbuffer->nlmsg_flags & NLM_F_MULTI) { - multi_part = true; + old = hashmap_remove(nl->rqueue_by_serial, UINT32_TO_PTR(serial)); + if (old) + log_debug("sd-netlink: received error message with serial %"PRIu32", but another message with " + "the same serial is already stored in the read queue, replacing.", serial); + } - for (i = 0; i < nl->rqueue_partial_size; i++) - if (message_get_serial(nl->rqueue_partial[i]) == - nl->rbuffer->nlmsg_seq) { - first = nl->rqueue_partial[i]; - break; - } + r = hashmap_ensure_put(&nl->rqueue_by_serial, &netlink_message_hash_ops, UINT32_TO_PTR(serial), m); + if (r == -EEXIST) { + if (!sd_netlink_message_is_error(m)) + log_debug("sd-netlink: received message with serial %"PRIu32", but another message with " + "the same serial is already stored in the read queue, ignoring.", serial); + return 0; + } + if (r < 0) { + sd_netlink_message_unref(ordered_set_remove(nl->rqueue, m)); + return r; } - for (struct nlmsghdr *new_msg = nl->rbuffer; NLMSG_OK(new_msg, len) && !done; new_msg = NLMSG_NEXT(new_msg, len)) { - _cleanup_(sd_netlink_message_unrefp) sd_netlink_message *m = NULL; - size_t size; + sd_netlink_message_ref(m); + return 0; +} - if (group == 0 && new_msg->nlmsg_pid != nl->sockaddr.nl.nl_pid) - /* not broadcast and not for us */ - continue; +static int netlink_queue_partially_received_message(sd_netlink *nl, sd_netlink_message *m) { + uint32_t serial; + int r; - if (new_msg->nlmsg_type == NLMSG_NOOP) - /* silently drop noop messages */ - continue; + assert(nl); + assert(m); + assert(m->hdr->nlmsg_flags & NLM_F_MULTI); - if (new_msg->nlmsg_type == NLMSG_DONE) { - /* finished reading multi-part message */ - done = true; + if (hashmap_size(nl->rqueue_partial_by_serial) >= NETLINK_RQUEUE_MAX) + return log_debug_errno(SYNTHETIC_ERRNO(ENOBUFS), + "sd-netlink: exhausted the partial read queue size (%d)", NETLINK_RQUEUE_MAX); - /* if first is not defined, put NLMSG_DONE into the receive queue. */ - if (first) - continue; - } + serial = message_get_serial(m); + r = hashmap_ensure_put(&nl->rqueue_partial_by_serial, &netlink_message_hash_ops, UINT32_TO_PTR(serial), m); + if (r < 0) + return r; - /* check that we support this message type */ - r = netlink_get_policy_set_and_header_size(nl, new_msg->nlmsg_type, NULL, &size); - if (r < 0) { - if (r == -EOPNOTSUPP) - log_debug("sd-netlink: ignored message with unknown type: %i", - new_msg->nlmsg_type); + sd_netlink_message_ref(m); + return 0; +} - continue; - } +static int parse_message_one(sd_netlink *nl, uint32_t group, const struct nlmsghdr *hdr, sd_netlink_message **ret) { + _cleanup_(sd_netlink_message_unrefp) sd_netlink_message *m = NULL; + size_t size; + int r; - /* check that the size matches the message type */ - if (new_msg->nlmsg_len < NLMSG_LENGTH(size)) { - log_debug("sd-netlink: message is shorter than expected, dropping"); - continue; - } + assert(nl); + assert(hdr); + assert(ret); + + /* not broadcast and not for us */ + if (group == 0 && hdr->nlmsg_pid != nl->sockaddr.nl.nl_pid) + goto finalize; + + /* silently drop noop messages */ + if (hdr->nlmsg_type == NLMSG_NOOP) + goto finalize; + + /* check that we support this message type */ + r = netlink_get_policy_set_and_header_size(nl, hdr->nlmsg_type, NULL, &size); + if (r == -EOPNOTSUPP) { + log_debug("sd-netlink: ignored message with unknown type: %i", hdr->nlmsg_type); + goto finalize; + } + if (r < 0) + return r; - r = message_new_empty(nl, &m); - if (r < 0) - return r; + /* check that the size matches the message type */ + if (hdr->nlmsg_len < NLMSG_LENGTH(size)) { + log_debug("sd-netlink: message is shorter than expected, dropping."); + goto finalize; + } - m->multicast_group = group; - m->hdr = memdup(new_msg, new_msg->nlmsg_len); - if (!m->hdr) - return -ENOMEM; + r = message_new_empty(nl, &m); + if (r < 0) + return r; - /* seal and parse the top-level message */ - r = sd_netlink_message_rewind(m, nl); - if (r < 0) - return r; + m->multicast_group = group; + m->hdr = memdup(hdr, hdr->nlmsg_len); + if (!m->hdr) + return -ENOMEM; - /* push the message onto the multi-part message stack */ - if (first) - m->next = first; - first = TAKE_PTR(m); - } + /* seal and parse the top-level message */ + r = sd_netlink_message_rewind(m, nl); + if (r < 0) + return r; - if (len > 0) - log_debug("sd-netlink: discarding %zu bytes of incoming message", len); + *ret = TAKE_PTR(m); + return 1; - if (!first) +finalize: + *ret = NULL; + return 0; +} + +/* On success, the number of bytes received is returned and *ret points to the received message + * which has a valid header and the correct size. + * If nothing useful was received 0 is returned. + * On failure, a negative error code is returned. + */ +int socket_read_message(sd_netlink *nl) { + bool done = false; + uint32_t group; + size_t len; + int r; + + assert(nl); + + /* read nothing, just get the pending message size */ + r = socket_recv_message(nl->fd, NULL, 0, NULL, true); + if (r <= 0) + return r; + len = (size_t) r; + + /* make room for the pending message */ + if (!greedy_realloc((void**) &nl->rbuffer, len, sizeof(uint8_t))) + return -ENOMEM; + + /* read the pending message */ + r = socket_recv_message(nl->fd, nl->rbuffer, MALLOC_SIZEOF_SAFE(nl->rbuffer), &group, false); + if (r <= 0) + return r; + len = (size_t) r; + + if (!NLMSG_OK(nl->rbuffer, len)) { + log_debug("sd-netlink: received invalid message, discarding %zu bytes of incoming message", len); return 0; + } + + for (struct nlmsghdr *hdr = nl->rbuffer; NLMSG_OK(hdr, len); hdr = NLMSG_NEXT(hdr, len)) { + _cleanup_(sd_netlink_message_unrefp) sd_netlink_message *m = NULL; - if (!multi_part || done) { - /* we got a complete message, push it on the read queue */ - r = netlink_rqueue_make_room(nl); + r = parse_message_one(nl, group, hdr, &m); if (r < 0) return r; + if (r == 0) + continue; - nl->rqueue[nl->rqueue_size++] = TAKE_PTR(first); + if (hdr->nlmsg_flags & NLM_F_MULTI) { + if (hdr->nlmsg_type == NLMSG_DONE) { + _cleanup_(sd_netlink_message_unrefp) sd_netlink_message *existing = NULL; - if (multi_part && (i < nl->rqueue_partial_size)) { - /* remove the message form the partial read queue */ - memmove(nl->rqueue_partial + i, nl->rqueue_partial + i + 1, - sizeof(sd_netlink_message*) * (nl->rqueue_partial_size - i - 1)); - nl->rqueue_partial_size--; - } + /* finished reading multi-part message */ + existing = hashmap_remove(nl->rqueue_partial_by_serial, UINT32_TO_PTR(hdr->nlmsg_seq)); + + /* if we receive only NLMSG_DONE, put it into the receive queue. */ + r = netlink_queue_received_message(nl, existing ?: m); + if (r < 0) + return r; + + done = true; + } else { + sd_netlink_message *existing; + + existing = hashmap_get(nl->rqueue_partial_by_serial, UINT32_TO_PTR(hdr->nlmsg_seq)); + if (existing) { + /* This is the continuation of the previously read messages. + * Let's append this message at the end. */ + while (existing->next) + existing = existing->next; + existing->next = TAKE_PTR(m); + } else { + /* This is the first message. Put it into the queue for partially + * received messages. */ + r = netlink_queue_partially_received_message(nl, m); + if (r < 0) + return r; + } + } - return 1; - } else { - /* we only got a partial multi-part message, push it on the - partial read queue */ - if (i < nl->rqueue_partial_size) - nl->rqueue_partial[i] = TAKE_PTR(first); - else { - r = netlink_rqueue_partial_make_room(nl); + } else { + r = netlink_queue_received_message(nl, m); if (r < 0) return r; - nl->rqueue_partial[nl->rqueue_partial_size++] = TAKE_PTR(first); + done = true; } - - return 0; } + + if (len > 0) + log_debug("sd-netlink: discarding trailing %zu bytes of incoming message", len); + + return done; } diff --git a/src/libsystemd/sd-netlink/netlink-util.c b/src/libsystemd/sd-netlink/netlink-util.c index 12cdc99ff2..c6091542d2 100644 --- a/src/libsystemd/sd-netlink/netlink-util.c +++ b/src/libsystemd/sd-netlink/netlink-util.c @@ -673,6 +673,15 @@ int netlink_open_family(sd_netlink **ret, int family) { return 0; } +static bool serial_used(sd_netlink *nl, uint32_t serial) { + assert(nl); + + return + hashmap_contains(nl->reply_callbacks, UINT32_TO_PTR(serial)) || + hashmap_contains(nl->rqueue_by_serial, UINT32_TO_PTR(serial)) || + hashmap_contains(nl->rqueue_partial_by_serial, UINT32_TO_PTR(serial)); +} + void netlink_seal_message(sd_netlink *nl, sd_netlink_message *m) { uint32_t picked; @@ -689,7 +698,7 @@ void netlink_seal_message(sd_netlink *nl, sd_netlink_message *m) { such messages */ nl->serial = nl->serial == UINT32_MAX ? 1 : nl->serial + 1; - } while (hashmap_contains(nl->reply_callbacks, UINT32_TO_PTR(picked))); + } while (serial_used(nl, picked)); m->hdr->nlmsg_seq = picked; message_seal(m); diff --git a/src/libsystemd/sd-netlink/sd-netlink.c b/src/libsystemd/sd-netlink/sd-netlink.c index feb751a848..6eb7f659ae 100644 --- a/src/libsystemd/sd-netlink/sd-netlink.c +++ b/src/libsystemd/sd-netlink/sd-netlink.c @@ -61,10 +61,6 @@ static int netlink_new(sd_netlink **ret) { .serial = (uint32_t) (now(CLOCK_MONOTONIC) % UINT32_MAX) + 1, }; - /* We guarantee that the read buffer has at least space for a message header */ - if (!greedy_realloc((void**) &nl->rbuffer, sizeof(struct nlmsghdr), sizeof(uint8_t))) - return -ENOMEM; - *ret = TAKE_PTR(nl); return 0; } @@ -120,18 +116,12 @@ int sd_netlink_increase_rxbuf(sd_netlink *nl, size_t size) { static sd_netlink *netlink_free(sd_netlink *nl) { sd_netlink_slot *s; - unsigned i; assert(nl); - for (i = 0; i < nl->rqueue_size; i++) - sd_netlink_message_unref(nl->rqueue[i]); - free(nl->rqueue); - - for (i = 0; i < nl->rqueue_partial_size; i++) - sd_netlink_message_unref(nl->rqueue_partial[i]); - free(nl->rqueue_partial); - + ordered_set_free(nl->rqueue); + hashmap_free(nl->rqueue_by_serial); + hashmap_free(nl->rqueue_partial_by_serial); free(nl->rbuffer); while ((s = nl->slots)) { @@ -179,57 +169,27 @@ int sd_netlink_send( return 1; } -int netlink_rqueue_make_room(sd_netlink *nl) { - assert(nl); - - if (nl->rqueue_size >= NETLINK_RQUEUE_MAX) - return log_debug_errno(SYNTHETIC_ERRNO(ENOBUFS), - "sd-netlink: exhausted the read queue size (%d)", - NETLINK_RQUEUE_MAX); - - if (!GREEDY_REALLOC(nl->rqueue, nl->rqueue_size + 1)) - return -ENOMEM; - - return 0; -} - -int netlink_rqueue_partial_make_room(sd_netlink *nl) { - assert(nl); - - if (nl->rqueue_partial_size >= NETLINK_RQUEUE_MAX) - return log_debug_errno(SYNTHETIC_ERRNO(ENOBUFS), - "sd-netlink: exhausted the partial read queue size (%d)", - NETLINK_RQUEUE_MAX); - - if (!GREEDY_REALLOC(nl->rqueue_partial, nl->rqueue_partial_size + 1)) - return -ENOMEM; - - return 0; -} - -static int dispatch_rqueue(sd_netlink *nl, sd_netlink_message **message) { +static int dispatch_rqueue(sd_netlink *nl, sd_netlink_message **ret) { + sd_netlink_message *m; int r; assert(nl); - assert(message); + assert(ret); - if (nl->rqueue_size <= 0) { + if (ordered_set_size(nl->rqueue) <= 0) { /* Try to read a new message */ r = socket_read_message(nl); - if (r == -ENOBUFS) { /* FIXME: ignore buffer overruns for now */ + if (r == -ENOBUFS) /* FIXME: ignore buffer overruns for now */ log_debug_errno(r, "sd-netlink: Got ENOBUFS from netlink socket, ignoring."); - return 1; - } - if (r <= 0) + else if (r < 0) return r; } /* Dispatch a queued message */ - *message = nl->rqueue[0]; - nl->rqueue_size--; - memmove(nl->rqueue, nl->rqueue + 1, sizeof(sd_netlink_message*) * nl->rqueue_size); - - return 1; + m = ordered_set_steal_first(nl->rqueue); + sd_netlink_message_unref(hashmap_remove_value(nl->rqueue_by_serial, UINT32_TO_PTR(message_get_serial(m)), m)); + *ret = m; + return !!m; } static int process_timeout(sd_netlink *nl) { @@ -464,13 +424,18 @@ static int netlink_poll(sd_netlink *nl, bool need_more, usec_t timeout_usec) { } int sd_netlink_wait(sd_netlink *nl, uint64_t timeout_usec) { + int r; + assert_return(nl, -EINVAL); assert_return(!netlink_pid_changed(nl), -ECHILD); - if (nl->rqueue_size > 0) + if (ordered_set_size(nl->rqueue) > 0) return 0; - return netlink_poll(nl, false, timeout_usec); + r = netlink_poll(nl, false, timeout_usec); + if (r < 0 && ERRNO_IS_TRANSIENT(r)) /* Convert EINTR to "something happened" and give user a chance to run some code before calling back into us */ + return 1; + return r; } static int timeout_compare(const void *a, const void *b) { @@ -565,39 +530,32 @@ int sd_netlink_read( timeout = calc_elapse(usec); for (;;) { + _cleanup_(sd_netlink_message_unrefp) sd_netlink_message *m = NULL; usec_t left; - for (unsigned i = 0; i < nl->rqueue_size; i++) { - _cleanup_(sd_netlink_message_unrefp) sd_netlink_message *incoming = NULL; - uint32_t received_serial; + m = hashmap_remove(nl->rqueue_by_serial, UINT32_TO_PTR(serial)); + if (m) { uint16_t type; - received_serial = message_get_serial(nl->rqueue[i]); - if (received_serial != serial) - continue; - - incoming = nl->rqueue[i]; - /* found a match, remove from rqueue and return it */ - memmove(nl->rqueue + i, nl->rqueue + i + 1, - sizeof(sd_netlink_message*) * (nl->rqueue_size - i - 1)); - nl->rqueue_size--; + sd_netlink_message_unref(ordered_set_remove(nl->rqueue, m)); - r = sd_netlink_message_get_errno(incoming); + r = sd_netlink_message_get_errno(m); if (r < 0) return r; - r = sd_netlink_message_get_type(incoming, &type); + r = sd_netlink_message_get_type(m, &type); if (r < 0) return r; if (type == NLMSG_DONE) { - *ret = NULL; + if (ret) + *ret = NULL; return 0; } if (ret) - *ret = TAKE_PTR(incoming); + *ret = TAKE_PTR(m); return 1; } @@ -651,7 +609,7 @@ int sd_netlink_get_events(sd_netlink *nl) { assert_return(nl, -EINVAL); assert_return(!netlink_pid_changed(nl), -ECHILD); - return nl->rqueue_size == 0 ? POLLIN : 0; + return ordered_set_size(nl->rqueue) == 0 ? POLLIN : 0; } int sd_netlink_get_timeout(sd_netlink *nl, uint64_t *timeout_usec) { @@ -661,7 +619,7 @@ int sd_netlink_get_timeout(sd_netlink *nl, uint64_t *timeout_usec) { assert_return(timeout_usec, -EINVAL); assert_return(!netlink_pid_changed(nl), -ECHILD); - if (nl->rqueue_size > 0) { + if (ordered_set_size(nl->rqueue) > 0) { *timeout_usec = 0; return 1; } @@ -673,7 +631,6 @@ int sd_netlink_get_timeout(sd_netlink *nl, uint64_t *timeout_usec) { } *timeout_usec = c->timeout; - return 1; } diff --git a/src/locale/localectl.c b/src/locale/localectl.c index 9a4e4fb59b..966f07d083 100644 --- a/src/locale/localectl.c +++ b/src/locale/localectl.c @@ -84,7 +84,7 @@ static int print_status_info(StatusInfo *i) { if (!strv_isempty(kernel_locale)) { log_warning("Warning: Settings on kernel command line override system locale settings in /etc/locale.conf."); r = table_add_many(table, - TABLE_STRING, "Command Line:", + TABLE_FIELD, "Command Line", TABLE_SET_COLOR, ansi_highlight_yellow(), TABLE_STRV, kernel_locale, TABLE_SET_COLOR, ansi_highlight_yellow()); diff --git a/src/login/logind.c b/src/login/logind.c index cc153fd6bf..a564f94bfe 100644 --- a/src/login/logind.c +++ b/src/login/logind.c @@ -18,6 +18,7 @@ #include "daemon-util.h" #include "device-util.h" #include "dirent-util.h" +#include "escape.h" #include "fd-util.h" #include "format-util.h" #include "fs-util.h" @@ -299,11 +300,16 @@ static int manager_enumerate_linger_users(Manager *m) { FOREACH_DIRENT(de, d, return -errno) { int k; + _cleanup_free_ char *n = NULL; if (!dirent_is_file(de)) continue; - - k = manager_add_user_by_name(m, de->d_name, NULL); + k = cunescape(de->d_name, 0, &n); + if (k < 0) { + r = log_warning_errno(k, "Failed to unescape username '%s', ignoring: %m", de->d_name); + continue; + } + k = manager_add_user_by_name(m, n, NULL); if (k < 0) r = log_warning_errno(k, "Couldn't add lingering user %s, ignoring: %m", de->d_name); } diff --git a/src/mount/mount-tool.c b/src/mount/mount-tool.c index 95dcf1bab0..f7dd705c5e 100644 --- a/src/mount/mount-tool.c +++ b/src/mount/mount-tool.c @@ -776,53 +776,51 @@ static int find_mount_points(const char *what, char ***list) { return n; } -static int find_loop_device(const char *backing_file, char **loop_dev) { - _cleanup_closedir_ DIR *d = NULL; - _cleanup_free_ char *l = NULL; +static int find_loop_device(const char *backing_file, sd_device **ret) { + _cleanup_(sd_device_enumerator_unrefp) sd_device_enumerator *e = NULL; + sd_device *dev; + int r; assert(backing_file); - assert(loop_dev); + assert(ret); - d = opendir("/sys/devices/virtual/block"); - if (!d) - return -errno; + r = sd_device_enumerator_new(&e); + if (r < 0) + return log_oom(); - FOREACH_DIRENT(de, d, return -errno) { - _cleanup_free_ char *sys = NULL, *fname = NULL; - int r; + r = sd_device_enumerator_add_match_subsystem(e, "block", /* match = */ true); + if (r < 0) + return log_error_errno(r, "Failed to add subsystem match: %m"); - if (de->d_type != DT_DIR) - continue; + r = sd_device_enumerator_add_match_property(e, "ID_FS_USAGE", "filesystem"); + if (r < 0) + return log_error_errno(r, "Failed to add property match: %m"); - if (!startswith(de->d_name, "loop")) - continue; + r = sd_device_enumerator_add_match_sysname(e, "loop*"); + if (r < 0) + return log_error_errno(r, "Failed to add sysname match: %m"); - sys = path_join("/sys/devices/virtual/block", de->d_name, "loop/backing_file"); - if (!sys) - return -ENOMEM; + r = sd_device_enumerator_add_match_sysattr(e, "loop/backing_file", /* value = */ NULL, /* match = */ true); + if (r < 0) + return log_error_errno(r, "Failed to add sysattr match: %m"); + + FOREACH_DEVICE(e, dev) { + const char *s; - r = read_one_line_file(sys, &fname); + r = sd_device_get_sysattr_value(dev, "loop/backing_file", &s); if (r < 0) { - log_debug_errno(r, "Failed to read %s, ignoring: %m", sys); + log_device_debug_errno(dev, r, "Failed to read \"loop/backing_file\" sysattr, ignoring: %m"); continue; } - if (files_same(fname, backing_file, 0) <= 0) + if (files_same(s, backing_file, 0) <= 0) continue; - l = path_join("/dev", de->d_name); - if (!l) - return -ENOMEM; - - break; + *ret = sd_device_ref(dev); + return 0; } - if (!l) - return -ENXIO; - - *loop_dev = TAKE_PTR(l); - - return 0; + return -ENXIO; } static int stop_mount( @@ -914,62 +912,69 @@ static int stop_mounts( return 0; } -static int umount_by_device(sd_bus *bus, const char *what) { +static int umount_by_device(sd_bus *bus, sd_device *dev) { _cleanup_(sd_device_unrefp) sd_device *d = NULL; _cleanup_strv_free_ char **list = NULL; - struct stat st; const char *v; - char **l; - int r, r2 = 0; - - assert(what); + int r, ret = 0; - if (stat(what, &st) < 0) - return log_error_errno(errno, "Can't stat %s: %m", what); + assert(bus); + assert(dev); - if (!S_ISBLK(st.st_mode)) - return log_error_errno(SYNTHETIC_ERRNO(ENOTBLK), - "Not a block device: %s", what); - - r = sd_device_new_from_stat_rdev(&d, &st); - if (r < 0) - return log_error_errno(r, "Failed to get device from device number: %m"); + if (sd_device_get_property_value(d, "SYSTEMD_MOUNT_WHERE", &v) >= 0) + ret = stop_mounts(bus, v); - r = sd_device_get_property_value(d, "ID_FS_USAGE", &v); + r = sd_device_get_devname(dev, &v); if (r < 0) - return log_device_error_errno(d, r, "Failed to get device property: %m"); - - if (!streq(v, "filesystem")) - return log_device_error_errno(d, SYNTHETIC_ERRNO(EINVAL), - "%s does not contain a known file system.", what); - - if (sd_device_get_property_value(d, "SYSTEMD_MOUNT_WHERE", &v) >= 0) - r2 = stop_mounts(bus, v); + return r; - r = find_mount_points(what, &list); + r = find_mount_points(v, &list); if (r < 0) return r; - for (l = list; *l; l++) { + STRV_FOREACH(l, list) { r = stop_mounts(bus, *l); if (r < 0) - r2 = r; + ret = r; } - return r2; + return ret; +} + +static int umount_by_device_node(sd_bus *bus, const char *node) { + _cleanup_(sd_device_unrefp) sd_device *dev = NULL; + const char *v; + int r; + + assert(bus); + assert(node); + + r = sd_device_new_from_devname(&dev, node); + if (r < 0) + return log_error_errno(r, "Failed to get device from %s: %m", node); + + r = sd_device_get_property_value(dev, "ID_FS_USAGE", &v); + if (r < 0) + return log_device_error_errno(dev, r, "Failed to get \"ID_FS_USAGE\" device property: %m"); + + if (!streq(v, "filesystem")) + return log_device_error_errno(dev, SYNTHETIC_ERRNO(EINVAL), + "%s does not contain a known file system.", node); + + return umount_by_device(bus, dev); } static int umount_loop(sd_bus *bus, const char *backing_file) { - _cleanup_free_ char *loop_dev = NULL; + _cleanup_(sd_device_unrefp) sd_device *dev = NULL; int r; assert(backing_file); - r = find_loop_device(backing_file, &loop_dev); + r = find_loop_device(backing_file, &dev); if (r < 0) return log_error_errno(r, r == -ENXIO ? "File %s is not mounted." : "Can't get loop device for %s: %m", backing_file); - return umount_by_device(bus, loop_dev); + return umount_by_device(bus, dev); } static int action_umount( @@ -1014,7 +1019,7 @@ static int action_umount( return log_error_errno(errno, "Can't stat %s (from %s): %m", p, argv[i]); if (S_ISBLK(st.st_mode)) - r = umount_by_device(bus, p); + r = umount_by_device_node(bus, p); else if (S_ISREG(st.st_mode)) r = umount_loop(bus, p); else if (S_ISDIR(st.st_mode)) @@ -1136,24 +1141,31 @@ static int acquire_mount_where(sd_device *d) { return 1; } -static int acquire_mount_where_for_loop_dev(const char *loop_dev) { +static int acquire_mount_where_for_loop_dev(sd_device *dev) { _cleanup_strv_free_ char **list = NULL; + const char *node; int r; + assert(dev); + if (arg_mount_where) return 0; - r = find_mount_points(loop_dev, &list); + r = sd_device_get_devname(dev, &node); if (r < 0) return r; - else if (r == 0) - return log_error_errno(SYNTHETIC_ERRNO(EINVAL), - "Can't find mount point of %s. It is expected that %s is already mounted on a place.", - loop_dev, loop_dev); - else if (r >= 2) - return log_error_errno(SYNTHETIC_ERRNO(EINVAL), - "%s is mounted on %d places. It is expected that %s is mounted on a place.", - loop_dev, r, loop_dev); + + r = find_mount_points(node, &list); + if (r < 0) + return r; + if (r == 0) + return log_device_error_errno(dev, SYNTHETIC_ERRNO(EINVAL), + "Can't find mount point of %s. It is expected that %s is already mounted on a place.", + node, node); + if (r >= 2) + return log_device_error_errno(dev, SYNTHETIC_ERRNO(EINVAL), + "%s is mounted on %d places. It is expected that %s is mounted on a place.", + node, r, node); arg_mount_where = strdup(list[0]); if (!arg_mount_where) @@ -1234,12 +1246,9 @@ static int acquire_removable(sd_device *d) { static int discover_loop_backing_file(void) { _cleanup_(sd_device_unrefp) sd_device *d = NULL; - _cleanup_free_ char *loop_dev = NULL; - struct stat st; - const char *v; int r; - r = find_loop_device(arg_mount_what, &loop_dev); + r = find_loop_device(arg_mount_what, &d); if (r < 0 && r != -ENXIO) return log_error_errno(errno, "Can't get loop device for %s: %m", arg_mount_what); @@ -1265,21 +1274,6 @@ static int discover_loop_backing_file(void) { return 0; } - if (stat(loop_dev, &st) < 0) - return log_error_errno(errno, "Can't stat %s: %m", loop_dev); - - if (!S_ISBLK(st.st_mode)) - return log_error_errno(SYNTHETIC_ERRNO(EINVAL), - "Invalid file type: %s", loop_dev); - - r = sd_device_new_from_stat_rdev(&d, &st); - if (r < 0) - return log_error_errno(r, "Failed to get device from device number: %m"); - - if (sd_device_get_property_value(d, "ID_FS_USAGE", &v) < 0 || !streq(v, "filesystem")) - return log_device_error_errno(d, SYNTHETIC_ERRNO(EINVAL), - "%s does not contain a known file system.", arg_mount_what); - r = acquire_mount_type(d); if (r < 0) return r; @@ -1288,7 +1282,7 @@ static int discover_loop_backing_file(void) { if (r < 0) return r; - r = acquire_mount_where_for_loop_dev(loop_dev); + r = acquire_mount_where_for_loop_dev(d); if (r < 0) return r; diff --git a/src/network/networkd-link.c b/src/network/networkd-link.c index 4ed622cfe6..c46b8f5b2e 100644 --- a/src/network/networkd-link.c +++ b/src/network/networkd-link.c @@ -1178,7 +1178,7 @@ static int link_get_network(Link *link, Network **ret) { return -ENOENT; } -static int link_reconfigure_impl(Link *link, bool force) { +int link_reconfigure_impl(Link *link, bool force) { Network *network = NULL; NetDev *netdev = NULL; int r; diff --git a/src/network/networkd-link.h b/src/network/networkd-link.h index 4d397da79a..0d601ab548 100644 --- a/src/network/networkd-link.h +++ b/src/network/networkd-link.h @@ -238,8 +238,8 @@ int link_stop_engines(Link *link, bool may_keep_dhcp); const char* link_state_to_string(LinkState s) _const_; LinkState link_state_from_string(const char *s) _pure_; +int link_reconfigure_impl(Link *link, bool force); int link_reconfigure(Link *link, bool force); -int link_reconfigure_after_sleep(Link *link); int manager_udev_process_link(Manager *m, sd_device *device, sd_device_action_t action); int manager_rtnl_process_link(sd_netlink *rtnl, sd_netlink_message *message, Manager *m); diff --git a/src/network/networkd-ndisc.c b/src/network/networkd-ndisc.c index ce7dff222b..6ee098a015 100644 --- a/src/network/networkd-ndisc.c +++ b/src/network/networkd-ndisc.c @@ -233,7 +233,7 @@ static int ndisc_request_address(Address *in, Link *link, sd_ndisc_router *rt) { link->ndisc_configured = false; return link_request_address(link, TAKE_PTR(address), true, &link->ndisc_messages, - ndisc_address_handler, NULL); + ndisc_address_handler, NULL); } static int ndisc_router_process_default(Link *link, sd_ndisc_router *rt) { @@ -442,7 +442,6 @@ static int ndisc_router_process_onlink_prefix(Link *link, sd_ndisc_router *rt) { return log_oom(); route->family = AF_INET6; - route->flags = RTM_F_PREFIX; route->dst.in6 = prefix; route->dst_prefixlen = prefixlen; route->lifetime_usec = sec_to_usec(lifetime_sec, timestamp_usec); diff --git a/src/network/networkd-network-gperf.gperf b/src/network/networkd-network-gperf.gperf index 70dead97ab..762eef5b91 100644 --- a/src/network/networkd-network-gperf.gperf +++ b/src/network/networkd-network-gperf.gperf @@ -420,6 +420,8 @@ CAKE.PriorityQueueingPreset, config_parse_cake_priority_queueing CAKE.FirewallMark, config_parse_cake_fwmark, QDISC_KIND_CAKE, 0 CAKE.Wash, config_parse_cake_tristate, QDISC_KIND_CAKE, 0 CAKE.SplitGSO, config_parse_cake_tristate, QDISC_KIND_CAKE, 0 +CAKE.RTTSec, config_parse_cake_rtt, QDISC_KIND_CAKE, 0 +CAKE.AckFilter, config_parse_cake_ack_filter, QDISC_KIND_CAKE, 0 ControlledDelay.Parent, config_parse_qdisc_parent, QDISC_KIND_CODEL, 0 ControlledDelay.Handle, config_parse_qdisc_handle, QDISC_KIND_CODEL, 0 ControlledDelay.PacketLimit, config_parse_controlled_delay_u32, QDISC_KIND_CODEL, 0 diff --git a/src/network/networkd-wifi.c b/src/network/networkd-wifi.c index 4bf798a9eb..62cbca0cf9 100644 --- a/src/network/networkd-wifi.c +++ b/src/network/networkd-wifi.c @@ -269,6 +269,18 @@ int manager_genl_process_nl80211_mlme(sd_netlink *genl, sd_netlink_message *mess if (link->wlan_iftype == NL80211_IFTYPE_STATION && link->ssid) log_link_info(link, "Connected WiFi access point: %s (%s)", link->ssid, ETHER_ADDR_TO_STR(&link->bssid)); + + /* Sometimes, RTM_NEWLINK message with carrier is received earlier than NL80211_CMD_CONNECT. + * To make SSID= or other WiFi related settings in [Match] section work, let's try to + * reconfigure the interface. */ + if (link->ssid && link_has_carrier(link)) { + r = link_reconfigure_impl(link, /* force = */ false); + if (r < 0) { + log_link_warning_errno(link, r, "Failed to reconfigure interface: %m"); + link_enter_failed(link); + return 0; + } + } break; } case NL80211_CMD_DISCONNECT: diff --git a/src/network/tc/cake.c b/src/network/tc/cake.c index 8d770b0896..a22a11cce3 100644 --- a/src/network/tc/cake.c +++ b/src/network/tc/cake.c @@ -27,6 +27,7 @@ static int cake_init(QDisc *qdisc) { c->preset = _CAKE_PRESET_INVALID; c->wash = -1; c->split_gso = -1; + c->ack_filter = _CAKE_ACK_FILTER_INVALID; return 0; } @@ -118,6 +119,18 @@ static int cake_fill_message(Link *link, QDisc *qdisc, sd_netlink_message *req) return r; } + if (c->rtt > 0) { + r = sd_netlink_message_append_u32(req, TCA_CAKE_RTT, c->rtt); + if (r < 0) + return r; + } + + if (c->ack_filter >= 0) { + r = sd_netlink_message_append_u32(req, TCA_CAKE_ACK_FILTER, c->ack_filter); + if (r < 0) + return r; + } + r = sd_netlink_message_close_container(req); if (r < 0) return r; @@ -605,6 +618,124 @@ int config_parse_cake_fwmark( return 0; } +int config_parse_cake_rtt( + const char *unit, + const char *filename, + unsigned line, + const char *section, + unsigned section_line, + const char *lvalue, + int ltype, + const char *rvalue, + void *data, + void *userdata) { + + _cleanup_(qdisc_free_or_set_invalidp) QDisc *qdisc = NULL; + CommonApplicationsKeptEnhanced *c; + Network *network = ASSERT_PTR(data); + usec_t t; + int r; + + assert(filename); + assert(lvalue); + assert(rvalue); + + r = qdisc_new_static(QDISC_KIND_CAKE, network, filename, section_line, &qdisc); + if (r == -ENOMEM) + return log_oom(); + if (r < 0) { + log_syntax(unit, LOG_WARNING, filename, line, r, + "More than one kind of queueing discipline, ignoring assignment: %m"); + return 0; + } + + c = CAKE(qdisc); + + if (isempty(rvalue)) { + c->rtt = 0; + TAKE_PTR(qdisc); + return 0; + } + + r = parse_sec(rvalue, &t); + if (r < 0) { + log_syntax(unit, LOG_WARNING, filename, line, r, + "Failed to parse '%s=', ignoring assignment: %s", + lvalue, rvalue); + return 0; + } + if (t <= 0 || t > UINT32_MAX) { + log_syntax(unit, LOG_WARNING, filename, line, 0, + "Invalid '%s=', ignoring assignment: %s", + lvalue, rvalue); + return 0; + } + + c->rtt = t; + TAKE_PTR(qdisc); + return 0; +} + +static const char * const cake_ack_filter_table[_CAKE_ACK_FILTER_MAX] = { + [CAKE_ACK_FILTER_NO] = "no", + [CAKE_ACK_FILTER_YES] = "yes", + [CAKE_ACK_FILTER_AGGRESSIVE] = "aggressive", +}; + +DEFINE_PRIVATE_STRING_TABLE_LOOKUP_FROM_STRING_WITH_BOOLEAN(cake_ack_filter, CakeAckFilter, CAKE_ACK_FILTER_YES); + +int config_parse_cake_ack_filter( + const char *unit, + const char *filename, + unsigned line, + const char *section, + unsigned section_line, + const char *lvalue, + int ltype, + const char *rvalue, + void *data, + void *userdata) { + + _cleanup_(qdisc_free_or_set_invalidp) QDisc *qdisc = NULL; + CommonApplicationsKeptEnhanced *c; + CakeAckFilter ack_filter; + Network *network = ASSERT_PTR(data); + int r; + + assert(filename); + assert(lvalue); + assert(rvalue); + + r = qdisc_new_static(QDISC_KIND_CAKE, network, filename, section_line, &qdisc); + if (r == -ENOMEM) + return log_oom(); + if (r < 0) { + log_syntax(unit, LOG_WARNING, filename, line, r, + "More than one kind of queueing discipline, ignoring assignment: %m"); + return 0; + } + + c = CAKE(qdisc); + + if (isempty(rvalue)) { + c->ack_filter = _CAKE_ACK_FILTER_INVALID; + TAKE_PTR(qdisc); + return 0; + } + + ack_filter = cake_ack_filter_from_string(rvalue); + if (ack_filter < 0) { + log_syntax(unit, LOG_WARNING, filename, line, ack_filter, + "Failed to parse '%s=', ignoring assignment: %s", + lvalue, rvalue); + return 0; + } + + c->ack_filter = ack_filter; + TAKE_PTR(qdisc); + return 0; +} + const QDiscVTable cake_vtable = { .object_size = sizeof(CommonApplicationsKeptEnhanced), .tca_kind = "cake", diff --git a/src/network/tc/cake.h b/src/network/tc/cake.h index ff68cedabf..5ca6dc6470 100644 --- a/src/network/tc/cake.h +++ b/src/network/tc/cake.h @@ -38,6 +38,14 @@ typedef enum CakePriorityQueueingPreset { _CAKE_PRESET_INVALID = -EINVAL, } CakePriorityQueueingPreset; +typedef enum CakeAckFilter { + CAKE_ACK_FILTER_NO = CAKE_ACK_NONE, + CAKE_ACK_FILTER_YES = CAKE_ACK_FILTER, + CAKE_ACK_FILTER_AGGRESSIVE = CAKE_ACK_AGGRESSIVE, + _CAKE_ACK_FILTER_MAX, + _CAKE_ACK_FILTER_INVALID = -EINVAL, +} CakeAckFilter; + typedef struct CommonApplicationsKeptEnhanced { QDisc meta; @@ -63,7 +71,8 @@ typedef struct CommonApplicationsKeptEnhanced { /* Other parameters */ int wash; int split_gso; - + usec_t rtt; + CakeAckFilter ack_filter; } CommonApplicationsKeptEnhanced; DEFINE_QDISC_CAST(CAKE, CommonApplicationsKeptEnhanced); @@ -77,3 +86,5 @@ CONFIG_PARSER_PROTOTYPE(config_parse_cake_compensation_mode); CONFIG_PARSER_PROTOTYPE(config_parse_cake_flow_isolation_mode); CONFIG_PARSER_PROTOTYPE(config_parse_cake_priority_queueing_preset); CONFIG_PARSER_PROTOTYPE(config_parse_cake_fwmark); +CONFIG_PARSER_PROTOTYPE(config_parse_cake_rtt); +CONFIG_PARSER_PROTOTYPE(config_parse_cake_ack_filter); diff --git a/src/nspawn/nspawn-seccomp.c b/src/nspawn/nspawn-seccomp.c index 77f4c2ac88..27044fadd2 100644 --- a/src/nspawn/nspawn-seccomp.c +++ b/src/nspawn/nspawn-seccomp.c @@ -88,6 +88,7 @@ static int add_syscall_filters( { 0, "sched_getparam" }, { 0, "sched_getscheduler" }, { 0, "sched_rr_get_interval" }, + { 0, "sched_rr_get_interval_time64" }, { 0, "sched_yield" }, { 0, "seccomp" }, { 0, "sendfile" }, diff --git a/src/nspawn/nspawn.c b/src/nspawn/nspawn.c index ef5825ef75..d7b636209e 100644 --- a/src/nspawn/nspawn.c +++ b/src/nspawn/nspawn.c @@ -5753,7 +5753,7 @@ static int run(int argc, char *argv[]) { log_notice("Note that the disk image needs to\n" " a) either contain only a single MBR partition of type 0x83 that is marked bootable\n" " b) or contain a single GPT partition of type 0FC63DAF-8483-4772-8E79-3D69D8477DE4\n" - " c) or follow https://systemd.io/DISCOVERABLE_PARTITIONS\n" + " c) or follow https://uapi-group.org/specifications/specs/discoverable_partitions_specification\n" " d) or contain a file system without a partition table\n" "in order to be bootable with systemd-nspawn."); goto finish; diff --git a/src/nss-myhostname/nss-myhostname.c b/src/nss-myhostname/nss-myhostname.c index 120e76be45..3af1d2f0c1 100644 --- a/src/nss-myhostname/nss-myhostname.c +++ b/src/nss-myhostname/nss-myhostname.c @@ -12,6 +12,7 @@ #include "local-addresses.h" #include "macro.h" #include "nss-util.h" +#include "resolve-util.h" #include "signal-util.h" #include "socket-util.h" #include "string-util.h" @@ -21,7 +22,7 @@ * IPv6 we use ::1 which unfortunately will not translate back to the * hostname but instead something like "localhost" or so. */ -#define LOCALADDRESS_IPV4 (htobe32(0x7F000002)) +#define LOCALADDRESS_IPV4 (htobe32(INADDR_LOCALADDRESS)) #define LOCALADDRESS_IPV6 &in6addr_loopback NSS_GETHOSTBYNAME_PROTOTYPES(myhostname); diff --git a/src/oom/oomd-util.c b/src/oom/oomd-util.c index 1fc81d1843..70a1dc941e 100644 --- a/src/oom/oomd-util.c +++ b/src/oom/oomd-util.c @@ -164,7 +164,7 @@ int oomd_fetch_cgroup_oom_preference(OomdCGroupContext *ctx, const char *prefix) if (r < 0) return log_debug_errno(r, "Failed to get owner/group from %s: %m", ctx->path); - if (uid == prefix_uid) { + if (uid == prefix_uid || uid == 0) { /* Ignore most errors when reading the xattr since it is usually unset and cgroup xattrs are only used * as an optional feature of systemd-oomd (and the system might not even support them). */ r = cg_get_xattr_bool(SYSTEMD_CGROUP_CONTROLLER, ctx->path, "user.oomd_avoid"); diff --git a/src/oom/test-oomd-util.c b/src/oom/test-oomd-util.c index 176e3a8d69..faa75c5578 100644 --- a/src/oom/test-oomd-util.c +++ b/src/oom/test-oomd-util.c @@ -475,9 +475,9 @@ static void test_oomd_fetch_cgroup_oom_preference(void) { /* Assert that avoid/omit are not set if the cgroup and prefix are not * owned by the same user.*/ - if (test_xattrs && !empty_or_root(ctx->path)) { + if (test_xattrs && !empty_or_root(cgroup)) { ctx = oomd_cgroup_context_free(ctx); - assert_se(cg_set_access(SYSTEMD_CGROUP_CONTROLLER, cgroup, 65534, 0) >= 0); + assert_se(cg_set_access(SYSTEMD_CGROUP_CONTROLLER, cgroup, 61183, 0) >= 0); assert_se(oomd_cgroup_context_acquire(cgroup, &ctx) == 0); assert_se(oomd_fetch_cgroup_oom_preference(ctx, NULL) == 0); diff --git a/src/partition/repart.c b/src/partition/repart.c index 8a1a8411cf..7857f7b3d1 100644 --- a/src/partition/repart.c +++ b/src/partition/repart.c @@ -76,19 +76,25 @@ #include "utf8.h" /* If not configured otherwise use a minimal partition size of 10M */ -#define DEFAULT_MIN_SIZE (10*1024*1024) +#define DEFAULT_MIN_SIZE (10ULL*1024ULL*1024ULL) /* Hard lower limit for new partition sizes */ -#define HARD_MIN_SIZE 4096 +#define HARD_MIN_SIZE 4096ULL /* We know up front we're never going to put more than this in a verity sig partition. */ -#define VERITY_SIG_SIZE (HARD_MIN_SIZE * 4) +#define VERITY_SIG_SIZE (HARD_MIN_SIZE*4ULL) /* libfdisk takes off slightly more than 1M of the disk size when creating a GPT disk label */ -#define GPT_METADATA_SIZE (1044*1024) +#define GPT_METADATA_SIZE (1044ULL*1024ULL) /* LUKS2 takes off 16M of the partition size with its metadata by default */ -#define LUKS2_METADATA_SIZE (16*1024*1024) +#define LUKS2_METADATA_SIZE (16ULL*1024ULL*1024ULL) + +/* To do LUKS2 offline encryption, we need to keep some extra free space at the end of the partition. */ +#define LUKS2_METADATA_KEEP_FREE (LUKS2_METADATA_SIZE*2ULL) + +/* LUKS2 volume key size. */ +#define VOLUME_KEY_SIZE (512ULL/8ULL) /* Note: When growing and placing new partitions we always align to 4K sector size. It's how newer hard disks * are designed, and if everything is aligned to that performance is best. And for older hard disks with 512B @@ -103,6 +109,14 @@ static enum { EMPTY_CREATE, /* create disk as loopback file, create a partition table always */ } arg_empty = EMPTY_REFUSE; +typedef enum FilterPartitionType { + FILTER_PARTITIONS_NONE, + FILTER_PARTITIONS_EXCLUDE, + FILTER_PARTITIONS_INCLUDE, + _FILTER_PARTITIONS_MAX, + _FILTER_PARTITIONS_INVALID = -EINVAL, +} FilterPartitionsType; + static bool arg_dry_run = true; static const char *arg_node = NULL; static char *arg_root = NULL; @@ -128,6 +142,11 @@ static uint32_t arg_tpm2_pcr_mask = UINT32_MAX; static char *arg_tpm2_public_key = NULL; static uint32_t arg_tpm2_public_key_pcr_mask = UINT32_MAX; static bool arg_split = false; +static sd_id128_t *arg_filter_partitions = NULL; +static size_t arg_n_filter_partitions = 0; +static FilterPartitionsType arg_filter_partitions_type = FILTER_PARTITIONS_NONE; +static sd_id128_t *arg_skip_partitions = NULL; +static size_t arg_n_skip_partitions = 0; STATIC_DESTRUCTOR_REGISTER(arg_root, freep); STATIC_DESTRUCTOR_REGISTER(arg_image, freep); @@ -137,6 +156,7 @@ STATIC_DESTRUCTOR_REGISTER(arg_private_key, EVP_PKEY_freep); STATIC_DESTRUCTOR_REGISTER(arg_certificate, X509_freep); STATIC_DESTRUCTOR_REGISTER(arg_tpm2_device, freep); STATIC_DESTRUCTOR_REGISTER(arg_tpm2_public_key, freep); +STATIC_DESTRUCTOR_REGISTER(arg_filter_partitions, freep); typedef struct Partition Partition; typedef struct FreeArea FreeArea; @@ -164,10 +184,11 @@ struct Partition { char *definition_path; char **drop_in_files; - sd_id128_t type_uuid; + GptPartitionType type; sd_id128_t current_uuid, new_uuid; bool new_uuid_is_set; char *current_label, *new_label; + sd_id128_t fs_uuid; bool dropped; bool factory_reset; @@ -191,6 +212,7 @@ struct Partition { char *copy_blocks_path; bool copy_blocks_auto; + const char *copy_blocks_root; int copy_blocks_fd; uint64_t copy_blocks_size; @@ -200,6 +222,7 @@ struct Partition { EncryptMode encrypt; VerityMode verity; char *verity_match_key; + bool minimize; uint64_t gpt_flags; int no_auto; @@ -256,12 +279,7 @@ static const char *verity_mode_table[_VERITY_MODE_MAX] = { [VERITY_SIG] = "signature", }; -#if HAVE_LIBCRYPTSETUP -DEFINE_PRIVATE_STRING_TABLE_LOOKUP_WITH_BOOLEAN(encrypt_mode, EncryptMode, ENCRYPT_KEY_FILE); -#else DEFINE_PRIVATE_STRING_TABLE_LOOKUP_FROM_STRING_WITH_BOOLEAN(encrypt_mode, EncryptMode, ENCRYPT_KEY_FILE); -#endif - DEFINE_PRIVATE_STRING_TABLE_LOOKUP(verity_mode, VerityMode); static uint64_t round_down_size(uint64_t v, uint64_t p) { @@ -344,20 +362,18 @@ static void partition_foreignize(Partition *p) { /* Reset several parameters set through definition file to make the partition foreign. */ - p->new_label = mfree(p->new_label); p->definition_path = mfree(p->definition_path); p->drop_in_files = strv_free(p->drop_in_files); p->copy_blocks_path = mfree(p->copy_blocks_path); p->copy_blocks_fd = safe_close(p->copy_blocks_fd); + p->copy_blocks_root = NULL; p->format = mfree(p->format); p->copy_files = strv_free(p->copy_files); p->make_directories = strv_free(p->make_directories); p->verity_match_key = mfree(p->verity_match_key); - p->new_uuid = SD_ID128_NULL; - p->new_uuid_is_set = false; p->priority = 0; p->weight = 1000; p->padding_weight = 0; @@ -371,6 +387,29 @@ static void partition_foreignize(Partition *p) { p->verity = VERITY_OFF; } +static bool partition_exclude(const Partition *p) { + assert(p); + + if (arg_filter_partitions_type == FILTER_PARTITIONS_NONE) + return false; + + for (size_t i = 0; i < arg_n_filter_partitions; i++) + if (sd_id128_equal(p->type.uuid, arg_filter_partitions[i])) + return arg_filter_partitions_type == FILTER_PARTITIONS_EXCLUDE; + + return arg_filter_partitions_type == FILTER_PARTITIONS_INCLUDE; +} + +static bool partition_skip(const Partition *p) { + assert(p); + + for (size_t i = 0; i < arg_n_skip_partitions; i++) + if (sd_id128_equal(p->type.uuid, arg_skip_partitions[i])) + return true; + + return false; +} + static Partition* partition_unlink_and_free(Context *context, Partition *p) { if (!p) return NULL; @@ -537,7 +576,7 @@ static uint64_t partition_min_size(const Context *context, const Partition *p) { uint64_t d = 0; if (p->encrypt != ENCRYPT_OFF) - d += round_up_size(LUKS2_METADATA_SIZE, context->grain_size); + d += round_up_size(LUKS2_METADATA_KEEP_FREE, context->grain_size); if (p->copy_blocks_size != UINT64_MAX) d += round_up_size(p->copy_blocks_size, context->grain_size); @@ -1024,21 +1063,27 @@ static int context_grow_partitions(Context *context) { return 0; } -static void context_place_partitions(Context *context) { +static uint64_t find_first_unused_partno(Context *context) { uint64_t partno = 0; assert(context); - /* Determine next partition number to assign */ - LIST_FOREACH(partitions, p, context->partitions) { - if (!PARTITION_EXISTS(p)) - continue; - - assert(p->partno != UINT64_MAX); - if (p->partno >= partno) - partno = p->partno + 1; + for (partno = 0;; partno++) { + bool found = false; + LIST_FOREACH(partitions, p, context->partitions) + if (p->partno != UINT64_MAX && p->partno == partno) + found = true; + if (!found) + break; } + return partno; +} + +static void context_place_partitions(Context *context) { + + assert(context); + for (size_t i = 0; i < context->n_free_areas; i++) { FreeArea *a = context->free_areas[i]; _unused_ uint64_t left; @@ -1061,7 +1106,7 @@ static void context_place_partitions(Context *context) { continue; p->offset = start; - p->partno = partno++; + p->partno = find_first_unused_partno(context); assert(left >= p->new_size); start += p->new_size; @@ -1086,12 +1131,12 @@ static int config_parse_type( void *data, void *userdata) { - sd_id128_t *type_uuid = ASSERT_PTR(data); + GptPartitionType *type = ASSERT_PTR(data); int r; assert(rvalue); - r = gpt_partition_type_uuid_from_string(rvalue, type_uuid); + r = gpt_partition_type_from_string(rvalue, type); if (r < 0) return log_syntax(unit, LOG_ERR, filename, line, r, "Failed to parse partition type: %s", rvalue); @@ -1338,6 +1383,7 @@ static int config_parse_copy_blocks( if (streq(rvalue, "auto")) { partition->copy_blocks_path = mfree(partition->copy_blocks_path); partition->copy_blocks_auto = true; + partition->copy_blocks_root = arg_root; return 0; } @@ -1354,6 +1400,7 @@ static int config_parse_copy_blocks( free_and_replace(partition->copy_blocks_path, d); partition->copy_blocks_auto = false; + partition->copy_blocks_root = arg_root; return 0; } @@ -1475,7 +1522,7 @@ static DEFINE_CONFIG_PARSE_ENUM_WITH_DEFAULT(config_parse_verity, verity_mode, V static int partition_read_definition(Partition *p, const char *path, const char *const *conf_file_dirs) { ConfigTableItem table[] = { - { "Partition", "Type", config_parse_type, 0, &p->type_uuid }, + { "Partition", "Type", config_parse_type, 0, &p->type }, { "Partition", "Label", config_parse_label, 0, &p->new_label }, { "Partition", "UUID", config_parse_uuid, 0, p }, { "Partition", "Priority", config_parse_int32, 0, &p->priority }, @@ -1498,6 +1545,7 @@ static int partition_read_definition(Partition *p, const char *path, const char { "Partition", "NoAuto", config_parse_tristate, 0, &p->no_auto }, { "Partition", "GrowFileSystem", config_parse_tristate, 0, &p->growfs }, { "Partition", "SplitName", config_parse_string, 0, &p->split_name_format }, + { "Partition", "Minimize", config_parse_bool, 0, &p->minimize }, {} }; int r; @@ -1506,7 +1554,7 @@ static int partition_read_definition(Partition *p, const char *path, const char r = path_extract_filename(path, &filename); if (r < 0) - return log_error_errno(r, "Failed to extract filename from path '%s': %m", path);; + return log_error_errno(r, "Failed to extract filename from path '%s': %m", path); dropin_dirname = strjoina(filename, ".d"); @@ -1523,6 +1571,9 @@ static int partition_read_definition(Partition *p, const char *path, const char if (r < 0) return r; + if (partition_exclude(p)) + return 0; + if (p->size_min != UINT64_MAX && p->size_max != UINT64_MAX && p->size_min > p->size_max) return log_syntax(NULL, LOG_ERR, path, 1, SYNTHETIC_ERRNO(EINVAL), "SizeMinBytes= larger than SizeMaxBytes=, refusing."); @@ -1531,7 +1582,7 @@ static int partition_read_definition(Partition *p, const char *path, const char return log_syntax(NULL, LOG_ERR, path, 1, SYNTHETIC_ERRNO(EINVAL), "PaddingMinBytes= larger than PaddingMaxBytes=, refusing."); - if (sd_id128_is_null(p->type_uuid)) + if (sd_id128_is_null(p->type.uuid)) return log_syntax(NULL, LOG_ERR, path, 1, SYNTHETIC_ERRNO(EINVAL), "Type= not defined, refusing."); @@ -1551,6 +1602,19 @@ static int partition_read_definition(Partition *p, const char *path, const char return log_oom(); } + if (p->minimize && !p->format) + return log_syntax(NULL, LOG_ERR, path, 1, SYNTHETIC_ERRNO(EINVAL), + "Minimize= can only be enabled if Format= is set"); + + if ((!strv_isempty(p->copy_files) || !strv_isempty(p->make_directories)) && !mkfs_supports_root_option(p->format) && geteuid() != 0) + return log_syntax(NULL, LOG_ERR, path, 1, SYNTHETIC_ERRNO(EPERM), + "Need to be root to populate %s filesystems with CopyFiles=/MakeDirectories=", + p->format); + + if (p->format && fstype_is_ro(p->format) && strv_isempty(p->copy_files) && strv_isempty(p->make_directories)) + return log_syntax(NULL, LOG_ERR, path, 1, SYNTHETIC_ERRNO(EINVAL), + "Cannot format %s filesystem without source files, refusing", p->format); + if (p->verity != VERITY_OFF || p->encrypt != ENCRYPT_OFF) { r = dlopen_cryptsetup(); if (r < 0) @@ -1591,13 +1655,11 @@ static int partition_read_definition(Partition *p, const char *path, const char verity_mode_to_string(p->verity)); /* Verity partitions are read only, let's imply the RO flag hence, unless explicitly configured otherwise. */ - if ((gpt_partition_type_is_root_verity(p->type_uuid) || - gpt_partition_type_is_usr_verity(p->type_uuid)) && - p->read_only < 0) + if (IN_SET(p->type.designator, PARTITION_ROOT_VERITY, PARTITION_USR_VERITY) && p->read_only < 0) p->read_only = true; /* Default to "growfs" on, unless read-only */ - if (gpt_partition_type_knows_growfs(p->type_uuid) && + if (gpt_partition_type_knows_growfs(p->type) && p->read_only <= 0) p->growfs = true; @@ -1610,7 +1672,7 @@ static int partition_read_definition(Partition *p, const char *path, const char } else if (streq(p->split_name_format, "-")) p->split_name_format = mfree(p->split_name_format); - return 0; + return 1; } static int find_verity_sibling(Context *context, Partition *p, VerityMode mode, Partition **ret) { @@ -1685,6 +1747,8 @@ static int context_read_definitions( r = partition_read_definition(p, *f, dirs); if (r < 0) return r; + if (r == 0) + continue; LIST_INSERT_AFTER(partitions, context->partitions, last, p); last = TAKE_PTR(p); @@ -1881,16 +1945,17 @@ static int context_load_partition_table( assert(context->end == UINT64_MAX); assert(context->total == UINT64_MAX); - c = fdisk_new_context(); - if (!c) - return log_oom(); - /* libfdisk doesn't have an API to operate on arbitrary fds, hence reopen the fd going via the * /proc/self/fd/ magic path if we have an existing fd. Open the original file otherwise. */ - if (*backing_fd < 0) + if (*backing_fd < 0) { + c = fdisk_new_context(); + if (!c) + return log_oom(); + r = fdisk_assign_device(c, node, arg_dry_run); - else - r = fdisk_assign_device(c, FORMAT_PROC_FD_PATH(*backing_fd), arg_dry_run); + } else + r = fdisk_new_context_fd(*backing_fd, arg_dry_run, &c); + if (r == -EINVAL && arg_size_auto) { struct stat st; @@ -2091,7 +2156,7 @@ static int context_load_partition_table( LIST_FOREACH(partitions, pp, context->partitions) { last = pp; - if (!sd_id128_equal(pp->type_uuid, ptid)) + if (!sd_id128_equal(pp->type.uuid, ptid)) continue; if (!pp->current_partition) { @@ -2128,7 +2193,7 @@ static int context_load_partition_table( return log_oom(); np->current_uuid = id; - np->type_uuid = ptid; + np->type = gpt_partition_type_from_uuid(ptid); np->current_size = sz; np->offset = start; np->partno = partno; @@ -2287,7 +2352,7 @@ static const char *partition_label(const Partition *p) { if (p->current_label) return p->current_label; - return gpt_partition_type_uuid_to_string(p->type_uuid); + return gpt_partition_type_uuid_to_string(p->type.uuid); } static int context_dump_partitions(Context *context, const char *node) { @@ -2361,7 +2426,7 @@ static int context_dump_partitions(Context *context, const char *node) { r = table_add_many( t, - TABLE_STRING, gpt_partition_type_uuid_to_string_harder(p->type_uuid, uuid_buffer), + TABLE_STRING, gpt_partition_type_uuid_to_string_harder(p->type.uuid, uuid_buffer), TABLE_STRING, empty_to_null(label) ?: "-", TABLE_SET_COLOR, empty_to_null(label) ? NULL : ansi_grey(), TABLE_UUID, p->new_uuid_is_set ? p->new_uuid : p->current_uuid, TABLE_STRING, p->definition_path ? basename(p->definition_path) : "-", TABLE_SET_COLOR, p->definition_path ? NULL : ansi_grey(), @@ -2495,7 +2560,7 @@ static int partition_hint(const Partition *p, const char *node, char **ret) { else if (!sd_id128_is_null(p->current_uuid)) id = p->current_uuid; else - id = p->type_uuid; + id = p->type.uuid; buf = strdup(SD_ID128_TO_UUID_STRING(id)); @@ -2913,6 +2978,9 @@ static int context_wipe_and_discard(Context *context, bool from_scratch) { if (!p->allocated_to_area) continue; + if (partition_skip(p)) + continue; + r = context_wipe_partition(context, p); if (r < 0) return r; @@ -2937,71 +3005,257 @@ static int context_wipe_and_discard(Context *context, bool from_scratch) { return 0; } -static int partition_encrypt( +typedef struct { + LoopDevice *loop; + int fd; + char *path; + int whole_fd; +} PartitionTarget; + +static int partition_target_fd(PartitionTarget *t) { + assert(t); + assert(t->loop || t->fd >= 0 || t->whole_fd >= 0); + return t->loop ? t->loop->fd : t->fd >= 0 ? t->fd : t->whole_fd; +} + +static const char* partition_target_path(PartitionTarget *t) { + assert(t); + assert(t->loop || t->path); + return t->loop ? t->loop->node : t->path; +} + +static PartitionTarget *partition_target_free(PartitionTarget *t) { + if (!t) + return NULL; + + loop_device_unref(t->loop); + safe_close(t->fd); + unlink_and_free(t->path); + + return mfree(t); +} + +DEFINE_TRIVIAL_CLEANUP_FUNC(PartitionTarget*, partition_target_free); + +static int prepare_temporary_file(PartitionTarget *t, uint64_t size) { + _cleanup_(unlink_and_freep) char *temp = NULL; + _cleanup_close_ int fd = -1; + const char *vt; + int r; + + assert(t); + + r = var_tmp_dir(&vt); + if (r < 0) + return log_error_errno(r, "Could not determine temporary directory: %m"); + + temp = path_join(vt, "repart-XXXXXX"); + if (!temp) + return log_oom(); + + fd = mkostemp_safe(temp); + if (fd < 0) + return log_error_errno(fd, "Failed to create temporary file: %m"); + + if (ftruncate(fd, size) < 0) + return log_error_errno(errno, "Failed to truncate temporary file to %s: %m", + FORMAT_BYTES(size)); + + t->fd = TAKE_FD(fd); + t->path = TAKE_PTR(temp); + + return 0; +} + +static int partition_target_prepare( Context *context, Partition *p, - const char *node, - struct crypt_device **ret_cd, - char **ret_volume, - int *ret_fd) { -#if HAVE_LIBCRYPTSETUP + uint64_t size, + bool need_path, + PartitionTarget **ret) { + + _cleanup_(partition_target_freep) PartitionTarget *t = NULL; + _cleanup_(loop_device_unrefp) LoopDevice *d = NULL; + int whole_fd, r; + + assert(context); + assert(p); + assert(ret); + + assert_se((whole_fd = fdisk_get_devfd(context->fdisk_context)) >= 0); + + t = new(PartitionTarget, 1); + if (!t) + return log_oom(); + *t = (PartitionTarget) { + .fd = -1, + .whole_fd = -1, + }; + + if (!need_path) { + if (lseek(whole_fd, p->offset, SEEK_SET) == (off_t) -1) + return log_error_errno(errno, "Failed to seek to partition offset: %m"); + + t->whole_fd = whole_fd; + *ret = TAKE_PTR(t); + return 0; + } + + /* Loopback block devices are not only useful to turn regular files into block devices, but + * also to cut out sections of block devices into new block devices. */ + + r = loop_device_make(whole_fd, O_RDWR, p->offset, size, 0, 0, LOCK_EX, &d); + if (r < 0 && r != -ENOENT && !ERRNO_IS_PRIVILEGE(r)) + return log_error_errno(r, "Failed to make loopback device of future partition %" PRIu64 ": %m", p->partno); + if (r >= 0) { + t->loop = TAKE_PTR(d); + *ret = TAKE_PTR(t); + return 0; + } + + /* If we can't allocate a loop device, let's write to a regular file that we copy into the final + * image so we can run in containers and without needing root privileges. On filesystems with + * reflinking support, we can take advantage of this and just reflink the result into the image. + */ + + log_debug_errno(r, "No access to loop devices, falling back to a regular file"); + + r = prepare_temporary_file(t, size); + if (r < 0) + return r; + + *ret = TAKE_PTR(t); + + return 0; +} + +static int partition_target_grow(PartitionTarget *t, uint64_t size) { + int r; + + assert(t); + + if (t->loop) { + r = loop_device_refresh_size(t->loop, UINT64_MAX, size); + if (r < 0) + return log_error_errno(r, "Failed to refresh loopback device size: %m"); + } else if (t->fd >= 0) { + if (ftruncate(t->fd, size) < 0) + return log_error_errno(errno, "Failed to grow '%s' to %s by truncation: %m", + t->path, FORMAT_BYTES(size)); + } + + return 0; +} + +static int partition_target_sync(Context *context, Partition *p, PartitionTarget *t) { + int whole_fd, r; + + assert(context); + assert(p); + assert(t); + + assert_se((whole_fd = fdisk_get_devfd(context->fdisk_context)) >= 0); + + if (t->loop) { + r = loop_device_sync(t->loop); + if (r < 0) + return log_error_errno(r, "Failed to sync loopback device: %m"); + } else if (t->fd >= 0) { + if (lseek(whole_fd, p->offset, SEEK_SET) == (off_t) -1) + return log_error_errno(errno, "Failed to seek to partition offset: %m"); + + r = copy_bytes(t->fd, whole_fd, UINT64_MAX, COPY_REFLINK|COPY_HOLES|COPY_FSYNC); + if (r < 0) + return log_error_errno(r, "Failed to copy bytes to partition: %m"); + } else { + if (fsync(t->whole_fd) < 0) + return log_error_errno(errno, "Failed to sync changes: %m"); + } + + return 0; +} + +static int partition_encrypt(Context *context, Partition *p, const char *node) { +#if HAVE_LIBCRYPTSETUP && HAVE_CRYPT_SET_DATA_OFFSET && HAVE_CRYPT_REENCRYPT_INIT_BY_PASSPHRASE && HAVE_CRYPT_REENCRYPT + struct crypt_params_luks2 luks_params = { + .label = strempty(ASSERT_PTR(p)->new_label), + .sector_size = ASSERT_PTR(context)->sector_size, + .data_device = node, + }; + struct crypt_params_reencrypt reencrypt_params = { + .mode = CRYPT_REENCRYPT_ENCRYPT, + .direction = CRYPT_REENCRYPT_BACKWARD, + .resilience = "datashift", + .data_shift = LUKS2_METADATA_SIZE / 512, + .luks2 = &luks_params, + .flags = CRYPT_REENCRYPT_INITIALIZE_ONLY|CRYPT_REENCRYPT_MOVE_FIRST_SEGMENT, + }; _cleanup_(sym_crypt_freep) struct crypt_device *cd = NULL; - _cleanup_(erase_and_freep) void *volume_key = NULL; - _cleanup_free_ char *dm_name = NULL, *vol = NULL; - size_t volume_key_size = 256 / 8; + _cleanup_(erase_and_freep) char *base64_encoded = NULL; + _cleanup_fclose_ FILE *h = NULL; + _cleanup_free_ char *hp = NULL; + const char *passphrase = NULL; + size_t passphrase_size = 0; sd_id128_t uuid; + const char *vt; int r; assert(context); assert(p); assert(p->encrypt != ENCRYPT_OFF); - log_debug("Encryption mode for partition %" PRIu64 ": %s", p->partno, encrypt_mode_to_string(p->encrypt)); - r = dlopen_cryptsetup(); if (r < 0) return log_error_errno(r, "libcryptsetup not found, cannot encrypt: %m"); - if (asprintf(&dm_name, "luks-repart-%08" PRIx64, random_u64()) < 0) - return log_oom(); - - if (ret_volume) { - vol = path_join("/dev/mapper/", dm_name); - if (!vol) - return log_oom(); - } - r = derive_uuid(p->new_uuid, "luks-uuid", &uuid); if (r < 0) return r; log_info("Encrypting future partition %" PRIu64 "...", p->partno); - volume_key = malloc(volume_key_size); - if (!volume_key) - return log_oom(); + r = var_tmp_dir(&vt); + if (r < 0) + return log_error_errno(r, "Failed to determine temporary files directory: %m"); - r = crypto_random_bytes(volume_key, volume_key_size); + r = fopen_temporary_child(vt, &h, &hp); if (r < 0) - return log_error_errno(r, "Failed to generate volume key: %m"); + return log_error_errno(r, "Failed to create temporary LUKS header file: %m"); - r = sym_crypt_init(&cd, node); + /* Weird cryptsetup requirement which requires the header file to be the size of at least one sector. */ + r = ftruncate(fileno(h), context->sector_size); if (r < 0) - return log_error_errno(r, "Failed to allocate libcryptsetup context: %m"); + return log_error_errno(r, "Failed to grow temporary LUKS header file: %m"); + + r = sym_crypt_init(&cd, hp); + if (r < 0) + return log_error_errno(r, "Failed to allocate libcryptsetup context for %s: %m", hp); cryptsetup_enable_logging(cd); + /* Disable kernel keyring usage by libcryptsetup as a workaround for + * https://gitlab.com/cryptsetup/cryptsetup/-/merge_requests/273. This makes sure that we can do + * offline encryption even when repart is running in a container. */ + r = sym_crypt_volume_key_keyring(cd, false); + if (r < 0) + return log_error_errno(r, "Failed to disable kernel keyring: %m"); + + r = sym_crypt_metadata_locking(cd, false); + if (r < 0) + return log_error_errno(r, "Failed to disable metadata locking: %m"); + + r = sym_crypt_set_data_offset(cd, LUKS2_METADATA_SIZE / 512); + if (r < 0) + return log_error_errno(r, "Failed to set data offset: %m"); + r = sym_crypt_format(cd, CRYPT_LUKS2, "aes", "xts-plain64", SD_ID128_TO_UUID_STRING(uuid), - volume_key, - volume_key_size, - &(struct crypt_params_luks2) { - .label = strempty(p->new_label), - .sector_size = context->sector_size, - }); + NULL, + VOLUME_KEY_SIZE, + &luks_params); if (r < 0) return log_error_errno(r, "Failed to LUKS2 format future partition: %m"); @@ -3009,17 +3263,19 @@ static int partition_encrypt( r = sym_crypt_keyslot_add_by_volume_key( cd, CRYPT_ANY_SLOT, - volume_key, - volume_key_size, + NULL, + VOLUME_KEY_SIZE, strempty(arg_key), arg_key_size); if (r < 0) return log_error_errno(r, "Failed to add LUKS2 key: %m"); + + passphrase = strempty(arg_key); + passphrase_size = arg_key_size; } if (IN_SET(p->encrypt, ENCRYPT_TPM2, ENCRYPT_KEY_FILE_TPM2)) { #if HAVE_TPM2 - _cleanup_(erase_and_freep) char *base64_encoded = NULL; _cleanup_(json_variant_unrefp) JsonVariant *v = NULL; _cleanup_(erase_and_freep) void *secret = NULL; _cleanup_free_ void *pubkey = NULL; @@ -3063,12 +3319,12 @@ static int partition_encrypt( keyslot = sym_crypt_keyslot_add_by_volume_key( cd, CRYPT_ANY_SLOT, - volume_key, - volume_key_size, + NULL, + VOLUME_KEY_SIZE, base64_encoded, strlen(base64_encoded)); if (keyslot < 0) - return log_error_errno(keyslot, "Failed to add new TPM2 key to %s: %m", node); + return log_error_errno(keyslot, "Failed to add new TPM2 key: %m"); r = tpm2_make_luks2_json( keyslot, @@ -3087,79 +3343,292 @@ static int partition_encrypt( r = cryptsetup_add_token_json(cd, v); if (r < 0) return log_error_errno(r, "Failed to add TPM2 JSON token to LUKS2 header: %m"); + + passphrase = base64_encoded; + passphrase_size = strlen(base64_encoded); #else return log_error_errno(SYNTHETIC_ERRNO(EOPNOTSUPP), "Support for TPM2 enrollment not enabled."); #endif } - r = sym_crypt_activate_by_volume_key( + r = sym_crypt_reencrypt_init_by_passphrase( cd, - dm_name, - volume_key, - volume_key_size, - arg_discard ? CRYPT_ACTIVATE_ALLOW_DISCARDS : 0); + NULL, + passphrase, + passphrase_size, + CRYPT_ANY_SLOT, + 0, + sym_crypt_get_cipher(cd), + sym_crypt_get_cipher_mode(cd), + &reencrypt_params); if (r < 0) - return log_error_errno(r, "Failed to activate LUKS superblock: %m"); + return log_error_errno(r, "Failed to prepare for reencryption: %m"); - log_info("Successfully encrypted future partition %" PRIu64 ".", p->partno); + /* crypt_reencrypt_init_by_passphrase() doesn't actually put the LUKS header at the front, we have + * to do that ourselves. */ - if (ret_fd) { - _cleanup_close_ int dev_fd = -1; + sym_crypt_free(cd); + cd = NULL; - dev_fd = open(vol, O_RDWR|O_CLOEXEC|O_NOCTTY); - if (dev_fd < 0) - return log_error_errno(errno, "Failed to open LUKS volume '%s': %m", vol); + r = sym_crypt_init(&cd, node); + if (r < 0) + return log_error_errno(r, "Failed to allocate libcryptsetup context for %s: %m", node); - *ret_fd = TAKE_FD(dev_fd); - } + r = sym_crypt_header_restore(cd, CRYPT_LUKS2, hp); + if (r < 0) + return log_error_errno(r, "Failed to place new LUKS header at head of %s: %m", node); - if (ret_cd) - *ret_cd = TAKE_PTR(cd); - if (ret_volume) - *ret_volume = TAKE_PTR(vol); + reencrypt_params.flags &= ~CRYPT_REENCRYPT_INITIALIZE_ONLY; + + r = sym_crypt_reencrypt_init_by_passphrase( + cd, + NULL, + passphrase, + passphrase_size, + CRYPT_ANY_SLOT, + 0, + NULL, + NULL, + &reencrypt_params); + if (r < 0) + return log_error_errno(r, "Failed to load reencryption context: %m"); + + r = sym_crypt_reencrypt(cd, NULL); + if (r < 0) + return log_error_errno(r, "Failed to encrypt %s: %m", node); + + log_info("Successfully encrypted future partition %" PRIu64 ".", p->partno); return 0; #else - return log_error_errno(SYNTHETIC_ERRNO(EOPNOTSUPP), "libcryptsetup is not supported, cannot encrypt: %m"); + return log_error_errno(SYNTHETIC_ERRNO(EOPNOTSUPP), + "libcryptsetup is not supported or is missing required symbols, cannot encrypt: %m"); #endif } -static int deactivate_luks(struct crypt_device *cd, const char *node) { +static int partition_format_verity_hash( + Context *context, + Partition *p, + const char *data_node) { + #if HAVE_LIBCRYPTSETUP + Partition *dp; + _cleanup_(partition_target_freep) PartitionTarget *t = NULL; + _cleanup_(sym_crypt_freep) struct crypt_device *cd = NULL; + _cleanup_free_ uint8_t *rh = NULL; + size_t rhs; int r; - if (!cd) + assert(context); + assert(p); + assert(data_node); + + if (p->dropped) + return 0; + + if (PARTITION_EXISTS(p)) /* Never format existing partitions */ return 0; - assert(node); + if (p->verity != VERITY_HASH) + return 0; + + if (partition_skip(p)) + return 0; - /* udev or so might access out block device in the background while we are done. Let's hence force - * detach the volume. We sync'ed before, hence this should be safe. */ + assert_se(dp = p->siblings[VERITY_DATA]); + assert(!dp->dropped); - r = sym_crypt_deactivate_by_name(cd, basename(node), CRYPT_DEACTIVATE_FORCE); + r = dlopen_cryptsetup(); if (r < 0) - return log_error_errno(r, "Failed to deactivate LUKS device: %m"); + return log_error_errno(r, "libcryptsetup not found, cannot setup verity: %m"); - return 1; + r = partition_target_prepare(context, p, p->new_size, /*need_path=*/ true, &t); + if (r < 0) + return r; + + r = sym_crypt_init(&cd, partition_target_path(t)); + if (r < 0) + return log_error_errno(r, "Failed to allocate libcryptsetup context: %m"); + + r = sym_crypt_format( + cd, CRYPT_VERITY, NULL, NULL, NULL, NULL, 0, + &(struct crypt_params_verity){ + .data_device = data_node, + .flags = CRYPT_VERITY_CREATE_HASH, + .hash_name = "sha256", + .hash_type = 1, + .data_block_size = context->sector_size, + .hash_block_size = context->sector_size, + .salt_size = 32, + }); + if (r < 0) + return log_error_errno(r, "Failed to setup verity hash data: %m"); + + r = partition_target_sync(context, p, t); + if (r < 0) + return r; + + r = sym_crypt_get_volume_key_size(cd); + if (r < 0) + return log_error_errno(r, "Failed to determine verity root hash size: %m"); + rhs = (size_t) r; + + rh = malloc(rhs); + if (!rh) + return log_oom(); + + r = sym_crypt_volume_key_get(cd, CRYPT_ANY_SLOT, (char *) rh, &rhs, NULL, 0); + if (r < 0) + return log_error_errno(r, "Failed to get verity root hash: %m"); + + assert(rhs >= sizeof(sd_id128_t) * 2); + + if (!dp->new_uuid_is_set) { + memcpy_safe(dp->new_uuid.bytes, rh, sizeof(sd_id128_t)); + dp->new_uuid_is_set = true; + } + + if (!p->new_uuid_is_set) { + memcpy_safe(p->new_uuid.bytes, rh + rhs - sizeof(sd_id128_t), sizeof(sd_id128_t)); + p->new_uuid_is_set = true; + } + + p->roothash = TAKE_PTR(rh); + p->roothash_size = rhs; + + return 0; #else + return log_error_errno(SYNTHETIC_ERRNO(EOPNOTSUPP), "libcryptsetup is not supported, cannot setup verity hashes: %m"); +#endif +} + +static int sign_verity_roothash( + const uint8_t *roothash, + size_t roothash_size, + uint8_t **ret_signature, + size_t *ret_signature_size) { + +#if HAVE_OPENSSL + _cleanup_(BIO_freep) BIO *rb = NULL; + _cleanup_(PKCS7_freep) PKCS7 *p7 = NULL; + _cleanup_free_ char *hex = NULL; + _cleanup_free_ uint8_t *sig = NULL; + int sigsz; + + assert(roothash); + assert(roothash_size > 0); + assert(ret_signature); + assert(ret_signature_size); + + hex = hexmem(roothash, roothash_size); + if (!hex) + return log_oom(); + + rb = BIO_new_mem_buf(hex, -1); + if (!rb) + return log_oom(); + + p7 = PKCS7_sign(arg_certificate, arg_private_key, NULL, rb, PKCS7_DETACHED|PKCS7_NOATTR|PKCS7_BINARY); + if (!p7) + return log_error_errno(SYNTHETIC_ERRNO(EIO), "Failed to calculate PKCS7 signature: %s", + ERR_error_string(ERR_get_error(), NULL)); + + sigsz = i2d_PKCS7(p7, &sig); + if (sigsz < 0) + return log_error_errno(SYNTHETIC_ERRNO(EIO), "Failed to convert PKCS7 signature to DER: %s", + ERR_error_string(ERR_get_error(), NULL)); + + *ret_signature = TAKE_PTR(sig); + *ret_signature_size = sigsz; + return 0; +#else + return log_error_errno(SYNTHETIC_ERRNO(EOPNOTSUPP), "openssl is not supported, cannot setup verity signature: %m"); #endif } +static int partition_format_verity_sig(Context *context, Partition *p) { + _cleanup_(json_variant_unrefp) JsonVariant *v = NULL; + _cleanup_free_ uint8_t *sig = NULL; + _cleanup_free_ char *text = NULL; + Partition *hp; + uint8_t fp[X509_FINGERPRINT_SIZE]; + size_t sigsz = 0, padsz; /* avoid false maybe-uninitialized warning */ + int whole_fd, r; + + assert(p->verity == VERITY_SIG); + + if (p->dropped) + return 0; + + if (PARTITION_EXISTS(p)) + return 0; + + if (partition_skip(p)) + return 0; + + assert_se(hp = p->siblings[VERITY_HASH]); + assert(!hp->dropped); + + assert(arg_certificate); + + assert_se((whole_fd = fdisk_get_devfd(context->fdisk_context)) >= 0); + + r = sign_verity_roothash(hp->roothash, hp->roothash_size, &sig, &sigsz); + if (r < 0) + return r; + + r = x509_fingerprint(arg_certificate, fp); + if (r < 0) + return log_error_errno(r, "Unable to calculate X509 certificate fingerprint: %m"); + + r = json_build(&v, + JSON_BUILD_OBJECT( + JSON_BUILD_PAIR("rootHash", JSON_BUILD_HEX(hp->roothash, hp->roothash_size)), + JSON_BUILD_PAIR( + "certificateFingerprint", + JSON_BUILD_HEX(fp, sizeof(fp)) + ), + JSON_BUILD_PAIR("signature", JSON_BUILD_BASE64(sig, sigsz)) + ) + ); + if (r < 0) + return log_error_errno(r, "Failed to build JSON object: %m"); + + r = json_variant_format(v, 0, &text); + if (r < 0) + return log_error_errno(r, "Failed to format JSON object: %m"); + + padsz = round_up_size(strlen(text), 4096); + assert_se(padsz <= p->new_size); + + r = strgrowpad0(&text, padsz); + if (r < 0) + return log_error_errno(r, "Failed to pad string to %s", FORMAT_BYTES(padsz)); + + if (lseek(whole_fd, p->offset, SEEK_SET) == (off_t) -1) + return log_error_errno(errno, "Failed to seek to partition offset: %m"); + + r = loop_write(whole_fd, text, padsz, /*do_poll=*/ false); + if (r < 0) + return log_error_errno(r, "Failed to write verity signature to partition: %m"); + + if (fsync(whole_fd) < 0) + return log_error_errno(errno, "Failed to synchronize verity signature JSON: %m"); + + return 0; +} + static int context_copy_blocks(Context *context) { - int whole_fd = -1, r; + int r; assert(context); /* Copy in file systems on the block level */ LIST_FOREACH(partitions, p, context->partitions) { - _cleanup_(sym_crypt_freep) struct crypt_device *cd = NULL; - _cleanup_(loop_device_unrefp) LoopDevice *d = NULL; - _cleanup_free_ char *encrypted = NULL; - _cleanup_close_ int encrypted_dev_fd = -1; - int target_fd; + _cleanup_(partition_target_freep) PartitionTarget *t = NULL; if (p->copy_blocks_fd < 0) continue; @@ -3170,65 +3639,62 @@ static int context_copy_blocks(Context *context) { if (PARTITION_EXISTS(p)) /* Never copy over existing partitions */ continue; + if (partition_skip(p)) + continue; + assert(p->new_size != UINT64_MAX); assert(p->copy_blocks_size != UINT64_MAX); - assert(p->new_size >= p->copy_blocks_size); - - if (whole_fd < 0) - assert_se((whole_fd = fdisk_get_devfd(context->fdisk_context)) >= 0); - - if (p->encrypt != ENCRYPT_OFF) { - r = loop_device_make(whole_fd, O_RDWR, p->offset, p->new_size, 0, 0, LOCK_EX, &d); - if (r < 0) - return log_error_errno(r, "Failed to make loopback device of future partition %" PRIu64 ": %m", p->partno); - - r = partition_encrypt(context, p, d->node, &cd, &encrypted, &encrypted_dev_fd); - if (r < 0) - return log_error_errno(r, "Failed to encrypt device: %m"); - - if (flock(encrypted_dev_fd, LOCK_EX) < 0) - return log_error_errno(errno, "Failed to lock LUKS device: %m"); + assert(p->new_size >= p->copy_blocks_size + (p->encrypt != ENCRYPT_OFF ? LUKS2_METADATA_KEEP_FREE : 0)); - target_fd = encrypted_dev_fd; - } else { - if (lseek(whole_fd, p->offset, SEEK_SET) == (off_t) -1) - return log_error_errno(errno, "Failed to seek to partition offset: %m"); - - target_fd = whole_fd; - } + r = partition_target_prepare(context, p, p->new_size, + /*need_path=*/ p->encrypt != ENCRYPT_OFF || p->siblings[VERITY_HASH], + &t); + if (r < 0) + return r; log_info("Copying in '%s' (%s) on block level into future partition %" PRIu64 ".", p->copy_blocks_path, FORMAT_BYTES(p->copy_blocks_size), p->partno); - r = copy_bytes_full(p->copy_blocks_fd, target_fd, p->copy_blocks_size, 0, NULL, NULL, NULL, NULL); + r = copy_bytes(p->copy_blocks_fd, partition_target_fd(t), p->copy_blocks_size, COPY_REFLINK); if (r < 0) return log_error_errno(r, "Failed to copy in data from '%s': %m", p->copy_blocks_path); - if (fsync(target_fd) < 0) - return log_error_errno(errno, "Failed to synchronize copied data blocks: %m"); - if (p->encrypt != ENCRYPT_OFF) { - encrypted_dev_fd = safe_close(encrypted_dev_fd); - - r = deactivate_luks(cd, encrypted); + r = partition_encrypt(context, p, partition_target_path(t)); if (r < 0) return r; + } + + r = partition_target_sync(context, p, t); + if (r < 0) + return r; - sym_crypt_free(cd); - cd = NULL; + log_info("Copying in of '%s' on block level completed.", p->copy_blocks_path); - r = loop_device_sync(d); + if (p->siblings[VERITY_HASH]) { + r = partition_format_verity_hash(context, p->siblings[VERITY_HASH], + partition_target_path(t)); if (r < 0) - return log_error_errno(r, "Failed to sync loopback device: %m"); + return r; } - log_info("Copying in of '%s' on block level completed.", p->copy_blocks_path); + if (p->siblings[VERITY_SIG]) { + r = partition_format_verity_sig(context, p->siblings[VERITY_SIG]); + if (r < 0) + return r; + } } return 0; } -static int do_copy_files(Partition *p, const char *root, const Set *denylist) { +static int do_copy_files( + Partition *p, + const char *root, + uid_t override_uid, + gid_t override_gid, + const Set *denylist) { + int r; assert(p); @@ -3270,21 +3736,26 @@ static int do_copy_files(Partition *p, const char *root, const Set *denylist) { if (pfd < 0) return log_error_errno(pfd, "Failed to open parent directory of target: %m"); + /* Make sure everything is owned by the user running repart so that + * make_filesystem() can map the user running repart to "root" in a user + * namespace to have the files owned by root in the final image. */ + r = copy_tree_at( sfd, ".", pfd, fn, - UID_INVALID, GID_INVALID, - COPY_REFLINK|COPY_MERGE|COPY_REPLACE|COPY_SIGINT|COPY_HARDLINKS|COPY_ALL_XATTRS, + override_uid, override_gid, + COPY_REFLINK|COPY_HOLES|COPY_MERGE|COPY_REPLACE|COPY_SIGINT|COPY_HARDLINKS|COPY_ALL_XATTRS, denylist); } else r = copy_tree_at( sfd, ".", tfd, ".", - UID_INVALID, GID_INVALID, - COPY_REFLINK|COPY_MERGE|COPY_REPLACE|COPY_SIGINT|COPY_HARDLINKS|COPY_ALL_XATTRS, + override_uid, override_gid, + COPY_REFLINK|COPY_HOLES|COPY_MERGE|COPY_REPLACE|COPY_SIGINT|COPY_HARDLINKS|COPY_ALL_XATTRS, denylist); if (r < 0) - return log_error_errno(r, "Failed to copy '%s' to '%s%s': %m", *source, strempty(arg_root), *target); + return log_error_errno(r, "Failed to copy '%s%s' to '%s%s': %m", + strempty(arg_root), *source, strempty(root), *target); } else { _cleanup_free_ char *dn = NULL, *fn = NULL; @@ -3313,10 +3784,13 @@ static int do_copy_files(Partition *p, const char *root, const Set *denylist) { if (tfd < 0) return log_error_errno(errno, "Failed to create target file '%s': %m", *target); - r = copy_bytes(sfd, tfd, UINT64_MAX, COPY_REFLINK|COPY_SIGINT); + r = copy_bytes(sfd, tfd, UINT64_MAX, COPY_REFLINK|COPY_HOLES|COPY_SIGINT); if (r < 0) return log_error_errno(r, "Failed to copy '%s' to '%s%s': %m", *source, strempty(arg_root), *target); + if (fchown(tfd, override_uid, override_gid) < 0) + return log_error_errno(r, "Failed to change ownership of %s", *target); + (void) copy_xattr(sfd, tfd, COPY_ALL_XATTRS); (void) copy_access(sfd, tfd); (void) copy_times(sfd, tfd, 0); @@ -3326,7 +3800,7 @@ static int do_copy_files(Partition *p, const char *root, const Set *denylist) { return 0; } -static int do_make_directories(Partition *p, const char *root) { +static int do_make_directories(Partition *p, uid_t override_uid, gid_t override_gid, const char *root) { int r; assert(p); @@ -3334,7 +3808,7 @@ static int do_make_directories(Partition *p, const char *root) { STRV_FOREACH(d, p->make_directories) { - r = mkdir_p_root(root, *d, UID_INVALID, GID_INVALID, 0755); + r = mkdir_p_root(root, *d, override_uid, override_gid, 0755); if (r < 0) return log_error_errno(r, "Failed to create directory '%s' in file system: %m", *d); } @@ -3342,56 +3816,34 @@ static int do_make_directories(Partition *p, const char *root) { return 0; } -static int partition_populate_directory(Partition *p, const Set *denylist, char **ret_root, char **ret_tmp_root) { +static bool partition_needs_populate(Partition *p) { + assert(p); + return !strv_isempty(p->copy_files) || !strv_isempty(p->make_directories); +} + +static int partition_populate_directory(Partition *p, const Set *denylist, char **ret) { _cleanup_(rm_rf_physical_and_freep) char *root = NULL; + _cleanup_close_ int rfd = -1; int r; - assert(ret_root); - assert(ret_tmp_root); - - /* When generating read-only filesystems, we need the source tree to be available when we generate - * the read-only filesystem. Because we might have multiple source trees, we build a temporary source - * tree beforehand where we merge all our inputs. We then use this merged source tree to create the - * read-only filesystem. */ - - if (!fstype_is_ro(p->format)) { - *ret_root = NULL; - *ret_tmp_root = NULL; - return 0; - } - - /* If we only have a single directory that's meant to become the root directory of the filesystem, - * we can shortcut this function and just use that directory as the root directory instead. If we - * allocate a temporary directory, it's stored in "ret_tmp_root" to indicate it should be removed. - * Otherwise, we return the directory to use in "root" to indicate it should not be removed. */ - - if (strv_length(p->copy_files) == 2 && strv_length(p->make_directories) == 0 && - streq(p->copy_files[1], "/") && set_isempty(denylist)) { - _cleanup_free_ char *s = NULL; + assert(ret); - r = chase_symlinks(p->copy_files[0], arg_root, CHASE_PREFIX_ROOT, &s, NULL); - if (r < 0) - return log_error_errno(r, "Failed to resolve source '%s%s': %m", strempty(arg_root), p->copy_files[0]); + rfd = mkdtemp_open("/var/tmp/repart-XXXXXX", 0, &root); + if (rfd < 0) + return log_error_errno(rfd, "Failed to create temporary directory: %m"); - *ret_root = TAKE_PTR(s); - *ret_tmp_root = NULL; - return 0; - } + if (fchmod(rfd, 0755) < 0) + return log_error_errno(errno, "Failed to change mode of temporary directory: %m"); - r = mkdtemp_malloc("/var/tmp/repart-XXXXXX", &root); - if (r < 0) - return log_error_errno(r, "Failed to create temporary directory: %m"); - - r = do_copy_files(p, root, denylist); + r = do_copy_files(p, root, getuid(), getgid(), denylist); if (r < 0) return r; - r = do_make_directories(p, root); + r = do_make_directories(p, getuid(), getgid(), root); if (r < 0) return r; - *ret_root = NULL; - *ret_tmp_root = TAKE_PTR(root); + *ret = TAKE_PTR(root); return 0; } @@ -3401,13 +3853,7 @@ static int partition_populate_filesystem(Partition *p, const char *node, const S assert(p); assert(node); - if (fstype_is_ro(p->format)) - return 0; - - if (strv_isempty(p->copy_files) && strv_isempty(p->make_directories)) - return 0; - - log_info("Populating partition %" PRIu64 " with files.", p->partno); + log_info("Populating %s filesystem with files.", p->format); /* We copy in a child process, since we have to mount the fs for that, and we don't want that fs to * appear in the host namespace. Hence we fork a child that has its own file system namespace and @@ -3429,10 +3875,10 @@ static int partition_populate_filesystem(Partition *p, const char *node, const S if (mount_nofollow_verbose(LOG_ERR, node, fs, p->format, MS_NOATIME|MS_NODEV|MS_NOEXEC|MS_NOSUID, NULL) < 0) _exit(EXIT_FAILURE); - if (do_copy_files(p, fs, denylist) < 0) + if (do_copy_files(p, fs, 0, 0, denylist) < 0) _exit(EXIT_FAILURE); - if (do_make_directories(p, fs) < 0) + if (do_make_directories(p, 0, 0, fs) < 0) _exit(EXIT_FAILURE); r = syncfs_path(AT_FDCWD, fs); @@ -3444,7 +3890,7 @@ static int partition_populate_filesystem(Partition *p, const char *node, const S _exit(EXIT_SUCCESS); } - log_info("Successfully populated partition %" PRIu64 " with files.", p->partno); + log_info("Successfully populated %s filesystem with files.", p->format); return 0; } @@ -3456,7 +3902,7 @@ static int make_copy_files_denylist(Context *context, Set **ret) { assert(ret); LIST_FOREACH(partitions, p, context->partitions) { - const char *sources = gpt_partition_type_mountpoint_nulstr(p->type_uuid); + const char *sources = gpt_partition_type_mountpoint_nulstr(p->type); if (!sources) continue; @@ -3490,7 +3936,7 @@ static int make_copy_files_denylist(Context *context, Set **ret) { static int context_mkfs(Context *context) { _cleanup_set_free_ Set *denylist = NULL; - int fd = -1, r; + int r; assert(context); @@ -3501,13 +3947,8 @@ static int context_mkfs(Context *context) { return r; LIST_FOREACH(partitions, p, context->partitions) { - _cleanup_(sym_crypt_freep) struct crypt_device *cd = NULL; - _cleanup_(loop_device_unrefp) LoopDevice *d = NULL; - _cleanup_(rm_rf_physical_and_freep) char *tmp_root = NULL; - _cleanup_free_ char *encrypted = NULL, *root = NULL; - _cleanup_close_ int encrypted_dev_fd = -1; - const char *fsdev; - sd_id128_t fs_uuid; + _cleanup_(rm_rf_physical_and_freep) char *root = NULL; + _cleanup_(partition_target_freep) PartitionTarget *t = NULL; if (p->dropped) continue; @@ -3518,214 +3959,89 @@ static int context_mkfs(Context *context) { if (!p->format) continue; - assert(p->offset != UINT64_MAX); - assert(p->new_size != UINT64_MAX); - - if (fd < 0) - assert_se((fd = fdisk_get_devfd(context->fdisk_context)) >= 0); + /* Minimized partitions will use the copy blocks logic so let's make sure to skip those here. */ + if (p->copy_blocks_fd >= 0) + continue; - /* Loopback block devices are not only useful to turn regular files into block devices, but - * also to cut out sections of block devices into new block devices. */ + if (partition_skip(p)) + continue; - r = loop_device_make(fd, O_RDWR, p->offset, p->new_size, 0, 0, LOCK_EX, &d); + assert(p->offset != UINT64_MAX); + assert(p->new_size != UINT64_MAX); + assert(p->new_size >= (p->encrypt != ENCRYPT_OFF ? LUKS2_METADATA_KEEP_FREE : 0)); + + /* If we're doing encryption, we make sure we keep free space at the end which is required + * for cryptsetup's offline encryption. */ + r = partition_target_prepare(context, p, + p->new_size - (p->encrypt != ENCRYPT_OFF ? LUKS2_METADATA_KEEP_FREE : 0), + /*need_path=*/ true, + &t); if (r < 0) - return log_error_errno(r, "Failed to make loopback device of future partition %" PRIu64 ": %m", p->partno); - - if (p->encrypt != ENCRYPT_OFF) { - r = partition_encrypt(context, p, d->node, &cd, &encrypted, &encrypted_dev_fd); - if (r < 0) - return log_error_errno(r, "Failed to encrypt device: %m"); + return r; - if (flock(encrypted_dev_fd, LOCK_EX) < 0) - return log_error_errno(errno, "Failed to lock LUKS device: %m"); + log_info("Formatting future partition %" PRIu64 ".", p->partno); - fsdev = encrypted; - } else - fsdev = d->node; + /* If we're not writing to a loop device or if we're populating a read-only filesystem, we + * have to populate using the filesystem's mkfs's --root (or equivalent) option. To do that, + * we need to set up the final directory tree beforehand. */ - log_info("Formatting future partition %" PRIu64 ".", p->partno); + if (partition_needs_populate(p) && (!t->loop || fstype_is_ro(p->format))) { + if (!mkfs_supports_root_option(p->format)) + return log_error_errno(SYNTHETIC_ERRNO(ENODEV), + "Loop device access is required to populate %s filesystems.", + p->format); - /* Calculate the UUID for the file system as HMAC-SHA256 of the string "file-system-uuid", - * keyed off the partition UUID. */ - r = derive_uuid(p->new_uuid, "file-system-uuid", &fs_uuid); - if (r < 0) - return r; + r = partition_populate_directory(p, denylist, &root); + if (r < 0) + return r; + } - /* Ideally, we populate filesystems using our own code after creating the filesystem to - * ensure consistent handling of chattrs, xattrs and other similar things. However, when - * using read-only filesystems such as squashfs, we can't populate after creating the - * filesystem because it's read-only, so instead we create a temporary root to use as the - * source tree when generating the read-only filesystem. */ - r = partition_populate_directory(p, denylist, &root, &tmp_root); + r = make_filesystem(partition_target_path(t), p->format, strempty(p->new_label), root, + p->fs_uuid, arg_discard); if (r < 0) return r; - r = make_filesystem(fsdev, p->format, strempty(p->new_label), root ?: tmp_root, fs_uuid, arg_discard); - if (r < 0) { - encrypted_dev_fd = safe_close(encrypted_dev_fd); - (void) deactivate_luks(cd, encrypted); - return r; - } - log_info("Successfully formatted future partition %" PRIu64 ".", p->partno); - /* The file system is now created, no need to delay udev further */ - if (p->encrypt != ENCRYPT_OFF) - if (flock(encrypted_dev_fd, LOCK_UN) < 0) - return log_error_errno(errno, "Failed to unlock LUKS device: %m"); + /* If we're writing to a loop device, we can now mount the empty filesystem and populate it. */ + if (partition_needs_populate(p) && !root) { + assert(t->loop); - /* Now, we can populate all the other filesystems that aren't read-only. */ - r = partition_populate_filesystem(p, fsdev, denylist); - if (r < 0) { - encrypted_dev_fd = safe_close(encrypted_dev_fd); - (void) deactivate_luks(cd, encrypted); - return r; + r = partition_populate_filesystem(p, t->loop->node, denylist); + if (r < 0) + return r; } - /* Note that we always sync explicitly here, since mkfs.fat doesn't do that on its own, and - * if we don't sync before detaching a block device the in-flight sectors possibly won't hit - * the disk. */ - if (p->encrypt != ENCRYPT_OFF) { - if (fsync(encrypted_dev_fd) < 0) - return log_error_errno(errno, "Failed to synchronize LUKS volume: %m"); - encrypted_dev_fd = safe_close(encrypted_dev_fd); - - r = deactivate_luks(cd, encrypted); + r = partition_target_grow(t, p->new_size); if (r < 0) return r; - sym_crypt_free(cd); - cd = NULL; + r = partition_encrypt(context, p, partition_target_path(t)); + if (r < 0) + return log_error_errno(r, "Failed to encrypt device: %m"); } - r = loop_device_sync(d); - if (r < 0) - return log_error_errno(r, "Failed to sync loopback device: %m"); - } - - return 0; -} - -static int do_verity_format( - LoopDevice *data_device, - LoopDevice *hash_device, - uint64_t sector_size, - uint8_t **ret_roothash, - size_t *ret_roothash_size) { - -#if HAVE_LIBCRYPTSETUP - _cleanup_(sym_crypt_freep) struct crypt_device *cd = NULL; - _cleanup_free_ uint8_t *rh = NULL; - size_t rhs; - int r; - - assert(data_device); - assert(hash_device); - assert(sector_size > 0); - assert(ret_roothash); - assert(ret_roothash_size); - - r = dlopen_cryptsetup(); - if (r < 0) - return log_error_errno(r, "libcryptsetup not found, cannot setup verity: %m"); - - r = sym_crypt_init(&cd, hash_device->node); - if (r < 0) - return log_error_errno(r, "Failed to allocate libcryptsetup context: %m"); - - r = sym_crypt_format( - cd, CRYPT_VERITY, NULL, NULL, NULL, NULL, 0, - &(struct crypt_params_verity){ - .data_device = data_device->node, - .flags = CRYPT_VERITY_CREATE_HASH, - .hash_name = "sha256", - .hash_type = 1, - .data_block_size = sector_size, - .hash_block_size = sector_size, - .salt_size = 32, - }); - if (r < 0) - return log_error_errno(r, "Failed to setup verity hash data: %m"); - - r = sym_crypt_get_volume_key_size(cd); - if (r < 0) - return log_error_errno(r, "Failed to determine verity root hash size: %m"); - rhs = (size_t) r; - - rh = malloc(rhs); - if (!rh) - return log_oom(); - - r = sym_crypt_volume_key_get(cd, CRYPT_ANY_SLOT, (char *) rh, &rhs, NULL, 0); - if (r < 0) - return log_error_errno(r, "Failed to get verity root hash: %m"); - - *ret_roothash = TAKE_PTR(rh); - *ret_roothash_size = rhs; - - return 0; -#else - return log_error_errno(SYNTHETIC_ERRNO(EOPNOTSUPP), "libcryptsetup is not supported, cannot setup verity hashes: %m"); -#endif -} - -static int context_verity_hash(Context *context) { - int fd = -1, r; - - assert(context); - - LIST_FOREACH(partitions, p, context->partitions) { - Partition *dp; - _cleanup_(loop_device_unrefp) LoopDevice *hash_device = NULL, *data_device = NULL; - _cleanup_free_ uint8_t *rh = NULL; - size_t rhs = 0; /* Initialize to work around for GCC false positive. */ - - if (p->dropped) - continue; - - if (PARTITION_EXISTS(p)) /* Never format existing partitions */ - continue; - - if (p->verity != VERITY_HASH) - continue; - - assert_se(dp = p->siblings[VERITY_DATA]); - assert(!dp->dropped); - - if (fd < 0) - assert_se((fd = fdisk_get_devfd(context->fdisk_context)) >= 0); - - r = loop_device_make(fd, O_RDONLY, dp->offset, dp->new_size, 0, 0, LOCK_EX, &data_device); - if (r < 0) - return log_error_errno(r, - "Failed to make loopback device of verity data partition %" PRIu64 ": %m", - p->partno); - - r = loop_device_make(fd, O_RDWR, p->offset, p->new_size, 0, 0, LOCK_EX, &hash_device); - if (r < 0) - return log_error_errno(r, - "Failed to make loopback device of verity hash partition %" PRIu64 ": %m", - p->partno); + /* Note that we always sync explicitly here, since mkfs.fat doesn't do that on its own, and + * if we don't sync before detaching a block device the in-flight sectors possibly won't hit + * the disk. */ - r = do_verity_format(data_device, hash_device, context->sector_size, &rh, &rhs); + r = partition_target_sync(context, p, t); if (r < 0) return r; - assert(rhs >= sizeof(sd_id128_t) * 2); - - if (!dp->new_uuid_is_set) { - memcpy_safe(dp->new_uuid.bytes, rh, sizeof(sd_id128_t)); - dp->new_uuid_is_set = true; + if (p->siblings[VERITY_HASH]) { + r = partition_format_verity_hash(context, p->siblings[VERITY_HASH], + partition_target_path(t)); + if (r < 0) + return r; } - if (!p->new_uuid_is_set) { - memcpy_safe(p->new_uuid.bytes, rh + rhs - sizeof(sd_id128_t), sizeof(sd_id128_t)); - p->new_uuid_is_set = true; + if (p->siblings[VERITY_SIG]) { + r = partition_format_verity_sig(context, p->siblings[VERITY_SIG]); + if (r < 0) + return r; } - - p->roothash = TAKE_PTR(rh); - p->roothash_size = rhs; } return 0; @@ -3785,127 +4101,6 @@ static int parse_private_key(const char *key, size_t key_size, EVP_PKEY **ret) { #endif } -static int sign_verity_roothash( - const uint8_t *roothash, - size_t roothash_size, - uint8_t **ret_signature, - size_t *ret_signature_size) { - -#if HAVE_OPENSSL - _cleanup_(BIO_freep) BIO *rb = NULL; - _cleanup_(PKCS7_freep) PKCS7 *p7 = NULL; - _cleanup_free_ char *hex = NULL; - _cleanup_free_ uint8_t *sig = NULL; - int sigsz; - - assert(roothash); - assert(roothash_size > 0); - assert(ret_signature); - assert(ret_signature_size); - - hex = hexmem(roothash, roothash_size); - if (!hex) - return log_oom(); - - rb = BIO_new_mem_buf(hex, -1); - if (!rb) - return log_oom(); - - p7 = PKCS7_sign(arg_certificate, arg_private_key, NULL, rb, PKCS7_DETACHED|PKCS7_NOATTR|PKCS7_BINARY); - if (!p7) - return log_error_errno(SYNTHETIC_ERRNO(EIO), "Failed to calculate PKCS7 signature: %s", - ERR_error_string(ERR_get_error(), NULL)); - - sigsz = i2d_PKCS7(p7, &sig); - if (sigsz < 0) - return log_error_errno(SYNTHETIC_ERRNO(EIO), "Failed to convert PKCS7 signature to DER: %s", - ERR_error_string(ERR_get_error(), NULL)); - - *ret_signature = TAKE_PTR(sig); - *ret_signature_size = sigsz; - - return 0; -#else - return log_error_errno(SYNTHETIC_ERRNO(EOPNOTSUPP), "openssl is not supported, cannot setup verity signature: %m"); -#endif -} - -static int context_verity_sig(Context *context) { - int fd = -1, r; - - assert(context); - - LIST_FOREACH(partitions, p, context->partitions) { - _cleanup_(json_variant_unrefp) JsonVariant *v = NULL; - _cleanup_free_ uint8_t *sig = NULL; - _cleanup_free_ char *text = NULL; - Partition *hp; - uint8_t fp[X509_FINGERPRINT_SIZE]; - size_t sigsz = 0, padsz; /* avoid false maybe-uninitialized warning */ - - if (p->dropped) - continue; - - if (PARTITION_EXISTS(p)) - continue; - - if (p->verity != VERITY_SIG) - continue; - - assert_se(hp = p->siblings[VERITY_HASH]); - assert(!hp->dropped); - - assert(arg_certificate); - - if (fd < 0) - assert_se((fd = fdisk_get_devfd(context->fdisk_context)) >= 0); - - r = sign_verity_roothash(hp->roothash, hp->roothash_size, &sig, &sigsz); - if (r < 0) - return r; - - r = x509_fingerprint(arg_certificate, fp); - if (r < 0) - return log_error_errno(r, "Unable to calculate X509 certificate fingerprint: %m"); - - r = json_build(&v, - JSON_BUILD_OBJECT( - JSON_BUILD_PAIR("rootHash", JSON_BUILD_HEX(hp->roothash, hp->roothash_size)), - JSON_BUILD_PAIR( - "certificateFingerprint", - JSON_BUILD_HEX(fp, sizeof(fp)) - ), - JSON_BUILD_PAIR("signature", JSON_BUILD_BASE64(sig, sigsz)) - ) - ); - if (r < 0) - return log_error_errno(r, "Failed to build JSON object: %m"); - - r = json_variant_format(v, 0, &text); - if (r < 0) - return log_error_errno(r, "Failed to format JSON object: %m"); - - padsz = round_up_size(strlen(text), 4096); - assert_se(padsz <= p->new_size); - - r = strgrowpad0(&text, padsz); - if (r < 0) - return log_error_errno(r, "Failed to pad string to %s", FORMAT_BYTES(padsz)); - - if (lseek(fd, p->offset, SEEK_SET) == (off_t) -1) - return log_error_errno(errno, "Failed to seek to partition offset: %m"); - - r = loop_write(fd, text, padsz, /*do_poll=*/ false); - if (r < 0) - return log_error_errno(r, "Failed to write verity signature to partition: %m"); - - if (fsync(fd) < 0) - return log_error_errno(errno, "Failed to synchronize verity signature JSON: %m"); - } - - return 0; -} - static int partition_acquire_uuid(Context *context, Partition *p, sd_id128_t *ret) { struct { sd_id128_t type_uuid; @@ -3946,13 +4141,13 @@ static int partition_acquire_uuid(Context *context, Partition *p, sd_id128_t *re if (p == q) break; - if (!sd_id128_equal(p->type_uuid, q->type_uuid)) + if (!sd_id128_equal(p->type.uuid, q->type.uuid)) continue; k++; } - plaintext.type_uuid = p->type_uuid; + plaintext.type_uuid = p->type.uuid; plaintext.counter = htole64(k); hmac_sha256(context->seed.bytes, sizeof(context->seed.bytes), @@ -3993,7 +4188,7 @@ static int partition_acquire_label(Context *context, Partition *p, char **ret) { assert(p); assert(ret); - prefix = gpt_partition_type_uuid_to_string(p->type_uuid); + prefix = gpt_partition_type_uuid_to_string(p->type.uuid); if (!prefix) prefix = "linux"; @@ -4060,6 +4255,12 @@ static int context_acquire_partition_uuids_and_labels(Context *context) { p->new_uuid_is_set = true; } + /* Calculate the UUID for the file system as HMAC-SHA256 of the string "file-system-uuid", + * keyed off the partition UUID. */ + r = derive_uuid(p->new_uuid, "file-system-uuid", &p->fs_uuid); + if (r < 0) + return r; + if (!isempty(p->current_label)) { /* never change initialized labels */ r = free_and_strdup_warn(&p->new_label, p->current_label); @@ -4103,35 +4304,35 @@ static uint64_t partition_merge_flags(Partition *p) { f = p->gpt_flags; if (p->no_auto >= 0) { - if (gpt_partition_type_knows_no_auto(p->type_uuid)) + if (gpt_partition_type_knows_no_auto(p->type)) SET_FLAG(f, SD_GPT_FLAG_NO_AUTO, p->no_auto); else { char buffer[SD_ID128_UUID_STRING_MAX]; log_warning("Configured NoAuto=%s for partition type '%s' that doesn't support it, ignoring.", yes_no(p->no_auto), - gpt_partition_type_uuid_to_string_harder(p->type_uuid, buffer)); + gpt_partition_type_uuid_to_string_harder(p->type.uuid, buffer)); } } if (p->read_only >= 0) { - if (gpt_partition_type_knows_read_only(p->type_uuid)) + if (gpt_partition_type_knows_read_only(p->type)) SET_FLAG(f, SD_GPT_FLAG_READ_ONLY, p->read_only); else { char buffer[SD_ID128_UUID_STRING_MAX]; log_warning("Configured ReadOnly=%s for partition type '%s' that doesn't support it, ignoring.", yes_no(p->read_only), - gpt_partition_type_uuid_to_string_harder(p->type_uuid, buffer)); + gpt_partition_type_uuid_to_string_harder(p->type.uuid, buffer)); } } if (p->growfs >= 0) { - if (gpt_partition_type_knows_growfs(p->type_uuid)) + if (gpt_partition_type_knows_growfs(p->type)) SET_FLAG(f, SD_GPT_FLAG_GROWFS, p->growfs); else { char buffer[SD_ID128_UUID_STRING_MAX]; log_warning("Configured GrowFileSystem=%s for partition type '%s' that doesn't support it, ignoring.", yes_no(p->growfs), - gpt_partition_type_uuid_to_string_harder(p->type_uuid, buffer)); + gpt_partition_type_uuid_to_string_harder(p->type.uuid, buffer)); } } @@ -4147,6 +4348,9 @@ static int context_mangle_partitions(Context *context) { if (p->dropped) continue; + if (partition_skip(p)) + continue; + assert(p->new_size != UINT64_MAX); assert(p->offset != UINT64_MAX); assert(p->partno != UINT64_MAX); @@ -4210,7 +4414,7 @@ static int context_mangle_partitions(Context *context) { if (!t) return log_oom(); - r = fdisk_parttype_set_typestr(t, SD_ID128_TO_UUID_STRING(p->type_uuid)); + r = fdisk_parttype_set_typestr(t, SD_ID128_TO_UUID_STRING(p->type.uuid)); if (r < 0) return log_error_errno(r, "Failed to initialize partition type: %m"); @@ -4269,8 +4473,8 @@ static int split_name_printf(Partition *p) { assert(p); const Specifier table[] = { - { 't', specifier_string, GPT_PARTITION_TYPE_UUID_TO_STRING_HARDER(p->type_uuid) }, - { 'T', specifier_id128, &p->type_uuid }, + { 't', specifier_string, GPT_PARTITION_TYPE_UUID_TO_STRING_HARDER(p->type.uuid) }, + { 'T', specifier_id128, &p->type.uuid }, { 'U', specifier_id128, &p->new_uuid }, { 'n', specifier_uint64, &p->partno }, @@ -4387,6 +4591,9 @@ static int context_split(Context *context) { if (!p->split_name_resolved) continue; + if (partition_skip(p)) + continue; + fname = strjoin(base, ".", p->split_name_resolved, ext); if (!fname) return log_oom(); @@ -4401,7 +4608,7 @@ static int context_split(Context *context) { if (lseek(fd, p->offset, SEEK_SET) < 0) return log_error_errno(errno, "Failed to seek to partition offset: %m"); - r = copy_bytes_full(fd, fdt, p->new_size, COPY_REFLINK|COPY_HOLES, NULL, NULL, NULL, NULL); + r = copy_bytes(fd, fdt, p->new_size, COPY_REFLINK|COPY_HOLES); if (r < 0) return log_error_errno(r, "Failed to copy to split partition %s: %m", fname); } @@ -4438,13 +4645,15 @@ static int context_write_partition_table( log_info("Wiped block device."); - r = context_discard_range(context, 0, context->total); - if (r == -EOPNOTSUPP) - log_info("Storage does not support discard, not discarding entire block device data."); - else if (r < 0) - return log_error_errno(r, "Failed to discard entire block device: %m"); - else if (r > 0) - log_info("Discarded entire block device."); + if (arg_discard) { + r = context_discard_range(context, 0, context->total); + if (r == -EOPNOTSUPP) + log_info("Storage does not support discard, not discarding entire block device data."); + else if (r < 0) + return log_error_errno(r, "Failed to discard entire block device: %m"); + else if (r > 0) + log_info("Discarded entire block device."); + } } r = fdisk_get_partitions(context->fdisk_context, &original_table); @@ -4465,14 +4674,6 @@ static int context_write_partition_table( if (r < 0) return r; - r = context_verity_hash(context); - if (r < 0) - return r; - - r = context_verity_sig(context); - if (r < 0) - return r; - r = context_mangle_partitions(context); if (r < 0) return r; @@ -4599,7 +4800,7 @@ static int context_can_factory_reset(Context *context) { static int resolve_copy_blocks_auto_candidate( dev_t partition_devno, - sd_id128_t partition_type_uuid, + GptPartitionType partition_type, dev_t restrict_devno, sd_id128_t *ret_uuid) { @@ -4691,10 +4892,10 @@ static int resolve_copy_blocks_auto_candidate( return false; } - if (!sd_id128_equal(pt_parsed, partition_type_uuid)) { + if (!sd_id128_equal(pt_parsed, partition_type.uuid)) { log_debug("Partition %u:%u has non-matching partition type " SD_ID128_FORMAT_STR " (needed: " SD_ID128_FORMAT_STR "), ignoring.", major(partition_devno), minor(partition_devno), - SD_ID128_FORMAT_VAL(pt_parsed), SD_ID128_FORMAT_VAL(partition_type_uuid)); + SD_ID128_FORMAT_VAL(pt_parsed), SD_ID128_FORMAT_VAL(partition_type.uuid)); return false; } @@ -4751,7 +4952,7 @@ static int find_backing_devno( } static int resolve_copy_blocks_auto( - sd_id128_t type_uuid, + GptPartitionType type, const char *root, dev_t restrict_devno, dev_t *ret_devno, @@ -4781,30 +4982,30 @@ static int resolve_copy_blocks_auto( * partitions in the host, using the appropriate directory as key and ensuring that the partition * type matches. */ - if (gpt_partition_type_is_root(type_uuid)) + if (type.designator == PARTITION_ROOT) try1 = "/"; - else if (gpt_partition_type_is_usr(type_uuid)) + else if (type.designator == PARTITION_USR) try1 = "/usr/"; - else if (gpt_partition_type_is_root_verity(type_uuid)) + else if (type.designator == PARTITION_ROOT_VERITY) try1 = "/"; - else if (gpt_partition_type_is_usr_verity(type_uuid)) + else if (type.designator == PARTITION_USR_VERITY) try1 = "/usr/"; - else if (sd_id128_equal(type_uuid, SD_GPT_ESP)) { + else if (type.designator == PARTITION_ESP) { try1 = "/efi/"; try2 = "/boot/"; - } else if (sd_id128_equal(type_uuid, SD_GPT_XBOOTLDR)) + } else if (type.designator == PARTITION_XBOOTLDR) try1 = "/boot/"; else return log_error_errno(SYNTHETIC_ERRNO(EOPNOTSUPP), "Partition type " SD_ID128_FORMAT_STR " not supported from automatic source block device discovery.", - SD_ID128_FORMAT_VAL(type_uuid)); + SD_ID128_FORMAT_VAL(type.uuid)); r = find_backing_devno(try1, root, &devno); if (r == -ENOENT && try2) r = find_backing_devno(try2, root, &devno); if (r < 0) return log_error_errno(r, "Failed to resolve automatic CopyBlocks= path for partition type " SD_ID128_FORMAT_STR ", sorry: %m", - SD_ID128_FORMAT_VAL(type_uuid)); + SD_ID128_FORMAT_VAL(type.uuid)); xsprintf_sys_block_path(p, "/slaves", devno); d = opendir(p); @@ -4846,7 +5047,7 @@ static int resolve_copy_blocks_auto( continue; } - r = resolve_copy_blocks_auto_candidate(sl, type_uuid, restrict_devno, &u); + r = resolve_copy_blocks_auto_candidate(sl, type, restrict_devno, &u); if (r < 0) return r; if (r > 0) { @@ -4862,7 +5063,7 @@ static int resolve_copy_blocks_auto( } else if (errno != ENOENT) return log_error_errno(errno, "Failed open %s: %m", p); else { - r = resolve_copy_blocks_auto_candidate(devno, type_uuid, restrict_devno, &found_uuid); + r = resolve_copy_blocks_auto_candidate(devno, type, restrict_devno, &found_uuid); if (r < 0) return r; if (r > 0) @@ -4884,7 +5085,6 @@ static int resolve_copy_blocks_auto( static int context_open_copy_block_paths( Context *context, - const char *root, dev_t restrict_devno) { int r; @@ -4906,7 +5106,7 @@ static int context_open_copy_block_paths( if (p->copy_blocks_path) { - source_fd = chase_symlinks_and_open(p->copy_blocks_path, root, CHASE_PREFIX_ROOT, O_RDONLY|O_CLOEXEC|O_NONBLOCK, &opened); + source_fd = chase_symlinks_and_open(p->copy_blocks_path, p->copy_blocks_root, CHASE_PREFIX_ROOT, O_RDONLY|O_CLOEXEC|O_NONBLOCK, &opened); if (source_fd < 0) return log_error_errno(source_fd, "Failed to open '%s': %m", p->copy_blocks_path); @@ -4920,7 +5120,7 @@ static int context_open_copy_block_paths( } else if (p->copy_blocks_auto) { dev_t devno; - r = resolve_copy_blocks_auto(p->type_uuid, root, restrict_devno, &devno, &uuid); + r = resolve_copy_blocks_auto(p->type, p->copy_blocks_root, restrict_devno, &devno, &uuid); if (r < 0) return r; @@ -4989,6 +5189,229 @@ static int context_open_copy_block_paths( return 0; } +static int fd_apparent_size(int fd, uint64_t *ret) { + off_t initial = 0; + uint64_t size = 0; + + assert(fd >= 0); + assert(ret); + + initial = lseek(fd, 0, SEEK_CUR); + if (initial < 0) + return log_error_errno(errno, "Failed to get file offset: %m"); + + for (off_t off = 0;;) { + off_t r; + + r = lseek(fd, off, SEEK_DATA); + if (r < 0 && errno == ENXIO) + /* If errno == ENXIO, that means we've reached the final hole of the file and + * that hole isn't followed by more data. */ + break; + if (r < 0) + return log_error_errno(errno, "Failed to seek data in file from offset %"PRIi64": %m", off); + + off = r; /* Set the offset to the start of the data segment. */ + + /* After copying a potential hole, find the end of the data segment by looking for + * the next hole. If we get ENXIO, we're at EOF. */ + r = lseek(fd, off, SEEK_HOLE); + if (r < 0) { + if (errno == ENXIO) + break; + return log_error_errno(errno, "Failed to seek hole in file from offset %"PRIi64": %m", off); + } + + size += r - off; + off = r; + } + + if (lseek(fd, initial, SEEK_SET) < 0) + return log_error_errno(errno, "Failed to reset file offset: %m"); + + *ret = size; + + return 0; +} + +static int context_minimize(Context *context) { + _cleanup_set_free_ Set *denylist = NULL; + const char *vt; + int r; + + assert(context); + + r = make_copy_files_denylist(context, &denylist); + if (r < 0) + return r; + + r = var_tmp_dir(&vt); + if (r < 0) + return log_error_errno(r, "Could not determine temporary directory: %m"); + + LIST_FOREACH(partitions, p, context->partitions) { + _cleanup_(rm_rf_physical_and_freep) char *root = NULL; + _cleanup_(unlink_and_freep) char *temp = NULL; + _cleanup_(loop_device_unrefp) LoopDevice *d = NULL; + _cleanup_close_ int fd = -1; + sd_id128_t fs_uuid; + uint64_t fsz; + + if (p->dropped) + continue; + + if (PARTITION_EXISTS(p)) /* Never format existing partitions */ + continue; + + if (!p->format) + continue; + + if (!p->minimize) + continue; + + if (!partition_needs_populate(p)) + continue; + + assert(!p->copy_blocks_path); + + r = tempfn_random_child(vt, "repart", &temp); + if (r < 0) + return log_error_errno(r, "Failed to generate temporary file path: %m"); + + if (fstype_is_ro(p->format)) + fs_uuid = p->fs_uuid; + else { + fd = open(temp, O_CREAT|O_EXCL|O_CLOEXEC|O_RDWR|O_NOCTTY, 0600); + if (fd < 0) + return log_error_errno(errno, "Failed to open temporary file %s: %m", temp); + + /* This may seem huge but it will be created sparse so it doesn't take up any space + * on disk until written to. */ + if (ftruncate(fd, 1024ULL * 1024ULL * 1024ULL * 1024ULL) < 0) + return log_error_errno(errno, "Failed to truncate temporary file to %s: %m", + FORMAT_BYTES(1024ULL * 1024ULL * 1024ULL * 1024ULL)); + + r = loop_device_make(fd, O_RDWR, 0, UINT64_MAX, 0, 0, LOCK_EX, &d); + if (r < 0 && r != -ENOENT && !ERRNO_IS_PRIVILEGE(r)) + return log_error_errno(r, "Failed to make loopback device of %s: %m", temp); + + /* We're going to populate this filesystem twice so use a random UUID the first time + * to avoid UUID conflicts. */ + r = sd_id128_randomize(&fs_uuid); + if (r < 0) + return r; + } + + if (!d || fstype_is_ro(p->format)) { + if (!mkfs_supports_root_option(p->format)) + return log_error_errno(SYNTHETIC_ERRNO(ENODEV), + "Loop device access is required to populate %s filesystems", + p->format); + + r = partition_populate_directory(p, denylist, &root); + if (r < 0) + return r; + } + + r = make_filesystem(d ? d->node : temp, p->format, strempty(p->new_label), root, fs_uuid, arg_discard); + if (r < 0) + return r; + + /* Read-only filesystems are minimal from the first try because they create and size the + * loopback file for us. */ + if (fstype_is_ro(p->format)) { + p->copy_blocks_path = TAKE_PTR(temp); + continue; + } + + if (!root) { + assert(d); + + r = partition_populate_filesystem(p, d->node, denylist); + if (r < 0) + return r; + } + + /* Other filesystems need to be provided with a pre-sized loopback file and will adapt to + * fully occupy it. Because we gave the filesystem a 1T sparse file, we need to shrink the + * filesystem down to a reasonable size again to fit it in the disk image. While there are + * some filesystems that support shrinking, it doesn't always work properly (e.g. shrinking + * btrfs gives us a 2.0G filesystem regardless of what we put in it). Instead, let's populate + * the filesystem again, but this time, instead of providing the filesystem with a 1T sparse + * loopback file, let's size the loopback file based on the actual data used by the + * filesystem in the sparse file after the first attempt. This should be a good guess of the + * minimal amount of space needed in the filesystem to fit all the required data. + */ + r = fd_apparent_size(fd, &fsz); + if (r < 0) + return r; + + /* Massage the size a bit because just going by actual data used in the sparse file isn't + * fool-proof. */ + fsz = round_up_size(fsz + (fsz / 2), context->grain_size); + if (minimal_size_by_fs_name(p->format) != UINT64_MAX) + fsz = MAX(minimal_size_by_fs_name(p->format), fsz); + + d = loop_device_unref(d); + + /* Erase the previous filesystem first. */ + if (ftruncate(fd, 0)) + return log_error_errno(errno, "Failed to erase temporary file: %m"); + + if (ftruncate(fd, fsz)) + return log_error_errno(errno, "Failed to truncate temporary file to %s: %m", FORMAT_BYTES(fsz)); + + r = loop_device_make(fd, O_RDWR, 0, UINT64_MAX, 0, 0, LOCK_EX, &d); + if (r < 0 && r != -ENOENT && !ERRNO_IS_PRIVILEGE(r)) + return log_error_errno(r, "Failed to make loopback device of %s: %m", temp); + + r = make_filesystem(d ? d->node : temp, p->format, strempty(p->new_label), root, p->fs_uuid, arg_discard); + if (r < 0) + return r; + + if (!root) { + assert(d); + + r = partition_populate_filesystem(p, d->node, denylist); + if (r < 0) + return r; + } + + p->copy_blocks_path = TAKE_PTR(temp); + } + + return 0; +} + +static int parse_partition_types(const char *p, sd_id128_t **partitions, size_t *n_partitions) { + int r; + + assert(partitions); + assert(n_partitions); + + for (;;) { + _cleanup_free_ char *name = NULL; + GptPartitionType type; + + r = extract_first_word(&p, &name, ",", EXTRACT_CUNESCAPE|EXTRACT_DONT_COALESCE_SEPARATORS); + if (r == 0) + break; + if (r < 0) + return log_error_errno(r, "Failed to extract partition type identifier or GUID: %s", p); + + r = gpt_partition_type_from_string(name, &type); + if (r < 0) + return log_error_errno(r, "'%s' is not a valid partition type identifier or GUID", name); + + if (!GREEDY_REALLOC(*partitions, *n_partitions + 1)) + return log_oom(); + + (*partitions)[(*n_partitions)++] = type.uuid; + } + + return 0; +} + static int help(void) { _cleanup_free_ char *link = NULL; int r; @@ -5031,6 +5454,13 @@ static int help(void) { " --json=pretty|short|off\n" " Generate JSON output\n" " --split=BOOL Whether to generate split artifacts\n" + " --include-partitions=PARTITION1,PARTITION2,PARTITION3,…\n" + " Ignore partitions not of the specified types\n" + " --exclude-partitions=PARTITION1,PARTITION2,PARTITION3,…\n" + " Ignore partitions of the specified types\n" + " --skip-partitions=PARTITION1,PARTITION2,PARTITION3,…\n" + " Take partitions of the specified types into account\n" + " but don't populate them yet\n" "\nSee the %s for details.\n", program_invocation_short_name, ansi_highlight(), @@ -5066,6 +5496,9 @@ static int parse_argv(int argc, char *argv[]) { ARG_TPM2_PUBLIC_KEY, ARG_TPM2_PUBLIC_KEY_PCRS, ARG_SPLIT, + ARG_INCLUDE_PARTITIONS, + ARG_EXCLUDE_PARTITIONS, + ARG_SKIP_PARTITIONS, }; static const struct option options[] = { @@ -5093,6 +5526,9 @@ static int parse_argv(int argc, char *argv[]) { { "tpm2-public-key", required_argument, NULL, ARG_TPM2_PUBLIC_KEY }, { "tpm2-public-key-pcrs", required_argument, NULL, ARG_TPM2_PUBLIC_KEY_PCRS }, { "split", required_argument, NULL, ARG_SPLIT }, + { "include-partitions", required_argument, NULL, ARG_INCLUDE_PARTITIONS }, + { "exclude-partitions", required_argument, NULL, ARG_EXCLUDE_PARTITIONS }, + { "skip-partitions", required_argument, NULL, ARG_SKIP_PARTITIONS }, {} }; @@ -5347,6 +5783,39 @@ static int parse_argv(int argc, char *argv[]) { arg_split = r; break; + case ARG_INCLUDE_PARTITIONS: + if (arg_filter_partitions_type == FILTER_PARTITIONS_EXCLUDE) + return log_error_errno(SYNTHETIC_ERRNO(EINVAL), + "Combination of --include-partitions= and --exclude-partitions= is invalid."); + + r = parse_partition_types(optarg, &arg_filter_partitions, &arg_n_filter_partitions); + if (r < 0) + return r; + + arg_filter_partitions_type = FILTER_PARTITIONS_INCLUDE; + + break; + + case ARG_EXCLUDE_PARTITIONS: + if (arg_filter_partitions_type == FILTER_PARTITIONS_INCLUDE) + return log_error_errno(SYNTHETIC_ERRNO(EINVAL), + "Combination of --include-partitions= and --exclude-partitions= is invalid."); + + r = parse_partition_types(optarg, &arg_filter_partitions, &arg_n_filter_partitions); + if (r < 0) + return r; + + arg_filter_partitions_type = FILTER_PARTITIONS_EXCLUDE; + + break; + + case ARG_SKIP_PARTITIONS: + r = parse_partition_types(optarg, &arg_skip_partitions, &arg_n_skip_partitions); + if (r < 0) + return r; + + break; + case '?': return -EINVAL; @@ -5629,11 +6098,7 @@ static int resize_pt(int fd) { * possession of the enlarged backing file. For this it suffices to open the device with libfdisk and * immediately write it again, with no changes. */ - c = fdisk_new_context(); - if (!c) - return log_oom(); - - r = fdisk_assign_device(c, FORMAT_PROC_FD_PATH(fd), 0); + r = fdisk_new_context_fd(fd, /* read_only= */ false, &c); if (r < 0) return log_error_errno(r, "Failed to open device '%s': %m", FORMAT_PROC_FD_PATH(fd)); @@ -5887,11 +6352,6 @@ static int run(int argc, char *argv[]) { if (r < 0) return r; - if (context->n_partitions <= 0 && arg_empty == EMPTY_REFUSE) { - log_info("Didn't find any partition definition files, nothing to do."); - return 0; - } - r = find_root(&node, &backing_fd); if (r < 0) return r; @@ -5949,10 +6409,18 @@ static int run(int argc, char *argv[]) { if (r < 0) return r; + /* Make sure each partition has a unique UUID and unique label */ + r = context_acquire_partition_uuids_and_labels(context); + if (r < 0) + return r; + + r = context_minimize(context); + if (r < 0) + return r; + /* Open all files to copy blocks from now, since we want to take their size into consideration */ r = context_open_copy_block_paths( context, - arg_root, loop_device ? loop_device->devno : /* if --image= is specified, only allow partitions on the loopback device */ arg_root && !arg_image ? 0 : /* if --root= is specified, don't accept any block device */ (dev_t) -1); /* if neither is specified, make no restrictions */ @@ -6005,11 +6473,6 @@ static int run(int argc, char *argv[]) { /* Now calculate where each new partition gets placed */ context_place_partitions(context); - /* Make sure each partition has a unique UUID and unique label */ - r = context_acquire_partition_uuids_and_labels(context); - if (r < 0) - return r; - (void) context_dump(context, node, /*late=*/ false); r = context_write_partition_table(context, node, from_scratch); diff --git a/src/portable/portable.c b/src/portable/portable.c index dd0f6d3d13..76af743771 100644 --- a/src/portable/portable.c +++ b/src/portable/portable.c @@ -1131,7 +1131,7 @@ static int attach_unit_file( (void) mkdir_parents(where, 0755); if (mkdir(where, 0755) < 0) { if (errno != EEXIST) - return -errno; + return log_debug_errno(errno, "Failed to create attach directory %s: %m", where); } else (void) portable_changes_add(changes, n_changes, PORTABLE_MKDIR, where, NULL); @@ -1145,7 +1145,7 @@ static int attach_unit_file( if (mkdir(dropin_dir, 0755) < 0) { if (errno != EEXIST) - return -errno; + return log_debug_errno(errno, "Failed to create drop-in directory %s: %m", dropin_dir); } else (void) portable_changes_add(changes, n_changes, PORTABLE_MKDIR, dropin_dir, NULL); @@ -1392,7 +1392,7 @@ int portable_attach( r = attach_unit_file(&paths, image->path, image->type, extension_images, item, profile, flags, changes, n_changes); if (r < 0) - return r; + return sd_bus_error_set_errnof(error, r, "Failed to attach unit '%s': %m", item->name); } /* We don't care too much for the image symlink, it's just a convenience thing, it's not necessary for proper diff --git a/src/random-seed/random-seed.c b/src/random-seed/random-seed.c index 04c2a29762..020840e0df 100644 --- a/src/random-seed/random-seed.c +++ b/src/random-seed/random-seed.c @@ -16,7 +16,10 @@ #include "alloc-util.h" #include "build.h" +#include "chase-symlinks.h" +#include "efi-loader.h" #include "fd-util.h" +#include "find-esp.h" #include "fs-util.h" #include "io-util.h" #include "log.h" @@ -26,6 +29,7 @@ #include "mkdir.h" #include "parse-argument.h" #include "parse-util.h" +#include "path-util.h" #include "pretty-print.h" #include "random-util.h" #include "string-table.h" @@ -185,7 +189,7 @@ static int load_seed_file( if (ret_hash_state) { struct sha256_ctx *hash_state; - hash_state = malloc(sizeof(struct sha256_ctx)); + hash_state = new(struct sha256_ctx, 1); if (!hash_state) return log_oom(); @@ -311,6 +315,102 @@ static int save_seed_file( return 0; } +static int refresh_boot_seed(void) { + uint8_t buffer[RANDOM_EFI_SEED_SIZE]; + struct sha256_ctx hash_state; + _cleanup_free_ void *seed_file_bytes = NULL; + _cleanup_free_ char *esp_path = NULL; + _cleanup_close_ int seed_fd = -1, dir_fd = -1; + size_t len; + ssize_t n; + int r; + + assert_cc(RANDOM_EFI_SEED_SIZE == SHA256_DIGEST_SIZE); + + r = find_esp_and_warn(NULL, NULL, /* unprivileged_mode= */ false, &esp_path, + NULL, NULL, NULL, NULL, NULL); + if (r < 0) { + if (r == -ENOKEY) { + log_debug_errno(r, "Couldn't find any ESP, so not updating ESP random seed."); + return 0; + } + return r; /* find_esp_and_warn() already logged */ + } + + r = chase_symlinks("/loader", esp_path, CHASE_PREFIX_ROOT|CHASE_PROHIBIT_SYMLINKS, NULL, &dir_fd); + if (r < 0) { + if (r == -ENOENT) { + log_debug_errno(r, "Couldn't find ESP loader directory, so not updating ESP random seed."); + return 0; + } + return log_error_errno(r, "Failed to open ESP loader directory: %m"); + } + seed_fd = openat(dir_fd, "random-seed", O_NOFOLLOW|O_RDWR|O_CLOEXEC|O_NOCTTY); + if (seed_fd < 0 && errno == ENOENT) { + uint64_t features; + r = efi_loader_get_features(&features); + if (r == 0 && FLAGS_SET(features, EFI_LOADER_FEATURE_RANDOM_SEED)) + seed_fd = openat(dir_fd, "random-seed", O_CREAT|O_EXCL|O_RDWR|O_CLOEXEC|O_NOCTTY, 0600); + else { + log_debug_errno(seed_fd, "Couldn't find ESP random seed, and not booted with systemd-boot, so not updating ESP random seed."); + return 0; + } + } + if (seed_fd < 0) + return log_error_errno(errno, "Failed to open EFI seed path: %m"); + r = random_seed_size(seed_fd, &len); + if (r < 0) + return log_error_errno(r, "Failed to determine EFI seed path length: %m"); + seed_file_bytes = malloc(len); + if (!seed_file_bytes) + return log_oom(); + n = loop_read(seed_fd, seed_file_bytes, len, false); + if (n < 0) + return log_error_errno(n, "Failed to read EFI seed file: %m"); + + /* Hash the old seed in so that we never regress in entropy. */ + sha256_init_ctx(&hash_state); + sha256_process_bytes(&n, sizeof(n), &hash_state); + sha256_process_bytes(seed_file_bytes, n, &hash_state); + + /* We're doing this opportunistically, so if the seeding dance before didn't manage to initialize the + * RNG, there's no point in doing it here. Secondly, getrandom(GRND_NONBLOCK) has been around longer + * than EFI seeding anyway, so there's no point in having non-getrandom() fallbacks here. So if this + * fails, just return early to cut our losses. */ + n = getrandom(buffer, sizeof(buffer), GRND_NONBLOCK); + if (n < 0) { + if (errno == EAGAIN) { + log_debug_errno(errno, "Random pool not initialized yet, so skipping EFI seed update"); + return 0; + } + if (errno == ENOSYS) { + log_debug_errno(errno, "getrandom() not available, so skipping EFI seed update"); + return 0; + } + return log_error_errno(errno, "Failed to generate random bytes for EFI seed: %m"); + } + assert(n == sizeof(buffer)); + + /* Hash the new seed into the state containing the old one to generate our final seed. */ + sha256_process_bytes(&n, sizeof(n), &hash_state); + sha256_process_bytes(buffer, n, &hash_state); + sha256_finish_ctx(&hash_state, buffer); + + if (lseek(seed_fd, 0, SEEK_SET) < 0) + return log_error_errno(errno, "Failed to seek to beginning of EFI seed file: %m"); + r = loop_write(seed_fd, buffer, sizeof(buffer), false); + if (r < 0) + return log_error_errno(r, "Failed to write new EFI seed file: %m"); + if (ftruncate(seed_fd, sizeof(buffer)) < 0) + return log_error_errno(errno, "Failed to truncate EFI seed file: %m"); + r = fsync_full(seed_fd); + if (r < 0) + return log_error_errno(r, "Failed to fsync EFI seed file: %m"); + + log_debug("Updated random seed in ESP"); + return 0; +} + static int help(int argc, char *argv[], void *userdata) { _cleanup_free_ char *link = NULL; int r; @@ -402,15 +502,15 @@ static int run(int argc, char *argv[]) { if (r < 0) return log_error_errno(r, "Failed to create directory " RANDOM_SEED_DIR ": %m"); + random_fd = open("/dev/urandom", O_RDWR|O_CLOEXEC|O_NOCTTY); + if (random_fd < 0) + return log_error_errno(errno, "Failed to open /dev/urandom: %m"); + /* When we load the seed we read it and write it to the device and then immediately update the saved * seed with new data, to make sure the next boot gets seeded differently. */ switch (arg_action) { case ACTION_LOAD: - random_fd = open("/dev/urandom", O_RDWR|O_CLOEXEC|O_NOCTTY); - if (random_fd < 0) - return log_error_errno(errno, "Failed to open /dev/urandom: %m"); - /* First, let's write the machine ID into /dev/urandom, not crediting entropy. See * load_machine_id() for an explanation why. */ load_machine_id(random_fd); @@ -428,8 +528,10 @@ static int run(int argc, char *argv[]) { log_full_errno(level, open_rw_error, "Failed to open " RANDOM_SEED " for writing: %m"); log_full_errno(level, errno, "Failed to open " RANDOM_SEED " for reading: %m"); + r = -errno; - return missing ? 0 : -errno; + (void) refresh_boot_seed(); + return missing ? 0 : r; } } else write_seed_file = true; @@ -439,10 +541,7 @@ static int run(int argc, char *argv[]) { break; case ACTION_SAVE: - random_fd = open("/dev/urandom", O_RDONLY|O_CLOEXEC|O_NOCTTY); - if (random_fd < 0) - return log_error_errno(errno, "Failed to open /dev/urandom: %m"); - + (void) refresh_boot_seed(); seed_fd = open(RANDOM_SEED, O_WRONLY|O_CLOEXEC|O_NOCTTY|O_CREAT, 0600); if (seed_fd < 0) return log_error_errno(errno, "Failed to open " RANDOM_SEED ": %m"); @@ -460,9 +559,11 @@ static int run(int argc, char *argv[]) { if (r < 0) return r; - if (read_seed_file) + if (read_seed_file) { r = load_seed_file(seed_fd, random_fd, seed_size, write_seed_file ? &hash_state : NULL); + (void) refresh_boot_seed(); + } if (r >= 0 && write_seed_file) r = save_seed_file(seed_fd, random_fd, seed_size, synchronous, hash_state); diff --git a/src/resolve/resolvectl.c b/src/resolve/resolvectl.c index ff645fc0d7..5889bd772f 100644 --- a/src/resolve/resolvectl.c +++ b/src/resolve/resolvectl.c @@ -480,7 +480,11 @@ static bool single_label_nonsynthetic(const char *name) { if (!dns_name_is_single_label(name)) return false; - if (is_localhost(name) || is_gateway_hostname(name)) + if (is_localhost(name) || + is_gateway_hostname(name) || + is_outbound_hostname(name) || + is_dns_stub_hostname(name) || + is_dns_proxy_stub_hostname(name)) return false; r = resolve_system_hostname(NULL, &first_label); diff --git a/src/resolve/resolved-dns-rr.c b/src/resolve/resolved-dns-rr.c index 8123ca1f98..d47cdbbd8e 100644 --- a/src/resolve/resolved-dns-rr.c +++ b/src/resolve/resolved-dns-rr.c @@ -1865,7 +1865,6 @@ static int type_bitmap_to_json(Bitmap *b, JsonVariant **ret) { unsigned t; int r; - assert(b); assert(ret); BITMAP_FOREACH(t, b) { diff --git a/src/resolve/resolved-dns-scope.c b/src/resolve/resolved-dns-scope.c index 852829569d..82f0c8f621 100644 --- a/src/resolve/resolved-dns-scope.c +++ b/src/resolve/resolved-dns-scope.c @@ -71,7 +71,7 @@ int dns_scope_new(Manager *m, DnsScope **ret, Link *l, DnsProtocol protocol, int log_debug("New scope on link %s, protocol %s, family %s", l ? l->ifname : "*", dns_protocol_to_string(protocol), family == AF_UNSPEC ? "*" : af_to_name(family)); /* Enforce ratelimiting for the multicast protocols */ - s->ratelimit = (RateLimit) { MULTICAST_RATELIMIT_INTERVAL_USEC, MULTICAST_RATELIMIT_BURST }; + s->ratelimit = (const RateLimit) { MULTICAST_RATELIMIT_INTERVAL_USEC, MULTICAST_RATELIMIT_BURST }; *ret = s; return 0; @@ -424,7 +424,7 @@ static int dns_scope_socket( return r; } - if (s->link) { + if (ifindex != 0) { r = socket_set_unicast_if(fd, sa.sa.sa_family, ifindex); if (r < 0) return r; @@ -635,8 +635,11 @@ DnsScopeMatch dns_scope_good_domain( if (dns_name_dont_resolve(domain)) return DNS_SCOPE_NO; - /* Never go to network for the _gateway or _outbound domain — they're something special, synthesized locally. */ - if (is_gateway_hostname(domain) || is_outbound_hostname(domain)) + /* Never go to network for the _gateway, _outbound, _localdnsstub, _localdnsproxy domain — they're something special, synthesized locally. */ + if (is_gateway_hostname(domain) || + is_outbound_hostname(domain) || + is_dns_stub_hostname(domain) || + is_dns_proxy_stub_hostname(domain)) return DNS_SCOPE_NO; switch (s->protocol) { @@ -764,8 +767,6 @@ DnsScopeMatch dns_scope_good_domain( return DNS_SCOPE_MAYBE; if ((dns_name_is_single_label(domain) && /* only resolve single label names via LLMNR */ - !is_gateway_hostname(domain) && /* don't resolve "_gateway" with LLMNR, let local synthesizing logic handle that */ - !is_outbound_hostname(domain) && /* similar for "_outbound" */ dns_name_equal(domain, "local") == 0 && /* don't resolve "local" with LLMNR, it's the top-level domain of mDNS after all, see above */ manager_is_own_hostname(s->manager, domain) <= 0)) /* never resolve the local hostname via LLMNR */ return DNS_SCOPE_YES_BASE + 1; /* Return +1, as we consider ourselves authoritative @@ -1116,7 +1117,7 @@ DnsTransaction *dns_scope_find_transaction( !(t->query_flags & SD_RESOLVED_NO_CACHE)) continue; - /* If we are asked to clamp ttls an the existing transaction doesn't do it, we can't + /* If we are asked to clamp ttls and the existing transaction doesn't do it, we can't * reuse */ if ((query_flags & SD_RESOLVED_CLAMP_TTL) && !(t->query_flags & SD_RESOLVED_CLAMP_TTL)) diff --git a/src/resolve/resolved-dns-server.c b/src/resolve/resolved-dns-server.c index 04a4f53ed0..8ff513fa33 100644 --- a/src/resolve/resolved-dns-server.c +++ b/src/resolve/resolved-dns-server.c @@ -648,6 +648,11 @@ int dns_server_adjust_opt(DnsServer *server, DnsPacket *packet, DnsServerFeature int dns_server_ifindex(const DnsServer *s) { assert(s); + /* For loopback addresses, go via the loopback interface, regardless which interface this is linked + * to. */ + if (in_addr_is_localhost(s->family, &s->address)) + return LOOPBACK_IFINDEX; + /* The link ifindex always takes precedence */ if (s->link) return s->link->ifindex; diff --git a/src/resolve/resolved-dns-synthesize.c b/src/resolve/resolved-dns-synthesize.c index b3442ad906..51e06bb91e 100644 --- a/src/resolve/resolved-dns-synthesize.c +++ b/src/resolve/resolved-dns-synthesize.c @@ -7,20 +7,6 @@ #include "missing_network.h" #include "resolved-dns-synthesize.h" -int dns_synthesize_ifindex(int ifindex) { - - /* When the caller asked for resolving on a specific - * interface, we synthesize the answer for that - * interface. However, if nothing specific was claimed and we - * only return localhost RRs, we synthesize the answer for - * localhost. */ - - if (ifindex > 0) - return ifindex; - - return LOOPBACK_IFINDEX; -} - int dns_synthesize_family(uint64_t flags) { /* Picks an address family depending on set flags. This is @@ -57,7 +43,7 @@ DnsProtocol dns_synthesize_protocol(uint64_t flags) { return DNS_PROTOCOL_DNS; } -static int synthesize_localhost_rr(Manager *m, const DnsResourceKey *key, int ifindex, DnsAnswer **answer) { +static int synthesize_localhost_rr(Manager *m, const DnsResourceKey *key, DnsAnswer **answer) { int r; assert(m); @@ -77,7 +63,7 @@ static int synthesize_localhost_rr(Manager *m, const DnsResourceKey *key, int if rr->a.in_addr.s_addr = htobe32(INADDR_LOOPBACK); - r = dns_answer_add(*answer, rr, dns_synthesize_ifindex(ifindex), DNS_ANSWER_AUTHENTICATED, NULL); + r = dns_answer_add(*answer, rr, LOOPBACK_IFINDEX, DNS_ANSWER_AUTHENTICATED, NULL); if (r < 0) return r; } @@ -91,7 +77,7 @@ static int synthesize_localhost_rr(Manager *m, const DnsResourceKey *key, int if rr->aaaa.in6_addr = in6addr_loopback; - r = dns_answer_add(*answer, rr, dns_synthesize_ifindex(ifindex), DNS_ANSWER_AUTHENTICATED, NULL); + r = dns_answer_add(*answer, rr, LOOPBACK_IFINDEX, DNS_ANSWER_AUTHENTICATED, NULL); if (r < 0) return r; } @@ -113,7 +99,7 @@ static int answer_add_ptr(DnsAnswer **answer, const char *from, const char *to, return dns_answer_add(*answer, rr, ifindex, flags, NULL); } -static int synthesize_localhost_ptr(Manager *m, const DnsResourceKey *key, int ifindex, DnsAnswer **answer) { +static int synthesize_localhost_ptr(Manager *m, const DnsResourceKey *key, DnsAnswer **answer) { int r; assert(m); @@ -125,7 +111,7 @@ static int synthesize_localhost_ptr(Manager *m, const DnsResourceKey *key, int i if (r < 0) return r; - r = answer_add_ptr(answer, dns_resource_key_name(key), "localhost", dns_synthesize_ifindex(ifindex), DNS_ANSWER_AUTHENTICATED); + r = answer_add_ptr(answer, dns_resource_key_name(key), "localhost", LOOPBACK_IFINDEX, DNS_ANSWER_AUTHENTICATED); if (r < 0) return r; } @@ -225,20 +211,19 @@ static int synthesize_system_hostname_rr(Manager *m, const DnsResourceKey *key, if (n == 0) { struct local_address buffer[2]; - /* If we have no local addresses then use ::1 - * and 127.0.0.2 as local ones. */ + /* If we have no local addresses then use ::1 and 127.0.0.2 as local ones. */ if (IN_SET(af, AF_INET, AF_UNSPEC)) buffer[n++] = (struct local_address) { .family = AF_INET, - .ifindex = dns_synthesize_ifindex(ifindex), - .address.in.s_addr = htobe32(0x7F000002), + .ifindex = LOOPBACK_IFINDEX, + .address.in.s_addr = htobe32(INADDR_LOCALADDRESS), }; if (IN_SET(af, AF_INET6, AF_UNSPEC) && socket_ipv6_is_enabled()) buffer[n++] = (struct local_address) { .family = AF_INET6, - .ifindex = dns_synthesize_ifindex(ifindex), + .ifindex = LOOPBACK_IFINDEX, .address.in6 = in6addr_loopback, }; @@ -260,7 +245,7 @@ static int synthesize_system_hostname_ptr(Manager *m, int af, const union in_add assert(address); assert(answer); - if (af == AF_INET && address->in.s_addr == htobe32(0x7F000002)) { + if (af == AF_INET && address->in.s_addr == htobe32(INADDR_LOCALADDRESS)) { /* Always map the IPv4 address 127.0.0.2 to the local hostname, in addition to "localhost": */ @@ -268,19 +253,19 @@ static int synthesize_system_hostname_ptr(Manager *m, int af, const union in_add if (r < 0) return r; - r = answer_add_ptr(answer, "2.0.0.127.in-addr.arpa", m->full_hostname, dns_synthesize_ifindex(ifindex), DNS_ANSWER_AUTHENTICATED); + r = answer_add_ptr(answer, "2.0.0.127.in-addr.arpa", m->full_hostname, LOOPBACK_IFINDEX, DNS_ANSWER_AUTHENTICATED); if (r < 0) return r; - r = answer_add_ptr(answer, "2.0.0.127.in-addr.arpa", m->llmnr_hostname, dns_synthesize_ifindex(ifindex), DNS_ANSWER_AUTHENTICATED); + r = answer_add_ptr(answer, "2.0.0.127.in-addr.arpa", m->llmnr_hostname, LOOPBACK_IFINDEX, DNS_ANSWER_AUTHENTICATED); if (r < 0) return r; - r = answer_add_ptr(answer, "2.0.0.127.in-addr.arpa", m->mdns_hostname, dns_synthesize_ifindex(ifindex), DNS_ANSWER_AUTHENTICATED); + r = answer_add_ptr(answer, "2.0.0.127.in-addr.arpa", m->mdns_hostname, LOOPBACK_IFINDEX, DNS_ANSWER_AUTHENTICATED); if (r < 0) return r; - r = answer_add_ptr(answer, "2.0.0.127.in-addr.arpa", "localhost", dns_synthesize_ifindex(ifindex), DNS_ANSWER_AUTHENTICATED); + r = answer_add_ptr(answer, "2.0.0.127.in-addr.arpa", "localhost", LOOPBACK_IFINDEX, DNS_ANSWER_AUTHENTICATED); if (r < 0) return r; @@ -356,7 +341,90 @@ static int synthesize_gateway_rr( return 1; /* > 0 means: we have some gateway */ } -static int synthesize_gateway_ptr(Manager *m, int af, const union in_addr_union *address, int ifindex, DnsAnswer **answer) { +static int synthesize_dns_stub_rr( + Manager *m, + const DnsResourceKey *key, + in_addr_t addr, + DnsAnswer **answer) { + + _cleanup_(dns_resource_record_unrefp) DnsResourceRecord *rr = NULL; + int r; + + assert(m); + assert(key); + assert(answer); + + if (!IN_SET(key->type, DNS_TYPE_A, DNS_TYPE_ANY)) + return 1; /* we still consider ourselves the owner of this name */ + + r = dns_answer_reserve(answer, 1); + if (r < 0) + return r; + + rr = dns_resource_record_new_full(DNS_CLASS_IN, DNS_TYPE_A, dns_resource_key_name(key)); + if (!rr) + return -ENOMEM; + + rr->a.in_addr.s_addr = htobe32(addr); + + r = dns_answer_add(*answer, rr, LOOPBACK_IFINDEX, DNS_ANSWER_AUTHENTICATED, NULL); + if (r < 0) + return r; + + return 1; +} + +static int synthesize_dns_stub_ptr( + Manager *m, + int af, + const union in_addr_union *address, + DnsAnswer **answer) { + + int r; + + assert(m); + assert(address); + assert(answer); + + if (af != AF_INET) + return 0; + + if (address->in.s_addr == htobe32(INADDR_DNS_STUB)) { + + r = dns_answer_reserve(answer, 1); + if (r < 0) + return r; + + r = answer_add_ptr(answer, "53.0.0.127.in-addr.arpa", "_localdnsstub", LOOPBACK_IFINDEX, DNS_ANSWER_AUTHENTICATED); + if (r < 0) + return r; + + return 1; + } + + if (address->in.s_addr == htobe32(INADDR_DNS_PROXY_STUB)) { + + r = dns_answer_reserve(answer, 1); + if (r < 0) + return r; + + r = answer_add_ptr(answer, "54.0.0.127.in-addr.arpa", "_localdnsproxy", LOOPBACK_IFINDEX, DNS_ANSWER_AUTHENTICATED); + if (r < 0) + return r; + + return 1; + } + + return 0; +} + +static int synthesize_gateway_ptr( + Manager *m, + int af, + const union in_addr_union *address, + int ifindex, + DnsAnswer **answer) { + _cleanup_free_ struct local_address *addresses = NULL; int n; @@ -405,7 +473,7 @@ int dns_synthesize_answer( } else if (is_localhost(name)) { - r = synthesize_localhost_rr(m, key, ifindex, &answer); + r = synthesize_localhost_rr(m, key, &answer); if (r < 0) return log_error_errno(r, "Failed to synthesize localhost RRs: %m"); @@ -437,15 +505,30 @@ int dns_synthesize_answer( continue; } - } else if ((dns_name_endswith(name, "127.in-addr.arpa") > 0 && dns_name_equal(name, "2.0.0.127.in-addr.arpa") == 0) || + } else if (is_dns_stub_hostname(name)) { + + r = synthesize_dns_stub_rr(m, key, INADDR_DNS_STUB, &answer); + if (r < 0) + return log_error_errno(r, "Failed to synthesize local DNS stub RRs: %m"); + + } else if (is_dns_proxy_stub_hostname(name)) { + + r = synthesize_dns_stub_rr(m, key, INADDR_DNS_PROXY_STUB, &answer); + if (r < 0) + return log_error_errno(r, "Failed to synthesize local DNS stub RRs: %m"); + + } else if ((dns_name_endswith(name, "127.in-addr.arpa") > 0 && + dns_name_equal(name, "2.0.0.127.in-addr.arpa") == 0 && + dns_name_equal(name, "53.0.0.127.in-addr.arpa") == 0 && + dns_name_equal(name, "54.0.0.127.in-addr.arpa") == 0) || dns_name_equal(name, "1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa") > 0) { - r = synthesize_localhost_ptr(m, key, ifindex, &answer); + r = synthesize_localhost_ptr(m, key, &answer); if (r < 0) return log_error_errno(r, "Failed to synthesize localhost PTR RRs: %m"); } else if (dns_name_address(name, &af, &address) > 0) { - int v, w; + int v, w, u; if (getenv_bool("SYSTEMD_RESOLVED_SYNTHESIZE_HOSTNAME") == 0) continue; @@ -458,7 +541,11 @@ int dns_synthesize_answer( if (w < 0) return log_error_errno(w, "Failed to synthesize gateway hostname PTR RR: %m"); - if (v == 0 && w == 0) /* This IP address is neither a local one nor a gateway */ + u = synthesize_dns_stub_ptr(m, af, &address, &answer); + if (u < 0) + return log_error_errno(u, "Failed to synthesize local stub hostname PTR PR: %m"); + + if (v == 0 && w == 0 && u == 0) /* This IP address is neither a local one, nor a gateway, nor a stub address */ continue; /* Note that we never synthesize reverse PTR for _outbound, since those are local diff --git a/src/resolve/resolved-dns-synthesize.h b/src/resolve/resolved-dns-synthesize.h index fb624589d7..bf271e862d 100644 --- a/src/resolve/resolved-dns-synthesize.h +++ b/src/resolve/resolved-dns-synthesize.h @@ -5,7 +5,6 @@ #include "resolved-dns-question.h" #include "resolved-manager.h" -int dns_synthesize_ifindex(int ifindex); int dns_synthesize_family(uint64_t flags); DnsProtocol dns_synthesize_protocol(uint64_t flags); diff --git a/src/resolve/resolved-dnstls-openssl.c b/src/resolve/resolved-dnstls-openssl.c index 4d3a88c8da..4a0132ad3d 100644 --- a/src/resolve/resolved-dnstls-openssl.c +++ b/src/resolve/resolved-dnstls-openssl.c @@ -14,6 +14,19 @@ #include "resolved-dnstls.h" #include "resolved-manager.h" +static char *dnstls_error_string(int ssl_error, char *buf, size_t count) { + assert(buf || count == 0); + if (ssl_error == SSL_ERROR_SSL) + ERR_error_string_n(ERR_get_error(), buf, count); + else + snprintf(buf, count, "SSL_get_error()=%d", ssl_error); + return buf; +} + +#define DNSTLS_ERROR_BUFSIZE 256 +#define DNSTLS_ERROR_STRING(error) \ + dnstls_error_string((error), (char[DNSTLS_ERROR_BUFSIZE]){}, DNSTLS_ERROR_BUFSIZE) + static int dnstls_flush_write_buffer(DnsStream *stream) { ssize_t ss; @@ -97,26 +110,18 @@ int dnstls_stream_connect_tls(DnsStream *stream, DnsServer *server) { if (server->server_name) { r = SSL_set_tlsext_host_name(s, server->server_name); - if (r <= 0) { - char errbuf[256]; - - error = ERR_get_error(); - ERR_error_string_n(error, errbuf, sizeof(errbuf)); - return log_debug_errno(SYNTHETIC_ERRNO(EINVAL), "Failed to set server name: %s", errbuf); - } + if (r <= 0) + return log_debug_errno(SYNTHETIC_ERRNO(EINVAL), + "Failed to set server name: %s", DNSTLS_ERROR_STRING(SSL_ERROR_SSL)); } ERR_clear_error(); stream->dnstls_data.handshake = SSL_do_handshake(s); if (stream->dnstls_data.handshake <= 0) { error = SSL_get_error(s, stream->dnstls_data.handshake); - if (!IN_SET(error, SSL_ERROR_WANT_READ, SSL_ERROR_WANT_WRITE)) { - char errbuf[256]; - - ERR_error_string_n(error, errbuf, sizeof(errbuf)); + if (!IN_SET(error, SSL_ERROR_WANT_READ, SSL_ERROR_WANT_WRITE)) return log_debug_errno(SYNTHETIC_ERRNO(ECONNREFUSED), - "Failed to invoke SSL_do_handshake: %s", errbuf); - } + "Failed to invoke SSL_do_handshake: %s", DNSTLS_ERROR_STRING(error)); } stream->encrypted = true; @@ -177,12 +182,8 @@ int dnstls_stream_on_io(DnsStream *stream, uint32_t revents) { } else if (error == SSL_ERROR_SYSCALL) { if (errno > 0) log_debug_errno(errno, "Failed to invoke SSL_shutdown, ignoring: %m"); - } else { - char errbuf[256]; - - ERR_error_string_n(error, errbuf, sizeof(errbuf)); - log_debug("Failed to invoke SSL_shutdown, ignoring: %s", errbuf); - } + } else + log_debug("Failed to invoke SSL_shutdown, ignoring: %s", DNSTLS_ERROR_STRING(error)); } stream->dnstls_events = 0; @@ -206,14 +207,10 @@ int dnstls_stream_on_io(DnsStream *stream, uint32_t revents) { return r; return -EAGAIN; - } else { - char errbuf[256]; - - ERR_error_string_n(error, errbuf, sizeof(errbuf)); + } else return log_debug_errno(SYNTHETIC_ERRNO(ECONNREFUSED), "Failed to invoke SSL_do_handshake: %s", - errbuf); - } + DNSTLS_ERROR_STRING(error)); } stream->dnstls_events = 0; @@ -275,12 +272,8 @@ int dnstls_stream_shutdown(DnsStream *stream, int error) { } else if (ssl_error == SSL_ERROR_SYSCALL) { if (errno > 0) log_debug_errno(errno, "Failed to invoke SSL_shutdown, ignoring: %m"); - } else { - char errbuf[256]; - - ERR_error_string_n(ssl_error, errbuf, sizeof(errbuf)); - log_debug("Failed to invoke SSL_shutdown, ignoring: %s", errbuf); - } + } else + log_debug("Failed to invoke SSL_shutdown, ignoring: %s", DNSTLS_ERROR_STRING(ssl_error)); } stream->dnstls_events = 0; @@ -307,10 +300,7 @@ static ssize_t dnstls_stream_write(DnsStream *stream, const char *buf, size_t co stream->dnstls_events = 0; ss = 0; } else { - char errbuf[256]; - - ERR_error_string_n(error, errbuf, sizeof(errbuf)); - log_debug("Failed to invoke SSL_write: %s", errbuf); + log_debug("Failed to invoke SSL_write: %s", DNSTLS_ERROR_STRING(error)); stream->dnstls_events = 0; ss = -EPIPE; } @@ -375,10 +365,7 @@ ssize_t dnstls_stream_read(DnsStream *stream, void *buf, size_t count) { stream->dnstls_events = 0; ss = 0; } else { - char errbuf[256]; - - ERR_error_string_n(error, errbuf, sizeof(errbuf)); - log_debug("Failed to invoke SSL_read: %s", errbuf); + log_debug("Failed to invoke SSL_read: %s", DNSTLS_ERROR_STRING(error)); stream->dnstls_events = 0; ss = -EPIPE; } diff --git a/src/resolve/resolved-manager.c b/src/resolve/resolved-manager.c index f62efa87aa..1c9048670b 100644 --- a/src/resolve/resolved-manager.c +++ b/src/resolve/resolved-manager.c @@ -868,11 +868,14 @@ int manager_recv(Manager *m, int fd, DnsProtocol protocol, DnsPacket **ret) { } static int sendmsg_loop(int fd, struct msghdr *mh, int flags) { + usec_t end; int r; assert(fd >= 0); assert(mh); + end = usec_add(now(CLOCK_MONOTONIC), SEND_TIMEOUT_USEC); + for (;;) { if (sendmsg(fd, mh, flags) >= 0) return 0; @@ -881,20 +884,26 @@ static int sendmsg_loop(int fd, struct msghdr *mh, int flags) { if (errno != EAGAIN) return -errno; - r = fd_wait_for_event(fd, POLLOUT, SEND_TIMEOUT_USEC); - if (r < 0) + r = fd_wait_for_event(fd, POLLOUT, LESS_BY(end, now(CLOCK_MONOTONIC))); + if (r < 0) { + if (ERRNO_IS_TRANSIENT(r)) + continue; return r; + } if (r == 0) return -ETIMEDOUT; } } static int write_loop(int fd, void *message, size_t length) { + usec_t end; int r; assert(fd >= 0); assert(message); + end = usec_add(now(CLOCK_MONOTONIC), SEND_TIMEOUT_USEC); + for (;;) { if (write(fd, message, length) >= 0) return 0; @@ -903,9 +912,12 @@ static int write_loop(int fd, void *message, size_t length) { if (errno != EAGAIN) return -errno; - r = fd_wait_for_event(fd, POLLOUT, SEND_TIMEOUT_USEC); - if (r < 0) + r = fd_wait_for_event(fd, POLLOUT, LESS_BY(end, now(CLOCK_MONOTONIC))); + if (r < 0) { + if (ERRNO_IS_TRANSIENT(r)) + continue; return r; + } if (r == 0) return -ETIMEDOUT; } diff --git a/src/shared/acpi-fpdt.c b/src/shared/acpi-fpdt.c index 668f6c3eee..9f77997d5a 100644 --- a/src/shared/acpi-fpdt.c +++ b/src/shared/acpi-fpdt.c @@ -61,6 +61,37 @@ struct acpi_fpdt_boot { uint64_t exit_services_exit; } _packed; +/* /dev/mem is deprecated on many systems, try using /sys/firmware/acpi/fpdt parsing instead. + * This code requires kernel version 5.12 on x86 based machines or 6.2 for arm64 */ +static int acpi_get_boot_usec_kernel_parsed(usec_t *ret_loader_start, usec_t *ret_loader_exit) { + usec_t start, end; + int r; + + r = read_timestamp_file("/sys/firmware/acpi/fpdt/boot/exitbootservice_end_ns", &end); + if (r < 0) + return r; + + if (end == 0) + /* Non-UEFI compatible boot. */ + return -ENODATA; + + r = read_timestamp_file("/sys/firmware/acpi/fpdt/boot/bootloader_launch_ns", &start); + if (r < 0) + return r; + + if (start == 0 || end < start) + return -EINVAL; + if (end > NSEC_PER_HOUR) + return -EINVAL; + + if (ret_loader_start) + *ret_loader_start = start / 1000; + if (ret_loader_exit) + *ret_loader_exit = end / 1000; + + return 0; +} + int acpi_get_boot_usec(usec_t *ret_loader_start, usec_t *ret_loader_exit) { _cleanup_free_ char *buf = NULL; struct acpi_table_header *tbl; @@ -73,6 +104,10 @@ int acpi_get_boot_usec(usec_t *ret_loader_start, usec_t *ret_loader_exit) { struct acpi_fpdt_boot_header hbrec; struct acpi_fpdt_boot brec; + r = acpi_get_boot_usec_kernel_parsed(ret_loader_start, ret_loader_exit); + if (r != -ENOENT) /* fallback to /dev/mem hack only if kernel doesn't support the new sysfs files */ + return r; + r = read_full_virtual_file("/sys/firmware/acpi/tables/FPDT", &buf, &l); if (r < 0) return r; diff --git a/src/shared/ask-password-api.c b/src/shared/ask-password-api.c index e7db23c201..1ad5ddd503 100644 --- a/src/shared/ask-password-api.c +++ b/src/shared/ask-password-api.c @@ -235,8 +235,7 @@ int ask_password_plymouth( if (notify < 0) return -errno; - r = inotify_add_watch(notify, flag_file, IN_ATTRIB); /* for the link count */ - if (r < 0) + if (inotify_add_watch(notify, flag_file, IN_ATTRIB) < 0) /* for the link count */ return -errno; } @@ -244,8 +243,7 @@ int ask_password_plymouth( if (fd < 0) return -errno; - r = connect(fd, &sa.sa, SOCKADDR_UN_LEN(sa.un)); - if (r < 0) + if (connect(fd, &sa.sa, SOCKADDR_UN_LEN(sa.un)) < 0) return -errno; if (FLAGS_SET(flags, ASK_PASSWORD_ACCEPT_CACHED)) { @@ -469,10 +467,9 @@ int ask_password_tty( new_termios.c_cc[VMIN] = 1; new_termios.c_cc[VTIME] = 0; - if (tcsetattr(ttyfd, TCSADRAIN, &new_termios) < 0) { - r = -errno; + r = RET_NERRNO(tcsetattr(ttyfd, TCSADRAIN, &new_termios)); + if (r < 0) goto finish; - } reset_tty = true; } @@ -496,11 +493,11 @@ int ask_password_tty( else timeout = USEC_INFINITY; - if (flag_file) - if (access(flag_file, F_OK) < 0) { - r = -errno; + if (flag_file) { + r = RET_NERRNO(access(flag_file, F_OK)); + if (r < 0) goto finish; - } + } r = ppoll_usec(pollfd, notify >= 0 ? 2 : 1, timeout); if (r == -EINTR) @@ -752,10 +749,10 @@ int ask_password_agent( r = -errno; goto finish; } - if (inotify_add_watch(notify, "/run/systemd/ask-password", IN_ATTRIB /* for mtime */) < 0) { - r = -errno; + + r = RET_NERRNO(inotify_add_watch(notify, "/run/systemd/ask-password", IN_ATTRIB /* for mtime */)); + if (r < 0) goto finish; - } } fd = mkostemp_safe(temp); @@ -818,10 +815,9 @@ int ask_password_agent( final[sizeof(final)-10] = 's'; final[sizeof(final)-9] = 'k'; - if (rename(temp, final) < 0) { - r = -errno; + r = RET_NERRNO(rename(temp, final)); + if (r < 0) goto finish; - } zero(pollfd); pollfd[FD_SOCKET].fd = socket_fd; diff --git a/src/shared/barrier.c b/src/shared/barrier.c index cbe54a60cd..d76a61a5db 100644 --- a/src/shared/barrier.c +++ b/src/shared/barrier.c @@ -92,7 +92,6 @@ */ int barrier_create(Barrier *b) { _unused_ _cleanup_(barrier_destroyp) Barrier *staging = b; - int r; assert(b); @@ -104,8 +103,7 @@ int barrier_create(Barrier *b) { if (b->them < 0) return -errno; - r = pipe2(b->pipe, O_CLOEXEC | O_NONBLOCK); - if (r < 0) + if (pipe2(b->pipe, O_CLOEXEC | O_NONBLOCK) < 0) return -errno; staging = NULL; diff --git a/src/shared/bootspec.c b/src/shared/bootspec.c index 0afc41d200..789b89ea93 100644 --- a/src/shared/bootspec.c +++ b/src/shared/bootspec.c @@ -419,7 +419,6 @@ void boot_config_free(BootConfig *config) { free(config->auto_entries); free(config->auto_firmware); free(config->console_mode); - free(config->random_seed_mode); free(config->beep); free(config->entry_oneshot); @@ -486,7 +485,7 @@ int boot_loader_read_conf(BootConfig *config, FILE *file, const char *path) { else if (streq(field, "console-mode")) r = free_and_strdup(&config->console_mode, p); else if (streq(field, "random-seed-mode")) - r = free_and_strdup(&config->random_seed_mode, p); + log_syntax(NULL, LOG_WARNING, path, line, 0, "'random-seed-mode' has been deprecated, ignoring."); else if (streq(field, "beep")) r = free_and_strdup(&config->beep, p); else { @@ -710,7 +709,7 @@ static int boot_entry_load_unified( if (!tmp.root) return log_oom(); - tmp.kernel = strdup(skip_leading_chars(k, "/")); + tmp.kernel = path_make_absolute(k, "/"); if (!tmp.kernel) return log_oom(); @@ -978,6 +977,12 @@ static int boot_config_find(const BootConfig *config, const char *id) { if (!id) return -1; + if (id[0] == '@') { + if (!strcaseeq(id, "@saved")) + return -1; + id = config->entry_selected; + } + for (size_t i = 0; i < config->n_entries; i++) if (fnmatch(id, config->entries[i].id, FNM_CASEFOLD) == 0) return i; @@ -1262,7 +1267,11 @@ static void boot_entry_file_list( int status = chase_symlinks_and_access(p, root, CHASE_PREFIX_ROOT|CHASE_PROHIBIT_SYMLINKS, F_OK, NULL, NULL); - printf("%13s%s ", strempty(field), field ? ":" : " "); + /* Note that this shows two '/' between the root and the file. This is intentional to highlight (in + * the abscence of color support) to the user that the boot loader is only interested in the second + * part of the file. */ + printf("%13s%s %s%s/%s", strempty(field), field ? ":" : " ", ansi_grey(), root, ansi_normal()); + if (status < 0) { errno = -status; printf("%s%s%s (%m)\n", ansi_highlight_red(), p, ansi_normal()); @@ -1314,14 +1323,21 @@ int show_boot_entry( if (e->id) printf(" id: %s\n", e->id); if (e->path) { - _cleanup_free_ char *link = NULL; + _cleanup_free_ char *text = NULL, *link = NULL; + + const char *p = e->root ? path_startswith(e->path, e->root) : NULL; + if (p) { + text = strjoin(ansi_grey(), e->root, "/", ansi_normal(), "/", p); + if (!text) + return log_oom(); + } /* Let's urlify the link to make it easy to view in an editor, but only if it is a text * file. Unified images are binary ELFs, and EFI variables are not pure text either. */ if (e->type == BOOT_ENTRY_CONF) - (void) terminal_urlify_path(e->path, NULL, &link); + (void) terminal_urlify_path(e->path, text, &link); - printf(" source: %s\n", link ?: e->path); + printf(" source: %s\n", link ?: text ?: e->path); } if (e->tries_left != UINT_MAX) { printf(" tries: %u left", e->tries_left); diff --git a/src/shared/bootspec.h b/src/shared/bootspec.h index 7f5d496b95..ac4d1890b0 100644 --- a/src/shared/bootspec.h +++ b/src/shared/bootspec.h @@ -57,7 +57,6 @@ typedef struct BootConfig { char *auto_entries; char *auto_firmware; char *console_mode; - char *random_seed_mode; char *beep; char *entry_oneshot; diff --git a/src/shared/bpf-compat.h b/src/shared/bpf-compat.h index 04ade82fc1..9ccb7d8205 100644 --- a/src/shared/bpf-compat.h +++ b/src/shared/bpf-compat.h @@ -25,7 +25,7 @@ struct bpf_map_create_opts; * - before the compat static inline helpers that use them. * When removing this file move these back to bpf-dlopen.h */ extern int (*sym_bpf_map_create)(enum bpf_map_type, const char *, __u32, __u32, __u32, const struct bpf_map_create_opts *); -extern bool (*sym_libbpf_probe_bpf_prog_type)(enum bpf_prog_type, const void *); +extern int (*sym_libbpf_probe_bpf_prog_type)(enum bpf_prog_type, const void *); /* compat symbols removed in libbpf 1.0 */ extern int (*sym_bpf_create_map)(enum bpf_map_type, int key_size, int value_size, int max_entries, __u32 map_flags); diff --git a/src/shared/bpf-dlopen.c b/src/shared/bpf-dlopen.c index 2556053cbb..15301aee60 100644 --- a/src/shared/bpf-dlopen.c +++ b/src/shared/bpf-dlopen.c @@ -6,8 +6,20 @@ #include "strv.h" #if HAVE_LIBBPF -struct bpf_link* (*sym_bpf_program__attach_cgroup)(struct bpf_program *, int); -struct bpf_link* (*sym_bpf_program__attach_lsm)(struct bpf_program *); + +/* libbpf changed types of function prototypes around, so we need to disable some type checking for older + * libbpf. We consider everything older than 0.7 too old for accurate type checks. */ +#if defined(__LIBBPF_CURRENT_VERSION_GEQ) +#if __LIBBPF_CURRENT_VERSION_GEQ(0, 7) +#define MODERN_LIBBPF 1 +#endif +#endif +#if !defined(MODERN_LIBBPF) +#define MODERN_LIBBPF 0 +#endif + +struct bpf_link* (*sym_bpf_program__attach_cgroup)(const struct bpf_program *, int); +struct bpf_link* (*sym_bpf_program__attach_lsm)(const struct bpf_program *); int (*sym_bpf_link__fd)(const struct bpf_link *); int (*sym_bpf_link__destroy)(struct bpf_link *); int (*sym_bpf_map__fd)(const struct bpf_map *); @@ -22,7 +34,7 @@ int (*sym_bpf_object__load_skeleton)(struct bpf_object_skeleton *); int (*sym_bpf_object__attach_skeleton)(struct bpf_object_skeleton *); void (*sym_bpf_object__detach_skeleton)(struct bpf_object_skeleton *); void (*sym_bpf_object__destroy_skeleton)(struct bpf_object_skeleton *); -bool (*sym_libbpf_probe_bpf_prog_type)(enum bpf_prog_type, const void *); +int (*sym_libbpf_probe_bpf_prog_type)(enum bpf_prog_type, const void *); const char* (*sym_bpf_program__name)(const struct bpf_program *); libbpf_print_fn_t (*sym_libbpf_set_print)(libbpf_print_fn_t); long (*sym_libbpf_get_error)(const void *); @@ -49,6 +61,8 @@ int dlopen_bpf(void) { void *dl; int r; + DISABLE_WARNING_DEPRECATED_DECLARATIONS; + dl = dlopen("libbpf.so.1", RTLD_LAZY); if (!dl) { /* libbpf < 1.0.0 (we rely on 0.1.0+) provide most symbols we care about, but @@ -61,14 +75,29 @@ int dlopen_bpf(void) { "neither libbpf.so.1 nor libbpf.so.0 are installed: %s", dlerror()); /* symbols deprecated in 1.0 we use as compat */ - r = dlsym_many_or_warn(dl, LOG_DEBUG, + r = dlsym_many_or_warn( + dl, LOG_DEBUG, +#if MODERN_LIBBPF + /* Don't exist anymore in new libbpf, hence cannot type check them */ + DLSYM_ARG_FORCE(bpf_create_map), + DLSYM_ARG_FORCE(bpf_probe_prog_type)); +#else DLSYM_ARG(bpf_create_map), DLSYM_ARG(bpf_probe_prog_type)); +#endif } else { /* symbols available from 0.7.0 */ - r = dlsym_many_or_warn(dl, LOG_DEBUG, + r = dlsym_many_or_warn( + dl, LOG_DEBUG, +#if MODERN_LIBBPF DLSYM_ARG(bpf_map_create), - DLSYM_ARG(libbpf_probe_bpf_prog_type)); + DLSYM_ARG(libbpf_probe_bpf_prog_type) +#else + /* These symbols did not exist in old libbpf, hence we cannot type check them */ + DLSYM_ARG_FORCE(bpf_map_create), + DLSYM_ARG_FORCE(libbpf_probe_bpf_prog_type) +#endif + ); } r = dlsym_many_or_warn( @@ -86,8 +115,14 @@ int dlopen_bpf(void) { DLSYM_ARG(bpf_object__attach_skeleton), DLSYM_ARG(bpf_object__detach_skeleton), DLSYM_ARG(bpf_object__destroy_skeleton), +#if MODERN_LIBBPF DLSYM_ARG(bpf_program__attach_cgroup), DLSYM_ARG(bpf_program__attach_lsm), +#else + /* libbpf added a "const" to function parameters where it should not have, ignore this type incompatibility */ + DLSYM_ARG_FORCE(bpf_program__attach_cgroup), + DLSYM_ARG_FORCE(bpf_program__attach_lsm), +#endif DLSYM_ARG(bpf_program__name), DLSYM_ARG(libbpf_set_print), DLSYM_ARG(libbpf_get_error)); @@ -96,6 +131,9 @@ int dlopen_bpf(void) { /* We set the print helper unconditionally. Otherwise libbpf will emit not useful log messages. */ (void) sym_libbpf_set_print(bpf_print_func); + + REENABLE_WARNING; + return r; } diff --git a/src/shared/bpf-dlopen.h b/src/shared/bpf-dlopen.h index 95951e63e0..0750abc56b 100644 --- a/src/shared/bpf-dlopen.h +++ b/src/shared/bpf-dlopen.h @@ -8,8 +8,8 @@ #include "bpf-compat.h" -extern struct bpf_link* (*sym_bpf_program__attach_cgroup)(struct bpf_program *, int); -extern struct bpf_link* (*sym_bpf_program__attach_lsm)(struct bpf_program *); +extern struct bpf_link* (*sym_bpf_program__attach_cgroup)(const struct bpf_program *, int); +extern struct bpf_link* (*sym_bpf_program__attach_lsm)(const struct bpf_program *); extern int (*sym_bpf_link__fd)(const struct bpf_link *); extern int (*sym_bpf_link__destroy)(struct bpf_link *); extern int (*sym_bpf_map__fd)(const struct bpf_map *); diff --git a/src/shared/btrfs-util.c b/src/shared/btrfs-util.c index 92a9bcde4f..4574a7899e 100644 --- a/src/shared/btrfs-util.c +++ b/src/shared/btrfs-util.c @@ -245,7 +245,6 @@ int btrfs_clone_range(int infd, uint64_t in_offset, int outfd, uint64_t out_offs assert(infd >= 0); assert(outfd >= 0); - assert(sz > 0); r = fd_verify_regular(outfd); if (r < 0) diff --git a/src/shared/bus-print-properties.c b/src/shared/bus-print-properties.c index 27b6f88cd0..b0267427fa 100644 --- a/src/shared/bus-print-properties.c +++ b/src/shared/bus-print-properties.c @@ -162,7 +162,7 @@ static int bus_print_property(const char *name, const char *expected_value, sd_b bus_print_property_value(name, expected_value, flags, "[not set]"); - else if ((STR_IN_SET(name, "DefaultMemoryLow", "DefaultMemoryMin", "MemoryLow", "MemoryHigh", "MemoryMax", "MemorySwapMax", "MemoryLimit", "MemoryAvailable") && u == CGROUP_LIMIT_MAX) || + else if ((STR_IN_SET(name, "DefaultMemoryLow", "DefaultMemoryMin", "MemoryLow", "MemoryHigh", "MemoryMax", "MemorySwapMax", "MemoryZSwapMax", "MemoryLimit", "MemoryAvailable") && u == CGROUP_LIMIT_MAX) || (STR_IN_SET(name, "TasksMax", "DefaultTasksMax") && u == UINT64_MAX) || (startswith(name, "Limit") && u == UINT64_MAX) || (startswith(name, "DefaultLimit") && u == UINT64_MAX)) diff --git a/src/shared/bus-unit-util.c b/src/shared/bus-unit-util.c index 922011eccd..6b6383b60b 100644 --- a/src/shared/bus-unit-util.c +++ b/src/shared/bus-unit-util.c @@ -523,6 +523,7 @@ static int bus_append_cgroup_property(sd_bus_message *m, const char *field, cons "MemoryHigh", "MemoryMax", "MemorySwapMax", + "MemoryZSwapMax", "MemoryLimit", "TasksMax")) { diff --git a/src/shared/cryptsetup-util.c b/src/shared/cryptsetup-util.c index 401e7a3f9c..f697429852 100644 --- a/src/shared/cryptsetup-util.c +++ b/src/shared/cryptsetup-util.c @@ -49,6 +49,18 @@ int (*sym_crypt_token_max)(const char *type); #endif crypt_token_info (*sym_crypt_token_status)(struct crypt_device *cd, int token, const char **type); int (*sym_crypt_volume_key_get)(struct crypt_device *cd, int keyslot, char *volume_key, size_t *volume_key_size, const char *passphrase, size_t passphrase_size); +#if HAVE_CRYPT_REENCRYPT_INIT_BY_PASSPHRASE +int (*sym_crypt_reencrypt_init_by_passphrase)(struct crypt_device *cd, const char *name, const char *passphrase, size_t passphrase_size, int keyslot_old, int keyslot_new, const char *cipher, const char *cipher_mode, const struct crypt_params_reencrypt *params); +#endif +#if HAVE_CRYPT_REENCRYPT +int (*sym_crypt_reencrypt)(struct crypt_device *cd, int (*progress)(uint64_t size, uint64_t offset, void *usrptr)); +#endif +int (*sym_crypt_metadata_locking)(struct crypt_device *cd, int enable); +#if HAVE_CRYPT_SET_DATA_OFFSET +int (*sym_crypt_set_data_offset)(struct crypt_device *cd, uint64_t data_offset); +#endif +int (*sym_crypt_header_restore)(struct crypt_device *cd, const char *requested_type, const char *backup_file); +int (*sym_crypt_volume_key_keyring)(struct crypt_device *cd, int enable); static void cryptsetup_log_glue(int level, const char *msg, void *usrptr) { @@ -193,6 +205,14 @@ int dlopen_cryptsetup(void) { #if HAVE_LIBCRYPTSETUP int r; + /* libcryptsetup added crypt_reencrypt() in 2.2.0, and marked it obsolete in 2.4.0, replacing it with + * crypt_reencrypt_run(), which takes one extra argument but is otherwise identical. The old call is + * still available though, and given we want to support 2.2.0 for a while longer, we'll stick to the + * old symbol. Howerver, the old symbols now has a GCC deprecation decorator, hence let's turn off + * warnings about this for now. */ + + DISABLE_WARNING_DEPRECATED_DECLARATIONS; + r = dlopen_many_sym_or_warn( &cryptsetup_dl, "libcryptsetup.so.12", LOG_DEBUG, DLSYM_ARG(crypt_activate_by_passphrase), @@ -234,10 +254,24 @@ int dlopen_cryptsetup(void) { DLSYM_ARG(crypt_token_max), #endif DLSYM_ARG(crypt_token_status), - DLSYM_ARG(crypt_volume_key_get)); + DLSYM_ARG(crypt_volume_key_get), +#if HAVE_CRYPT_REENCRYPT_INIT_BY_PASSPHRASE + DLSYM_ARG(crypt_reencrypt_init_by_passphrase), +#endif +#if HAVE_CRYPT_REENCRYPT + DLSYM_ARG(crypt_reencrypt), +#endif + DLSYM_ARG(crypt_metadata_locking), +#if HAVE_CRYPT_SET_DATA_OFFSET + DLSYM_ARG(crypt_set_data_offset), +#endif + DLSYM_ARG(crypt_header_restore), + DLSYM_ARG(crypt_volume_key_keyring)); if (r <= 0) return r; + REENABLE_WARNING; + /* Redirect the default logging calls of libcryptsetup to our own logging infra. (Note that * libcryptsetup also maintains per-"struct crypt_device" log functions, which we'll also set * whenever allocating a "struct crypt_device" context. Why set both? To be defensive: maybe some diff --git a/src/shared/cryptsetup-util.h b/src/shared/cryptsetup-util.h index b390dc9a5c..5ff439d9c2 100644 --- a/src/shared/cryptsetup-util.h +++ b/src/shared/cryptsetup-util.h @@ -64,6 +64,18 @@ static inline int crypt_token_max(_unused_ const char *type) { #endif extern crypt_token_info (*sym_crypt_token_status)(struct crypt_device *cd, int token, const char **type); extern int (*sym_crypt_volume_key_get)(struct crypt_device *cd, int keyslot, char *volume_key, size_t *volume_key_size, const char *passphrase, size_t passphrase_size); +#if HAVE_CRYPT_REENCRYPT_INIT_BY_PASSPHRASE +extern int (*sym_crypt_reencrypt_init_by_passphrase)(struct crypt_device *cd, const char *name, const char *passphrase, size_t passphrase_size, int keyslot_old, int keyslot_new, const char *cipher, const char *cipher_mode, const struct crypt_params_reencrypt *params); +#endif +#if HAVE_CRYPT_REENCRYPT +extern int (*sym_crypt_reencrypt)(struct crypt_device *cd, int (*progress)(uint64_t size, uint64_t offset, void *usrptr)); +#endif +extern int (*sym_crypt_metadata_locking)(struct crypt_device *cd, int enable); +#if HAVE_CRYPT_SET_DATA_OFFSET +extern int (*sym_crypt_set_data_offset)(struct crypt_device *cd, uint64_t data_offset); +#endif +extern int (*sym_crypt_header_restore)(struct crypt_device *cd, const char *requested_type, const char *backup_file); +extern int (*sym_crypt_volume_key_keyring)(struct crypt_device *cd, int enable); DEFINE_TRIVIAL_CLEANUP_FUNC_FULL(struct crypt_device *, crypt_free, NULL); DEFINE_TRIVIAL_CLEANUP_FUNC_FULL(struct crypt_device *, sym_crypt_free, NULL); diff --git a/src/shared/dissect-image.c b/src/shared/dissect-image.c index 8418ed6dc3..708355872f 100644 --- a/src/shared/dissect-image.c +++ b/src/shared/dissect-image.c @@ -333,6 +333,27 @@ static int open_partition(const char *node, bool is_partition, const LoopDevice return TAKE_FD(fd); } +static int compare_arch(Architecture a, Architecture b) { + if (a == b) + return 0; + + if (a == native_architecture()) + return 1; + + if (b == native_architecture()) + return -1; + +#ifdef ARCHITECTURE_SECONDARY + if (a == ARCHITECTURE_SECONDARY) + return 1; + + if (b == ARCHITECTURE_SECONDARY) + return -1; +#endif + + return 0; +} + static int dissect_image( DissectedImage *m, int fd, @@ -606,10 +627,9 @@ static int dissect_image( } if (is_gpt) { - PartitionDesignator designator = _PARTITION_DESIGNATOR_INVALID; - Architecture architecture = _ARCHITECTURE_INVALID; const char *stype, *sid, *fstype = NULL, *label; sd_id128_t type_id, id; + GptPartitionType type; bool rw = true, growfs = false; sid = blkid_partition_get_uuid(pp); @@ -624,9 +644,11 @@ static int dissect_image( if (sd_id128_from_string(stype, &type_id) < 0) continue; + type = gpt_partition_type_from_uuid(type_id); + label = blkid_partition_get_name(pp); /* libblkid returns NULL here if empty */ - if (sd_id128_equal(type_id, SD_GPT_HOME)) { + if (type.designator == PARTITION_HOME) { check_partition_flags(node, pflags, SD_GPT_FLAG_NO_AUTO | SD_GPT_FLAG_READ_ONLY | SD_GPT_FLAG_GROWFS); @@ -634,11 +656,10 @@ static int dissect_image( if (pflags & SD_GPT_FLAG_NO_AUTO) continue; - designator = PARTITION_HOME; rw = !(pflags & SD_GPT_FLAG_READ_ONLY); growfs = FLAGS_SET(pflags, SD_GPT_FLAG_GROWFS); - } else if (sd_id128_equal(type_id, SD_GPT_SRV)) { + } else if (type.designator == PARTITION_SRV) { check_partition_flags(node, pflags, SD_GPT_FLAG_NO_AUTO | SD_GPT_FLAG_READ_ONLY | SD_GPT_FLAG_GROWFS); @@ -646,11 +667,10 @@ static int dissect_image( if (pflags & SD_GPT_FLAG_NO_AUTO) continue; - designator = PARTITION_SRV; rw = !(pflags & SD_GPT_FLAG_READ_ONLY); growfs = FLAGS_SET(pflags, SD_GPT_FLAG_GROWFS); - } else if (sd_id128_equal(type_id, SD_GPT_ESP)) { + } else if (type.designator == PARTITION_ESP) { /* Note that we don't check the SD_GPT_FLAG_NO_AUTO flag for the ESP, as it is * not defined there. We instead check the SD_GPT_FLAG_NO_BLOCK_IO_PROTOCOL, as @@ -660,10 +680,9 @@ static int dissect_image( if (pflags & SD_GPT_FLAG_NO_BLOCK_IO_PROTOCOL) continue; - designator = PARTITION_ESP; fstype = "vfat"; - } else if (sd_id128_equal(type_id, SD_GPT_XBOOTLDR)) { + } else if (type.designator == PARTITION_XBOOTLDR) { check_partition_flags(node, pflags, SD_GPT_FLAG_NO_AUTO | SD_GPT_FLAG_READ_ONLY | SD_GPT_FLAG_GROWFS); @@ -671,11 +690,10 @@ static int dissect_image( if (pflags & SD_GPT_FLAG_NO_AUTO) continue; - designator = PARTITION_XBOOTLDR; rw = !(pflags & SD_GPT_FLAG_READ_ONLY); growfs = FLAGS_SET(pflags, SD_GPT_FLAG_GROWFS); - } else if (gpt_partition_type_is_root(type_id)) { + } else if (type.designator == PARTITION_ROOT) { check_partition_flags(node, pflags, SD_GPT_FLAG_NO_AUTO | SD_GPT_FLAG_READ_ONLY | SD_GPT_FLAG_GROWFS); @@ -687,12 +705,10 @@ static int dissect_image( if (!sd_id128_is_null(root_uuid) && !sd_id128_equal(root_uuid, id)) continue; - assert_se((architecture = gpt_partition_type_uuid_to_arch(type_id)) >= 0); - designator = partition_root_of_arch(architecture); rw = !(pflags & SD_GPT_FLAG_READ_ONLY); growfs = FLAGS_SET(pflags, SD_GPT_FLAG_GROWFS); - } else if (gpt_partition_type_is_root_verity(type_id)) { + } else if (type.designator == PARTITION_ROOT_VERITY) { check_partition_flags(node, pflags, SD_GPT_FLAG_NO_AUTO | SD_GPT_FLAG_READ_ONLY); @@ -712,12 +728,10 @@ static int dissect_image( if (!sd_id128_is_null(root_verity_uuid) && !sd_id128_equal(root_verity_uuid, id)) continue; - assert_se((architecture = gpt_partition_type_uuid_to_arch(type_id)) >= 0); - designator = partition_verity_of(partition_root_of_arch(architecture)); fstype = "DM_verity_hash"; rw = false; - } else if (gpt_partition_type_is_root_verity_sig(type_id)) { + } else if (type.designator == PARTITION_ROOT_VERITY_SIG) { check_partition_flags(node, pflags, SD_GPT_FLAG_NO_AUTO | SD_GPT_FLAG_READ_ONLY); @@ -732,12 +746,10 @@ static int dissect_image( if (verity->designator >= 0 && verity->designator != PARTITION_ROOT) continue; - assert_se((architecture = gpt_partition_type_uuid_to_arch(type_id)) >= 0); - designator = partition_verity_sig_of(partition_root_of_arch(architecture)); fstype = "verity_hash_signature"; rw = false; - } else if (gpt_partition_type_is_usr(type_id)) { + } else if (type.designator == PARTITION_USR) { check_partition_flags(node, pflags, SD_GPT_FLAG_NO_AUTO | SD_GPT_FLAG_READ_ONLY | SD_GPT_FLAG_GROWFS); @@ -749,12 +761,10 @@ static int dissect_image( if (!sd_id128_is_null(usr_uuid) && !sd_id128_equal(usr_uuid, id)) continue; - assert_se((architecture = gpt_partition_type_uuid_to_arch(type_id)) >= 0); - designator = partition_usr_of_arch(architecture); rw = !(pflags & SD_GPT_FLAG_READ_ONLY); growfs = FLAGS_SET(pflags, SD_GPT_FLAG_GROWFS); - } else if (gpt_partition_type_is_usr_verity(type_id)) { + } else if (type.designator == PARTITION_USR_VERITY) { check_partition_flags(node, pflags, SD_GPT_FLAG_NO_AUTO | SD_GPT_FLAG_READ_ONLY); @@ -773,12 +783,10 @@ static int dissect_image( if (!sd_id128_is_null(usr_verity_uuid) && !sd_id128_equal(usr_verity_uuid, id)) continue; - assert_se((architecture = gpt_partition_type_uuid_to_arch(type_id)) >= 0); - designator = partition_verity_of(partition_usr_of_arch(architecture)); fstype = "DM_verity_hash"; rw = false; - } else if (gpt_partition_type_is_usr_verity_sig(type_id)) { + } else if (type.designator == PARTITION_USR_VERITY_SIG) { check_partition_flags(node, pflags, SD_GPT_FLAG_NO_AUTO | SD_GPT_FLAG_READ_ONLY); @@ -793,21 +801,18 @@ static int dissect_image( if (verity->designator >= 0 && verity->designator != PARTITION_USR) continue; - assert_se((architecture = gpt_partition_type_uuid_to_arch(type_id)) >= 0); - designator = partition_verity_sig_of(partition_usr_of_arch(architecture)); fstype = "verity_hash_signature"; rw = false; - } else if (sd_id128_equal(type_id, SD_GPT_SWAP)) { + } else if (type.designator == PARTITION_SWAP) { check_partition_flags(node, pflags, SD_GPT_FLAG_NO_AUTO); if (pflags & SD_GPT_FLAG_NO_AUTO) continue; - designator = PARTITION_SWAP; - - } else if (sd_id128_equal(type_id, SD_GPT_LINUX_GENERIC)) { + /* We don't have a designator for SD_GPT_LINUX_GENERIC so check the UUID instead. */ + } else if (sd_id128_equal(type.uuid, SD_GPT_LINUX_GENERIC)) { check_partition_flags(node, pflags, SD_GPT_FLAG_NO_AUTO | SD_GPT_FLAG_READ_ONLY | SD_GPT_FLAG_GROWFS); @@ -827,7 +832,7 @@ static int dissect_image( return -ENOMEM; } - } else if (sd_id128_equal(type_id, SD_GPT_TMP)) { + } else if (type.designator == PARTITION_TMP) { check_partition_flags(node, pflags, SD_GPT_FLAG_NO_AUTO | SD_GPT_FLAG_READ_ONLY | SD_GPT_FLAG_GROWFS); @@ -835,11 +840,10 @@ static int dissect_image( if (pflags & SD_GPT_FLAG_NO_AUTO) continue; - designator = PARTITION_TMP; rw = !(pflags & SD_GPT_FLAG_READ_ONLY); growfs = FLAGS_SET(pflags, SD_GPT_FLAG_GROWFS); - } else if (sd_id128_equal(type_id, SD_GPT_VAR)) { + } else if (type.designator == PARTITION_VAR) { check_partition_flags(node, pflags, SD_GPT_FLAG_NO_AUTO | SD_GPT_FLAG_READ_ONLY | SD_GPT_FLAG_GROWFS); @@ -868,30 +872,33 @@ static int dissect_image( } } - designator = PARTITION_VAR; rw = !(pflags & SD_GPT_FLAG_READ_ONLY); growfs = FLAGS_SET(pflags, SD_GPT_FLAG_GROWFS); } - if (designator != _PARTITION_DESIGNATOR_INVALID) { + if (type.designator != _PARTITION_DESIGNATOR_INVALID) { _cleanup_free_ char *t = NULL, *o = NULL, *l = NULL; _cleanup_close_ int mount_node_fd = -1; const char *options = NULL; - if (m->partitions[designator].found) { + if (m->partitions[type.designator].found) { /* For most partition types the first one we see wins. Except for the * rootfs and /usr, where we do a version compare of the label, and * let the newest version win. This permits a simple A/B versioning * scheme in OS images. */ - if (!partition_designator_is_versioned(designator) || - strverscmp_improved(m->partitions[designator].label, label) >= 0) + if (compare_arch(type.arch, m->partitions[type.designator].architecture) <= 0) + continue; + + if (!partition_designator_is_versioned(type.designator) || + strverscmp_improved(m->partitions[type.designator].label, label) >= 0) continue; - dissected_partition_done(m->partitions + designator); + dissected_partition_done(m->partitions + type.designator); } - if (FLAGS_SET(flags, DISSECT_IMAGE_OPEN_PARTITION_DEVICES)) { + if (FLAGS_SET(flags, DISSECT_IMAGE_OPEN_PARTITION_DEVICES) && + type.designator != PARTITION_SWAP) { mount_node_fd = open_partition(node, /* is_partition = */ true, m->loop); if (mount_node_fd < 0) return mount_node_fd; @@ -909,19 +916,19 @@ static int dissect_image( return -ENOMEM; } - options = mount_options_from_designator(mount_options, designator); + options = mount_options_from_designator(mount_options, type.designator); if (options) { o = strdup(options); if (!o) return -ENOMEM; } - m->partitions[designator] = (DissectedPartition) { + m->partitions[type.designator] = (DissectedPartition) { .found = true, .partno = nr, .rw = rw, .growfs = growfs, - .architecture = architecture, + .architecture = type.arch, .node = TAKE_PTR(node), .fstype = TAKE_PTR(t), .label = TAKE_PTR(l), @@ -1001,121 +1008,21 @@ static int dissect_image( } } - if (m->partitions[PARTITION_ROOT].found) { - /* If we found the primary arch, then invalidate the secondary and other arch to avoid any - * ambiguities, since we never want to mount the secondary or other arch in this case. */ - m->partitions[PARTITION_ROOT_SECONDARY].found = false; - m->partitions[PARTITION_ROOT_SECONDARY_VERITY].found = false; - m->partitions[PARTITION_ROOT_SECONDARY_VERITY_SIG].found = false; - m->partitions[PARTITION_USR_SECONDARY].found = false; - m->partitions[PARTITION_USR_SECONDARY_VERITY].found = false; - m->partitions[PARTITION_USR_SECONDARY_VERITY_SIG].found = false; - - m->partitions[PARTITION_ROOT_OTHER].found = false; - m->partitions[PARTITION_ROOT_OTHER_VERITY].found = false; - m->partitions[PARTITION_ROOT_OTHER_VERITY_SIG].found = false; - m->partitions[PARTITION_USR_OTHER].found = false; - m->partitions[PARTITION_USR_OTHER_VERITY].found = false; - m->partitions[PARTITION_USR_OTHER_VERITY_SIG].found = false; - - } else if (m->partitions[PARTITION_ROOT_VERITY].found || - m->partitions[PARTITION_ROOT_VERITY_SIG].found) - return -EADDRNOTAVAIL; /* Verity found but no matching rootfs? Something is off, refuse. */ - - else if (m->partitions[PARTITION_ROOT_SECONDARY].found) { - - /* No root partition found but there's one for the secondary architecture? Then upgrade - * secondary arch to first and invalidate the other arch. */ - - log_debug("No root partition found of the native architecture, falling back to a root " - "partition of the secondary architecture."); - - m->partitions[PARTITION_ROOT] = TAKE_PARTITION(m->partitions[PARTITION_ROOT_SECONDARY]); - m->partitions[PARTITION_ROOT_VERITY] = TAKE_PARTITION(m->partitions[PARTITION_ROOT_SECONDARY_VERITY]); - m->partitions[PARTITION_ROOT_VERITY_SIG] = TAKE_PARTITION(m->partitions[PARTITION_ROOT_SECONDARY_VERITY_SIG]); - - m->partitions[PARTITION_USR] = TAKE_PARTITION(m->partitions[PARTITION_USR_SECONDARY]); - m->partitions[PARTITION_USR_VERITY] = TAKE_PARTITION(m->partitions[PARTITION_USR_SECONDARY_VERITY]); - m->partitions[PARTITION_USR_VERITY_SIG] = TAKE_PARTITION(m->partitions[PARTITION_USR_SECONDARY_VERITY_SIG]); - - m->partitions[PARTITION_ROOT_OTHER].found = false; - m->partitions[PARTITION_ROOT_OTHER_VERITY].found = false; - m->partitions[PARTITION_ROOT_OTHER_VERITY_SIG].found = false; - m->partitions[PARTITION_USR_OTHER].found = false; - m->partitions[PARTITION_USR_OTHER_VERITY].found = false; - m->partitions[PARTITION_USR_OTHER_VERITY_SIG].found = false; - - } else if (m->partitions[PARTITION_ROOT_SECONDARY_VERITY].found || - m->partitions[PARTITION_ROOT_SECONDARY_VERITY_SIG].found) - return -EADDRNOTAVAIL; /* as above */ - - else if (m->partitions[PARTITION_ROOT_OTHER].found) { - - /* No root or secondary partition found but there's one for another architecture? Then - * upgrade the other architecture to first. */ - - log_debug("No root partition found of the native architecture or the secondary architecture, " - "falling back to a root partition of a non-native architecture (%s).", - architecture_to_string(m->partitions[PARTITION_ROOT_OTHER].architecture)); - - m->partitions[PARTITION_ROOT] = TAKE_PARTITION(m->partitions[PARTITION_ROOT_OTHER]); - m->partitions[PARTITION_ROOT_VERITY] = TAKE_PARTITION(m->partitions[PARTITION_ROOT_OTHER_VERITY]); - m->partitions[PARTITION_ROOT_VERITY_SIG] = TAKE_PARTITION(m->partitions[PARTITION_ROOT_OTHER_VERITY_SIG]); - - m->partitions[PARTITION_USR] = TAKE_PARTITION(m->partitions[PARTITION_USR_OTHER]); - m->partitions[PARTITION_USR_VERITY] = TAKE_PARTITION(m->partitions[PARTITION_USR_OTHER_VERITY]); - m->partitions[PARTITION_USR_VERITY_SIG] = TAKE_PARTITION(m->partitions[PARTITION_USR_OTHER_VERITY_SIG]); - } + if (!m->partitions[PARTITION_ROOT].found && + (m->partitions[PARTITION_ROOT_VERITY].found || + m->partitions[PARTITION_ROOT_VERITY_SIG].found)) + return -EADDRNOTAVAIL; /* Verity found but no matching rootfs? Something is off, refuse. */ /* Hmm, we found a signature partition but no Verity data? Something is off. */ if (m->partitions[PARTITION_ROOT_VERITY_SIG].found && !m->partitions[PARTITION_ROOT_VERITY].found) return -EADDRNOTAVAIL; - if (m->partitions[PARTITION_USR].found) { - /* Invalidate secondary and other arch /usr/ if we found the primary arch */ - m->partitions[PARTITION_USR_SECONDARY].found = false; - m->partitions[PARTITION_USR_SECONDARY_VERITY].found = false; - m->partitions[PARTITION_USR_SECONDARY_VERITY_SIG].found = false; - - m->partitions[PARTITION_USR_OTHER].found = false; - m->partitions[PARTITION_USR_OTHER_VERITY].found = false; - m->partitions[PARTITION_USR_OTHER_VERITY_SIG].found = false; - - } else if (m->partitions[PARTITION_USR_VERITY].found || - m->partitions[PARTITION_USR_VERITY_SIG].found) - return -EADDRNOTAVAIL; /* as above */ - - else if (m->partitions[PARTITION_USR_SECONDARY].found) { - - log_debug("No usr partition found of the native architecture, falling back to a usr " - "partition of the secondary architecture."); - - /* Upgrade secondary arch to primary */ - m->partitions[PARTITION_USR] = TAKE_PARTITION(m->partitions[PARTITION_USR_SECONDARY]); - m->partitions[PARTITION_USR_VERITY] = TAKE_PARTITION(m->partitions[PARTITION_USR_SECONDARY_VERITY]); - m->partitions[PARTITION_USR_VERITY_SIG] = TAKE_PARTITION(m->partitions[PARTITION_USR_SECONDARY_VERITY_SIG]); + if (!m->partitions[PARTITION_USR].found && + (m->partitions[PARTITION_USR_VERITY].found || + m->partitions[PARTITION_USR_VERITY_SIG].found)) + return -EADDRNOTAVAIL; /* as above */ - m->partitions[PARTITION_USR_OTHER].found = false; - m->partitions[PARTITION_USR_OTHER_VERITY].found = false; - m->partitions[PARTITION_USR_OTHER_VERITY_SIG].found = false; - - } else if (m->partitions[PARTITION_USR_SECONDARY_VERITY].found || - m->partitions[PARTITION_USR_SECONDARY_VERITY_SIG].found) - return -EADDRNOTAVAIL; /* as above */ - - else if (m->partitions[PARTITION_USR_OTHER].found) { - - log_debug("No usr partition found of the native architecture or the secondary architecture, " - "falling back to a usr partition of a non-native architecture (%s).", - architecture_to_string(m->partitions[PARTITION_ROOT_OTHER].architecture)); - - /* Upgrade other arch to primary */ - m->partitions[PARTITION_USR] = TAKE_PARTITION(m->partitions[PARTITION_USR_OTHER]); - m->partitions[PARTITION_USR_VERITY] = TAKE_PARTITION(m->partitions[PARTITION_USR_OTHER_VERITY]); - m->partitions[PARTITION_USR_VERITY_SIG] = TAKE_PARTITION(m->partitions[PARTITION_USR_OTHER_VERITY_SIG]); - } - - /* Hmm, we found a signature partition but no Verity data? Something is off. */ + /* as above */ if (m->partitions[PARTITION_USR_VERITY_SIG].found && !m->partitions[PARTITION_USR_VERITY].found) return -EADDRNOTAVAIL; diff --git a/src/shared/dlfcn-util.h b/src/shared/dlfcn-util.h index d786d035d7..f8a16433fc 100644 --- a/src/shared/dlfcn-util.h +++ b/src/shared/dlfcn-util.h @@ -19,4 +19,8 @@ int dlopen_many_sym_or_warn_sentinel(void **dlp, const char *filename, int log_l * that each library symbol to resolve will be placed in a variable with the "sym_" prefix, i.e. a symbol * "foobar" is loaded into a variable "sym_foobar". */ #define DLSYM_ARG(arg) \ + ({ assert_cc(__builtin_types_compatible_p(typeof(sym_##arg), typeof(&arg))); &sym_##arg; }), STRINGIFY(arg) + +/* libbpf is a bit confused about type-safety and API compatibility. Provide a macro that can tape over that mess. Sad. */ +#define DLSYM_ARG_FORCE(arg) \ &sym_##arg, STRINGIFY(arg) diff --git a/src/shared/fdisk-util.c b/src/shared/fdisk-util.c new file mode 100644 index 0000000000..1cdf09b18d --- /dev/null +++ b/src/shared/fdisk-util.c @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: LGPL-2.1-or-later */ + +#include "fd-util.h" +#include "fdisk-util.h" + +#if HAVE_LIBFDISK + +int fdisk_new_context_fd(int fd, bool read_only, struct fdisk_context **ret) { + _cleanup_(fdisk_unref_contextp) struct fdisk_context *c = NULL; + int r; + + assert(ret); + + if (fd < 0) + return -EBADF; + + c = fdisk_new_context(); + if (!c) + return -ENOMEM; + + r = fdisk_assign_device(c, FORMAT_PROC_FD_PATH(fd), read_only); + if (r < 0) + return r; + + *ret = TAKE_PTR(c); + return 0; +} + +#endif diff --git a/src/shared/fdisk-util.h b/src/shared/fdisk-util.h index 64c0c2f324..49cb840c33 100644 --- a/src/shared/fdisk-util.h +++ b/src/shared/fdisk-util.h @@ -12,4 +12,6 @@ DEFINE_TRIVIAL_CLEANUP_FUNC_FULL(struct fdisk_partition*, fdisk_unref_partition, DEFINE_TRIVIAL_CLEANUP_FUNC_FULL(struct fdisk_parttype*, fdisk_unref_parttype, NULL); DEFINE_TRIVIAL_CLEANUP_FUNC_FULL(struct fdisk_table*, fdisk_unref_table, NULL); +int fdisk_new_context_fd(int fd, bool read_only, struct fdisk_context **ret); + #endif diff --git a/src/shared/find-esp.c b/src/shared/find-esp.c index dfe0574aba..fa234c8b5f 100644 --- a/src/shared/find-esp.c +++ b/src/shared/find-esp.c @@ -165,59 +165,65 @@ static int verify_esp_udev( r = sd_device_get_devname(d, &node); if (r < 0) - return log_error_errno(r, "Failed to get device node: %m"); + return log_device_error_errno(d, r, "Failed to get device node: %m"); r = sd_device_get_property_value(d, "ID_FS_TYPE", &v); if (r < 0) - return log_error_errno(r, "Failed to get device property: %m"); + return log_device_error_errno(d, r, "Failed to get device property: %m"); if (!streq(v, "vfat")) - return log_full_errno(searching ? LOG_DEBUG : LOG_ERR, - SYNTHETIC_ERRNO(searching ? EADDRNOTAVAIL : ENODEV), - "File system \"%s\" is not FAT.", node ); + return log_device_full_errno(d, + searching ? LOG_DEBUG : LOG_ERR, + SYNTHETIC_ERRNO(searching ? EADDRNOTAVAIL : ENODEV), + "File system \"%s\" is not FAT.", node ); r = sd_device_get_property_value(d, "ID_PART_ENTRY_SCHEME", &v); if (r < 0) - return log_error_errno(r, "Failed to get device property: %m"); + return log_device_full_errno(d, + searching && r == -ENOENT ? LOG_DEBUG : LOG_ERR, + searching && r == -ENOENT ? SYNTHETIC_ERRNO(EADDRNOTAVAIL) : r, + "Failed to get device property: %m"); if (!streq(v, "gpt")) - return log_full_errno(searching ? LOG_DEBUG : LOG_ERR, - SYNTHETIC_ERRNO(searching ? EADDRNOTAVAIL : ENODEV), - "File system \"%s\" is not on a GPT partition table.", node); + return log_device_full_errno(d, + searching ? LOG_DEBUG : LOG_ERR, + SYNTHETIC_ERRNO(searching ? EADDRNOTAVAIL : ENODEV), + "File system \"%s\" is not on a GPT partition table.", node); r = sd_device_get_property_value(d, "ID_PART_ENTRY_TYPE", &v); if (r < 0) - return log_error_errno(r, "Failed to get device property: %m"); + return log_device_error_errno(d, r, "Failed to get device property: %m"); if (sd_id128_string_equal(v, SD_GPT_ESP) <= 0) - return log_full_errno(searching ? LOG_DEBUG : LOG_ERR, - SYNTHETIC_ERRNO(searching ? EADDRNOTAVAIL : ENODEV), - "File system \"%s\" has wrong type for an EFI System Partition (ESP).", node); + return log_device_full_errno(d, + searching ? LOG_DEBUG : LOG_ERR, + SYNTHETIC_ERRNO(searching ? EADDRNOTAVAIL : ENODEV), + "File system \"%s\" has wrong type for an EFI System Partition (ESP).", node); r = sd_device_get_property_value(d, "ID_PART_ENTRY_UUID", &v); if (r < 0) - return log_error_errno(r, "Failed to get device property: %m"); + return log_device_error_errno(d, r, "Failed to get device property: %m"); r = sd_id128_from_string(v, &uuid); if (r < 0) - return log_error_errno(r, "Partition \"%s\" has invalid UUID \"%s\".", node, v); + return log_device_error_errno(d, r, "Partition \"%s\" has invalid UUID \"%s\".", node, v); r = sd_device_get_property_value(d, "ID_PART_ENTRY_NUMBER", &v); if (r < 0) - return log_error_errno(r, "Failed to get device property: %m"); + return log_device_error_errno(d, r, "Failed to get device property: %m"); r = safe_atou32(v, &part); if (r < 0) - return log_error_errno(r, "Failed to parse PART_ENTRY_NUMBER field."); + return log_device_error_errno(d, r, "Failed to parse PART_ENTRY_NUMBER field."); r = sd_device_get_property_value(d, "ID_PART_ENTRY_OFFSET", &v); if (r < 0) - return log_error_errno(r, "Failed to get device property: %m"); + return log_device_error_errno(d, r, "Failed to get device property: %m"); r = safe_atou64(v, &pstart); if (r < 0) - return log_error_errno(r, "Failed to parse PART_ENTRY_OFFSET field."); + return log_device_error_errno(d, r, "Failed to parse PART_ENTRY_OFFSET field."); r = sd_device_get_property_value(d, "ID_PART_ENTRY_SIZE", &v); if (r < 0) - return log_error_errno(r, "Failed to get device property: %m"); + return log_device_error_errno(d, r, "Failed to get device property: %m"); r = safe_atou64(v, &psize); if (r < 0) - return log_error_errno(r, "Failed to parse PART_ENTRY_SIZE field."); + return log_device_error_errno(d, r, "Failed to parse PART_ENTRY_SIZE field."); if (ret_part) *ret_part = part; @@ -572,10 +578,11 @@ static int verify_xbootldr_blkid( else if (r != 0) return log_error_errno(errno ?: SYNTHETIC_ERRNO(EIO), "%s: Failed to probe file system: %m", node); - errno = 0; r = blkid_probe_lookup_value(b, "PART_ENTRY_SCHEME", &type, NULL); if (r != 0) - return log_error_errno(errno ?: SYNTHETIC_ERRNO(EIO), "%s: Failed to probe PART_ENTRY_SCHEME: %m", node); + return log_full_errno(searching ? LOG_DEBUG : LOG_ERR, + searching ? SYNTHETIC_ERRNO(EADDRNOTAVAIL) : SYNTHETIC_ERRNO(EIO), + "%s: Failed to probe PART_ENTRY_SCHEME: %m", node); if (streq(type, "gpt")) { errno = 0; @@ -634,11 +641,14 @@ static int verify_xbootldr_udev( r = sd_device_get_devname(d, &node); if (r < 0) - return log_error_errno(r, "Failed to get device node: %m"); + return log_device_error_errno(d, r, "Failed to get device node: %m"); r = sd_device_get_property_value(d, "ID_PART_ENTRY_SCHEME", &type); if (r < 0) - return log_device_error_errno(d, r, "Failed to query ID_PART_ENTRY_SCHEME: %m"); + return log_device_full_errno(d, + searching && r == -ENOENT ? LOG_DEBUG : LOG_ERR, + searching && r == -ENOENT ? SYNTHETIC_ERRNO(EADDRNOTAVAIL) : r, + "Failed to query ID_PART_ENTRY_SCHEME: %m"); if (streq(type, "gpt")) { diff --git a/src/shared/gpt.c b/src/shared/gpt.c index af969ff9d5..99795530bd 100644 --- a/src/shared/gpt.c +++ b/src/shared/gpt.c @@ -23,23 +23,11 @@ bool partition_designator_is_versioned(PartitionDesignator d) { return IN_SET(d, PARTITION_ROOT, - PARTITION_ROOT_SECONDARY, - PARTITION_ROOT_OTHER, PARTITION_USR, - PARTITION_USR_SECONDARY, - PARTITION_USR_OTHER, PARTITION_ROOT_VERITY, - PARTITION_ROOT_SECONDARY_VERITY, - PARTITION_ROOT_OTHER_VERITY, PARTITION_USR_VERITY, - PARTITION_USR_SECONDARY_VERITY, - PARTITION_USR_OTHER_VERITY, PARTITION_ROOT_VERITY_SIG, - PARTITION_ROOT_SECONDARY_VERITY_SIG, - PARTITION_ROOT_OTHER_VERITY_SIG, - PARTITION_USR_VERITY_SIG, - PARTITION_USR_SECONDARY_VERITY_SIG, - PARTITION_USR_OTHER_VERITY_SIG); + PARTITION_USR_VERITY_SIG); } PartitionDesignator partition_verity_of(PartitionDesignator p) { @@ -48,21 +36,9 @@ PartitionDesignator partition_verity_of(PartitionDesignator p) { case PARTITION_ROOT: return PARTITION_ROOT_VERITY; - case PARTITION_ROOT_SECONDARY: - return PARTITION_ROOT_SECONDARY_VERITY; - - case PARTITION_ROOT_OTHER: - return PARTITION_ROOT_OTHER_VERITY; - case PARTITION_USR: return PARTITION_USR_VERITY; - case PARTITION_USR_SECONDARY: - return PARTITION_USR_SECONDARY_VERITY; - - case PARTITION_USR_OTHER: - return PARTITION_USR_OTHER_VERITY; - default: return _PARTITION_DESIGNATOR_INVALID; } @@ -74,97 +50,36 @@ PartitionDesignator partition_verity_sig_of(PartitionDesignator p) { case PARTITION_ROOT: return PARTITION_ROOT_VERITY_SIG; - case PARTITION_ROOT_SECONDARY: - return PARTITION_ROOT_SECONDARY_VERITY_SIG; - - case PARTITION_ROOT_OTHER: - return PARTITION_ROOT_OTHER_VERITY_SIG; - case PARTITION_USR: return PARTITION_USR_VERITY_SIG; - case PARTITION_USR_SECONDARY: - return PARTITION_USR_SECONDARY_VERITY_SIG; - - case PARTITION_USR_OTHER: - return PARTITION_USR_OTHER_VERITY_SIG; - default: return _PARTITION_DESIGNATOR_INVALID; } } -PartitionDesignator partition_root_of_arch(Architecture arch) { - switch (arch) { - - case native_architecture(): - return PARTITION_ROOT; - -#ifdef ARCHITECTURE_SECONDARY - case ARCHITECTURE_SECONDARY: - return PARTITION_ROOT_SECONDARY; -#endif - default: - return PARTITION_ROOT_OTHER; - } -} - -PartitionDesignator partition_usr_of_arch(Architecture arch) { - switch (arch) { - - case native_architecture(): - return PARTITION_USR; - -#ifdef ARCHITECTURE_SECONDARY - case ARCHITECTURE_SECONDARY: - return PARTITION_USR_SECONDARY; -#endif - - default: - return PARTITION_USR_OTHER; - } -} - -static const char *const partition_designator_table[] = { +static const char *const partition_designator_table[_PARTITION_DESIGNATOR_MAX] = { [PARTITION_ROOT] = "root", - [PARTITION_ROOT_SECONDARY] = "root-secondary", - [PARTITION_ROOT_OTHER] = "root-other", [PARTITION_USR] = "usr", - [PARTITION_USR_SECONDARY] = "usr-secondary", - [PARTITION_USR_OTHER] = "usr-other", [PARTITION_HOME] = "home", [PARTITION_SRV] = "srv", [PARTITION_ESP] = "esp", [PARTITION_XBOOTLDR] = "xbootldr", [PARTITION_SWAP] = "swap", [PARTITION_ROOT_VERITY] = "root-verity", - [PARTITION_ROOT_SECONDARY_VERITY] = "root-secondary-verity", - [PARTITION_ROOT_OTHER_VERITY] = "root-other-verity", [PARTITION_USR_VERITY] = "usr-verity", - [PARTITION_USR_SECONDARY_VERITY] = "usr-secondary-verity", - [PARTITION_USR_OTHER_VERITY] = "usr-other-verity", [PARTITION_ROOT_VERITY_SIG] = "root-verity-sig", - [PARTITION_ROOT_SECONDARY_VERITY_SIG] = "root-secondary-verity-sig", - [PARTITION_ROOT_OTHER_VERITY_SIG] = "root-other-verity-sig", [PARTITION_USR_VERITY_SIG] = "usr-verity-sig", - [PARTITION_USR_SECONDARY_VERITY_SIG] = "usr-secondary-verity-sig", - [PARTITION_USR_OTHER_VERITY_SIG] = "usr-other-verity-sig", [PARTITION_TMP] = "tmp", [PARTITION_VAR] = "var", - [PARTITION_USER_HOME] = "user-home", - [PARTITION_LINUX_GENERIC] = "linux-generic", }; DEFINE_STRING_TABLE_LOOKUP(partition_designator, PartitionDesignator); -static const char *const partition_mountpoint_table[] = { +static const char *const partition_mountpoint_table[_PARTITION_DESIGNATOR_MAX] = { [PARTITION_ROOT] = "/\0", - [PARTITION_ROOT_SECONDARY] = "/\0", - [PARTITION_ROOT_OTHER] = "/\0", [PARTITION_USR] = "/usr\0", - [PARTITION_USR_SECONDARY] = "/usr\0", - [PARTITION_USR_OTHER] = "/usr\0", [PARTITION_HOME] = "/home\0", [PARTITION_SRV] = "/srv\0", [PARTITION_ESP] = "/efi\0/boot\0", @@ -176,12 +91,12 @@ static const char *const partition_mountpoint_table[] = { DEFINE_PRIVATE_STRING_TABLE_LOOKUP_TO_STRING(partition_mountpoint, PartitionDesignator); #define _GPT_ARCH_SEXTET(arch, name) \ - { SD_GPT_ROOT_##arch, "root-" name, ARCHITECTURE_##arch, .designator = PARTITION_ROOT_OTHER }, \ - { SD_GPT_ROOT_##arch##_VERITY, "root-" name "-verity", ARCHITECTURE_##arch, .designator = PARTITION_ROOT_OTHER_VERITY }, \ - { SD_GPT_ROOT_##arch##_VERITY_SIG, "root-" name "-verity-sig", ARCHITECTURE_##arch, .designator = PARTITION_ROOT_OTHER_VERITY_SIG }, \ - { SD_GPT_USR_##arch, "usr-" name, ARCHITECTURE_##arch, .designator = PARTITION_USR_OTHER }, \ - { SD_GPT_USR_##arch##_VERITY, "usr-" name "-verity", ARCHITECTURE_##arch, .designator = PARTITION_USR_OTHER_VERITY }, \ - { SD_GPT_USR_##arch##_VERITY_SIG, "usr-" name "-verity-sig", ARCHITECTURE_##arch, .designator = PARTITION_USR_OTHER_VERITY_SIG } + { SD_GPT_ROOT_##arch, "root-" name, ARCHITECTURE_##arch, .designator = PARTITION_ROOT }, \ + { SD_GPT_ROOT_##arch##_VERITY, "root-" name "-verity", ARCHITECTURE_##arch, .designator = PARTITION_ROOT_VERITY }, \ + { SD_GPT_ROOT_##arch##_VERITY_SIG, "root-" name "-verity-sig", ARCHITECTURE_##arch, .designator = PARTITION_ROOT_VERITY_SIG }, \ + { SD_GPT_USR_##arch, "usr-" name, ARCHITECTURE_##arch, .designator = PARTITION_USR }, \ + { SD_GPT_USR_##arch##_VERITY, "usr-" name "-verity", ARCHITECTURE_##arch, .designator = PARTITION_USR_VERITY }, \ + { SD_GPT_USR_##arch##_VERITY_SIG, "usr-" name "-verity-sig", ARCHITECTURE_##arch, .designator = PARTITION_USR_VERITY_SIG } const GptPartitionType gpt_partition_type_table[] = { _GPT_ARCH_SEXTET(ALPHA, "alpha"), @@ -212,12 +127,12 @@ const GptPartitionType gpt_partition_type_table[] = { { SD_GPT_USR_NATIVE_VERITY_SIG, "usr-verity-sig", native_architecture(), .designator = PARTITION_USR_VERITY_SIG }, #endif #ifdef SD_GPT_ROOT_SECONDARY - { SD_GPT_ROOT_NATIVE, "root-secondary", native_architecture(), .designator = PARTITION_ROOT_SECONDARY }, - { SD_GPT_ROOT_NATIVE_VERITY, "root-secondary-verity", native_architecture(), .designator = PARTITION_ROOT_SECONDARY_VERITY }, - { SD_GPT_ROOT_NATIVE_VERITY_SIG, "root-secondary-verity-sig", native_architecture(), .designator = PARTITION_ROOT_SECONDARY_VERITY_SIG }, - { SD_GPT_USR_NATIVE, "usr-secondary", native_architecture(), .designator = PARTITION_USR_SECONDARY }, - { SD_GPT_USR_NATIVE_VERITY, "usr-secondary-verity", native_architecture(), .designator = PARTITION_USR_SECONDARY_VERITY }, - { SD_GPT_USR_NATIVE_VERITY_SIG, "usr-secondary-verity-sig", native_architecture(), .designator = PARTITION_USR_SECONDARY_VERITY_SIG }, + { SD_GPT_ROOT_NATIVE, "root-secondary", native_architecture(), .designator = PARTITION_ROOT }, + { SD_GPT_ROOT_NATIVE_VERITY, "root-secondary-verity", native_architecture(), .designator = PARTITION_ROOT_VERITY }, + { SD_GPT_ROOT_NATIVE_VERITY_SIG, "root-secondary-verity-sig", native_architecture(), .designator = PARTITION_ROOT_VERITY_SIG }, + { SD_GPT_USR_NATIVE, "usr-secondary", native_architecture(), .designator = PARTITION_USR }, + { SD_GPT_USR_NATIVE_VERITY, "usr-secondary-verity", native_architecture(), .designator = PARTITION_USR_VERITY }, + { SD_GPT_USR_NATIVE_VERITY_SIG, "usr-secondary-verity-sig", native_architecture(), .designator = PARTITION_USR_VERITY_SIG }, #endif { SD_GPT_ESP, "esp", _ARCHITECTURE_INVALID, .designator = PARTITION_ESP }, @@ -227,8 +142,8 @@ const GptPartitionType gpt_partition_type_table[] = { { SD_GPT_SRV, "srv", _ARCHITECTURE_INVALID, .designator = PARTITION_SRV }, { SD_GPT_VAR, "var", _ARCHITECTURE_INVALID, .designator = PARTITION_VAR }, { SD_GPT_TMP, "tmp", _ARCHITECTURE_INVALID, .designator = PARTITION_TMP }, - { SD_GPT_USER_HOME, "user-home", _ARCHITECTURE_INVALID, .designator = PARTITION_USER_HOME }, - { SD_GPT_LINUX_GENERIC, "linux-generic", _ARCHITECTURE_INVALID, .designator = PARTITION_LINUX_GENERIC }, + { SD_GPT_USER_HOME, "user-home", _ARCHITECTURE_INVALID, .designator = _PARTITION_DESIGNATOR_INVALID }, + { SD_GPT_LINUX_GENERIC, "linux-generic", _ARCHITECTURE_INVALID, .designator = _PARTITION_DESIGNATOR_INVALID }, {} }; @@ -266,17 +181,27 @@ const char *gpt_partition_type_uuid_to_string_harder( return sd_id128_to_uuid_string(id, buffer); } -int gpt_partition_type_uuid_from_string(const char *s, sd_id128_t *ret) { +int gpt_partition_type_from_string(const char *s, GptPartitionType *ret) { + sd_id128_t id; + int r; + assert(s); for (size_t i = 0; i < ELEMENTSOF(gpt_partition_type_table) - 1; i++) if (streq(s, gpt_partition_type_table[i].name)) { if (ret) - *ret = gpt_partition_type_table[i].uuid; + *ret = gpt_partition_type_table[i]; return 0; } - return sd_id128_from_string(s, ret); + r = sd_id128_from_string(s, &id); + if (r < 0) + return r; + + if (ret) + *ret = gpt_partition_type_from_uuid(id); + + return 0; } Architecture gpt_partition_type_uuid_to_arch(sd_id128_t id) { @@ -299,7 +224,7 @@ int gpt_partition_label_valid(const char *s) { return char16_strlen(recoded) <= GPT_LABEL_MAX; } -static GptPartitionType gpt_partition_type_from_uuid(sd_id128_t id) { +GptPartitionType gpt_partition_type_from_uuid(sd_id128_t id) { const GptPartitionType *pt; pt = gpt_partition_type_find_by_uuid(id); @@ -313,91 +238,45 @@ static GptPartitionType gpt_partition_type_from_uuid(sd_id128_t id) { }; } -bool gpt_partition_type_is_root(sd_id128_t id) { - return IN_SET(gpt_partition_type_from_uuid(id).designator, - PARTITION_ROOT, - PARTITION_ROOT_SECONDARY, - PARTITION_ROOT_OTHER); +const char *gpt_partition_type_mountpoint_nulstr(GptPartitionType type) { + return partition_mountpoint_to_string(type.designator); } -bool gpt_partition_type_is_root_verity(sd_id128_t id) { - return IN_SET(gpt_partition_type_from_uuid(id).designator, +bool gpt_partition_type_knows_read_only(GptPartitionType type) { + return IN_SET(type.designator, + PARTITION_ROOT, + PARTITION_USR, + /* pretty much implied, but let's set the bit to make things really clear */ PARTITION_ROOT_VERITY, - PARTITION_ROOT_SECONDARY_VERITY, - PARTITION_ROOT_OTHER_VERITY); -} - -bool gpt_partition_type_is_root_verity_sig(sd_id128_t id) { - return IN_SET(gpt_partition_type_from_uuid(id).designator, - PARTITION_ROOT_VERITY_SIG, - PARTITION_ROOT_SECONDARY_VERITY_SIG, - PARTITION_ROOT_OTHER_VERITY_SIG); + PARTITION_USR_VERITY, + PARTITION_HOME, + PARTITION_SRV, + PARTITION_VAR, + PARTITION_TMP, + PARTITION_XBOOTLDR); } -bool gpt_partition_type_is_usr(sd_id128_t id) { - return IN_SET(gpt_partition_type_from_uuid(id).designator, +bool gpt_partition_type_knows_growfs(GptPartitionType type) { + return IN_SET(type.designator, + PARTITION_ROOT, PARTITION_USR, - PARTITION_USR_SECONDARY, - PARTITION_USR_OTHER); + PARTITION_HOME, + PARTITION_SRV, + PARTITION_VAR, + PARTITION_TMP, + PARTITION_XBOOTLDR); } -bool gpt_partition_type_is_usr_verity(sd_id128_t id) { - return IN_SET(gpt_partition_type_from_uuid(id).designator, +bool gpt_partition_type_knows_no_auto(GptPartitionType type) { + return IN_SET(type.designator, + PARTITION_ROOT, + PARTITION_ROOT_VERITY, + PARTITION_USR, PARTITION_USR_VERITY, - PARTITION_USR_SECONDARY_VERITY, - PARTITION_USR_OTHER_VERITY); -} - -bool gpt_partition_type_is_usr_verity_sig(sd_id128_t id) { - return IN_SET(gpt_partition_type_from_uuid(id).designator, - PARTITION_USR_VERITY_SIG, - PARTITION_USR_SECONDARY_VERITY_SIG, - PARTITION_USR_OTHER_VERITY_SIG); -} - -const char *gpt_partition_type_mountpoint_nulstr(sd_id128_t id) { - PartitionDesignator d = gpt_partition_type_from_uuid(id).designator; - if (d < 0) - return NULL; - - return partition_mountpoint_to_string(d); -} - -bool gpt_partition_type_knows_read_only(sd_id128_t id) { - return gpt_partition_type_is_root(id) || - gpt_partition_type_is_usr(id) || - /* pretty much implied, but let's set the bit to make things really clear */ - gpt_partition_type_is_root_verity(id) || - gpt_partition_type_is_usr_verity(id) || - IN_SET(gpt_partition_type_from_uuid(id).designator, - PARTITION_HOME, - PARTITION_SRV, - PARTITION_VAR, - PARTITION_TMP, - PARTITION_XBOOTLDR); -} - -bool gpt_partition_type_knows_growfs(sd_id128_t id) { - return gpt_partition_type_is_root(id) || - gpt_partition_type_is_usr(id) || - IN_SET(gpt_partition_type_from_uuid(id).designator, - PARTITION_HOME, - PARTITION_SRV, - PARTITION_VAR, - PARTITION_TMP, - PARTITION_XBOOTLDR); -} - -bool gpt_partition_type_knows_no_auto(sd_id128_t id) { - return gpt_partition_type_is_root(id) || - gpt_partition_type_is_root_verity(id) || - gpt_partition_type_is_usr(id) || - gpt_partition_type_is_usr_verity(id) || - IN_SET(gpt_partition_type_from_uuid(id).designator, - PARTITION_HOME, - PARTITION_SRV, - PARTITION_VAR, - PARTITION_TMP, - PARTITION_XBOOTLDR, - PARTITION_SWAP); + PARTITION_HOME, + PARTITION_SRV, + PARTITION_VAR, + PARTITION_TMP, + PARTITION_XBOOTLDR, + PARTITION_SWAP); } diff --git a/src/shared/gpt.h b/src/shared/gpt.h index e0ab44a642..03af12c9e3 100644 --- a/src/shared/gpt.h +++ b/src/shared/gpt.h @@ -12,32 +12,18 @@ typedef enum PartitionDesignator { PARTITION_ROOT, /* Primary architecture */ - PARTITION_ROOT_SECONDARY, /* Secondary architecture */ - PARTITION_ROOT_OTHER, /* Any architecture not covered by the primary or secondary architecture. */ PARTITION_USR, - PARTITION_USR_SECONDARY, - PARTITION_USR_OTHER, PARTITION_HOME, PARTITION_SRV, PARTITION_ESP, PARTITION_XBOOTLDR, PARTITION_SWAP, PARTITION_ROOT_VERITY, /* verity data for the PARTITION_ROOT partition */ - PARTITION_ROOT_SECONDARY_VERITY, /* verity data for the PARTITION_ROOT_SECONDARY partition */ - PARTITION_ROOT_OTHER_VERITY, PARTITION_USR_VERITY, - PARTITION_USR_SECONDARY_VERITY, - PARTITION_USR_OTHER_VERITY, PARTITION_ROOT_VERITY_SIG, /* PKCS#7 signature for root hash for the PARTITION_ROOT partition */ - PARTITION_ROOT_SECONDARY_VERITY_SIG, /* ditto for the PARTITION_ROOT_SECONDARY partition */ - PARTITION_ROOT_OTHER_VERITY_SIG, PARTITION_USR_VERITY_SIG, - PARTITION_USR_SECONDARY_VERITY_SIG, - PARTITION_USR_OTHER_VERITY_SIG, PARTITION_TMP, PARTITION_VAR, - PARTITION_USER_HOME, - PARTITION_LINUX_GENERIC, _PARTITION_DESIGNATOR_MAX, _PARTITION_DESIGNATOR_INVALID = -EINVAL, } PartitionDesignator; @@ -46,8 +32,6 @@ bool partition_designator_is_versioned(PartitionDesignator d); PartitionDesignator partition_verity_of(PartitionDesignator p); PartitionDesignator partition_verity_sig_of(PartitionDesignator p); -PartitionDesignator partition_root_of_arch(Architecture arch); -PartitionDesignator partition_usr_of_arch(Architecture arch); const char* partition_designator_to_string(PartitionDesignator d) _const_; PartitionDesignator partition_designator_from_string(const char *name) _pure_; @@ -56,7 +40,6 @@ const char *gpt_partition_type_uuid_to_string(sd_id128_t id); const char *gpt_partition_type_uuid_to_string_harder( sd_id128_t id, char buffer[static SD_ID128_UUID_STRING_MAX]); -int gpt_partition_type_uuid_from_string(const char *s, sd_id128_t *ret); #define GPT_PARTITION_TYPE_UUID_TO_STRING_HARDER(id) \ gpt_partition_type_uuid_to_string_harder((id), (char[SD_ID128_UUID_STRING_MAX]) {}) @@ -74,15 +57,11 @@ extern const GptPartitionType gpt_partition_type_table[]; int gpt_partition_label_valid(const char *s); -bool gpt_partition_type_is_root(sd_id128_t id); -bool gpt_partition_type_is_root_verity(sd_id128_t id); -bool gpt_partition_type_is_root_verity_sig(sd_id128_t id); -bool gpt_partition_type_is_usr(sd_id128_t id); -bool gpt_partition_type_is_usr_verity(sd_id128_t id); -bool gpt_partition_type_is_usr_verity_sig(sd_id128_t id); +GptPartitionType gpt_partition_type_from_uuid(sd_id128_t id); +int gpt_partition_type_from_string(const char *s, GptPartitionType *ret); -const char *gpt_partition_type_mountpoint_nulstr(sd_id128_t id); +const char *gpt_partition_type_mountpoint_nulstr(GptPartitionType type); -bool gpt_partition_type_knows_read_only(sd_id128_t id); -bool gpt_partition_type_knows_growfs(sd_id128_t id); -bool gpt_partition_type_knows_no_auto(sd_id128_t id); +bool gpt_partition_type_knows_read_only(GptPartitionType type); +bool gpt_partition_type_knows_growfs(GptPartitionType type); +bool gpt_partition_type_knows_no_auto(GptPartitionType type); diff --git a/src/shared/idn-util.c b/src/shared/idn-util.c index d4108d0c8e..6f36688dc0 100644 --- a/src/shared/idn-util.c +++ b/src/shared/idn-util.c @@ -17,7 +17,7 @@ static void* idn_dl = NULL; #if HAVE_LIBIDN2 int (*sym_idn2_lookup_u8)(const uint8_t* src, uint8_t** lookupname, int flags) = NULL; -const char *(*sym_idn2_strerror)(int rc) = NULL; +const char *(*sym_idn2_strerror)(int rc) _const_ = NULL; int (*sym_idn2_to_unicode_8z8z)(const char * input, char ** output, int flags) = NULL; int dlopen_idn(void) { @@ -31,7 +31,7 @@ int dlopen_idn(void) { #if HAVE_LIBIDN int (*sym_idna_to_ascii_4i)(const uint32_t * in, size_t inlen, char *out, int flags); -int (*sym_idna_to_unicode_44i)(const uint32_t * in, size_t inlen,uint32_t * out, size_t * outlen, int flags); +int (*sym_idna_to_unicode_44i)(const uint32_t * in, size_t inlen, uint32_t * out, size_t * outlen, int flags); char* (*sym_stringprep_ucs4_to_utf8)(const uint32_t * str, ssize_t len, size_t * items_read, size_t * items_written); uint32_t* (*sym_stringprep_utf8_to_ucs4)(const char *str, ssize_t len, size_t *items_written); diff --git a/src/shared/idn-util.h b/src/shared/idn-util.h index 4698eed3b8..e64bd99747 100644 --- a/src/shared/idn-util.h +++ b/src/shared/idn-util.h @@ -20,7 +20,7 @@ static inline int dlopen_idn(void) { #if HAVE_LIBIDN2 extern int (*sym_idn2_lookup_u8)(const uint8_t* src, uint8_t** lookupname, int flags); -extern const char *(*sym_idn2_strerror)(int rc); +extern const char *(*sym_idn2_strerror)(int rc) _const_; extern int (*sym_idn2_to_unicode_8z8z)(const char * input, char ** output, int flags); #endif diff --git a/src/shared/meson.build b/src/shared/meson.build index 5f66b865de..3be7ba17bf 100644 --- a/src/shared/meson.build +++ b/src/shared/meson.build @@ -125,7 +125,6 @@ shared_sources = files( 'exit-status.h', 'extension-release.c', 'extension-release.h', - 'fdisk-util.h', 'fdset.c', 'fdset.h', 'fileio-label.c', @@ -493,3 +492,18 @@ libshared = shared_library( dependencies : libshared_deps, install : true, install_dir : rootpkglibdir) + +shared_fdisk_sources = files( + 'fdisk-util.h', + 'fdisk-util.c', +) + +if get_option('fdisk') != 'false' + libshared_fdisk = static_library( + 'shared-fdisk', + shared_fdisk_sources, + include_directories : includes, + dependencies : [libfdisk], + c_args : ['-fvisibility=default'], + build_by_default : false) +endif diff --git a/src/shared/mkfs-util.c b/src/shared/mkfs-util.c index 8161dbf825..3edeaa5285 100644 --- a/src/shared/mkfs-util.c +++ b/src/shared/mkfs-util.c @@ -2,13 +2,20 @@ #include <unistd.h> +#include "dirent-util.h" +#include "fd-util.h" +#include "fileio.h" +#include "fs-util.h" #include "id128-util.h" #include "mkfs-util.h" #include "mountpoint-util.h" #include "path-util.h" #include "process-util.h" +#include "recurse-dir.h" +#include "stat-util.h" #include "stdio-util.h" #include "string-util.h" +#include "tmpfile-util.h" #include "utf8.h" int mkfs_exists(const char *fstype) { @@ -33,6 +40,10 @@ int mkfs_exists(const char *fstype) { return true; } +int mkfs_supports_root_option(const char *fstype) { + return fstype_is_ro(fstype) || STR_IN_SET(fstype, "ext2", "ext3", "ext4", "btrfs", "vfat", "xfs"); +} + static int mangle_linux_fs_label(const char *s, size_t max_len, char **ret) { /* Not more than max_len bytes (12 or 16) */ @@ -87,6 +98,202 @@ static int mangle_fat_label(const char *s, char **ret) { return 0; } +static int setup_userns(uid_t uid, gid_t gid) { + int r; + + /* mkfs programs tend to keep ownership intact when bootstrapping themselves from a root directory. + * However, we'd like for the files to be owned by root instead, so we fork off a user namespace and + * inside of it, map the uid/gid of the root directory to root in the user namespace. mkfs programs + * will pick up on this and the files will be owned by root in the generated filesystem. */ + + r = write_string_filef("/proc/self/uid_map", WRITE_STRING_FILE_DISABLE_BUFFER, + UID_FMT " " UID_FMT " " UID_FMT, 0u, uid, 1u); + if (r < 0) + return log_error_errno(r, + "Failed to write mapping for "UID_FMT" to /proc/self/uid_map: %m", + uid); + + r = write_string_file("/proc/self/setgroups", "deny", WRITE_STRING_FILE_DISABLE_BUFFER); + if (r < 0) + return log_error_errno(r, "Failed to write 'deny' to /proc/self/setgroups: %m"); + + r = write_string_filef("/proc/self/gid_map", WRITE_STRING_FILE_DISABLE_BUFFER, + GID_FMT " " GID_FMT " " GID_FMT, 0u, gid, 1u); + if (r < 0) + return log_error_errno(r, + "Failed to write mapping for "GID_FMT" to /proc/self/gid_map: %m", + gid); + + return 0; +} + +static int do_mcopy(const char *node, const char *root) { + _cleanup_free_ char *mcopy = NULL; + _cleanup_strv_free_ char **argv = NULL; + _cleanup_close_ int rfd = -1; + _cleanup_free_ DirectoryEntries *de = NULL; + struct stat st; + int r; + + assert(node); + assert(root); + + /* Return early if there's nothing to copy. */ + if (dir_is_empty(root, /*ignore_hidden_or_backup=*/ false)) + return 0; + + r = find_executable("mcopy", &mcopy); + if (r == -ENOENT) + return log_error_errno(SYNTHETIC_ERRNO(EPROTONOSUPPORT), "Could not find mcopy binary."); + if (r < 0) + return log_error_errno(r, "Failed to determine whether mcopy binary exists: %m"); + + argv = strv_new(mcopy, "-s", "-p", "-Q", "-m", "-i", node); + if (!argv) + return log_oom(); + + /* mcopy copies the top level directory instead of everything in it so we have to pass all + * the subdirectories to mcopy instead to end up with the correct directory structure. */ + + rfd = open(root, O_RDONLY|O_DIRECTORY|O_CLOEXEC); + if (rfd < 0) + return log_error_errno(errno, "Failed to open directory '%s': %m", root); + + r = readdir_all(rfd, RECURSE_DIR_SORT|RECURSE_DIR_ENSURE_TYPE, &de); + if (r < 0) + return log_error_errno(r, "Failed to read '%s' contents: %m", root); + + for (size_t i = 0; i < de->n_entries; i++) { + char *p = path_join(root, de->entries[i]->d_name); + if (!p) + return log_oom(); + + if (!IN_SET(de->entries[i]->d_type, DT_REG, DT_DIR)) { + log_debug("%s is not a file/directory which are the only file types supported by vfat, ignoring", p); + continue; + } + + r = strv_consume(&argv, TAKE_PTR(p)); + if (r < 0) + return log_oom(); + } + + r = strv_extend(&argv, "::"); + if (r < 0) + return log_oom(); + + if (fstat(rfd, &st) < 0) + return log_error_errno(errno, "Failed to stat '%s': %m", root); + + r = safe_fork("(mcopy)", FORK_RESET_SIGNALS|FORK_RLIMIT_NOFILE_SAFE|FORK_DEATHSIG|FORK_LOG|FORK_WAIT|FORK_STDOUT_TO_STDERR|FORK_NEW_USERNS|FORK_CLOSE_ALL_FDS, NULL); + if (r < 0) + return r; + if (r == 0) { + r = setup_userns(st.st_uid, st.st_gid); + if (r < 0) + _exit(EXIT_FAILURE); + + /* Avoid failures caused by mismatch in expectations between mkfs.vfat and mcopy by disabling + * the stricter mcopy checks using MTOOLS_SKIP_CHECK. */ + execve(mcopy, argv, STRV_MAKE("MTOOLS_SKIP_CHECK=1")); + + log_error_errno(errno, "Failed to execute mcopy: %m"); + + _exit(EXIT_FAILURE); + } + + return 0; +} + +static int protofile_print_item( + RecurseDirEvent event, + const char *path, + int dir_fd, + int inode_fd, + const struct dirent *de, + const struct statx *sx, + void *userdata) { + + FILE *f = ASSERT_PTR(userdata); + int r; + + if (event == RECURSE_DIR_LEAVE) { + fputs("$\n", f); + return 0; + } + + if (!IN_SET(event, RECURSE_DIR_ENTER, RECURSE_DIR_ENTRY)) + return RECURSE_DIR_CONTINUE; + + char type = S_ISDIR(sx->stx_mode) ? 'd' : + S_ISREG(sx->stx_mode) ? '-' : + S_ISLNK(sx->stx_mode) ? 'l' : + S_ISFIFO(sx->stx_mode) ? 'p' : + S_ISBLK(sx->stx_mode) ? 'b' : + S_ISCHR(sx->stx_mode) ? 'c' : 0; + if (type == 0) + return RECURSE_DIR_CONTINUE; + + fprintf(f, "%s %c%c%c%03o 0 0 ", + de->d_name, + type, + sx->stx_mode & S_ISUID ? 'u' : '-', + sx->stx_mode & S_ISGID ? 'g' : '-', + (unsigned) (sx->stx_mode & 0777)); + + if (S_ISREG(sx->stx_mode)) + fputs(path, f); + else if (S_ISLNK(sx->stx_mode)) { + _cleanup_free_ char *p = NULL; + + r = readlinkat_malloc(dir_fd, de->d_name, &p); + if (r < 0) + return log_error_errno(r, "Failed to read symlink %s: %m", path); + + fputs(p, f); + } else if (S_ISBLK(sx->stx_mode) || S_ISCHR(sx->stx_mode)) + fprintf(f, "%" PRIu32 " %" PRIu32, sx->stx_rdev_major, sx->stx_rdev_minor); + + fputc('\n', f); + + return RECURSE_DIR_CONTINUE; +} + +static int make_protofile(const char *root, char **ret) { + _cleanup_fclose_ FILE *f = NULL; + _cleanup_(unlink_and_freep) char *p = NULL; + const char *vt; + int r; + + assert(ret); + + r = var_tmp_dir(&vt); + if (r < 0) + return log_error_errno(r, "Failed to get persistent temporary directory: %m"); + + r = fopen_temporary_child(vt, &f, &p); + if (r < 0) + return log_error_errno(r, "Failed to open temporary file: %m"); + + fputs("/\n" + "0 0\n" + "d--755 0 0\n", f); + + r = recurse_dir_at(AT_FDCWD, root, STATX_TYPE|STATX_MODE, UINT_MAX, RECURSE_DIR_SORT, protofile_print_item, f); + if (r < 0) + return log_error_errno(r, "Failed to recurse through %s: %m", root); + + fputs("$\n", f); + + r = fflush_and_check(f); + if (r < 0) + return log_error_errno(r, "Failed to flush %s: %m", p); + + *ret = TAKE_PTR(p); + + return 0; +} + int make_filesystem( const char *node, const char *fstype, @@ -96,7 +303,10 @@ int make_filesystem( bool discard) { _cleanup_free_ char *mkfs = NULL, *mangled_label = NULL; + _cleanup_strv_free_ char **argv = NULL; + _cleanup_(unlink_and_freep) char *protofile = NULL; char vol_id[CONST_MAX(SD_ID128_UUID_STRING_MAX, 8U + 1U)] = {}; + struct stat st; int r; assert(node); @@ -128,9 +338,9 @@ int make_filesystem( "Don't know how to create read-only file system '%s', refusing.", fstype); } else { - if (root) + if (root && !mkfs_supports_root_option(fstype)) return log_error_errno(SYNTHETIC_ERRNO(EOPNOTSUPP), - "Populating with source tree is only supported for read-only filesystems"); + "Populating with source tree is not supported for %s", fstype); r = mkfs_exists(fstype); if (r < 0) return log_error_errno(r, "Failed to determine whether mkfs binary for %s exists: %m", fstype); @@ -169,101 +379,161 @@ int make_filesystem( if (isempty(vol_id)) assert_se(sd_id128_to_uuid_string(uuid, vol_id)); - r = safe_fork("(mkfs)", FORK_RESET_SIGNALS|FORK_RLIMIT_NOFILE_SAFE|FORK_DEATHSIG|FORK_LOG|FORK_WAIT|FORK_STDOUT_TO_STDERR, NULL); + /* When changing this conditional, also adjust the log statement below. */ + if (streq(fstype, "ext2")) { + argv = strv_new(mkfs, + "-q", + "-L", label, + "-U", vol_id, + "-I", "256", + "-m", "0", + "-E", discard ? "discard,lazy_itable_init=1" : "nodiscard,lazy_itable_init=1", + node); + if (!argv) + return log_oom(); + + if (root) { + r = strv_extend_strv(&argv, STRV_MAKE("-d", root), false); + if (r < 0) + return log_oom(); + } + + } else if (STR_IN_SET(fstype, "ext3", "ext4")) { + argv = strv_new(mkfs, + "-q", + "-L", label, + "-U", vol_id, + "-I", "256", + "-O", "has_journal", + "-m", "0", + "-E", discard ? "discard,lazy_itable_init=1" : "nodiscard,lazy_itable_init=1", + node); + + if (root) { + r = strv_extend_strv(&argv, STRV_MAKE("-d", root), false); + if (r < 0) + return log_oom(); + } + + } else if (streq(fstype, "btrfs")) { + argv = strv_new(mkfs, + "-q", + "-L", label, + "-U", vol_id, + node); + if (!argv) + return log_oom(); + + if (!discard) { + r = strv_extend(&argv, "--nodiscard"); + if (r < 0) + return log_oom(); + } + + if (root) { + r = strv_extend_strv(&argv, STRV_MAKE("-r", root), false); + if (r < 0) + return log_oom(); + } + + } else if (streq(fstype, "f2fs")) { + argv = strv_new(mkfs, + "-q", + "-g", /* "default options" */ + "-f", /* force override, without this it doesn't seem to want to write to an empty partition */ + "-l", label, + "-U", vol_id, + "-t", one_zero(discard), + node); + + } else if (streq(fstype, "xfs")) { + const char *j; + + j = strjoina("uuid=", vol_id); + + argv = strv_new(mkfs, + "-q", + "-L", label, + "-m", j, + "-m", "reflink=1", + node); + if (!argv) + return log_oom(); + + if (!discard) { + r = strv_extend(&argv, "-K"); + if (r < 0) + return log_oom(); + } + + if (root) { + r = make_protofile(root, &protofile); + if (r < 0) + return r; + + r = strv_extend_strv(&argv, STRV_MAKE("-p", protofile), false); + if (r < 0) + return log_oom(); + } + + } else if (streq(fstype, "vfat")) + + argv = strv_new(mkfs, + "-i", vol_id, + "-n", label, + "-F", "32", /* yes, we force FAT32 here */ + node); + + else if (streq(fstype, "swap")) + /* TODO: add --quiet here if + * https://github.com/util-linux/util-linux/issues/1499 resolved. */ + + argv = strv_new(mkfs, + "-L", label, + "-U", vol_id, + node); + + else if (streq(fstype, "squashfs")) + + argv = strv_new(mkfs, + root, node, + "-quiet", + "-noappend"); + else + /* Generic fallback for all other file systems */ + argv = strv_new(mkfs, node); + + if (!argv) + return log_oom(); + + if (root && stat(root, &st) < 0) + return log_error_errno(errno, "Failed to stat %s: %m", root); + + r = safe_fork("(mkfs)", FORK_RESET_SIGNALS|FORK_RLIMIT_NOFILE_SAFE|FORK_DEATHSIG|FORK_LOG|FORK_WAIT|FORK_STDOUT_TO_STDERR|FORK_CLOSE_ALL_FDS|(root ? FORK_NEW_USERNS : 0), NULL); if (r < 0) return r; if (r == 0) { /* Child */ - /* When changing this conditional, also adjust the log statement below. */ - if (streq(fstype, "ext2")) - (void) execlp(mkfs, mkfs, - "-q", - "-L", label, - "-U", vol_id, - "-I", "256", - "-m", "0", - "-E", discard ? "discard,lazy_itable_init=1" : "nodiscard,lazy_itable_init=1", - node, NULL); - - else if (STR_IN_SET(fstype, "ext3", "ext4")) - (void) execlp(mkfs, mkfs, - "-q", - "-L", label, - "-U", vol_id, - "-I", "256", - "-O", "has_journal", - "-m", "0", - "-E", discard ? "discard,lazy_itable_init=1" : "nodiscard,lazy_itable_init=1", - node, NULL); - - else if (streq(fstype, "btrfs")) { - (void) execlp(mkfs, mkfs, - "-q", - "-L", label, - "-U", vol_id, - node, - discard ? NULL : "--nodiscard", - NULL); - - } else if (streq(fstype, "f2fs")) { - (void) execlp(mkfs, mkfs, - "-q", - "-g", /* "default options" */ - "-f", /* force override, without this it doesn't seem to want to write to an empty partition */ - "-l", label, - "-U", vol_id, - "-t", one_zero(discard), - node, - NULL); - - } else if (streq(fstype, "xfs")) { - const char *j; - - j = strjoina("uuid=", vol_id); - - (void) execlp(mkfs, mkfs, - "-q", - "-L", label, - "-m", j, - "-m", "reflink=1", - node, - discard ? NULL : "-K", - NULL); - - } else if (streq(fstype, "vfat")) - - (void) execlp(mkfs, mkfs, - "-i", vol_id, - "-n", label, - "-F", "32", /* yes, we force FAT32 here */ - node, NULL); - - else if (streq(fstype, "swap")) - /* TODO: add --quiet here if - * https://github.com/util-linux/util-linux/issues/1499 resolved. */ - - (void) execlp(mkfs, mkfs, - "-L", label, - "-U", vol_id, - node, NULL); - - else if (streq(fstype, "squashfs")) - - (void) execlp(mkfs, mkfs, - root, node, - "-quiet", - "-noappend", - NULL); - else - /* Generic fallback for all other file systems */ - (void) execlp(mkfs, mkfs, node, NULL); + if (root) { + r = setup_userns(st.st_uid, st.st_gid); + if (r < 0) + _exit(EXIT_FAILURE); + } + + execvp(mkfs, argv); log_error_errno(errno, "Failed to execute %s: %m", mkfs); _exit(EXIT_FAILURE); } + if (root && streq(fstype, "vfat")) { + r = do_mcopy(node, root); + if (r < 0) + return r; + } + if (STR_IN_SET(fstype, "ext2", "ext3", "ext4", "btrfs", "f2fs", "xfs", "vfat", "swap")) log_info("%s successfully formatted as %s (label \"%s\", uuid %s)", node, fstype, label, vol_id); diff --git a/src/shared/mkfs-util.h b/src/shared/mkfs-util.h index 3125ef176d..62061c6647 100644 --- a/src/shared/mkfs-util.h +++ b/src/shared/mkfs-util.h @@ -9,4 +9,6 @@ int mkfs_exists(const char *fstype); +int mkfs_supports_root_option(const char *fstype); + int make_filesystem(const char *node, const char *fstype, const char *label, const char *root, sd_id128_t uuid, bool discard); diff --git a/src/shared/mount-util.c b/src/shared/mount-util.c index 1f827e2061..681d698800 100644 --- a/src/shared/mount-util.c +++ b/src/shared/mount-util.c @@ -21,6 +21,7 @@ #include "fs-util.h" #include "glyph-util.h" #include "hashmap.h" +#include "initrd-util.h" #include "label.h" #include "libmount-util.h" #include "missing_mount.h" @@ -489,6 +490,52 @@ int mount_move_root(const char *path) { return RET_NERRNO(chdir("/")); } +int mount_pivot_root(const char *path) { + _cleanup_close_ int fd_oldroot = -EBADF, fd_newroot = -EBADF; + + assert(path); + + /* pivot_root() isn't currently supported in the initramfs. */ + if (in_initrd()) + return mount_move_root(path); + + fd_oldroot = open("/", O_PATH|O_DIRECTORY|O_CLOEXEC|O_NOFOLLOW); + if (fd_oldroot < 0) + return log_debug_errno(errno, "Failed to open old rootfs"); + + fd_newroot = open(path, O_PATH|O_DIRECTORY|O_CLOEXEC|O_NOFOLLOW); + if (fd_newroot < 0) + return log_debug_errno(errno, "Failed to open new rootfs '%s': %m", path); + + /* Change into the new rootfs. */ + if (fchdir(fd_newroot) < 0) + return log_debug_errno(errno, "Failed to change into new rootfs '%s': %m", path); + + /* Let the kernel tuck the new root under the old one. */ + if (pivot_root(".", ".") < 0) + return log_debug_errno(errno, "Failed to pivot root to new rootfs '%s': %m", path); + + + /* At this point the new root is tucked under the old root. If we want + * to unmount it we cannot be fchdir()ed into it. So escape back to the + * old root. */ + if (fchdir(fd_oldroot) < 0) + return log_debug_errno(errno, "Failed to change back to old rootfs: %m"); + + /* Note, usually we should set mount propagation up here but we'll + * assume that the caller has already done that. */ + + /* Get rid of the old root and reveal our brand new root. */ + if (umount2(".", MNT_DETACH) < 0) + return log_debug_errno(errno, "Failed to unmount old rootfs: %m"); + + if (fchdir(fd_newroot) < 0) + return log_debug_errno(errno, "Failed to switch to new rootfs '%s': %m", path); + + return 0; +} + + int repeat_unmount(const char *path, int flags) { bool done = false; diff --git a/src/shared/mount-util.h b/src/shared/mount-util.h index 8b07611ec8..29b9ed02f7 100644 --- a/src/shared/mount-util.h +++ b/src/shared/mount-util.h @@ -55,6 +55,7 @@ static inline int bind_remount_recursive(const char *prefix, unsigned long new_f int bind_remount_one_with_mountinfo(const char *path, unsigned long new_flags, unsigned long flags_mask, FILE *proc_self_mountinfo); int mount_move_root(const char *path); +int mount_pivot_root(const char *path); DEFINE_TRIVIAL_CLEANUP_FUNC_FULL(FILE*, endmntent, NULL); #define _cleanup_endmntent_ _cleanup_(endmntentp) diff --git a/src/shared/resize-fs.h b/src/shared/resize-fs.h index 312005f7e2..ac185d180b 100644 --- a/src/shared/resize-fs.h +++ b/src/shared/resize-fs.h @@ -8,7 +8,7 @@ int resize_fs(int fd, uint64_t sz, uint64_t *ret_size); #define BTRFS_MINIMAL_SIZE (256U*1024U*1024U) -#define XFS_MINIMAL_SIZE (14U*1024U*1024U) +#define XFS_MINIMAL_SIZE (16U*1024U*1024U) #define EXT4_MINIMAL_SIZE (1024U*1024U) uint64_t minimal_size_by_fs_magic(statfs_f_type_t magic); diff --git a/src/shared/resolve-util.h b/src/shared/resolve-util.h index e58173d864..7c9008c705 100644 --- a/src/shared/resolve-util.h +++ b/src/shared/resolve-util.h @@ -11,6 +11,9 @@ /* 127.0.0.54 in native endian (The IP address we listen on we only implement "proxy" mode) */ #define INADDR_DNS_PROXY_STUB ((in_addr_t) 0x7f000036U) +/* 127.0.0.2 is an address we always map to the local hostname. This is different from 127.0.0.1 which maps to "localhost" */ +#define INADDR_LOCALADDRESS ((in_addr_t) 0x7f000002U) + typedef enum DnsCacheMode DnsCacheMode; enum DnsCacheMode { diff --git a/src/shared/tmpfile-util-label.c b/src/shared/tmpfile-util-label.c index d37c0b0845..17c5038b51 100644 --- a/src/shared/tmpfile-util-label.c +++ b/src/shared/tmpfile-util-label.c @@ -14,6 +14,8 @@ int fopen_temporary_label( int r; + assert(path); + r = mac_selinux_create_file_prepare(target, S_IFREG); if (r < 0) return r; diff --git a/src/shared/tpm2-util.c b/src/shared/tpm2-util.c index ba8a23e18c..327caa439f 100644 --- a/src/shared/tpm2-util.c +++ b/src/shared/tpm2-util.c @@ -152,8 +152,19 @@ int tpm2_context_init(const char *device, struct tpm2_context *ret) { if (r < 0) return log_error_errno(r, "TPM2 support not installed: %m"); - if (!device) + if (!device) { device = secure_getenv("SYSTEMD_TPM2_DEVICE"); + if (device) + /* Setting the env var to an empty string forces tpm2-tss' own device picking + * logic to be used. */ + device = empty_to_null(device); + else + /* If nothing was specified explicitly, we'll use a hardcoded default: the "device" tcti + * driver and the "/dev/tpmrm0" device. We do this since on some distributions the tpm2-abrmd + * might be used and we really don't want that, since it is a system service and that creates + * various ordering issues/deadlocks during early boot. */ + device = "device:/dev/tpmrm0"; + } if (device) { const char *param, *driver, *fn; @@ -163,15 +174,27 @@ int tpm2_context_init(const char *device, struct tpm2_context *ret) { param = strchr(device, ':'); if (param) { + /* Syntax #1: Pair of driver string and arbitrary parameter */ driver = strndupa_safe(device, param - device); + if (isempty(driver)) + return log_error_errno(SYNTHETIC_ERRNO(EINVAL), "TPM2 driver name is empty, refusing."); + param++; - } else { + } else if (path_is_absolute(device) && path_is_valid(device)) { + /* Syntax #2: TPM device node */ driver = "device"; param = device; - } + } else + return log_error_errno(SYNTHETIC_ERRNO(EINVAL), "Invalid TPM2 driver string, refusing."); + + log_debug("Using TPM2 TCTI driver '%s' with device '%s'.", driver, param); fn = strjoina("libtss2-tcti-", driver, ".so.0"); + /* Better safe than sorry, let's refuse strings that cannot possibly be valid driver early, before going to disk. */ + if (!filename_is_valid(fn)) + return log_error_errno(SYNTHETIC_ERRNO(EINVAL), "TPM2 driver name '%s' not valid, refusing.", driver); + dl = dlopen(fn, RTLD_NOW); if (!dl) return log_error_errno(SYNTHETIC_ERRNO(ENOTRECOVERABLE), "Failed to load %s: %s", fn, dlerror()); @@ -1094,7 +1117,13 @@ static int tpm2_make_policy_session( ESYS_TR_NONE, NULL, &pubkey_tpm2, +#if HAVE_TSS2_ESYS3 + /* tpm2-tss >= 3.0.0 requires a ESYS_TR_RH_* constant specifying the requested + * hierarchy, older versions need TPM2_RH_* instead. */ + ESYS_TR_RH_OWNER, +#else TPM2_RH_OWNER, +#endif &pubkey_handle); if (rc != TSS2_RC_SUCCESS) { r = log_error_errno(SYNTHETIC_ERRNO(ENOTRECOVERABLE), diff --git a/src/shared/utmp-wtmp.c b/src/shared/utmp-wtmp.c index d2c8473c60..37a5bf7990 100644 --- a/src/shared/utmp-wtmp.c +++ b/src/shared/utmp-wtmp.c @@ -12,6 +12,7 @@ #include <utmpx.h> #include "alloc-util.h" +#include "errno-util.h" #include "fd-util.h" #include "hostname-util.h" #include "io-util.h" @@ -292,13 +293,15 @@ static int write_to_terminal(const char *tty, const char *message) { assert(message); fd = open(tty, O_WRONLY|O_NONBLOCK|O_NOCTTY|O_CLOEXEC); - if (fd < 0 || !isatty(fd)) + if (fd < 0) return -errno; + if (!isatty(fd)) + return -ENOTTY; p = message; left = strlen(message); - end = now(CLOCK_MONOTONIC) + TIMEOUT_USEC; + end = usec_add(now(CLOCK_MONOTONIC), TIMEOUT_USEC); while (left > 0) { ssize_t n; @@ -306,19 +309,21 @@ static int write_to_terminal(const char *tty, const char *message) { int k; t = now(CLOCK_MONOTONIC); - if (t >= end) return -ETIME; k = fd_wait_for_event(fd, POLLOUT, end - t); - if (k < 0) + if (k < 0) { + if (ERRNO_IS_TRANSIENT(k)) + continue; return k; + } if (k == 0) return -ETIME; n = write(fd, p, left); if (n < 0) { - if (errno == EAGAIN) + if (ERRNO_IS_TRANSIENT(errno)) continue; return -errno; diff --git a/src/shared/varlink.c b/src/shared/varlink.c index 4f7ac97689..4d2cfee491 100644 --- a/src/shared/varlink.c +++ b/src/shared/varlink.c @@ -1025,7 +1025,7 @@ static void handle_revents(Varlink *v, int revents) { if ((revents & (POLLOUT|POLLHUP)) == 0) return; - varlink_log(v, "Anynchronous connection completed."); + varlink_log(v, "Asynchronous connection completed."); v->connecting = false; } else { /* Note that we don't care much about POLLIN/POLLOUT here, we'll just try reading and writing @@ -1075,6 +1075,9 @@ int varlink_wait(Varlink *v, usec_t timeout) { return events; r = fd_wait_for_event(fd, events, t); + if (r < 0 && ERRNO_IS_TRANSIENT(r)) /* Treat EINTR as not a timeout, but also nothing happened, and + * the caller gets a chance to call back into us */ + return 1; if (r <= 0) return r; @@ -1161,8 +1164,12 @@ int varlink_flush(Varlink *v) { } r = fd_wait_for_event(v->fd, POLLOUT, USEC_INFINITY); - if (r < 0) + if (r < 0) { + if (ERRNO_IS_TRANSIENT(r)) + continue; + return varlink_log_errno(v, r, "Poll failed on fd: %m"); + } assert(r != 0); diff --git a/src/stdio-bridge/stdio-bridge.c b/src/stdio-bridge/stdio-bridge.c index 3c5ba074c7..6e8f2bbe3c 100644 --- a/src/stdio-bridge/stdio-bridge.c +++ b/src/stdio-bridge/stdio-bridge.c @@ -242,8 +242,11 @@ static int run(int argc, char *argv[]) { }; r = ppoll_usec(p, ELEMENTSOF(p), t); - if (r < 0) + if (r < 0) { + if (ERRNO_IS_TRANSIENT(r)) /* don't be bothered by signals, i.e. EINTR */ + continue; return log_error_errno(r, "ppoll() failed: %m"); + } } return 0; diff --git a/src/systemctl/systemctl-edit.c b/src/systemctl/systemctl-edit.c index fe47f73d4a..e8e6eeda29 100644 --- a/src/systemctl/systemctl-edit.c +++ b/src/systemctl/systemctl-edit.c @@ -483,6 +483,18 @@ static int trim_edit_markers(const char *path) { strshorten(contents_start, contents_end - contents_start); contents_start = strstrip(contents_start); + if (*contents_start && !endswith(contents_start, "\n")) { + char *tmp = contents_start; + if (MALLOC_SIZEOF_SAFE(contents) - (contents_start - contents) - strlen(contents_start) < 2) { + if ((tmp = realloc(contents, size + 1))) { + contents_start = tmp + (contents_start - contents); + contents = tmp; + } + } + + if (tmp) + strcat(contents_start, "\n"); + } /* Write new contents if the trimming actually changed anything */ if (strlen(contents) != size) { diff --git a/src/systemctl/systemctl-show.c b/src/systemctl/systemctl-show.c index 24c7d564b8..77dd075eb3 100644 --- a/src/systemctl/systemctl-show.c +++ b/src/systemctl/systemctl-show.c @@ -250,6 +250,7 @@ typedef struct UnitStatusInfo { uint64_t memory_high; uint64_t memory_max; uint64_t memory_swap_max; + uint64_t memory_zswap_max; uint64_t memory_limit; uint64_t memory_available; uint64_t cpu_usage_nsec; @@ -700,6 +701,7 @@ static void print_status_info( if (i->memory_min > 0 || i->memory_low > 0 || i->memory_high != CGROUP_LIMIT_MAX || i->memory_max != CGROUP_LIMIT_MAX || i->memory_swap_max != CGROUP_LIMIT_MAX || + i->memory_zswap_max != CGROUP_LIMIT_MAX || i->memory_available != CGROUP_LIMIT_MAX || i->memory_limit != CGROUP_LIMIT_MAX) { const char *prefix = ""; @@ -725,6 +727,10 @@ static void print_status_info( printf("%sswap max: %s", prefix, FORMAT_BYTES(i->memory_swap_max)); prefix = " "; } + if (i->memory_zswap_max != CGROUP_LIMIT_MAX) { + printf("%szswap max: %s", prefix, FORMAT_BYTES(i->memory_zswap_max)); + prefix = " "; + } if (i->memory_limit != CGROUP_LIMIT_MAX) { printf("%slimit: %s", prefix, FORMAT_BYTES(i->memory_limit)); prefix = " "; @@ -1935,6 +1941,7 @@ static int show_one( { "MemoryHigh", "t", NULL, offsetof(UnitStatusInfo, memory_high) }, { "MemoryMax", "t", NULL, offsetof(UnitStatusInfo, memory_max) }, { "MemorySwapMax", "t", NULL, offsetof(UnitStatusInfo, memory_swap_max) }, + { "MemoryZSwapMax", "t", NULL, offsetof(UnitStatusInfo, memory_zswap_max) }, { "MemoryLimit", "t", NULL, offsetof(UnitStatusInfo, memory_limit) }, { "CPUUsageNSec", "t", NULL, offsetof(UnitStatusInfo, cpu_usage_nsec) }, { "TasksCurrent", "t", NULL, offsetof(UnitStatusInfo, tasks_current) }, @@ -1969,6 +1976,7 @@ static int show_one( .memory_high = CGROUP_LIMIT_MAX, .memory_max = CGROUP_LIMIT_MAX, .memory_swap_max = CGROUP_LIMIT_MAX, + .memory_zswap_max = CGROUP_LIMIT_MAX, .memory_limit = UINT64_MAX, .memory_available = CGROUP_LIMIT_MAX, .cpu_usage_nsec = UINT64_MAX, diff --git a/src/systemctl/systemctl-start-special.c b/src/systemctl/systemctl-start-special.c index 9363764cd7..edc907c832 100644 --- a/src/systemctl/systemctl-start-special.c +++ b/src/systemctl/systemctl-start-special.c @@ -153,19 +153,8 @@ int verb_start_special(int argc, char *argv[], void *userdata) { return r; if (a == ACTION_REBOOT) { - const char *arg = NULL; - - if (argc > 1) { - if (arg_reboot_argument) - return log_error_errno(SYNTHETIC_ERRNO(EINVAL), "Both --reboot-argument= and positional argument passed to reboot command, refusing."); - - log_notice("Positional argument to reboot command is deprecated, please use --reboot-argument= instead. Accepting anyway."); - arg = argv[1]; - } else - arg = arg_reboot_argument; - - if (arg) { - r = update_reboot_parameter_and_warn(arg, false); + if (arg_reboot_argument) { + r = update_reboot_parameter_and_warn(arg_reboot_argument, false); if (r < 0) return r; } diff --git a/src/systemctl/systemctl.c b/src/systemctl/systemctl.c index 3f28bcc3dc..4f2637f0f1 100644 --- a/src/systemctl/systemctl.c +++ b/src/systemctl/systemctl.c @@ -1087,7 +1087,7 @@ static int systemctl_main(int argc, char *argv[]) { { "import-environment", VERB_ANY, VERB_ANY, VERB_ONLINE_ONLY, verb_import_environment }, { "halt", VERB_ANY, 1, VERB_ONLINE_ONLY, verb_start_system_special }, { "poweroff", VERB_ANY, 1, VERB_ONLINE_ONLY, verb_start_system_special }, - { "reboot", VERB_ANY, 2, VERB_ONLINE_ONLY, verb_start_system_special }, + { "reboot", VERB_ANY, 1, VERB_ONLINE_ONLY, verb_start_system_special }, { "kexec", VERB_ANY, 1, VERB_ONLINE_ONLY, verb_start_system_special }, { "suspend", VERB_ANY, 1, VERB_ONLINE_ONLY, verb_start_system_special }, { "hibernate", VERB_ANY, 1, VERB_ONLINE_ONLY, verb_start_system_special }, diff --git a/src/sysupdate/sysupdate-partition.c b/src/sysupdate/sysupdate-partition.c index fa46574fd6..33d0e584ba 100644 --- a/src/sysupdate/sysupdate-partition.c +++ b/src/sysupdate/sysupdate-partition.c @@ -111,6 +111,7 @@ int read_partition_info( struct fdisk_parttype *pt; uint64_t start, size, flags; sd_id128_t ptid, id; + GptPartitionType type; size_t partno; int r; @@ -178,6 +179,8 @@ int read_partition_info( if (!label_copy) return log_oom(); + type = gpt_partition_type_from_uuid(ptid); + *ret = (PartitionInfo) { .partno = partno, .start = start, @@ -187,9 +190,9 @@ int read_partition_info( .uuid = id, .label = TAKE_PTR(label_copy), .device = TAKE_PTR(device), - .no_auto = FLAGS_SET(flags, SD_GPT_FLAG_NO_AUTO) && gpt_partition_type_knows_no_auto(ptid), - .read_only = FLAGS_SET(flags, SD_GPT_FLAG_READ_ONLY) && gpt_partition_type_knows_read_only(ptid), - .growfs = FLAGS_SET(flags, SD_GPT_FLAG_GROWFS) && gpt_partition_type_knows_growfs(ptid), + .no_auto = FLAGS_SET(flags, SD_GPT_FLAG_NO_AUTO) && gpt_partition_type_knows_no_auto(type), + .read_only = FLAGS_SET(flags, SD_GPT_FLAG_READ_ONLY) && gpt_partition_type_knows_read_only(type), + .growfs = FLAGS_SET(flags, SD_GPT_FLAG_GROWFS) && gpt_partition_type_knows_growfs(type), }; return 1; /* found! */ @@ -269,6 +272,7 @@ int patch_partition( _cleanup_(fdisk_unref_partitionp) struct fdisk_partition *pa = NULL; _cleanup_(fdisk_unref_contextp) struct fdisk_context *c = NULL; bool tweak_no_auto, tweak_read_only, tweak_growfs; + GptPartitionType type; int r, fd; assert(device); @@ -313,16 +317,18 @@ int patch_partition( return log_error_errno(r, "Failed to update partition UUID: %m"); } + type = gpt_partition_type_from_uuid(info->type); + /* Tweak the read-only flag, but only if supported by the partition type */ tweak_no_auto = FLAGS_SET(change, PARTITION_NO_AUTO) && - gpt_partition_type_knows_no_auto(info->type); + gpt_partition_type_knows_no_auto(type); tweak_read_only = FLAGS_SET(change, PARTITION_READ_ONLY) && - gpt_partition_type_knows_read_only(info->type); + gpt_partition_type_knows_read_only(type); tweak_growfs = FLAGS_SET(change, PARTITION_GROWFS) && - gpt_partition_type_knows_growfs(info->type); + gpt_partition_type_knows_growfs(type); if (change & PARTITION_FLAGS) { uint64_t flags; diff --git a/src/sysupdate/sysupdate-resource.c b/src/sysupdate/sysupdate-resource.c index 8104e9c82e..2814fdb6fa 100644 --- a/src/sysupdate/sysupdate-resource.c +++ b/src/sysupdate/sysupdate-resource.c @@ -8,6 +8,7 @@ #include "blockdev-util.h" #include "chase-symlinks.h" #include "device-util.h" +#include "devnum-util.h" #include "dirent-util.h" #include "env-util.h" #include "fd-util.h" @@ -194,7 +195,7 @@ static int resource_load_from_blockdev(Resource *rr) { continue; /* Check if partition type matches */ - if (rr->partition_type_set && !sd_id128_equal(pinfo.type, rr->partition_type)) + if (rr->partition_type_set && !sd_id128_equal(pinfo.type, rr->partition_type.uuid)) continue; /* A label of "_empty" means "not used so far" for us */ @@ -525,10 +526,14 @@ int resource_resolve_path( assert(rr); if (rr->path_auto) { + struct stat orig_root_stats; - /* NB: we don't actually check the backing device of the root fs "/", but of "/usr", in order - * to support environments where the root fs is a tmpfs, and the OS itself placed exclusively - * in /usr/. */ + /* NB: If the root mount has been replaced by some form of volatile file system (overlayfs), + * the original root block device node is symlinked in /run/systemd/volatile-root. Let's + * follow that link here. If that doesn't exist, we check the backing device of "/usr". We + * don't actually check the backing device of the root fs "/", in order to support + * environments where the root fs is a tmpfs, and the OS itself placed exclusively in + * /usr/. */ if (rr->type != RESOURCE_PARTITION) return log_error_errno(SYNTHETIC_ERRNO(EOPNOTSUPP), @@ -546,7 +551,16 @@ int resource_resolve_path( return log_error_errno(SYNTHETIC_ERRNO(EPERM), "Block device is not allowed when using --root= mode."); - r = get_block_device_harder("/usr/", &d); + r = stat("/run/systemd/volatile-root", &orig_root_stats); + if (r < 0) { + if (errno == -ENOENT) /* volatile-root not found */ + r = get_block_device_harder("/usr/", &d); + else + return log_error_errno(r, "Failed to stat /run/systemd/volatile-root: %m"); + } else if (!S_ISBLK(orig_root_stats.st_mode)) /* symlink was present but not block device */ + return log_error_errno(SYNTHETIC_ERRNO(ENOTBLK), "/run/systemd/volatile-root is not linked to a block device."); + else /* symlink was present and a block device */ + d = orig_root_stats.st_rdev; } else if (rr->type == RESOURCE_PARTITION) { _cleanup_close_ int fd = -1, real_fd = -1; diff --git a/src/sysupdate/sysupdate-resource.h b/src/sysupdate/sysupdate-resource.h index 86be0d3389..3209988c24 100644 --- a/src/sysupdate/sysupdate-resource.h +++ b/src/sysupdate/sysupdate-resource.h @@ -5,8 +5,7 @@ #include <stdbool.h> #include <sys/types.h> -#include "sd-id128.h" - +#include "gpt.h" #include "hashmap.h" #include "macro.h" @@ -74,7 +73,7 @@ struct Resource { char *path; bool path_auto; /* automatically find root path (only available if target resource, not source resource) */ char **patterns; - sd_id128_t partition_type; + GptPartitionType partition_type; bool partition_type_set; /* All instances of this resource we found */ diff --git a/src/sysupdate/sysupdate-transfer.c b/src/sysupdate/sysupdate-transfer.c index d6705cd12e..0c3d65a00d 100644 --- a/src/sysupdate/sysupdate-transfer.c +++ b/src/sysupdate/sysupdate-transfer.c @@ -344,7 +344,7 @@ static int config_parse_resource_ptype( assert(rvalue); - r = gpt_partition_type_uuid_from_string(rvalue, &rr->partition_type); + r = gpt_partition_type_from_string(rvalue, &rr->partition_type); if (r < 0) { log_syntax(unit, LOG_WARNING, filename, line, r, "Failed parse partition type, ignoring: %s", rvalue); @@ -654,18 +654,18 @@ int transfer_vacuum( if (t->target.n_empty + t->target.n_instances < 2) return log_error_errno(SYNTHETIC_ERRNO(ENOSPC), "Partition table has less than two partition slots of the right type " SD_ID128_UUID_FORMAT_STR " (%s), refusing.", - SD_ID128_FORMAT_VAL(t->target.partition_type), - gpt_partition_type_uuid_to_string(t->target.partition_type)); + SD_ID128_FORMAT_VAL(t->target.partition_type.uuid), + gpt_partition_type_uuid_to_string(t->target.partition_type.uuid)); if (space > t->target.n_empty + t->target.n_instances) return log_error_errno(SYNTHETIC_ERRNO(ENOSPC), "Partition table does not have enough partition slots of right type " SD_ID128_UUID_FORMAT_STR " (%s) for operation.", - SD_ID128_FORMAT_VAL(t->target.partition_type), - gpt_partition_type_uuid_to_string(t->target.partition_type)); + SD_ID128_FORMAT_VAL(t->target.partition_type.uuid), + gpt_partition_type_uuid_to_string(t->target.partition_type.uuid)); if (space == t->target.n_empty + t->target.n_instances) return log_error_errno(SYNTHETIC_ERRNO(ENOSPC), "Asked to empty all partition table slots of the right type " SD_ID128_UUID_FORMAT_STR " (%s), can't allow that. One instance must always remain.", - SD_ID128_FORMAT_VAL(t->target.partition_type), - gpt_partition_type_uuid_to_string(t->target.partition_type)); + SD_ID128_FORMAT_VAL(t->target.partition_type.uuid), + gpt_partition_type_uuid_to_string(t->target.partition_type.uuid)); rm = LESS_BY(space, t->target.n_empty); remain = LESS_BY(t->target.n_instances, rm); @@ -858,7 +858,7 @@ int transfer_acquire_instance(Transfer *t, Instance *i) { r = find_suitable_partition( t->target.path, i->metadata.size, - t->target.partition_type_set ? &t->target.partition_type : NULL, + t->target.partition_type_set ? &t->target.partition_type.uuid : NULL, &t->partition_info); if (r < 0) return r; diff --git a/src/test/test-gpt.c b/src/test/test-gpt.c index 8c313c66cc..377b79f155 100644 --- a/src/test/test-gpt.c +++ b/src/test/test-gpt.c @@ -19,13 +19,13 @@ TEST(gpt_types_against_architectures) { for (Architecture a = 0; a < _ARCHITECTURE_MAX; a++) FOREACH_STRING(suffix, "", "-verity", "-verity-sig") { _cleanup_free_ char *joined = NULL; - sd_id128_t id; + GptPartitionType type; joined = strjoin(prefix, architecture_to_string(a), suffix); if (!joined) return (void) log_oom(); - r = gpt_partition_type_uuid_from_string(joined, &id); + r = gpt_partition_type_from_string(joined, &type); if (r < 0) { printf("%s %s\n", RED_CROSS_MARK(), joined); continue; @@ -34,15 +34,15 @@ TEST(gpt_types_against_architectures) { printf("%s %s\n", GREEN_CHECK_MARK(), joined); if (streq(prefix, "root-") && streq(suffix, "")) - assert_se(gpt_partition_type_is_root(id)); + assert_se(type.designator == PARTITION_ROOT); if (streq(prefix, "root-") && streq(suffix, "-verity")) - assert_se(gpt_partition_type_is_root_verity(id)); + assert_se(type.designator == PARTITION_ROOT_VERITY); if (streq(prefix, "usr-") && streq(suffix, "")) - assert_se(gpt_partition_type_is_usr(id)); + assert_se(type.designator == PARTITION_USR); if (streq(prefix, "usr-") && streq(suffix, "-verity")) - assert_se(gpt_partition_type_is_usr_verity(id)); + assert_se(type.designator == PARTITION_USR_VERITY); - assert_se(gpt_partition_type_uuid_to_arch(id) == a); + assert_se(type.arch == a); } } diff --git a/src/test/test-ratelimit.c b/src/test/test-ratelimit.c index d82bda5347..de208c7408 100644 --- a/src/test/test-ratelimit.c +++ b/src/test/test-ratelimit.c @@ -18,7 +18,7 @@ TEST(ratelimit_below) { for (i = 0; i < 10; i++) assert_se(ratelimit_below(&ratelimit)); - ratelimit = (RateLimit) { 0, 10 }; + ratelimit = (const RateLimit) { 0, 10 }; for (i = 0; i < 10000; i++) assert_se(ratelimit_below(&ratelimit)); } diff --git a/src/test/test-uid-range.c b/src/test/test-uid-range.c index c759573173..186f6ee29c 100644 --- a/src/test/test-uid-range.c +++ b/src/test/test-uid-range.c @@ -114,7 +114,7 @@ TEST(load_userns) { assert_se(uid_range_covers(p, 0, UINT32_MAX)); } - assert_se(fopen_temporary(NULL, &f, &fn) >= 0); + assert_se(fopen_temporary_child(NULL, &f, &fn) >= 0); fputs("0 0 20\n" "100 0 20\n", f); assert_se(fflush_and_check(f) >= 0); diff --git a/src/timesync/timesyncd-manager.c b/src/timesync/timesyncd-manager.c index da540856ee..5b076157aa 100644 --- a/src/timesync/timesyncd-manager.c +++ b/src/timesync/timesyncd-manager.c @@ -1112,7 +1112,7 @@ int manager_new(Manager **ret) { .server_socket = -1, - .ratelimit = (RateLimit) { + .ratelimit = (const RateLimit) { RATELIMIT_INTERVAL_USEC, RATELIMIT_BURST }, diff --git a/src/tmpfiles/tmpfiles.c b/src/tmpfiles/tmpfiles.c index bf5192c56f..f156d90073 100644 --- a/src/tmpfiles/tmpfiles.c +++ b/src/tmpfiles/tmpfiles.c @@ -1979,7 +1979,7 @@ static int create_fifo(Item *i) { creation = r >= 0 ? CREATION_NORMAL : CREATION_EXISTING; - /* Open the inode via O_PATH, regardless if we managed to create it or not. Maybe it is is already the FIFO we want */ + /* Open the inode via O_PATH, regardless if we managed to create it or not. Maybe it is already the FIFO we want */ fd = openat(pfd, bn, O_NOFOLLOW|O_CLOEXEC|O_PATH); if (fd < 0) { if (r < 0) diff --git a/src/udev/udev-builtin-blkid.c b/src/udev/udev-builtin-blkid.c index 92ea43eef0..9f5646ffdd 100644 --- a/src/udev/udev-builtin-blkid.c +++ b/src/udev/udev-builtin-blkid.c @@ -120,14 +120,14 @@ static int find_gpt_root(sd_device *dev, blkid_probe pr, bool test) { #if defined(SD_GPT_ROOT_NATIVE) && ENABLE_EFI _cleanup_free_ char *root_id = NULL, *root_label = NULL; - bool found_esp = false; + bool found_esp_or_xbootldr = false; int r; assert(pr); - /* Iterate through the partitions on this disk, and see if the - * EFI ESP we booted from is on it. If so, find the first root - * disk, and add a property indicating its partition UUID. */ + /* Iterate through the partitions on this disk, and see if the UEFI ESP or XBOOTLDR partition we + * booted from is on it. If so, find the first root disk, and add a property indicating its partition + * UUID. */ errno = 0; blkid_partlist pl = blkid_probe_get_partitions(pr); @@ -157,21 +157,20 @@ static int find_gpt_root(sd_device *dev, blkid_probe pr, bool test) { if (sd_id128_from_string(stype, &type) < 0) continue; - if (sd_id128_equal(type, SD_GPT_ESP)) { - sd_id128_t id, esp; + if (sd_id128_in_set(type, SD_GPT_ESP, SD_GPT_XBOOTLDR)) { + sd_id128_t id, esp_or_xbootldr; - /* We found an ESP, let's see if it matches - * the ESP we booted from. */ + /* We found an ESP or XBOOTLDR, let's see if it matches the ESP/XBOOTLDR we booted from. */ if (sd_id128_from_string(sid, &id) < 0) continue; - r = efi_loader_get_device_part_uuid(&esp); + r = efi_loader_get_device_part_uuid(&esp_or_xbootldr); if (r < 0) return r; - if (sd_id128_equal(id, esp)) - found_esp = true; + if (sd_id128_equal(id, esp_or_xbootldr)) + found_esp_or_xbootldr = true; } else if (sd_id128_equal(type, SD_GPT_ROOT_NATIVE)) { unsigned long long flags; @@ -195,9 +194,9 @@ static int find_gpt_root(sd_device *dev, blkid_probe pr, bool test) { } } - /* We found the ESP on this disk, and also found a root - * partition, nice! Let's export its UUID */ - if (found_esp && root_id) + /* We found the ESP/XBOOTLDR on this disk, and also found a root partition, nice! Let's export its + * UUID */ + if (found_esp_or_xbootldr && root_id) udev_builtin_add_property(dev, test, "ID_PART_GPT_AUTO_ROOT_UUID", root_id); #endif diff --git a/test/README.testsuite b/test/README.testsuite index 40240599a0..2136800ff5 100644 --- a/test/README.testsuite +++ b/test/README.testsuite @@ -88,10 +88,10 @@ NSPAWN_ARGUMENTS='...' Specify additional arguments for systemd-nspawn QEMU_TIMEOUT=infinity - Set a timeout for tests under qemu (defaults to infinity) + Set a timeout for tests under qemu (defaults to 1800 sec) NSPAWN_TIMEOUT=infinity - Set a timeout for tests under systemd-nspawn (defaults to infinity) + Set a timeout for tests under systemd-nspawn (defaults to 1800 sec) INTERACTIVE_DEBUG=1 Configure the machine to be more *user-friendly* for interactive debuggung diff --git a/test/TEST-58-REPART/test.sh b/test/TEST-58-REPART/test.sh index 31c5e67e6a..0d513cf85b 100755 --- a/test/TEST-58-REPART/test.sh +++ b/test/TEST-58-REPART/test.sh @@ -3,6 +3,8 @@ set -e TEST_DESCRIPTION="test systemd-repart" +IMAGE_NAME="repart" +TEST_FORCE_NEWIMAGE=1 # shellcheck source=test/test-functions . "$TEST_BASE_DIR/test-functions" @@ -10,11 +12,14 @@ TEST_DESCRIPTION="test systemd-repart" test_append_files() { if ! get_bool "${TEST_NO_QEMU:=}"; then install_dmevent - if command -v openssl >/dev/null 2>&1; then - inst_binary openssl - fi instmods dm_verity =md generate_module_dependencies + image_install -o /sbin/mksquashfs + fi + + inst_binary mcopy + if command -v openssl >/dev/null 2>&1; then + inst_binary openssl fi } diff --git a/test/fuzz/fuzz-unit-file/directives-all.service b/test/fuzz/fuzz-unit-file/directives-all.service index 621fb1cf1b..b4cfca2814 100644 --- a/test/fuzz/fuzz-unit-file/directives-all.service +++ b/test/fuzz/fuzz-unit-file/directives-all.service @@ -160,6 +160,7 @@ MemoryLimit= MemoryLow= MemoryMax= MemorySwapMax= +MemoryZSwapMax= MessageQueueMaxMessages= MessageQueueMessageSize= MountAPIVFS= diff --git a/test/test-functions b/test/test-functions index ff0cc963ce..28331c2412 100644 --- a/test/test-functions +++ b/test/test-functions @@ -955,6 +955,8 @@ install_modules() { if get_bool "$LOOKS_LIKE_SUSE"; then instmods ext4 + # for TEST-54-CREDS + instmods dmi-sysfs fi } @@ -1148,7 +1150,6 @@ install_debian_systemd() { } install_suse_systemd() { - local testsdir=/usr/lib/systemd/tests local pkgs dinfo "Install SUSE systemd" @@ -1159,6 +1160,9 @@ install_suse_systemd() { systemd-coredump systemd-experimental systemd-journal-remote + # Since commit fb6f25d7b979134a, systemd-resolved, which is shipped by + # systemd-network sub-package on openSUSE, has its own testsuite. + systemd-network systemd-portable udev ) @@ -1174,15 +1178,21 @@ install_suse_systemd() { done < <(rpm -ql "$p") done - # we only need testsdata dir as well as the unit tests (for - # TEST-02-UNITTESTS) in the image. - dinfo "Install unit tests and testdata directory" - - mkdir -p "$initdir/$testsdir" - cp "$testsdir"/test-* "$initdir/$testsdir/" - cp -a "$testsdir/testdata" "$initdir/$testsdir/" - - # On openSUSE, these dirs are not created at package install for now on. + # Embed the files needed by the extended testsuite at runtime. Also include + # the unit tests needed by TEST-02-UNITTESTS. This is mostly equivalent to + # what `ninja install` does for the tests when '-Dinstall-tests=true'. + # + # Why? openSUSE ships a package named 'systemd-testsuite' which contains + # the minimal set of files that allows to run the testsuite on the host (as + # long as it runs an equivalent version of systemd) getting rid of the + # hassles of fetching, configuring, building the source code. + dinfo "Install the files needed by the tests at runtime" + image_install "${SOURCE_DIR}"/test-* + inst_recursive "${SOURCE_DIR}/testdata" + inst_recursive "${SOURCE_DIR}/manual" + + # On openSUSE, this directory is not created at package install, at least + # for now. mkdir -p "$initdir/var/log/journal/remote" } @@ -1310,6 +1320,11 @@ install_missing_libraries() { inst_simple "${path}/engines-3/capi.so" || true inst_simple "${path}/engines-3/loader_attic.so" || true inst_simple "${path}/engines-3/padlock.so" || true + + # Binaries from mtools depend on the gconv modules to translate between codepages. Because there's no + # pkg-config file for these, we copy every gconv/ directory we can find in /usr/lib and /usr/lib64. + # shellcheck disable=SC2046 + inst_recursive $(find /usr/lib* -name gconv 2>/dev/null) } cleanup_loopdev() { @@ -1346,6 +1361,9 @@ create_empty_image() { root_size=$((4 * root_size)) data_size=$((2 * data_size)) fi + if [ "$IMAGE_NAME" = "repart" ]; then + root_size=$((root_size+=1000)) + fi echo "Setting up ${IMAGE_PUBLIC:?} (${root_size} MB)" rm -f "${IMAGE_PRIVATE:?}" "$IMAGE_PUBLIC" diff --git a/test/test-network/conf/25-qdisc-cake.network b/test/test-network/conf/25-qdisc-cake.network index b13720c6dd..6a1364214d 100644 --- a/test/test-network/conf/25-qdisc-cake.network +++ b/test/test-network/conf/25-qdisc-cake.network @@ -21,3 +21,5 @@ PriorityQueueingPreset=diffserv8 FirewallMark=0xff00 Wash=yes SplitGSO=yes +RTTSec=1sec +AckFilter=aggressive diff --git a/test/test-network/systemd-networkd-tests.py b/test/test-network/systemd-networkd-tests.py index 7da22349fa..e4e92d5a37 100755 --- a/test/test-network/systemd-networkd-tests.py +++ b/test/test-network/systemd-networkd-tests.py @@ -3485,6 +3485,8 @@ class NetworkdTCTests(unittest.TestCase, Utilities): self.assertIn('overhead 128', output) self.assertIn('mpu 20', output) self.assertIn('fwmark 0xff00', output) + self.assertIn('rtt 1s', output) + self.assertIn('ack-filter-aggressive', output) @expectedFailureIfModuleIsNotAvailable('sch_codel') def test_qdisc_codel(self): diff --git a/test/units/testsuite-58.sh b/test/units/testsuite-58.sh index 640b8440dc..01eec20745 100755 --- a/test/units/testsuite-58.sh +++ b/test/units/testsuite-58.sh @@ -3,6 +3,13 @@ set -eux set -o pipefail +runas() { + declare userid=$1 + shift + # shellcheck disable=SC2016 + su "$userid" -s /bin/sh -c 'XDG_RUNTIME_DIR=/run/user/$UID exec "$@"' -- sh "$@" +} + if ! command -v systemd-repart &>/dev/null; then echo "no systemd-repart" >/skipped exit 0 @@ -89,17 +96,17 @@ test_basic() { local defs imgs output local loop volume - defs="$(mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" - imgs="$(mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" + defs="$(runas testuser mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" + imgs="$(runas testuser mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" # shellcheck disable=SC2064 trap "rm -rf '$defs' '$imgs'" RETURN # 1. create an empty image - systemd-repart --empty=create \ - --size=1G \ - --seed="$seed" \ - "$imgs/zzz" + runas testuser systemd-repart --empty=create \ + --size=1G \ + --seed="$seed" \ + "$imgs/zzz" output=$(sfdisk -d "$imgs/zzz" | grep -v -e 'sector-size' -e '^$') @@ -133,10 +140,44 @@ SizeMaxBytes=64M PaddingMinBytes=92M EOF - systemd-repart --definitions="$defs" \ - --dry-run=no \ - --seed="$seed" \ - "$imgs/zzz" + runas testuser systemd-repart --definitions="$defs" \ + --dry-run=no \ + --seed="$seed" \ + --include-partitions=home,swap \ + "$imgs/zzz" + + output=$(sfdisk -d "$imgs/zzz" | grep -v -e 'sector-size' -e '^$') + + assert_eq "$output" "label: gpt +label-id: 1D2CE291-7CCE-4F7D-BC83-FDB49AD74EBD +device: $imgs/zzz +unit: sectors +first-lba: 2048 +last-lba: 2097118 +$imgs/zzz1 : start= 2048, size= 1775576, type=933AC7E1-2EB4-4F13-B844-0E14E2AEF915, uuid=4980595D-D74A-483A-AA9E-9903879A0EE5, name=\"home-first\", attrs=\"GUID:59\" +$imgs/zzz2 : start= 1777624, size= 131072, type=0657FD6D-A4AB-43C4-84E5-0933C84B4F4F, uuid=78C92DB8-3D2B-4823-B0DC-792B78F66F1E, name=\"swap\"" + + runas testuser systemd-repart --definitions="$defs" \ + --dry-run=no \ + --seed="$seed" \ + --empty=force \ + --skip-partitions=home,root \ + "$imgs/zzz" + + output=$(sfdisk -d "$imgs/zzz" | grep -v -e 'sector-size' -e '^$') + + assert_eq "$output" "label: gpt +label-id: 1D2CE291-7CCE-4F7D-BC83-FDB49AD74EBD +device: $imgs/zzz +unit: sectors +first-lba: 2048 +last-lba: 2097118 +$imgs/zzz4 : start= 1777624, size= 131072, type=0657FD6D-A4AB-43C4-84E5-0933C84B4F4F, uuid=78C92DB8-3D2B-4823-B0DC-792B78F66F1E, name=\"swap\"" + + runas testuser systemd-repart --definitions="$defs" \ + --dry-run=no \ + --seed="$seed" \ + "$imgs/zzz" output=$(sfdisk -d "$imgs/zzz" | grep -v -e 'sector-size' -e '^$') @@ -169,10 +210,10 @@ EOF echo "Label=ignored_label" >>"$defs/home.conf" echo "UUID=b0b1b2b3b4b5b6b7b8b9babbbcbdbebf" >>"$defs/home.conf" - systemd-repart --definitions="$defs" \ - --dry-run=no \ - --seed="$seed" \ - "$imgs/zzz" + runas testuser systemd-repart --definitions="$defs" \ + --dry-run=no \ + --seed="$seed" \ + "$imgs/zzz" output=$(sfdisk -d "$imgs/zzz" | grep -v -e 'sector-size' -e '^$') @@ -190,11 +231,11 @@ $imgs/zzz5 : start= 1908696, size= 188416, type=0FC63DAF-8483-4772-8E79 # 4. Resizing to 2G - systemd-repart --definitions="$defs" \ - --size=2G \ - --dry-run=no \ - --seed="$seed" \ - "$imgs/zzz" + runas testuser systemd-repart --definitions="$defs" \ + --size=2G \ + --dry-run=no \ + --seed="$seed" \ + "$imgs/zzz" output=$(sfdisk -d "$imgs/zzz" | grep -v -e 'sector-size' -e '^$') @@ -222,11 +263,11 @@ UUID=2a1d97e1d0a346cca26eadc643926617 CopyBlocks=$imgs/block-copy EOF - systemd-repart --definitions="$defs" \ - --size=3G \ - --dry-run=no \ - --seed="$seed" \ - "$imgs/zzz" + runas testuser systemd-repart --definitions="$defs" \ + --size=3G \ + --dry-run=no \ + --seed="$seed" \ + "$imgs/zzz" output=$(sfdisk -d "$imgs/zzz" | grep -v -e 'sector-size' -e '^$') @@ -245,11 +286,6 @@ $imgs/zzz6 : start= 4194264, size= 2097152, type=0FC63DAF-8483-4772-8E79 cmp --bytes=$((4096*10240)) --ignore-initial=0:$((512*4194264)) "$imgs/block-copy" "$imgs/zzz" - if systemd-detect-virt --quiet --container; then - echo "Skipping encrypt tests in container." - return - fi - # 6. Testing Format=/Encrypt=/CopyFiles= cat >"$defs/extra3.conf" <<EOF @@ -263,11 +299,11 @@ CopyFiles=$defs:/def SizeMinBytes=48M EOF - systemd-repart --definitions="$defs" \ - --size=auto \ - --dry-run=no \ - --seed="$seed" \ - "$imgs/zzz" + runas testuser systemd-repart --definitions="$defs" \ + --size=auto \ + --dry-run=no \ + --seed="$seed" \ + "$imgs/zzz" output=$(sfdisk -d "$imgs/zzz" | grep -v -e 'sector-size' -e '^$') @@ -285,6 +321,11 @@ $imgs/zzz5 : start= 1908696, size= 2285568, type=0FC63DAF-8483-4772-8E79 $imgs/zzz6 : start= 4194264, size= 2097152, type=0FC63DAF-8483-4772-8E79-3D69D8477DE4, uuid=2A1D97E1-D0A3-46CC-A26E-ADC643926617, name=\"block-copy\" $imgs/zzz7 : start= 6291416, size= 98304, type=0FC63DAF-8483-4772-8E79-3D69D8477DE4, uuid=7B93D1F2-595D-4CE3-B0B9-837FBD9E63B0, name=\"luks-format-copy\"" + if systemd-detect-virt --quiet --container; then + echo "Skipping encrypt mount tests in container." + return + fi + loop="$(losetup -P --show --find "$imgs/zzz")" udevadm wait --timeout 60 --settle "${loop:?}" @@ -304,8 +345,8 @@ $imgs/zzz7 : start= 6291416, size= 98304, type=0FC63DAF-8483-4772-8E79 test_dropin() { local defs imgs output - defs="$(mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" - imgs="$(mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" + defs="$(runas testuser mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" + imgs="$(runas testuser mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" # shellcheck disable=SC2064 trap "rm -rf '$defs' '$imgs'" RETURN @@ -328,7 +369,11 @@ EOF Label=label2 EOF - output=$(systemd-repart --definitions="$defs" --empty=create --size=100M --json=pretty "$imgs/zzz") + output=$(runas testuser systemd-repart --definitions="$defs" \ + --empty=create \ + --size=100M \ + --json=pretty \ + "$imgs/zzz") diff -u <(echo "$output") - <<EOF [ @@ -358,8 +403,8 @@ EOF test_multiple_definitions() { local defs imgs output - defs="$(mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" - imgs="$(mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" + defs="$(runas testuser mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" + imgs="$(runas testuser mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" # shellcheck disable=SC2064 trap "rm -rf '$defs' '$imgs'" RETURN @@ -383,7 +428,12 @@ UUID=837c3d67-21b3-478e-be82-7e7f83bf96d3 Label=label2 EOF - output=$(systemd-repart --definitions="$defs/1" --definitions="$defs/2" --empty=create --size=100M --json=pretty "$imgs/zzz") + output=$(runas testuser systemd-repart --definitions="$defs/1" \ + --definitions="$defs/2" \ + --empty=create \ + --size=100M \ + --json=pretty \ + "$imgs/zzz") diff -u <(echo "$output") - <<EOF [ @@ -424,13 +474,8 @@ EOF test_copy_blocks() { local defs imgs output - if systemd-detect-virt --quiet --container; then - echo "Skipping copy blocks tests in container." - return - fi - - defs="$(mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" - imgs="$(mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" + defs="$(runas testuser mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" + imgs="$(runas testuser mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" # shellcheck disable=SC2064 trap "rm -rf '$defs' '$imgs'" RETURN @@ -459,11 +504,11 @@ Format=ext4 MakeDirectories=/usr /efi EOF - systemd-repart --definitions="$defs" \ - --empty=create \ - --size=auto \ - --seed="$seed" \ - "$imgs/zzz" + runas testuser systemd-repart --definitions="$defs" \ + --empty=create \ + --size=auto \ + --seed="$seed" \ + "$imgs/zzz" output=$(sfdisk --dump "$imgs/zzz") @@ -471,6 +516,11 @@ EOF assert_in "$imgs/zzz2 : start= 22528, size= 20480, type=${root_guid}, uuid=${root_uuid}, name=\"root-${architecture}\", attrs=\"GUID:59\"" "$output" assert_in "$imgs/zzz3 : start= 43008, size= 20480, type=${usr_guid}, uuid=${usr_uuid}, name=\"usr-${architecture}\", attrs=\"GUID:60\"" "$output" + if systemd-detect-virt --quiet --container; then + echo "Skipping second part of copy blocks tests in container." + return + fi + # Then, create another image with CopyBlocks=auto cat >"$defs/esp.conf" <<EOF @@ -492,6 +542,7 @@ Type=root-${architecture} CopyBlocks=auto EOF + # --image needs root privileges so skip runas testuser here. systemd-repart --definitions="$defs" \ --empty=create \ --size=auto \ @@ -505,8 +556,8 @@ EOF test_unaligned_partition() { local defs imgs output - defs="$(mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" - imgs="$(mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" + defs="$(runas testuser mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" + imgs="$(runas testuser mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" # shellcheck disable=SC2064 trap "rm -rf '$defs' '$imgs'" RETURN @@ -517,7 +568,7 @@ test_unaligned_partition() { Type=root-${architecture} EOF - truncate -s 10g "$imgs/unaligned" + runas testuser truncate -s 10g "$imgs/unaligned" sfdisk "$imgs/unaligned" <<EOF label: gpt @@ -525,10 +576,10 @@ start=2048, size=69044 start=71092, size=3591848 EOF - systemd-repart --definitions="$defs" \ - --seed="$seed" \ - --dry-run=no \ - "$imgs/unaligned" + runas testuser systemd-repart --definitions="$defs" \ + --seed="$seed" \ + --dry-run=no \ + "$imgs/unaligned" output=$(sfdisk --dump "$imgs/unaligned") @@ -542,8 +593,8 @@ test_issue_21817() { # testcase for #21817 - defs="$(mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" - imgs="$(mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" + defs="$(runas testuser mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" + imgs="$(runas testuser mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" # shellcheck disable=SC2064 trap "rm -rf '$defs' '$imgs'" RETURN @@ -552,7 +603,7 @@ test_issue_21817() { Type=root EOF - truncate -s 100m "$imgs/21817.img" + runas testuser truncate -s 100m "$imgs/21817.img" sfdisk "$imgs/21817.img" <<EOF label: gpt @@ -560,11 +611,11 @@ size=50M, type=${root_guid} , EOF - systemd-repart --pretty=yes \ - --definitions "$imgs" \ - --seed="$seed" \ - --dry-run=no \ - "$imgs/21817.img" + runas testuser systemd-repart --pretty=yes \ + --definitions "$imgs" \ + --seed="$seed" \ + --dry-run=no \ + "$imgs/21817.img" output=$(sfdisk --dump "$imgs/21817.img") @@ -578,8 +629,8 @@ test_issue_24553() { # testcase for #24553 - defs="$(mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" - imgs="$(mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" + defs="$(runas testuser mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" + imgs="$(runas testuser mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" # shellcheck disable=SC2064 trap "rm -rf '$defs' '$imgs'" RETURN @@ -601,28 +652,28 @@ start=524328, size=14848000, type=${root_guid}, uuid=${root_uuid}, name="root-${ EOF # 1. Operate on a small image compared with SizeMinBytes=. - truncate -s 8g "$imgs/zzz" + runas testuser truncate -s 8g "$imgs/zzz" sfdisk "$imgs/zzz" <"$imgs/partscript" # This should fail, but not trigger assertions. - assert_rc 1 systemd-repart --definitions="$defs" \ - --seed="$seed" \ - --dry-run=no \ - "$imgs/zzz" + assert_rc 1 runas testuser systemd-repart --definitions="$defs" \ + --seed="$seed" \ + --dry-run=no \ + "$imgs/zzz" output=$(sfdisk --dump "$imgs/zzz") assert_in "$imgs/zzz2 : start= 524328, size= 14848000, type=${root_guid}, uuid=${root_uuid}, name=\"root-${architecture}\"" "$output" # 2. Operate on an larger image compared with SizeMinBytes=. rm -f "$imgs/zzz" - truncate -s 12g "$imgs/zzz" + runas testuser truncate -s 12g "$imgs/zzz" sfdisk "$imgs/zzz" <"$imgs/partscript" # This should succeed. - systemd-repart --definitions="$defs" \ - --seed="$seed" \ - --dry-run=no \ - "$imgs/zzz" + runas testuser systemd-repart --definitions="$defs" \ + --seed="$seed" \ + --dry-run=no \ + "$imgs/zzz" output=$(sfdisk --dump "$imgs/zzz") assert_in "$imgs/zzz2 : start= 524328, size= 24641456, type=${root_guid}, uuid=${root_uuid}, name=\"root-${architecture}\"" "$output" @@ -644,14 +695,14 @@ Priority=10 EOF rm -f "$imgs/zzz" - truncate -s 8g "$imgs/zzz" + runas testuser truncate -s 8g "$imgs/zzz" sfdisk "$imgs/zzz" <"$imgs/partscript" # This should also succeed, but root is not extended. - systemd-repart --definitions="$defs" \ - --seed="$seed" \ - --dry-run=no \ - "$imgs/zzz" + runas testuser systemd-repart --definitions="$defs" \ + --seed="$seed" \ + --dry-run=no \ + "$imgs/zzz" output=$(sfdisk --dump "$imgs/zzz") assert_in "$imgs/zzz2 : start= 524328, size= 14848000, type=${root_guid}, uuid=${root_uuid}, name=\"root-${architecture}\"" "$output" @@ -659,14 +710,14 @@ EOF # 4. Multiple partitions with Priority= (large disk) rm -f "$imgs/zzz" - truncate -s 12g "$imgs/zzz" + runas testuser truncate -s 12g "$imgs/zzz" sfdisk "$imgs/zzz" <"$imgs/partscript" # This should also succeed, and root is extended. - systemd-repart --definitions="$defs" \ - --seed="$seed" \ - --dry-run=no \ - "$imgs/zzz" + runas testuser systemd-repart --definitions="$defs" \ + --seed="$seed" \ + --dry-run=no \ + "$imgs/zzz" output=$(sfdisk --dump "$imgs/zzz") assert_in "$imgs/zzz2 : start= 524328, size= 20971520, type=${root_guid}, uuid=${root_uuid}, name=\"root-${architecture}\"" "$output" @@ -676,8 +727,8 @@ EOF test_zero_uuid() { local defs imgs output - defs="$(mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" - imgs="$(mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" + defs="$(runas testuser mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" + imgs="$(runas testuser mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" # shellcheck disable=SC2064 trap "rm -rf '$defs' '$imgs'" RETURN @@ -689,12 +740,12 @@ Type=root-${architecture} UUID=null EOF - systemd-repart --definitions="$defs" \ - --seed="$seed" \ - --dry-run=no \ - --empty=create \ - --size=auto \ - "$imgs/zero" + runas testuser systemd-repart --definitions="$defs" \ + --seed="$seed" \ + --dry-run=no \ + --empty=create \ + --size=auto \ + "$imgs/zero" output=$(sfdisk --dump "$imgs/zero") @@ -704,13 +755,8 @@ EOF test_verity() { local defs imgs output - if systemd-detect-virt --quiet --container; then - echo "Skipping verity test in container." - return - fi - - defs="$(mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" - imgs="$(mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" + defs="$(runas testuser mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" + imgs="$(runas testuser mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" # shellcheck disable=SC2064 trap "rm -rf '$defs' '$imgs'" RETURN @@ -752,25 +798,36 @@ CN = Common Name emailAddress = test@email.com EOF - openssl req -config "$defs/verity.openssl.cnf" -new -x509 -newkey rsa:1024 -keyout "$defs/verity.key" -out "$defs/verity.crt" -days 365 -nodes + runas testuser openssl req -config "$defs/verity.openssl.cnf" \ + -new -x509 \ + -newkey rsa:1024 \ + -keyout "$defs/verity.key" \ + -out "$defs/verity.crt" \ + -days 365 \ + -nodes mkdir -p /run/verity.d ln -s "$defs/verity.crt" /run/verity.d/ok.crt - output=$(systemd-repart --definitions="$defs" \ - --seed="$seed" \ - --dry-run=no \ - --empty=create \ - --size=auto \ - --json=pretty \ - --private-key="$defs/verity.key" \ - --certificate="$defs/verity.crt" \ - "$imgs/verity") + output=$(runas testuser systemd-repart --definitions="$defs" \ + --seed="$seed" \ + --dry-run=no \ + --empty=create \ + --size=auto \ + --json=pretty \ + --private-key="$defs/verity.key" \ + --certificate="$defs/verity.crt" \ + "$imgs/verity") roothash=$(jq -r ".[] | select(.type == \"root-${architecture}-verity\") | .roothash" <<< "$output") # Check that we can dissect, mount and unmount a repart verity image. (and that the image UUID is deterministic) + if systemd-detect-virt --quiet --container; then + echo "Skipping verity test dissect part in container." + return + fi + systemd-dissect "$imgs/verity" --root-hash "$roothash" systemd-dissect "$imgs/verity" --root-hash "$roothash" --json=short | grep -q '"imageUuid":"1d2ce291-7cce-4f7d-bc83-fdb49ad74ebd"' systemd-dissect "$imgs/verity" --root-hash "$roothash" -M "$imgs/mnt" @@ -780,14 +837,9 @@ EOF test_issue_24786() { local defs imgs root output - if systemd-detect-virt --quiet --container; then - echo "Skipping verity test in container." - return - fi - - defs="$(mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" - imgs="$(mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" - root="$(mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" + defs="$(runas testuser mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" + imgs="$(runas testuser mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" + root="$(runas testuser mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" # shellcheck disable=SC2064 trap "rm -rf '$defs' '$imgs' '$root'" RETURN @@ -807,14 +859,19 @@ Type=usr-${architecture} CopyFiles=/usr:/ EOF - output=$(systemd-repart --definitions="$defs" \ - --seed="$seed" \ - --dry-run=no \ - --empty=create \ - --size=auto \ - --json=pretty \ - --root="$root" \ - "$imgs/zzz") + output=$(runas testuser systemd-repart --definitions="$defs" \ + --seed="$seed" \ + --dry-run=no \ + --empty=create \ + --size=auto \ + --json=pretty \ + --root="$root" \ + "$imgs/zzz") + + if systemd-detect-virt --quiet --container; then + echo "Skipping issue 24786 test loop/mount parts in container." + return + fi loop=$(losetup -P --show -f "$imgs/zzz") udevadm wait --timeout 60 --settle "${loop:?}" @@ -831,6 +888,58 @@ EOF losetup -d "$loop" } +test_minimize() { + local defs imgs output + + if systemd-detect-virt --quiet --container; then + echo "Skipping minimize test in container." + return + fi + + defs="$(mktemp --directory "/tmp/test-repart.XXXXXXXXXX")" + imgs="$(mktemp --directory "/var/tmp/test-repart.XXXXXXXXXX")" + # shellcheck disable=SC2064 + trap "rm -rf '$defs' '$imgs'" RETURN + + for format in ext4 vfat; do + if ! command -v "mkfs.$format" >/dev/null; then + continue + fi + + cat >"$defs/root-$format.conf" <<EOF +[Partition] +Type=root-${architecture} +Format=${format} +CopyFiles=${defs} +Minimize=yes +EOF + done + + if ! command -v mksquashfs >/dev/null; then + cat >"$defs/root-squashfs.conf" <<EOF +[Partition] +Type=root-${architecture} +Format=squashfs +CopyFiles=${defs} +Minimize=yes +EOF + fi + + output=$(systemd-repart --definitions="$defs" \ + --seed="$seed" \ + --dry-run=no \ + --empty=create \ + --size=auto \ + --json=pretty \ + "$imgs/zzz") + + # Check that we can dissect, mount and unmount a minimized image. + + systemd-dissect "$imgs/zzz" + systemd-dissect "$imgs/zzz" -M "$imgs/mnt" + systemd-dissect -U "$imgs/mnt" +} + test_sector() { local defs imgs output loop local start size ratio @@ -867,6 +976,8 @@ EOF truncate -s 100m "$imgs/$sector.img" loop=$(losetup -b "$sector" -P --show -f "$imgs/$sector.img" ) udevadm wait --timeout 60 --settle "${loop:?}" + # This operates on a loop device which we don't support doing without root privileges so we skip runas + # here. systemd-repart --pretty=yes \ --definitions="$defs" \ --seed="$seed" \ @@ -900,6 +1011,7 @@ test_issue_24553 test_zero_uuid test_verity test_issue_24786 +test_minimize # Valid block sizes on the Linux block layer are >= 512 and <= PAGE_SIZE, and # must be powers of 2. Which leaves exactly four different ones to test on diff --git a/test/units/testsuite-64.sh b/test/units/testsuite-64.sh index 7673036335..8e4653312b 100755 --- a/test/units/testsuite-64.sh +++ b/test/units/testsuite-64.sh @@ -243,6 +243,7 @@ EOF echo "${FUNCNAME[0]}: test failover" local device expected link mpoint part local -a devices + mkdir -p /mnt mpoint="$(mktemp -d /mnt/mpathXXX)" wwid="deaddeadbeef0000" path="/dev/disk/by-id/wwn-0x$wwid" diff --git a/test/units/testsuite-74.cgtop.sh b/test/units/testsuite-74.cgtop.sh index 8141ec1b1f..6f08362e7c 100755 --- a/test/units/testsuite-74.cgtop.sh +++ b/test/units/testsuite-74.cgtop.sh @@ -15,8 +15,8 @@ systemd-cgtop --cpu=percentage systemd-cgtop --cpu=time systemd-cgtop -P systemd-cgtop -k -# FIXME: https://github.com/systemd/systemd/issues/25248 -#systemd-cgtop --recursive=no +systemd-cgtop --recursive=no -P +systemd-cgtop --recursive=no -k systemd-cgtop --depth=0 systemd-cgtop --depth=100 @@ -29,4 +29,5 @@ systemd-cgtop -p -t -c -m -i (! systemd-cgtop --order=foo) (! systemd-cgtop --depth=-1) (! systemd-cgtop --recursive=foo) +(! systemd-cgtop --recursive=no) (! systemd-cgtop --delay=1foo) diff --git a/test/units/testsuite-74.firstboot.sh b/test/units/testsuite-74.firstboot.sh index fdea34b5c8..36e265edfe 100755 --- a/test/units/testsuite-74.firstboot.sh +++ b/test/units/testsuite-74.firstboot.sh @@ -24,6 +24,12 @@ ROOT_HASHED_PASSWORD1='$6$foobarsalt$YbwdaATX6IsFxvWbY3QcZj2gB31R/LFRFrjlFrJtTTq # shellcheck disable=SC2016 ROOT_HASHED_PASSWORD2='$6$foobarsalt$q.P2932zYMLbKnjFwIxPI8y3iuxeuJ2BgE372LcZMMnj3Gcg/9mJg2LPKUl.ha0TG/.fRNNnRQcLfzM0SNot3.' +# Debian and Ubuntu use /etc/default/locale instead of /etc/locale.conf. Make +# sure we use the appropriate path for locale configuration. +LOCALE_PATH="/etc/locale.conf" +[ -e "$LOCALE_PATH" ] || LOCALE_PATH="/etc/default/locale" +[ -e "$LOCALE_PATH" ] || systemd-firstboot --locale=C.UTF-8 + # Create a minimal root so we don't modify the testbed ROOT=test-root mkdir -p "$ROOT/bin" @@ -31,14 +37,14 @@ mkdir -p "$ROOT/bin" touch "$ROOT/bin/fooshell" "$ROOT/bin/barshell" systemd-firstboot --root="$ROOT" --locale=foo -grep -q "LANG=foo" "$ROOT/etc/locale.conf" -rm -fv "$ROOT/etc/locale.conf" +grep -q "LANG=foo" "$ROOT$LOCALE_PATH" +rm -fv "$ROOT$LOCALE_PATH" systemd-firstboot --root="$ROOT" --locale-messages=foo -grep -q "LC_MESSAGES=foo" "$ROOT/etc/locale.conf" -rm -fv "$ROOT/etc/locale.conf" +grep -q "LC_MESSAGES=foo" "$ROOT$LOCALE_PATH" +rm -fv "$ROOT$LOCALE_PATH" systemd-firstboot --root="$ROOT" --locale=foo --locale-messages=bar -grep -q "LANG=foo" "$ROOT/etc/locale.conf" -grep -q "LC_MESSAGES=bar" "$ROOT/etc/locale.conf" +grep -q "LANG=foo" "$ROOT$LOCALE_PATH" +grep -q "LC_MESSAGES=bar" "$ROOT$LOCALE_PATH" systemd-firstboot --root="$ROOT" --keymap=foo grep -q "KEYMAP=foo" "$ROOT/etc/vconsole.conf" @@ -82,8 +88,8 @@ systemd-firstboot --root="$ROOT" \ --root-password-hashed="$ROOT_HASHED_PASSWORD2" \ --root-shell=/bin/barshell \ --kernel-command-line="hello.world=0" -grep -q "LANG=foo" "$ROOT/etc/locale.conf" -grep -q "LC_MESSAGES=bar" "$ROOT/etc/locale.conf" +grep -q "LANG=foo" "$ROOT$LOCALE_PATH" +grep -q "LC_MESSAGES=bar" "$ROOT$LOCALE_PATH" grep -q "KEYMAP=foo" "$ROOT/etc/vconsole.conf" readlink "$ROOT/etc/localtime" | grep -q "Europe/Berlin$" grep -q "foobar" "$ROOT/etc/hostname" @@ -103,8 +109,8 @@ systemd-firstboot --root="$ROOT" --force \ --root-password-hashed="$ROOT_HASHED_PASSWORD2" \ --root-shell=/bin/barshell \ --kernel-command-line="hello.world=0" -grep -q "LANG=locale-overwrite" "$ROOT/etc/locale.conf" -grep -q "LC_MESSAGES=messages-overwrite" "$ROOT/etc/locale.conf" +grep -q "LANG=locale-overwrite" "$ROOT$LOCALE_PATH" +grep -q "LC_MESSAGES=messages-overwrite" "$ROOT$LOCALE_PATH" grep -q "KEYMAP=keymap-overwrite" "$ROOT/etc/vconsole.conf" readlink "$ROOT/etc/localtime" | grep -q "/CET$" grep -q "hostname-overwrite" "$ROOT/etc/hostname" @@ -118,7 +124,7 @@ rm -fr "$ROOT" mkdir "$ROOT" # Copy everything at once (--copy) systemd-firstboot --root="$ROOT" --copy -diff /etc/locale.conf "$ROOT/etc/locale.conf" +diff $LOCALE_PATH "$ROOT$LOCALE_PATH" diff <(awk -F: '/^root/ { print $7; }' /etc/passwd) <(awk -F: '/^root/ { print $7; }' "$ROOT/etc/passwd") diff <(awk -F: '/^root/ { print $2; }' /etc/shadow) <(awk -F: '/^root/ { print $2; }' "$ROOT/etc/shadow") [[ -e /etc/vconsole.conf ]] && diff /etc/vconsole.conf "$ROOT/etc/vconsole.conf" @@ -127,7 +133,7 @@ rm -fr "$ROOT" mkdir "$ROOT" # Copy everything at once, but now by using separate switches systemd-firstboot --root="$ROOT" --copy-locale --copy-keymap --copy-timezone --copy-root-password --copy-root-shell -diff /etc/locale.conf "$ROOT/etc/locale.conf" +diff $LOCALE_PATH "$ROOT$LOCALE_PATH" diff <(awk -F: '/^root/ { print $7; }' /etc/passwd) <(awk -F: '/^root/ { print $7; }' "$ROOT/etc/passwd") diff <(awk -F: '/^root/ { print $2; }' /etc/shadow) <(awk -F: '/^root/ { print $2; }' "$ROOT/etc/shadow") [[ -e /etc/vconsole.conf ]] && diff /etc/vconsole.conf "$ROOT/etc/vconsole.conf" @@ -140,8 +146,8 @@ touch "$ROOT/bin/fooshell" "$ROOT/bin/barshell" # We can do only limited testing here, since it's all an interactive stuff, # so --prompt and --prompt-root-password are skipped on purpose echo -ne "\nfoo\nbar\n" | systemd-firstboot --root="$ROOT" --prompt-locale -grep -q "LANG=foo" "$ROOT/etc/locale.conf" -grep -q "LC_MESSAGES=bar" "$ROOT/etc/locale.conf" +grep -q "LANG=foo" "$ROOT$LOCALE_PATH" +grep -q "LC_MESSAGES=bar" "$ROOT$LOCALE_PATH" echo -ne "\nfoo\n" | systemd-firstboot --root="$ROOT" --prompt-keymap grep -q "KEYMAP=foo" "$ROOT/etc/vconsole.conf" echo -ne "\nEurope/Berlin\n" | systemd-firstboot --root="$ROOT" --prompt-timezone diff --git a/test/units/testsuite-75.sh b/test/units/testsuite-75.sh index 1a656fcdc1..0c68e0636f 100755 --- a/test/units/testsuite-75.sh +++ b/test/units/testsuite-75.sh @@ -56,6 +56,17 @@ echo nameserver 10.0.3.3 10.0.3.4 | "$RESOLVCONF" -a hoge.foo.dhcp assert_in '10.0.3.1 10.0.3.2' "$(resolvectl dns hoge)" assert_in '10.0.3.3 10.0.3.4' "$(resolvectl dns hoge.foo)" +# Tests for _localdnsstub and _localdnsproxy +assert_in '127.0.0.53' "$(resolvectl query _localdnsstub)" +assert_in '_localdnsstub' "$(resolvectl query 127.0.0.53)" +assert_in '127.0.0.54' "$(resolvectl query _localdnsproxy)" +assert_in '_localdnsproxy' "$(resolvectl query 127.0.0.54)" + +assert_in '127.0.0.53' "$(dig @127.0.0.53 _localdnsstub)" +assert_in '_localdnsstub' "$(dig @127.0.0.53 -x 127.0.0.53)" +assert_in '127.0.0.54' "$(dig @127.0.0.53 _localdnsproxy)" +assert_in '_localdnsproxy' "$(dig @127.0.0.53 -x 127.0.0.54)" + # Tests for mDNS and LLMNR settings mkdir -p /run/systemd/resolved.conf.d { diff --git a/tools/list-discoverable-partitions.py b/tools/list-discoverable-partitions.py index 153c904774..8376a7cdeb 100644 --- a/tools/list-discoverable-partitions.py +++ b/tools/list-discoverable-partitions.py @@ -90,7 +90,8 @@ DESCRIPTIONS = { 'The Extended Boot Loader Partition (XBOOTLDR) used for the current boot is automatically ' 'mounted to `/boot/`, unless a different partition is mounted there (possibly via ' '`/etc/fstab`) or the directory is non-empty on the root disk. This partition type ' - 'is defined by the [Boot Loader Specification](https://systemd.io/BOOT_LOADER_SPECIFICATION).'), + 'is defined by the [Boot Loader ' + 'Specification](https://uapi-group.org/specifications/specs/boot_loader_specification).'), 'SWAP': ( 'Swap, optionally in LUKS', 'All swap partitions on the disk containing the root partition are automatically enabled. ' diff --git a/units/systemd-boot-system-token.service b/units/systemd-boot-system-token.service index 662a1fda04..ef5577549e 100644 --- a/units/systemd-boot-system-token.service +++ b/units/systemd-boot-system-token.service @@ -16,18 +16,12 @@ After=local-fs.target systemd-random-seed.service Conflicts=shutdown.target initrd-switch-root.target Before=shutdown.target initrd-switch-root.target -# Don't run this in a VM environment, because there EFI variables are not -# actually stored in NVRAM, independent of regular storage. -ConditionVirtualization=no - # Only run this if the boot loader can support random seed initialization. -ConditionPathExists=/sys/firmware/efi/efivars/LoaderFeatures-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f - -# Only run this if there is no system token defined yet, or … -ConditionPathExists=|!/sys/firmware/efi/efivars/LoaderSystemToken-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f +ConditionPathExists=|/sys/firmware/efi/efivars/LoaderFeatures-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f +ConditionPathExists=|/sys/firmware/efi/efivars/StubFeatures-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f -# … if the boot loader didn't pass the OS a random seed (and thus probably was missing the random seed file) -ConditionPathExists=|!/sys/firmware/efi/efivars/LoaderRandomSeed-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f +# Only run this if there is no system token defined yet +ConditionPathExists=!/sys/firmware/efi/efivars/LoaderSystemToken-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f [Service] Type=oneshot diff --git a/units/systemd-networkd-wait-online.service.in b/units/systemd-networkd-wait-online.service.in index 10d8b08c8e..09698fc535 100644 --- a/units/systemd-networkd-wait-online.service.in +++ b/units/systemd-networkd-wait-online.service.in @@ -12,7 +12,7 @@ Description=Wait for Network to be Configured Documentation=man:systemd-networkd-wait-online.service(8) DefaultDependencies=no Conflicts=shutdown.target -Requires=systemd-networkd.service +BindsTo=systemd-networkd.service After=systemd-networkd.service Before=network-online.target shutdown.target |