systemd - systemd

	Commit message (Collapse)	Author	Files	Lines
2024-04-06	nsresourced: add client-side helpers around nsresourced APIs	Lennart Poettering	3	-0/+340
	This adds simple functions that wrap the Varlink IPC calls.
2024-04-06	nsresourced: add new daemon for granting clients user namespaces and ↵	Lennart Poettering	27	-2/+4292
	assigning resources to them This adds a small, socket-activated Varlink daemon that can delegate UID ranges for user namespaces to clients asking for it. The primary call is AllocateUserRange() where the user passes in an uninitialized userns fd, which is then set up. There are other calls that allow assigning a mount fd to a userns allocated that way, to set up permissions for a cgroup subtree, and to allocate a veth for such a user namespace. Since the UID assignments are supposed to be transitive, i.e. not permanent, care is taken to ensure that users cannot create inodes owned by these UIDs, so that persistancy cannot be acquired. This is implemented via a BPF-LSM module that ensures that any member of a userns allocated that way cannot create files unless the mount it operates on is owned by the userns itself, or is explicitly allowelisted. BPF LSM program with contributions from Alexei Starovoitov.
2024-04-06	build-sys: pick up vmlinux.h from running kernel BTF or user	Lennart Poettering	2	-2/+79

2024-04-06	dissect-image: document one more dissected_image_decrypt() error code	Lennart Poettering	1	-0/+1

2024-04-06	dissect-image: make dissected_image_acquire_metadata() operate within a ↵	Lennart Poettering	4	-6/+21
	userns if possible This opens the door for making the call work without privileges: if we pass in a userns fd and DissectedImage that has mount fds then we can acquire all information without privs.
2024-04-06	dissect-image: add a new helper that checks if VeritySettings has anything ↵	Lennart Poettering	1	-0/+8
	set at all
2024-04-06	dissect-image: add dissected_image_close() that closes all references to ↵	Lennart Poettering	2	-0/+15
	resources
2024-04-06	discover-image: export search paths array	Lennart Poettering	2	-1/+3
	This way we can use it to validate image paths later.
2024-04-06	cgroup-setup: add fd-based version of cg_attach()	Lennart Poettering	2	-0/+15

2024-04-06	cgroup-util: add helpers for opening cgroup by id	Lennart Poettering	5	-13/+109

2024-04-06	lock-util: make global lock return parameter to image_path_lock() optional	Lennart Poettering	2	-19/+29
	When adding unprivileged nspawn support we don't really want a global lock file, since we cannot even access the dir they are stored in, hence make the concept optional. Some minor other modernizations.
2024-04-06	bpf-dlopen: pick up more symbols from libbpf	Lennart Poettering	3	-26/+69

2024-04-06	namespace-util: add new helper is_our_namespace()	Lennart Poettering	2	-0/+41

2024-04-06	namespace-util: add namespace_open_by_type() helper	Lennart Poettering	2	-0/+18

2024-04-06	namespace-util: add detach_mount_namespace_userns()	Lennart Poettering	2	-0/+16

2024-04-06	namespace-util: add helper for allocating an empty userns fd	Lennart Poettering	2	-0/+22

2024-04-06	namespace-util: add detach_mount_namespace_harder()	Lennart Poettering	2	-0/+49
	This is just like detach_mount_namespace() but if need be uses unpriv user namespaces to be able to execute CLONE_NEWNS.
2024-04-06	uid-range: add some basic operations on UidRange objects	Lennart Poettering	3	-7/+80
	Helpers to compare and get size, and whether the object is empty.
2024-04-06	uid-range: add new uid_range_load_userns_by_fd() helper	Lennart Poettering	2	-0/+62
	This is similar to uid_range_load_userns() but instead of reading the uid_map off a process it reads it off a userns fd. (Of course the kernel has no API for this right now, hence we fork off a throw-away process which joins the user namespace, and then read off the data from there.)
2024-04-06	uid-range: optionally load outside view of UID range from uid_map procfs file	Lennart Poettering	6	-10/+25

2024-04-06	uid-range: add uid_range_overlaps() helper	Lennart Poettering	2	-0/+22

2024-04-06	image-policy: add a new image_policy_intersect() call	Lennart Poettering	3	-0/+133
	This new call takes two image policy objects and generates an "intersection" policy, i.e. only allows what is allowed by both. Or in other words it conceptually implements a binary AND of the policy flags. (Except that it's a bit harder, due to normalization, and underspecified flags). We can use this later for mountfsd: a client can specify a policy, and mountfsd can specify another policy, and we'll then apply only what both allow. Note that a policy generated like this might be invalid. For example, if one policy says root must exist and be verity or luks protected, and the other policy says root must be absent, then the intersection is invalid, since one policy only allows what the other prohibits and vice versa. We'll return a clear error code in that case (ENAVAIL). (This is because we simply don't allow encoding such impossible policies in an ImagePolicy structure, for good reasons.)
2024-04-06	varlink: add varlink_peek_dup_fd() helper	Lennart Poettering	2	-2/+13
	This new call is like varlink_peek_fd() (i.e. gets an fd out of the connection but leaving it also in there), and combines ith with F_DUPFD_CLOEXEC to make a copy of it. We previously already had varlink_dup_fd() which was a duplicating version for pushing an fd into the connection. To reduce confusion, let's rename that one varlink_push_dup_fd() to make the symmetry to valrink_push_fd() clear so that we have no: varlink_peer_push_fd() → put fd in without dup'ing varlink_peer_push_dup_fd() → same with F_DUPFD_CLOEXEC varlink_peer_peek_fd() → get fd out without dup'ing varlink_peer_peek_dup_fd() → same with F_DUPFD_CLOEXEC
2024-04-06	varlink: add varlink_get_peer_gid() helper	Lennart Poettering	2	-1/+19

2024-04-06	test: improve debug-ability of test-execute	Frantisek Sumsal	1	-1/+5
	Since e56a8790a0 debugging test-execute fails has been a royal PITA, since we ditch all potentially useful output from the test units (that, for the most part, run `sh -x ...`). Let's improve the situation a bit by setting EXEC_OUTPUT_NULL only when running the single test case that needs it, and inheriting stdout otherwise. For example, with a purposefully introduced error we get this output with this patch: exec-personality-x86-64.service: About to execute: sh -x -c "c=\$\$(uname -m); test \"\$\$c\" = \"foo_bar\"" Serializing sd-executor-state to memfd. ... Personality: x86-64 LockPersonality: no SystemCallErrorNumber: kill ++ uname -m + c=x86_64 + test x86_64 = foo_bar Received SIGCHLD from PID 1520588 (sh). Child 1520588 (sh) died (code=exited, status=1/FAILURE) exec-personality-x86-64.service: Child 1520588 belongs to exec-personality-x86-64.service. exec-personality-x86-64.service: Main process exited, code=exited, status=1/FAILURE exec-personality-x86-64.service: Failed with result 'exit-code'. ... Exit Status: 1 src/test/test-execute.c:456:test_exec_personality: exec-personality-x86-64.service: can_unshare=yes: exit status 1, expected 0 (test-execute-root) terminated by signal ABRT. Assertion 'r >= 0' failed at src/test/test-execute.c:1433, function prepare_ns(). Aborting. Aborted But without it, we'd miss the most important part: exec-personality-x86-64.service: About to execute: sh -x -c "c=\$\$(uname -m); test \"\$\$c\" = \"foo_bar\"" Serializing sd-executor-state to memfd. ... Personality: x86-64 LockPersonality: no SystemCallErrorNumber: kill Received SIGCHLD from PID 1521365 (sh). Child 1521365 (sh) died (code=exited, status=1/FAILURE) exec-personality-x86-64.service: Child 1521365 belongs to exec-personality-x86-64.service. exec-personality-x86-64.service: Main process exited, code=exited, status=1/FAILURE exec-personality-x86-64.service: Failed with result 'exit-code'. ... Exit Status: 1 src/test/test-execute.c:456:test_exec_personality: exec-personality-x86-64.service: can_unshare=yes: exit status 1, expected 0 (test-execute-root) terminated by signal ABRT. Assertion 'r >= 0' failed at src/test/test-execute.c:1433, function prepare_ns(). Aborting. Aborted
2024-04-06	man: fix typo s/veno/reno/	Vito Caputo	1	-1/+1

2024-04-05	core/service: add a FIXME to use pidfd to monitor foreign processes	Mike Yuan	1	-2/+2

2024-04-05	core/service: complain louder if new MAINPID= is refused	Mike Yuan	1	-1/+1

2024-04-05	core/service: make service_set_main_pidref consume pidref	Mike Yuan	1	-20/+19
	Currently, the memory management of service_set_main_pidref is a bit odd. Normally we either invalidate the original resource on caller's side after the call succeeds, or just pass the ownership wholly. But service_set_main_pidref take a pointer, and calls pidref_done() internally. Let's just make it consume the passed pidref. This is more straightforward.
2024-04-05	sleep: rename SleepMemMode= to MemorySleepMode=	Mike Yuan	3	-3/+3
	Addresses https://github.com/systemd/systemd/pull/31986#discussion_r1554053623
2024-04-05	os-util: use ENDSWITH_SET where appropriate	Mike Yuan	1	-9/+4
	Addresses https://github.com/systemd/systemd/pull/31435#discussion_r1553969156 Co-authored-by: Lennart Poettering <lennart@poettering.net>
2024-04-05	base-filesystem: check for __s390x__ first	Frantisek Sumsal	1	-2/+2
	On s390x both __s390__ and __s390x__ are defined, and with the original order we'd go through the __s390__ branch and emit a warning: [169/2118] Compiling C object src/shared/libsystemd-shared-256.a.p/base-filesystem.c.o ../src/shared/base-filesystem.c:136:11: note: ‘#pragma message: Please add an entry above specifying whether your architecture uses /lib64/, /lib32/, or no such links.’ 136 \| # pragma message "Please add an entry above specifying whether your architecture uses /lib64/, /lib32/, or no such links." \| ^~~~~~~
2024-04-05	test: account for build dir being under one of the tmpfs-ed directories	Frantisek Sumsal	1	-1/+30
	If we're running test-execute from the build directory which is under one of the tmpfs-ed directories (i.e. /root or /tmp), test-execute might behave strangely, since in that case manager_new() pins the system systemd-executor binary instead of the build dir one, which may lead to a very confusing test fails (if there's enough difference between the system and built sd-executor binary). Let's account for that and bind-mount the build dir under the tmpfs-ed directory if necessary.
2024-04-05	test: make test-fd-util more lenient when using fd_move_above_stdio()	Frantisek Sumsal	1	-9/+13
	On s390x this test fails when the SUT uses the z90crypt kernel module, as it's an another FD the test doesn't account for: /* test_rearrange_stdio */ Successfully forked off 'rearrange' as PID 57293. test_rearrange_stdio: r=0 /proc/57293/fd: total 0 lrwx------. 1 root root 64 Apr 5 06:18 0 -> /dev/pts/0 lrwx------. 1 root root 64 Apr 5 06:18 1 -> /dev/pts/0 lrwx------. 1 root root 64 Apr 5 06:18 2 -> /dev/pts/0 lrwx------. 1 root root 64 Apr 5 06:18 3 -> /dev/z90crypt rearrange terminated by signal ABRT. Debugging this was pain, since the child process didn't log anything once we closed stdout/stderr (for obvious reasons). Let's fix both issues by switching logging to kmsg once we close stdin/stdout/stderr, and also by making the test work fine when there are some extra FDs in the child's environment.
2024-04-05	sd-journal: fix check in `journal_file_verify_header()`	Antonio Alvarez Feijoo	1	-3/+3
	Fixes 6ea51363c8e39fb0924dda972a212936456a2b4f
2024-04-05	log: fix comment	Frantisek Sumsal	1	-1/+1

2024-04-05	core: Serialize both pid and pidfd to keep downgrades working	Daan De Meyer	5	-17/+22
	Currently, when downgrading from a version with pidfd support to a version without pidfd support, all information about running processes is lost as the newer systemd will serialized pidfds which are not recognized by the older systemd when deserializing. To improve the situation, let's serialize both the pid and the pidfd. This is safe because existing versions will either replace the first deserialized pidref with the second one or discard the second one in favor of the first one depending on the unit and field. Older versions that don't support pidfd's will silently discard any fields that contain a pidfd as those will try to parse the field as a pid and since a pidfd field will start with '@', those versions will debug error log and ignore the value. To make sure we reuse the existing pidfd as much as possible, the pidfd is serialized first. Both for scopes and service main pids, if the same pid is seen multiple times, the first pidref is kept. So by serializing the pidfd first we make sure the original pidfd is used instead of the new one which is opened when deserializing the first pid field. For other control units, older versions with pidfd support will discard the first pidfd and replace it with a new pidfd from the second pid field. This is a slight regression on downgrades, but we make sure it doesn't happen for future versions (and older versions when this commit is backported) by modifying the logic to only use the first successfully deserialized pidref so that the raw pid without pidfd is discarded instead of it replacing the existing pidfd.
2024-04-05	meson: set -fno-ssa-phiopt when building bpf with gcc	Luca Boccassi	1	-0/+1
	There are bugs in the kernel verifier that cause legitimate code to be rejected, disabling this optimization makes bpf programs built with a new enough gcc work again. Fixes https://github.com/systemd/systemd/issues/31888
2024-04-05	hwdb: fix missing colon (#32108)	Kirk	1	-1/+1
	Missing colon prevents this from working correctly on the Chuwi UBook X and UBook X Pro.
2024-04-04	udevadm-test: also show security labels if specified	Yu Watanabe	1	-0/+7
	Follow-up for 03b6879f4d45c49264708aef872fd05af30ddcf0.
2024-04-04	backlight: fix detection of multiple graphic cards	Yu Watanabe	1	-0/+4
	Follow-up for e0504dd011189d97a1ea813aabfe1e696742bcf5. Hopefully, devices in PCI subsystem have some properties, thus have their udev database file. But, that may not be true. Here, we only read sysattrs of enumerated devices, hence it is not necessary to check if the device is initialized or not.
2024-04-04	udev: do not update sysattr and sysctl value on testing	Yu Watanabe	1	-12/+21
	Follow-up for 089bef66316e5bdc91b9984148e5a6455449c1da.
2024-04-04	man/kernel-command-line: document resume_offset= too	Mike Yuan	1	-0/+10

2024-04-04	hibernate-util: say "HibernateLocation EFI variable" consistently	Mike Yuan	1	-1/+1

2024-04-04	udevadm-test: insert missing line break	Yu Watanabe	1	-1/+1
	Addresses post-merge comment: https://github.com/systemd/systemd/commit/03b6879f4d45c49264708aef872fd05af30ddcf0#r140587790
2024-04-04	TEST-50: add tests for riscv{32,64}	Zbigniew Jędrzejewski-Szmek	1	-5/+15
	Requested for the testing of F40 riscv bringup. Numbers copied from https://uapi-group.org/specifications/specs/discoverable_partitions_specification/. It'd be nice to do the same in TEST-58, but the code there is rather involved and I don't have a system to test on. We can probably try that later on when F40 is available.
2024-04-04	Fixed resolution for pen and touchpad	mkubiak	1	-3/+5

2024-04-04	netowrk/ndisc: drop NDisc configurations when received NA without Router flag	Yu Watanabe	2	-4/+157
	Closes #28421.
2024-04-04	test-ndisc: add basic tests for Neighbor Advertisement handling	Yu Watanabe	1	-3/+117

2024-04-04	sd-ndisc: add basic support of Neighbor Advertisement message	Yu Watanabe	8	-2/+240
	This adds basic support of receiving and parsing Neighbor Advertisement message defined in RFC 4861.