summaryrefslogtreecommitdiffstats
path: root/src/sulogin-shell (unfollow)
Commit message (Collapse)AuthorFilesLines
2024-04-06nsresourced: add client-side helpers around nsresourced APIsLennart Poettering3-0/+340
This adds simple functions that wrap the Varlink IPC calls.
2024-04-06nsresourced: add new daemon for granting clients user namespaces and ↵Lennart Poettering27-2/+4292
assigning resources to them This adds a small, socket-activated Varlink daemon that can delegate UID ranges for user namespaces to clients asking for it. The primary call is AllocateUserRange() where the user passes in an uninitialized userns fd, which is then set up. There are other calls that allow assigning a mount fd to a userns allocated that way, to set up permissions for a cgroup subtree, and to allocate a veth for such a user namespace. Since the UID assignments are supposed to be transitive, i.e. not permanent, care is taken to ensure that users cannot create inodes owned by these UIDs, so that persistancy cannot be acquired. This is implemented via a BPF-LSM module that ensures that any member of a userns allocated that way cannot create files unless the mount it operates on is owned by the userns itself, or is explicitly allowelisted. BPF LSM program with contributions from Alexei Starovoitov.
2024-04-06build-sys: pick up vmlinux.h from running kernel BTF or userLennart Poettering2-2/+79
2024-04-06dissect-image: document one more dissected_image_decrypt() error codeLennart Poettering1-0/+1
2024-04-06dissect-image: make dissected_image_acquire_metadata() operate within a ↵Lennart Poettering4-6/+21
userns if possible This opens the door for making the call work without privileges: if we pass in a userns fd and DissectedImage that has mount fds then we can acquire all information without privs.
2024-04-06dissect-image: add a new helper that checks if VeritySettings has anything ↵Lennart Poettering1-0/+8
set at all
2024-04-06dissect-image: add dissected_image_close() that closes all references to ↵Lennart Poettering2-0/+15
resources
2024-04-06discover-image: export search paths arrayLennart Poettering2-1/+3
This way we can use it to validate image paths later.
2024-04-06cgroup-setup: add fd-based version of cg_attach()Lennart Poettering2-0/+15
2024-04-06cgroup-util: add helpers for opening cgroup by idLennart Poettering5-13/+109
2024-04-06lock-util: make global lock return parameter to image_path_lock() optionalLennart Poettering2-19/+29
When adding unprivileged nspawn support we don't really want a global lock file, since we cannot even access the dir they are stored in, hence make the concept optional. Some minor other modernizations.
2024-04-06bpf-dlopen: pick up more symbols from libbpfLennart Poettering3-26/+69
2024-04-06namespace-util: add new helper is_our_namespace()Lennart Poettering2-0/+41
2024-04-06namespace-util: add namespace_open_by_type() helperLennart Poettering2-0/+18
2024-04-06namespace-util: add detach_mount_namespace_userns()Lennart Poettering2-0/+16
2024-04-06namespace-util: add helper for allocating an empty userns fdLennart Poettering2-0/+22
2024-04-06namespace-util: add detach_mount_namespace_harder()Lennart Poettering2-0/+49
This is just like detach_mount_namespace() but if need be uses unpriv user namespaces to be able to execute CLONE_NEWNS.
2024-04-06uid-range: add some basic operations on UidRange objectsLennart Poettering3-7/+80
Helpers to compare and get size, and whether the object is empty.
2024-04-06uid-range: add new uid_range_load_userns_by_fd() helperLennart Poettering2-0/+62
This is similar to uid_range_load_userns() but instead of reading the uid_map off a process it reads it off a userns fd. (Of course the kernel has no API for this right now, hence we fork off a throw-away process which joins the user namespace, and then read off the data from there.)
2024-04-06uid-range: optionally load outside view of UID range from uid_map procfs fileLennart Poettering6-10/+25
2024-04-06uid-range: add uid_range_overlaps() helperLennart Poettering2-0/+22
2024-04-06image-policy: add a new image_policy_intersect() callLennart Poettering3-0/+133
This new call takes two image policy objects and generates an "intersection" policy, i.e. only allows what is allowed by both. Or in other words it conceptually implements a binary AND of the policy flags. (Except that it's a bit harder, due to normalization, and underspecified flags). We can use this later for mountfsd: a client can specify a policy, and mountfsd can specify another policy, and we'll then apply only what both allow. Note that a policy generated like this might be invalid. For example, if one policy says root must exist and be verity or luks protected, and the other policy says root must be absent, then the intersection is invalid, since one policy only allows what the other prohibits and vice versa. We'll return a clear error code in that case (ENAVAIL). (This is because we simply don't allow encoding such impossible policies in an ImagePolicy structure, for good reasons.)
2024-04-06varlink: add varlink_peek_dup_fd() helperLennart Poettering2-2/+13
This new call is like varlink_peek_fd() (i.e. gets an fd out of the connection but leaving it also in there), and combines ith with F_DUPFD_CLOEXEC to make a copy of it. We previously already had varlink_dup_fd() which was a duplicating version for pushing an fd *into* the connection. To reduce confusion, let's rename that one varlink_push_dup_fd() to make the symmetry to valrink_push_fd() clear so that we have no: varlink_peer_push_fd() → put fd in without dup'ing varlink_peer_push_dup_fd() → same with F_DUPFD_CLOEXEC varlink_peer_peek_fd() → get fd out without dup'ing varlink_peer_peek_dup_fd() → same with F_DUPFD_CLOEXEC
2024-04-06varlink: add varlink_get_peer_gid() helperLennart Poettering2-1/+19
2024-04-06test: improve debug-ability of test-executeFrantisek Sumsal1-1/+5
Since e56a8790a0 debugging test-execute fails has been a royal PITA, since we ditch all potentially useful output from the test units (that, for the most part, run `sh -x ...`). Let's improve the situation a bit by setting EXEC_OUTPUT_NULL only when running the single test case that needs it, and inheriting stdout otherwise. For example, with a purposefully introduced error we get this output with this patch: exec-personality-x86-64.service: About to execute: sh -x -c "c=\$\$(uname -m); test \"\$\$c\" = \"foo_bar\"" Serializing sd-executor-state to memfd. ... Personality: x86-64 LockPersonality: no SystemCallErrorNumber: kill ++ uname -m + c=x86_64 + test x86_64 = foo_bar Received SIGCHLD from PID 1520588 (sh). Child 1520588 (sh) died (code=exited, status=1/FAILURE) exec-personality-x86-64.service: Child 1520588 belongs to exec-personality-x86-64.service. exec-personality-x86-64.service: Main process exited, code=exited, status=1/FAILURE exec-personality-x86-64.service: Failed with result 'exit-code'. ... Exit Status: 1 src/test/test-execute.c:456:test_exec_personality: exec-personality-x86-64.service: can_unshare=yes: exit status 1, expected 0 (test-execute-root) terminated by signal ABRT. Assertion 'r >= 0' failed at src/test/test-execute.c:1433, function prepare_ns(). Aborting. Aborted But without it, we'd miss the most important part: exec-personality-x86-64.service: About to execute: sh -x -c "c=\$\$(uname -m); test \"\$\$c\" = \"foo_bar\"" Serializing sd-executor-state to memfd. ... Personality: x86-64 LockPersonality: no SystemCallErrorNumber: kill Received SIGCHLD from PID 1521365 (sh). Child 1521365 (sh) died (code=exited, status=1/FAILURE) exec-personality-x86-64.service: Child 1521365 belongs to exec-personality-x86-64.service. exec-personality-x86-64.service: Main process exited, code=exited, status=1/FAILURE exec-personality-x86-64.service: Failed with result 'exit-code'. ... Exit Status: 1 src/test/test-execute.c:456:test_exec_personality: exec-personality-x86-64.service: can_unshare=yes: exit status 1, expected 0 (test-execute-root) terminated by signal ABRT. Assertion 'r >= 0' failed at src/test/test-execute.c:1433, function prepare_ns(). Aborting. Aborted
2024-04-06man: fix typo s/veno/reno/Vito Caputo1-1/+1
2024-04-05core/service: add a FIXME to use pidfd to monitor foreign processesMike Yuan1-2/+2
2024-04-05core/service: complain louder if new MAINPID= is refusedMike Yuan1-1/+1
2024-04-05core/service: make service_set_main_pidref consume pidrefMike Yuan1-20/+19
Currently, the memory management of service_set_main_pidref is a bit odd. Normally we either invalidate the original resource on caller's side after the call succeeds, or just pass the ownership wholly. But service_set_main_pidref take a pointer, and calls pidref_done() internally. Let's just make it consume the passed pidref. This is more straightforward.
2024-04-05sleep: rename SleepMemMode= to MemorySleepMode=Mike Yuan3-3/+3
Addresses https://github.com/systemd/systemd/pull/31986#discussion_r1554053623
2024-04-05os-util: use ENDSWITH_SET where appropriateMike Yuan1-9/+4
Addresses https://github.com/systemd/systemd/pull/31435#discussion_r1553969156 Co-authored-by: Lennart Poettering <lennart@poettering.net>
2024-04-05base-filesystem: check for __s390x__ firstFrantisek Sumsal1-2/+2
On s390x both __s390__ and __s390x__ are defined, and with the original order we'd go through the __s390__ branch and emit a warning: [169/2118] Compiling C object src/shared/libsystemd-shared-256.a.p/base-filesystem.c.o ../src/shared/base-filesystem.c:136:11: note: ‘#pragma message: Please add an entry above specifying whether your architecture uses /lib64/, /lib32/, or no such links.’ 136 | # pragma message "Please add an entry above specifying whether your architecture uses /lib64/, /lib32/, or no such links." | ^~~~~~~
2024-04-05test: account for build dir being under one of the tmpfs-ed directoriesFrantisek Sumsal1-1/+30
If we're running test-execute from the build directory which is under one of the tmpfs-ed directories (i.e. /root or /tmp), test-execute might behave strangely, since in that case manager_new() pins the system systemd-executor binary instead of the build dir one, which may lead to a very confusing test fails (if there's enough difference between the system and built sd-executor binary). Let's account for that and bind-mount the build dir under the tmpfs-ed directory if necessary.
2024-04-05test: make test-fd-util more lenient when using fd_move_above_stdio()Frantisek Sumsal1-9/+13
On s390x this test fails when the SUT uses the z90crypt kernel module, as it's an another FD the test doesn't account for: /* test_rearrange_stdio */ Successfully forked off 'rearrange' as PID 57293. test_rearrange_stdio: r=0 /proc/57293/fd: total 0 lrwx------. 1 root root 64 Apr 5 06:18 0 -> /dev/pts/0 lrwx------. 1 root root 64 Apr 5 06:18 1 -> /dev/pts/0 lrwx------. 1 root root 64 Apr 5 06:18 2 -> /dev/pts/0 lrwx------. 1 root root 64 Apr 5 06:18 3 -> /dev/z90crypt rearrange terminated by signal ABRT. Debugging this was pain, since the child process didn't log anything once we closed stdout/stderr (for obvious reasons). Let's fix both issues by switching logging to kmsg once we close stdin/stdout/stderr, and also by making the test work fine when there are some extra FDs in the child's environment.
2024-04-05sd-journal: fix check in `journal_file_verify_header()`Antonio Alvarez Feijoo1-3/+3
Fixes 6ea51363c8e39fb0924dda972a212936456a2b4f
2024-04-05log: fix commentFrantisek Sumsal1-1/+1
2024-04-05core: Serialize both pid and pidfd to keep downgrades workingDaan De Meyer5-17/+22
Currently, when downgrading from a version with pidfd support to a version without pidfd support, all information about running processes is lost as the newer systemd will serialized pidfds which are not recognized by the older systemd when deserializing. To improve the situation, let's serialize both the pid and the pidfd. This is safe because existing versions will either replace the first deserialized pidref with the second one or discard the second one in favor of the first one depending on the unit and field. Older versions that don't support pidfd's will silently discard any fields that contain a pidfd as those will try to parse the field as a pid and since a pidfd field will start with '@', those versions will debug error log and ignore the value. To make sure we reuse the existing pidfd as much as possible, the pidfd is serialized first. Both for scopes and service main pids, if the same pid is seen multiple times, the first pidref is kept. So by serializing the pidfd first we make sure the original pidfd is used instead of the new one which is opened when deserializing the first pid field. For other control units, older versions with pidfd support will discard the first pidfd and replace it with a new pidfd from the second pid field. This is a slight regression on downgrades, but we make sure it doesn't happen for future versions (and older versions when this commit is backported) by modifying the logic to only use the first successfully deserialized pidref so that the raw pid without pidfd is discarded instead of it replacing the existing pidfd.
2024-04-05meson: set -fno-ssa-phiopt when building bpf with gccLuca Boccassi1-0/+1
There are bugs in the kernel verifier that cause legitimate code to be rejected, disabling this optimization makes bpf programs built with a new enough gcc work again. Fixes https://github.com/systemd/systemd/issues/31888
2024-04-05hwdb: fix missing colon (#32108)Kirk1-1/+1
Missing colon prevents this from working correctly on the Chuwi UBook X and UBook X Pro.
2024-04-04udevadm-test: also show security labels if specifiedYu Watanabe1-0/+7
Follow-up for 03b6879f4d45c49264708aef872fd05af30ddcf0.
2024-04-04backlight: fix detection of multiple graphic cardsYu Watanabe1-0/+4
Follow-up for e0504dd011189d97a1ea813aabfe1e696742bcf5. Hopefully, devices in PCI subsystem have some properties, thus have their udev database file. But, that may not be true. Here, we only read sysattrs of enumerated devices, hence it is not necessary to check if the device is initialized or not.
2024-04-04udev: do not update sysattr and sysctl value on testingYu Watanabe1-12/+21
Follow-up for 089bef66316e5bdc91b9984148e5a6455449c1da.
2024-04-04man/kernel-command-line: document resume_offset= tooMike Yuan1-0/+10
2024-04-04hibernate-util: say "HibernateLocation EFI variable" consistentlyMike Yuan1-1/+1
2024-04-04udevadm-test: insert missing line breakYu Watanabe1-1/+1
Addresses post-merge comment: https://github.com/systemd/systemd/commit/03b6879f4d45c49264708aef872fd05af30ddcf0#r140587790
2024-04-04TEST-50: add tests for riscv{32,64}Zbigniew Jędrzejewski-Szmek1-5/+15
Requested for the testing of F40 riscv bringup. Numbers copied from https://uapi-group.org/specifications/specs/discoverable_partitions_specification/. It'd be nice to do the same in TEST-58, but the code there is rather involved and I don't have a system to test on. We can probably try that later on when F40 is available.
2024-04-04Fixed resolution for pen and touchpadmkubiak1-3/+5
2024-04-04netowrk/ndisc: drop NDisc configurations when received NA without Router flagYu Watanabe2-4/+157
Closes #28421.
2024-04-04test-ndisc: add basic tests for Neighbor Advertisement handlingYu Watanabe1-3/+117
2024-04-04sd-ndisc: add basic support of Neighbor Advertisement messageYu Watanabe8-2/+240
This adds basic support of receiving and parsing Neighbor Advertisement message defined in RFC 4861.