| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Without this change, the fd is closed twice on failure.
Fixes a bug introduced by dff9808a628c31b7ecb1f1aba8fdc3be06ce8372.
Fixes #35288.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, get_fixed_user() employs USER_CREDS_SUPPRESS_PLACEHOLDER,
meaning home path is set to NULL if it's empty or root. However,
the path is also used for applying WorkingDirectory=~, and we'd
spuriously use the invoking user's home as fallback even if
User= is changed in that case.
Let's instead delegate such suppression to build_environment(),
so that home is proper initialized for usage at other steps.
shell doesn't actually suffer from such problem, but it's changed
too for consistency.
Alternative to #34789
|
|
|
|
|
| |
Assign exit_status at the same site where error log is emitted,
for readability.
|
|
|
|
|
|
| |
into its own flag
No functional change, preparation for later commits.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We use the $WATCHDOG_USEC variable for two very closely uses: as part of
the sd_watchdog_enabled() protocol for implementing service watchdogs.
And as part of the protocol between the service manager and
systemd-shutdown across the PID 1 execve() transition during shutdown.
Apparently some exitrds tools got confused by the latter use. Let's
address that by setting $WATCHDOG_PID to 1, in accordance to the
sd_watchdog_enabled() protocol to make clear this is only intended for
PID 1 and nothing else.
Replaces: #35135
|
| |
|
|\
| |
| |
| |
| | |
This is a follow for https://github.com/systemd/systemd/pull/34853. In
particular, this comment
https://github.com/systemd/systemd/pull/34853#discussion_r1825837705.
|
| | |
|
| |
| |
| |
| |
| |
| | |
Follow-up for 406f1775017a5631bc91a1f53ac5e50f4fbfac0c.
Closes CID#1564897.
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since v256 we completely fail to boot if v1 is configured. Fedora 41 was just
released with v256.7 and this is probably the first major exposure of users to
this code. It turns out not work very well. Fedora switched to v2 as default in
F31 (2019) and at that time some people added configuration to use v1 either
because of Docker or for other reasons. But it's been long enough ago that
people don't remember this and are now very unhappy when the system refuses to
boot after an upgrade.
Refusing to boot is also unnecessarilly punishing to users. For machines that
are used remotely, this could mean somebody needs to physically access the
machine. For other users, the machine might be the only way to access the net
and help, and people might not know how to set kernel parameters without some
docs. And because this is in systemd, after an upgrade all boot choices are
affected, and it's not possible to e.g. select an older kernel for boot. And
crashing the machine doesn't really serve our goal either: we were giving a
hint how to continue using v1 and nothing else.
If the new override is configured, warn and immediately boot to v1.
If v1 is configured w/o the override, warn and wait 30 s and boot to v2.
Also give a hint how to switch to v2.
https://bugzilla.redhat.com/show_bug.cgi?id=2323323
https://bugzilla.redhat.com/show_bug.cgi?id=2323345
https://bugzilla.redhat.com/show_bug.cgi?id=2322467
https://www.reddit.com/r/Fedora/comments/1gfcyw9/refusing_to_run_under_cgroup_01_sy_specified_on/
The advice is to set systemd.unified_cgroup_hierarchy=1 (instead of removing
systemd.unified_cgroup_hierarchy=0). I think this is easier to convey. Users
who are understand what is going on can just remove the option instead.
The caching is dropped in cg_is_legacy_wanted(). It turns out that the
order in which those functions are called during early setup is very fragile.
If cg_is_legacy_wanted() is called before we have set up the v2 hierarchy,
we incorrectly cache a true answer. The function is called just a handful
of times at most, so we don't really need to cache the response.
|
|
|
|
| |
For justification, see 3f9a0a522f2029e9295ea5e9984259022be88413.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This new setting allows unsharing the pid namespace in a unit. Because
you have to fork to get a process into a pid namespace, we fork in
systemd-executor to get into the new pid namespace. The parent then
sends the pid of the child process back to the manager and exits while
the child process continues on with the rest of exec_invoke() and then
executes the actual payload.
Communicating the child pid is done via a new pidref socket pair that is
set up on manager startup.
We unshare the PID namespace right before the mount namespace so we
mount procfs correctly. Note PrivatePIDs=yes always implies MountAPIVFS=yes
to mount procfs.
When running unprivileged in a user session, user namespace is set up first
to allow for PID namespace to be unshared. However, when running in
privileged mode, we unshare the user namespace last to ensure the user
namespace does not own the PID namespace and cannot break out of the sandbox.
Note we disallow Type=forking services from using PrivatePIDs=yes since the
init proess inside the PID namespace must not exit for other processes in
the namespace to exist.
Note Daan De Meyer did the original work for this commit with Ryan Wilson
addressing follow-ups.
Co-authored-by: Daan De Meyer <daan.j.demeyer@gmail.com>
|
| |
|
|
|
|
|
|
|
|
| |
The names of these conflict with macros from efi.h that we'll move
to efi-fundamental.h in a later commit. Let's avoid the conflict by
getting rid of these helpers. Arguably this also improves readability
by clearly indicating we're passing arbitrary strings and not constants
to the macros when we invoke them.
|
| |
|
|\
| |
| | |
Fixes https://github.com/systemd/systemd/issues/34758
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The goal of RestartMode=direct is to make restarts invisible
to dependents, so auto restart jobs shouldn't bring them down
at all. So far we only skipped going through failed/dead states
in service_enter_dead(), i.e. the unit would never be considered
dead. But when constructing restart transaction, the stop job
would be propagated to dependents. Consider the following 2 units:
dependent.target:
[Unit]
BindsTo=a.service
After=a.service
a.service:
[Service]
ExecStart=bash -c 'sleep 100 && exit 1'
Restart=on-failure
RestartMode=direct
Before this commit, even though BindsTo= isn't triggered since
a.service never failed, when a.service auto-restarts, dependent.target
is also restarted. Let's suppress it by using JOB_REPLACE instead of
JOB_RESTART_DEPENDENCIES in service_enter_restart().
Fixes #34758
The example above is subtly different from the original report,
to illustrate that the new behavior makes sense for less exotic
use cases too.
|
| |
| |
| |
| | |
TRANSACTION_REENQUEUE_ANCHOR
|
| |
| |
| |
| |
| |
| | |
TransactionAddFlags
No functional change. Preparation for later commits.
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
Restart jobs are always run as stop jobs initially, and later gets
converted to start jobs by job engine. Hence UNIT_ATOM_PROPAGATE_STOP
should and does cover the restart case, as currently all dep types
with _RESTART also carries _STOP. Drop UNIT_ATOM_PROPAGATE_RESTART.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| | |
When an exec directory is shared between services, this allows one of the
service to be the producer of files, and the other the consumer, without
letting the consumer modify the shared files.
This will be especially useful in conjunction with id-mapped exec directories
so that fully sandboxed services can share directories in one direction, safely.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
By default mount(8), umount(8), swapon(8) and swapoff(8) should run with
with the SMACK label inherited from systemd rather than the default one
meant for services.
Fixes: aa5ae9711ef3cd0c69b7fcfbd65bca05fb704a8a
Follow-up-for: 20bbf5ee4c6c80599a91e7a4b7474e931a27db4a
|
| |
| |
| |
| |
| |
| | |
Let's make ConfigurationDirectory= a bit less "special-casey", by hiding
the fact that it's the only per-service dir we do not do chown()ing for
inside of a new EXEC_DIRECTORY_TYPE_SHALL_CHOWN() helper.
|
| |
| |
| |
| | |
These serve as race-free alternatives for MAINPID= notification.
|
| |
| |
| |
| |
| |
| |
| |
| | |
This applies the existing SocketUser=/SocketGroup= options to units
defining a POSIX message queue, bringing them in line with UNIX
sockets and FIFOs. They are set on the file descriptor rather than
a file system path because the /dev/mqueue path interface is an
optional mount unit.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This commit adds two settings private and strict to
the ProtectControlGroups= property. Private will unshare the cgroup
namespace and mount a read-write private cgroup2 filesystem at /sys/fs/cgroup.
Strict does the same except the mount is read-only. Since the unit is
running in a cgroup namespace, the new root of /sys/fs/cgroup is the unit's
own cgroup.
We also add a new dbus property ProtectControlGroupsEx which accepts strings
instead of boolean. This will allow users to use private/strict via dbus
and systemd-run in addition to service files.
Note private and strict fall back to no and yes respectively if the kernel
doesn't support cgroup2 or system is not using unified hierarchy.
Fixes: #34634
|
|/
|
|
|
|
|
| |
This commit refactors ProtectControlGroups= from using a boolean
in the dbus/execute backend to using an enum. There is no functional
change but this will allow adding new non-boolean values (e.g. strict,
private) a la PrivateHome.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A colleague reported when RootDirectory= does not exist, systemd reports an error like:
```
Failed to set up mount namespacing: No such file or directory
```
Unfortunately, with large spec files, it can be hard to diagnose which path systemd is talking
about. Thus, to make the error message more helpful and similar to mount error messages, we add
the root directory/image path into the error message like:
```
Failed to set up mount namespacing: /tmp/thisdoesnotexist: No such file or directory
```
|
|
|
|
|
| |
This is a non-functional change to ensure error_path used to print out the
offending mount causing an error follows coding guidelines.
|
|
|
|
|
|
|
|
|
| |
No functional change, but let's print yes/no rather than on/off in systemd-analyze.
Similar to 2e8a581b9cc1132743c2341fc334461096266ad4 and
edd3f4d9b7a63dc9a142ef20119e80d1d9527f2f.
(Note, the commit messages of those commits are wrong, as
parse_boolean() supports on/off anyway.)
|
|
|
|
| |
(#34893)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
even if no user is specified explicitly
When PAMName= is set this should be enough to go through our entire user
changing story, so that PAM is definitely run, and environment variables
definitely pulled in and so on.
Previously, it would happen that under some circumstances we might no do
this when transitioning from root to root itself even though PAM was
enabled.
Fixes: #34682
|
|\
| |
| | |
core: follow-ups for live mount
|
| |
| |
| |
| |
| |
| | |
* Use SD_BUS_ERROR_NOT_SUPPORTED where appropriate
* Use Service object in service_can_live_mount()
* Include errno in bus error message
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
service_enter_running() would re-arm timer for RuntimeMaxSec=,
hence it should be called instead of disabling timer completely
when live mount operation fails, in a similar fashion as
service_enter_reload_by_notify().
|
| |
| |
| |
| |
| | |
that combines updating Service.live_mount_result and
service_mount_request_reply()
|
| | |
|
| | |
|
|\ \
| | |
| | | |
core/namespace: make ProtectHome=tmpfs makes /home and friends read-only as documented
|
| | |
| | |
| | |
| | | |
with .read_only = true
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
entries
Otherwise, ProtectHome=tmpfs makes /home/ and friends not read-only.
Also, mount options for /run/ specified in MountAPIVFS=yes are not
applied.
The function append_static_mounts() was introduced in
5327c910d2fc1ae91bd0b891be92b30379c7467b, but at that time, there were
neither .read_only nor .options in the struct. But, when later the
struct is extended, the function was not updated and they were not
copied from the static table.
The fields has been used in static tables since
e4da7d8c796a1fd11ecfa80fb8a48eac9e823f06, and also in
94293d65cd4125347e21b3e423d0e245226b1be2.
Fixes #34825.
|
|/ /
| |
| |
| |
| |
| | |
Call setup_smack() also when only fallback_smack_process_label is set.
Fixes: 75689fb2d41f
|
|/
|
|
|
|
|
|
|
|
|
|
|
| |
WRITE_STRING_FILE_LABEL flag
Given that we have the LabelOps abstraction these days, we can teach
write_string_file() to use it, which means we can get rid of
fileio-label.[ch] as a separate concept.
(The only reason that fileio-label.[ch] exists independently of
fileio.[ch] was that the former linekd to libselinux potentially, and
thus had to be in src/shared/ while the other always was in src/basic/.
But the LabelOps vtable provides us with a nice work-around)
|
|\
| |
| | |
modernize the ask-password logic, and add unpriv askpw agents to the concept
|
| | |
|