summaryrefslogtreecommitdiffstats
path: root/src/basic/unit-def.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* man: consolidate list of active unit states into a shared tableLuca Boccassi2024-10-041-0/+1
| | | | | | Avoids the need to maintain the same list over and over again, and link it to the defition table in the implementation as a reminder too
* core: do BindMount/MountImage operations in async control processLuca Boccassi2024-08-291-0/+3
| | | | | | | | | | | | | | | | | | These operations might require slow I/O, and thus might block PID1's main loop for an undeterminated amount of time. Instead of performing them inline, fork a worker process and stash away the D-Bus message, and reply once we get a SIGCHILD indicating they have completed. That way we don't break compatibility and callers can continue to rely on the fact that when they get the method reply the operation either succeeded or failed. To keep backward compatibility, unlike reload control processes, these are ran inside init.scope and not the target cgroup. Unlike ExecReload, this is under our control and is not defined by the unit. This is necessary because previously the operation also wasn't ran from the target cgroup, so suddenly forking a copy-on-write copy of pid1 into the target cgroup will make memory usage spike, and if there is a MemoryMax= or MemoryHigh= set and the cgroup is already close to the limit, it will cause an OOM kill, where previously it would have worked fine.
* core,unit-def: use our usual way of asserting enumsMike Yuan2024-07-171-1/+3
|
* various: move ptr indicator to return valueZbigniew Jędrzejewski-Szmek2024-06-191-1/+1
|
* various: move const ptr indicator to return valueZbigniew Jędrzejewski-Szmek2024-06-191-1/+1
|
* unit-def: append trailing comma for the last entryYu Watanabe2024-03-291-7/+7
|
* core: Rework recursive freeze/thawAdrian Vovk2024-01-301-4/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit overhauls the way freeze/thaw works recursively: First, it introduces new FreezerActions that are like the existing FREEZE and THAW but indicate that the action was initiated by a parent unit. We also refactored the code to pass these FreezerActions through the whole call stack so that we can make use of them. FreezerState was extended similarly, to be able to differentiate between a unit that's frozen manually and a unit that's frozen because a parent is frozen. Next, slices were changed to check recursively that all their child units can be frozen before it attempts to freeze them. This is different from the previous behavior, that would just check if the unit's type supported freezing at all. This cleans up the code, and also ensures that the behavior of slices corresponds to the unit's actual ability to be frozen Next, we make it so that if you FREEZE a slice, it'll PARENT_FREEZE all of its children. Similarly, if you THAW a slice it will PARENT_THAW its children. Finally, we use the new states available to us to refactor the code that actually does the cgroup freezing. The code now looks at the unit's existing freezer state and the action being requested, and decides what next state is most appropriate. Then it puts the unit in that state. For instance, a RUNNING unit with a request to PARENT_FREEZE will put the unit into the PARENT_FREEZING state. As another example, a FROZEN unit who's parent is also FROZEN will transition to PARENT_FROZEN in response to a request to THAW. Fixes https://github.com/systemd/systemd/issues/30640 Fixes https://github.com/systemd/systemd/issues/15850
* Add unit_type_to_capitalized_string()Daan De Meyer2023-10-201-0/+8
|
* Revert "mount: check right before invoking /bin/umount if it makes sense"Yu Watanabe2023-08-141-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 1483892a421ca34bc841a8e8b1f385744c0407ed. As the commit says, it does not solve the race. Moreover, it introduces an regression #28410. Also, checking by `path_is_mount_point()` may trigger automount. From statx(2), > AT_NO_AUTOMOUNT > Don't automount the terminal ("basename") component of pathname > if it is a directory that is an automount point. Similar statements can be found in fstatat(2), which is used in the fallback call for statx() in glibc, and name_to_handle_at(2), which is used as the fallback when statx() failed. So, `path_is_mount_point()` may _do_ trigger automount for parent paths. That should be avoided especially on shutdown. The original issue #25527 that is 'fixed' by the commit is not serious, and should be fixed by making umount command handle path gracefully: https://github.com/util-linux/util-linux/issues/2132 Fixes #28410.
* core: introduce a new job mode JOB_RESTART_DEPENDENCIESLennart Poettering2023-07-031-0/+1
| | | | | | | | | | | | | | | | | | | | This new job mode will enqueue a start job for a unit, and all units depending on the unit will get a restart job enqueued. This is then used for automatic sevice restarts: the unit itself is only started, the depending units restarted. This way the unit will not go down unnecessarily, triggering OnSuccess= needlessly. This also introduces a new state SERVICE_AUTO_RESTART_QUEUED that is entered once the restart jobs are enqueued. Previously we'd stay in SERVICE_AUTO_RESTART, but that's problematic, since we'd lose information whether we still need to enqueue the restart job during a serialization/deserialization cycle or not. By having an explicit state for this we know exactly whether we still need to enqueue the job or not. It's also good since when we are in SERVICE_AUTO_RESTART_QUEUED we want to act on unit_start(), but on SERVICE_AUTO_RESTART we want to wait for the holdoff time to pass before we act on unit_start(). Fixes: #27722
* mount: check right before invoking /bin/umount if it makes senseLennart Poettering2023-05-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Notifications from /proc/self/mountinfo are async, so if we stop a service (and while doing so get rid of the credentials mount point of it), then it will take a while until the notification reaches us and we actually scan the table again. In particular as we nowadays ratelimit notifications on the table, since it's so inefficient. And as I learnt the ratelimiting is actually quite regularly hit during shutdown, where a flurry of umount events are genreated. Hence, let's check if a mount point is actually a mountpoint before trying to unmount it. And if it isn't let's wait for the notification to come in. (This race might be triggred not just by us on ourselves btw: there are other daemons that unmount stuff when stopping where the race also exists, but might simply be harder to trigger: if during service shutdown these services remove some mount then they might collide with us doing the same. After all, we have the rule to unmount everything mounted automatically for you during shutdown.) In the long run we should also start making us of this when it becomes available: https://github.com/util-linux/util-linux/issues/2132 With that we can make issues like this go away entirely from our side of things at least. Fixes: #25527
* service: add ability to pin fd storeLennart Poettering2023-04-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Oftentimes it is useful to allow the per-service fd store to survive longer than for a restart. This is useful in various scenarios: 1. An fd to some security relevant object needs to be stashed somewhere, that should not be cleaned automatically, because the security enforcement would be dropped then. 2. A user namespace fd should be allocated on first invocation and be kept around until the user logs out (i.e. systemd --user ends), á la #16328 (This does not implement what #16318 asks for, but should solve the use-case discussed there.) 3. There's interest in allow a concept of "userspace reboots" where the kernel stays running, and userspace is swapped out (i.e. all services exit, and the rootfs transitioned into a new version of it) while keeping some select resources pinned, very similar to how we implement a switch root. Thus it is useful to allow services to exit, while leaving their fds around till the very end. This is exposed through a new FileDescriptorStorePreserve= setting that is closely modelled after RuntimeDirectoryPreserve= (in fact it reused the same internal type), since we want similar behaviour in the end, and quite often they probably want to be used together.
* pid1: introduce new SERVICE_{DEAD|FAILED}_BEFORE_AUTO_RESTART service substatesLennart Poettering2023-03-291-21/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a service deactivates and is then automatically restarted via Restart= we currently quickly transition through SERVICE_DEAD/SERVICE_FAILED. Which is weird given it's not the normal ("permanent") dead/failed state, but a transitory one we immediately leave from again. We do this so that software that looks for failures/successes can take notice, even if we restart as a consequence of the deactivation. Let's clean this up a bit: let's introduce two new states: SERVICE_DEAD_BEFORE_AUTO_RESTART and SERVICE_FAILED_BEFORE_AUTO_RESTART that are used for the transitory states. Both the SERVICE_DEAD and SERVICE_DEAD_BEFORE_AUTO_RESTART will map to the high-level UNIT_INACTIVE state though. (and similar for the respective failed states). This means the high-level state machine won't change by this, only the low-level one. This clearly seperates the substates, which makes the state engine cleaner, and allows clients to follow precisely whether we are in a transitory dead/failed state, or a permanent one, by looking at the service substate. Moreover it allows us to remove the 'n_keep_fd_store' which so far we used to ensure the fdstore was not released during this transitory dead/failed state but only during the permanent one. Since we can now distinguish these states properly we can just use that. This has been bugging me for a while. Let's clean this up. Note that the unit restart logic is already nicely covered in the testsiute, hence this adds no new tests for that. And yes, this could be considered a compat break, but sofar we took the liberty to make changes to the low-level state machine (i.e. SERVICE_xyz states, sometimes called "substates") without considering this a bad breakage – the high-level state machine (i.e. UNIT_xyz states) should be considered API that cannot be changed.
* pid1: add new Type=notify-reload service typeLennart Poettering2023-01-101-0/+2
| | | | Fixes: #6162
* scope: allow unprivileged delegation on scopesMichal Sekletar2022-08-041-0/+1
| | | | | | | | | Previously it was possible to set delegate property for scope, but you were not able to allow unprivileged process to manage the scope's cgroup hierarchy. This is useful when launching manager process that will run unprivileged but is supposed to manage its own (scope) sub-hierarchy. Fixes #21683
* unit-def: align string tablesYu Watanabe2022-07-111-125/+125
|
* core: implement Uphold= dependency typeLennart Poettering2021-05-251-0/+2
| | | | | | | | | | | | | | This is like a really strong version of Wants=, that keeps starting the specified unit if it is ever found inactive. This is an alternative to Restart= inside a unit, acknowledging the fact that whether to keep restarting the unit is sometimes not a property of the unit itself but the state of the system. This implements a part of what #4263 requests. i.e. there's no distinction between "always" and "opportunistic". We just dumbly implement "always" and become active whenever we see no job queued for an inactive unit that is supposed to be upheld.
* core: add new OnSuccess= dependency typeLennart Poettering2021-05-251-0/+2
| | | | | | | | | | | | | | This is similar to OnFailure= but is activated whenever a unit returns into inactive state successfully. I was always afraid of adding this, since it effectively allows building loops and makes our engine Turing complete, but it pretty much already was it was just hidden. Given that we have per-unit ratelimits as well as an event loop global ratelimit I feel safe to add this finally, given it actually is useful. Fixes: #13386
* core: add new PropagateStopTo= dependency (and inverse)Lennart Poettering2021-05-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This takes inspiration from PropagatesReloadTo=, but propagates stop jobs instead of restart jobs. This is defined based on exactly two atoms: UNIT_ATOM_PROPAGATE_STOP + UNIT_ATOM_RETROACTIVE_STOP_ON_STOP. The former ensures that when the unit the dependency is originating from is stopped based on user request, we'll propagate the stop job to the target unit, too. In addition, when the originating unit suddenly stops from external causes the stopping is propagated too. Note that this does *not* include the UNIT_ATOM_CANNOT_BE_ACTIVE_WITHOUT atom (which is used by BoundBy=), i.e. this dependency is purely about propagating "edges" and not "levels", i.e. it's about propagating specific events, instead of continious states. This is supposed to be useful for dependencies between .mount units and their backing .device units. So far we either placed a BindsTo= or Requires= dependency between them. The former gave a very clear binding of the to units together, however was problematic if users establish mounnts manually with different block device sources than our configuration defines, as we there might come to the conclusion that the backing device was absent and thus we need to umount again what the user mounted. By combining Requires= with the new StopPropagatedFrom= (i.e. the inverse PropagateStopTo=) we can get behaviour that matches BindsTo= in every single atom but one: UNIT_ATOM_CANNOT_BE_ACTIVE_WITHOUT is absent, and hence the level-triggered logic doesn't apply. Replaces: #11340
* core: add a reverse dep for OnFailure=Lennart Poettering2021-05-251-0/+1
| | | | | | | | | | | | Let's add an implicit reverse dep OnFailureOf=. This is exposed via the bus to make things more debuggable: you can now ask systemd for which units a specific unit is the failure handler. OnFailure= was the only dependency type that had no inverse, this fixes that. Now that deps are a bit cheaper, it should be OK to add deps that only serve debug purposes.
* core: convert Slice= into a proper dependency (and add a back dependency)Lennart Poettering2021-05-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | The slice a unit is assigned to is currently a UnitRef reference. Let's turn it into a proper dependency, to simplify and clean up code a bit. Now that new dep types are cheaper, deps should generally be preferable over everything else, if the concept applies. This brings one major benefit: we often have to iterate through all unit a slice contains. So far we iterated through all Before= dependencies of the slice unit to achieve that, filtering out unrelated units, and taking benefit of the fact that slice units are implicitly ordered Before= the units they contain. By making Slice= a proper dependency, and having an accompanying SliceOf= dependency type, this is much simpler and nicer as we can directly enumerate the units a slice contains. The forward dependency is actually called InSlice internally, since we already used the UNIT_SLICE name as UnitType field. However, since we don't intend to expose the dependency to users as dep anyway (we already have the regular Slice D-Bus property for this) this shouldn't matter. The SliceOf= implicit dependency type (the erverse of Slice=/InSlice=) is exported over the bus, to make things a bit nicer to debug and discoverable.
* core: rework unit_active_state_to_glyph() to use a translation tableLennart Poettering2021-04-081-18/+15
| | | | | Let's make this a bit more readable by implementing this via a translation table, indexed by the state.
* core: add Unit.Markers propertyZbigniew Jędrzejewski-Szmek2021-02-151-0/+7
| | | | | | | | | The property is never set by systemd, only reset after a stop or restart or reload. It may externally be set to mark the unit for a later restart/reload. I wasn't sure whether to configure the property only for the types where this makes sense (Service, Swap, etc). But Restart() method is defined on the unit, and also having this always under the same property name is more convenient.
* feature: display status with a different shape depending on the status (#17728)Jiehong2021-01-221-0/+21
|
* license: LGPL-2.1+ -> LGPL-2.1-or-laterYu Watanabe2020-11-091-1/+1
|
* core: let user define start-/stop-timeout behaviourJan Klötzke2020-06-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | The usual behaviour when a timeout expires is to terminate/kill the service. This is what user usually want in production systems. To debug services that fail to start/stop (especially sporadic failures) it might be necessary to trigger the watchdog machinery and write core dumps, though. Likewise, it is usually just a waste of time to gracefully stop a stuck service. Instead it might save time to go directly into kill mode. This commit adds two new options to services: TimeoutStartFailureMode= and TimeoutStopFailureMode=. Both take the same values and tweak the behavior of systemd when a start/stop timeout expires: * 'terminate': is the default behaviour as it has always been, * 'abort': triggers the watchdog machinery and will send SIGABRT (unless WatchdogSignal was changed) and * 'kill' will directly send SIGKILL. To handle the stop failure mode in stop-post state too a new final-watchdog state needs to be introduced.
* core: introduce support for cgroup freezerMichal Sekletár2020-04-301-0/+9
| | | | | | | | | | | | | | | | | | | | With cgroup v2 the cgroup freezer is implemented as a cgroup attribute called cgroup.freeze. cgroup can be frozen by writing "1" to the file and kernel will send us a notification through "cgroup.events" after the operation is finished and processes in the cgroup entered quiescent state, i.e. they are not scheduled to run. Writing "0" to the attribute file does the inverse and process execution is resumed. This commit exposes above low-level functionality through systemd's DBus API. Each unit type must provide specialized implementation for these methods, otherwise, we return an error. So far only service, scope, and slice unit types provide the support. It is possible to check if a given unit has the support using CanFreeze() DBus property. Note that DBus API has a synchronous behavior and we dispatch the reply to freeze/thaw requests only after the kernel has notified us that requested operation was completed.
* core/swap: support "systemctl clean" for swap unitsYu Watanabe2019-08-281-1/+2
|
* core/mount: support "systemctl clean" for mount unitsYu Watanabe2019-08-281-1/+2
|
* core/socket: support "systemctl clean" for socket unitsYu Watanabe2019-08-281-1/+2
|
* core: ExecCondition= for servicesAnita Zhang2019-07-171-0/+1
| | | | Closes #10596
* tree-wide: get rid of strappend()Lennart Poettering2019-07-121-1/+1
| | | | | It's a special case of strjoin(), so no need to keep both. In particular as typing strjoin() is even shoert than strappend().
* core: hook up service unit type with the new clean operationLennart Poettering2019-07-111-0/+1
| | | | | | The implementation is pretty straight-foward: when we get a request to clean some type of resources we fork off a process doing that, and while it is running we are in the "cleaning" state.
* core: add generic "clean" operation to unitsLennart Poettering2019-07-111-1/+2
| | | | | | | | | | | | | | | | | | | | This adds basic infrastructure to implement a "clean" operation for unit types. This "clean" operation is supposed to remove on-disk resources of units, and is supposed to be used in a later commit to clean our RuntimeDirectory=, StateDirectory= and so on of service units. Later commits will open this up to the bus, and hook up service units with this. This also adds a new generic ActiveState called UNIT_MAINTENANCE. It's supposed to cover all kinds of "maintainance" state of units. Specifically, this is supposed to cover the "cleaning" operations later added for service units which might take a bit of time. This high-level, generic, abstract state is called UNIT_MAINTENANCE instead of the more specific "UNIT_CLEANING", since I think this should be kept open for different operations possibly later on that could be nicely subsumed under this (for example, maybe a recursive chown()ing operation could be covered by this, and similar).
* Make Watchdog Signal ConfigurableAnita Zhang2018-09-261-1/+1
| | | | | | | | | | Allows configuring the watchdog signal (with a default of SIGABRT). This allows an alternative to SIGABRT when coredumps are not desirable. Appropriate references to SIGABRT or aborting were renamed to reflect more liberal watchdog signals. Closes #8658
* tree-wide: remove Lennart's copyright linesLennart Poettering2018-06-141-3/+0
| | | | | | | | | | | These lines are generally out-of-date, incomplete and unnecessary. With SPDX and git repository much more accurate and fine grained information about licensing and authorship is available, hence let's drop the per-file copyright notice. Of course, removing copyright lines of others is problematic, hence this commit only removes my own lines and leaves all others untouched. It might be nicer if sooner or later those could go away too, making git the only and accurate source of authorship information.
* tree-wide: drop 'This file is part of systemd' blurbLennart Poettering2018-06-141-2/+0
| | | | | | | | | | | | | | | | This part of the copyright blurb stems from the GPL use recommendations: https://www.gnu.org/licenses/gpl-howto.en.html The concept appears to originate in times where version control was per file, instead of per tree, and was a way to glue the files together. Ultimately, we nowadays don't live in that world anymore, and this information is entirely useless anyway, as people are very welcome to copy these files into any projects they like, and they shouldn't have to change bits that are part of our copyright header for that. hence, let's just get rid of this old cruft, and shorten our codebase a bit.
* core: introduce a new load state "bad-setting"Lennart Poettering2018-06-111-0/+1
| | | | | | | | | | | | | | | | | | Since bb28e68477a3a39796e4999a6cbc6ac6345a9159 parsing failures of certain unit file settings will result in load failures of units. This introduces a new load state "bad-setting" that is entered in precisely this case. With this addition error messages on bad settings should be a lot more explicit, as we don't have to show some generic "errno" error in that case, but can explicitly say that a bad setting is at fault. Internally this unit load state is entered as soon as any configuration loader call returns ENOEXEC. Hence: config parser calls should return ENOEXEC now for such essential unit file settings. Turns out, they generally already do. Fixes: #9107
* tree-wide: drop license boilerplateZbigniew Jędrzejewski-Szmek2018-04-061-13/+0
| | | | | | | | | | Files which are installed as-is (any .service and other unit files, .conf files, .policy files, etc), are left as is. My assumption is that SPDX identifiers are not yet that well known, so it's better to retain the extended header to avoid any doubt. I also kept any copyright lines. We can probably remove them, but it'd nice to obtain explicit acks from all involved authors before doing that.
* Add SPDX license identifiers to source files under the LGPLZbigniew Jędrzejewski-Szmek2017-11-191-0/+1
| | | | | This follows what the kernel is doing, c.f. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.
* basic: split unit-name.[ch] into two (#7065)Lennart Poettering2017-10-111-0/+289
It always bothered me a bit that unit-name.[ch] contains so many definitions that aren't really have much to do with unit nameing, for example all the unit state definitions. With this patch unit-name.[ch] is split into two: the file now contains only the unit naming related operations, and everything else is split out into a new set of files unit-def.[ch]. That's mostly unit state stuff as well as dbus path and interface name operations. No functional changes. This just moves code around. (Note as both .c files include each other's headers this doesn't make the build simpler or anything. All it does is make the C files a bit shorter, and medicate my pretend OCD)