summaryrefslogtreecommitdiffstats
path: root/src/oom (follow)
Commit message (Collapse)AuthorAgeFilesLines
* meson: move files' closing brace to separate lineZbigniew Jędrzejewski-Szmek2022-03-031-1/+2
|
* meson: do not use split() in file listsZbigniew Jędrzejewski-Szmek2022-03-021-12/+9
| | | | | | | | | | | The approach to use '''…'''.split() instead of a list of strings was initially used when converting from automake because it allowed identical blocks of lines to be used for both, making the conversion easier. But over the years we have been using normal lists more and more, especially when there were just a few filenames listed. This converts the rest. No functional change.
* Merge pull request #22596 from yuwata/test-fix-fd-leaksYu Watanabe2022-02-221-1/+4
|\ | | | | test: fix fd leaks
| * test: fix file descriptor leak in test-oomd-utilYu Watanabe2022-02-221-1/+4
| | | | | | | | Fixes an issue reported in #22576.
* | test-oomd-util: fix conditional jump on uninitialised valueYu Watanabe2022-02-221-1/+1
| | | | | | | | Fixes #22577.
* | test-oomd-util: style fixletsYu Watanabe2022-02-221-4/+3
|/
* oom: Cleanup of information dump code after killBenjamin Berg2022-02-071-3/+1
| | | | | | This is a follow up to 29f4185a9cdc ("oomd: Dump top offenders after a kill action") to clean up the code a bit for review comments that happened after the code had been merged already.
* oomd: Dump top offenders after a kill actionBenjamin Berg2022-02-042-2/+41
| | | | | This hopefully makes it more transparent why a specific cgroup was killed by systemd-oomd.
* oomd: handle situations when no cgroups are killedAnita Zhang2022-01-202-9/+12
| | | | | | | | | Currently if systemd-oomd doesn't kill anything in a selected cgroup, it selects a new candidate immediately. But if a selected cgroup wasn't killed, it is likely due to it disappearing or getting cleaned up between the time it was selected as a candidate and getting sent SIGKILL(s). We should handle it as though systemd-oomd did perform a kill so that it will check swap/pressure again before it tries to select a new candidate.
* oomd: fix race with path unavailability when killing cgroupsAnita Zhang2022-01-201-1/+8
| | | | | | | | | | | There can be a situation where systemd-oomd would kill all of the processes in a cgroup, pid1 would clean up that cgroup, and systemd-oomd would get ENODEV trying to iterate the cgroup a final time to ensure it was empty. systemd-oomd sees this as an error and immediately picks a new candidate even though pressure may have recovered. To counter this, check and handle path unavailability errnos specially. Fixes: #22030
* meson: Use files() for testsJan Janssen2022-01-111-3/+3
| | | | | | Not having to provide the full path in the source tree is much nicer and the produced lists can also be used anywhere in the source tree.
* oomd: use type suffix instead of castingZbigniew Jędrzejewski-Szmek2021-11-301-4/+1
| | | | The end result is the same.
* Make pager_open() return voidZbigniew Jędrzejewski-Szmek2021-11-031-2/+2
|
* test: use assert_se() instead of assert()Yu Watanabe2021-10-121-1/+1
|
* parse-util: prefix load average macros with LOAD_AVG_Luca Boccassi2021-09-272-9/+9
| | | | Follow-up for #20839
* basic: delete loadavg.h copyLuca Boccassi2021-09-252-9/+9
| | | | | | | | | | loadavg.h is an internal header of the Linux source repository, and as such it is licensed as GPLv2-only, without syscall exception. We use it only for 4 macros, which are simply doing some math calculations that cannot thus be subject to copyright. Reimplement the same calculations in another internal header and delete loadavg.h from our tree.
* Merge pull request #20690 from DaanDeMeyer/oomd-user-servicesLuca Boccassi2021-09-213-64/+167
|\ | | | | oom: Support for user services
| * oom: Add support for user unit ManagedOOM property updatesDaan De Meyer2021-09-203-13/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Compared to PID1 where systemd-oomd has to be the client to PID1 because PID1 is a more privileged process than systemd-oomd, systemd-oomd is the more privileged process compared to a user manager so we have user managers be the client whereas systemd-oomd is now the server. The same varlink protocol is used between user managers and systemd-oomd to deliver ManagedOOM property updates. systemd-oomd now sets up a varlink server that user managers connect to to send ManagedOOM property updates. We also add extra validation to make sure that non-root senders don't send updates for cgroups they don't own. The integration test was extended to repeat the chill/bloat test using a user manager instead of PID1.
| * oom: Introduce process_managed_oom_message()Daan De Meyer2021-09-161-52/+56
| | | | | | | | | | | | Gets rid of a few gotos, allows removing the extra ret variable and will also be used in a future commit by the codepath that receives cgroups from user instances of systemd.
| * oom: Add missing sd-bus.h includeDaan De Meyer2021-09-161-1/+3
| |
* | tree-wide: mark set-but-not-used variables as unused to make LLVM happyFrantisek Sumsal2021-09-152-2/+2
|/ | | | | | | | | | | | | | LLVM 13 introduced `-Wunused-but-set-variable` diagnostic flag, which trips over some intentionally set-but-not-used variables or variables attached to cleanup handlers with side effects (`_cleanup_umask_`, `_cleanup_(notify_on_cleanup)`, `_cleanup_(restore_sigsetp)`, etc.): ``` ../src/basic/process-util.c:1257:46: error: variable 'saved_ssp' set but not used [-Werror,-Wunused-but-set-variable] _cleanup_(restore_sigsetp) sigset_t *saved_ssp = NULL; ^ 1 error generated. ```
* test-oomd-util: skip tests if cgroup memory controller is not availableYu Watanabe2021-09-121-0/+6
| | | | Fixes #20593 and #20655.
* oomd: refuse to start if cgroup memory controller is not availableYu Watanabe2021-09-121-0/+8
|
* Drop the text argument from assert_not_reached()Zbigniew Jędrzejewski-Szmek2021-08-032-2/+2
| | | | | | | | | | | | | | | | | In general we almost never hit those asserts in production code, so users see them very rarely, if ever. But either way, we just need something that users can pass to the developers. We have quite a few of those asserts, and some have fairly nice messages, but many are like "WTF?" or "???" or "unexpected something". The error that is printed includes the file location, and function name. In almost all functions there's at most one assert, so the function name alone is enough to identify the failure for a developer. So we don't get much extra from the message, and we might just as well drop them. Dropping them makes our code a tiny bit smaller, and most importantly, improves development experience by making it easy to insert such an assert in the code without thinking how to phrase the argument.
* Replace format_bytes_cgroup_protection with FORMAT_BYTES_CGROUP_PROTECTIONZbigniew Jędrzejewski-Szmek2021-07-091-4/+2
|
* tree-wide: add FORMAT_BYTES()Zbigniew Jędrzejewski-Szmek2021-07-091-12/+7
|
* tree-wide: add FORMAT_TIMESPAN()Zbigniew Jędrzejewski-Szmek2021-07-092-12/+5
|
* oomd: don't collect candidate stats on every intervalAnita Zhang2021-07-071-7/+0
| | | | | | | | | cb13961ada52c1b27f6d6c2c6e37a2901f01ed30 updated the oomd logic to collect candidate data when a kill was about to happen. However there was still a call left over in the main loop to collect candidate data on every interval. Remove this since it's unneeded. Fixes #20122
* oomd: review follow ups to #20020Anita Zhang2021-07-021-7/+15
|
* Merge pull request #20020 from anitazha/oomd_with_memZbigniew Jędrzejewski-Szmek2021-06-304-58/+120
|\ | | | | oomd: check that memory use also exceeds threshold before doing a swap kill
| * oomd: check mem free and swap free before doing a swap-based killAnita Zhang2021-06-301-4/+11
| | | | | | | | https://bugzilla.redhat.com/show_bug.cgi?id=1974763
| * oomd: get memory total and free as part of system contextAnita Zhang2021-06-303-14/+54
| |
| * oomd: switch system context parsing to use /proc/meminfoAnita Zhang2021-06-303-49/+64
| | | | | | | | | | Makes it easier in the next commits to unify on one way to read swap and memory info.
* | basic: move acquire_data_fd() and fd_duplicate_data_fd() to new data-fd-util.cZbigniew Jędrzejewski-Szmek2021-06-241-0/+1
|/ | | | | | | | | | | fd_duplicate_data_fd() is renamed to copy_data_fd(). This makes the two functions have nicely similar names. Now fd-util.[ch] is again about low-level file descriptor manipulations. copy_data_fd() is a complex function that internally wraps the other functions in copy.c. I want to move copy.c and the whole cluster of related code from basic/ to shared/ later on, and this is a preparatory step for that.
* oom: log one-time warning if kernel doesn't provide memory.swap.currentDan Streetman2021-05-201-1/+5
| | | | | | | | | | | The kernel can be compiled without support for any memory.swap.* files, or it can be disabled at boot time with the 'swapaccount=0' boot parameter, so if the file doesn't exist log warning indicating the kernel doesn't support the file and the user may need to try using the 'swapaccount=1' boot param. Note that the actual error from the call to fopen() is ENOENT, but that is translated into ENODATA in cg_get_attribute_as_uint64()
* fix: point to the correct drop-ins subdirectory for confsJóhann B. Guðmundsson2021-04-221-1/+1
|
* Merge pull request #19126 from anitazha/oomdimprovementsZbigniew Jędrzejewski-Szmek2021-04-066-219/+274
|\ | | | | systemd-oomd post-test week improvements
| * oomd: threshold swap kill candidates to usages of more than 5%Anita Zhang2021-04-054-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | In some instances, particularly with swap on zram, swap used will be high while there is still a lot of memory available. FB OOMD handles this by thresholding kills to X% of total swap usage. Let's do the same thing here. Anecdotally with these thresholds and my laptop which is exclusively swap on zram I can sit at 0K / 4G swap free with most of memory free and systemd-oomd doesn't kill anything. Partially addresses aggressive kill behavior from https://bugzilla.redhat.com/show_bug.cgi?id=1941170
| * oomd: don't get pressure candidates on every intervalAnita Zhang2021-04-051-5/+43
| | | | | | | | | | | | | | | | | | | | | | Only start collecting candidates for a memory pressure kill when we're hitting the limit (but before the duration hitting that limit is exceeded). This brings CPU util from ~1% to 0.3%. Addresses CPU util from https://bugzilla.redhat.com/show_bug.cgi?id=1941340 and https://bugzilla.redhat.com/show_bug.cgi?id=1944646
| * oomd: force DefaultMemoryPressureDurationSec= to be greater than or equal 1 secAnita Zhang2021-04-021-0/+3
| |
| * oomd: delete unused variablesAnita Zhang2021-04-022-4/+0
| |
| * oomd: rename last_hit_mem_pressure_limit -> mem_pressure_limit_hit_startAnita Zhang2021-04-023-18/+18
| | | | | | | | | | | | | | | | | | Since this is only changed the first time the limit is hit (and remains set as long as the pressure remains over), I changed the name to better reflect that. Keeps consistent with "last_had_mem_reclaim" which is actually updated every time there is reclaim activity.
| * oomd: rework memory reclaim detection logicAnita Zhang2021-04-025-125/+60
| | | | | | | | | | | | | | | | | | systemd-oomd only monitors and kills within a selected cgroup subtree For memory pressure kills, this means it's unnecessary to get the pgscan rate across all the monitored memory pressure cgroups. The increase will show up whether we do a total sum or not, but since we only care about the increase in the subtree we're about to target for a kill, we can simplify the code a bit by not doing this total sum.
| * oomd: refactor pgscan_rate calculation into helperAnita Zhang2021-04-022-17/+24
| |
| * oomd: split swap and mem pressure event timersAnita Zhang2021-04-022-56/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One thing that came out of the test week is that systoomd needs to poll more frequently so as not to race with the kernel oom killer in situations where memory is eaten quickly. Memory pressure counters are lagging so it isn't worthwhile to change the current read rate; however swap is not lagging and can be checked more frequently. So let's split these into 2 different timer events. As a result, swap now also doesn't have to be subject to the post-action (post-kill) delay that we need for memory pressure events. Addresses some of slowness to kill discussed in https://bugzilla.redhat.com/show_bug.cgi?id=1941340
* | test-oomd-util: fix running in mkosiAnita Zhang2021-04-021-2/+9
|/ | | | | | | | | When this test is run in mkosi, the previously tested cgroup that we write xattrs into and the root cgroup are the same. Since the root cgroup is a live cgroup anyways (vs. the test cgroups which are remade each time) let's generate the expected preference values from reading the xattrs instead of assuming it will be NONE.
* Merge pull request #19149 from anitazha/oomdloggingLuca Boccassi2021-03-303-32/+85
|\ | | | | oomd: make it more clear when a kill happens
| * oomd: fix iteration over candidates to killZbigniew Jędrzejewski-Szmek2021-03-301-10/+10
| |
| * oomd: make it more clear when a kill happensAnita Zhang2021-03-303-24/+77
| | | | | | | | | | | | | | | | Improve the logging to only print if systemd-oomd killed something. And also print which cgroup was targeted. Demote general swap above/pressure above messages to debug. [zjs: fix some issuelets found in review]
* | config files: recommend systemd-analyze cat-configZbigniew Jędrzejewski-Szmek2021-03-261-0/+2
|/ | | | | | | | | | | | | | This adds the same line to most of our .conf files. Not for systemd/user.conf though, since we can't correctly display it right now: $ systemd-analyze cat-config --user systemd/user.conf Option --user is not supported for cat-config right now. For sysusers.d, tmpfiles.d, rules.d, etc, there is no single file. Maybe we should short READMEs in /usr/lib/sysusers.d, /usr/lib/tmpfiles.d, etc.? Inspired by #19118.