| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Running systemd with IP accounting enabled generates many bpf maps (two
per unit for accounting, another two if IPAddressAllow/Deny are used).
Systemd itself knows which maps belong to what unit and commands like
`systemctl status <unit>` can be used to query what service has which
map, but monitoring these values all the time costs 4 dbus requests
(calling the .IP{E,I}gress{Bytes,Packets} method for each unit) and
makes services like the prometheus systemd_exporter[1] somewhat slow
when doing that for every units, while less precise information could
quickly be obtained by looking directly at the maps.
Unfortunately, bpf map names are rather limited:
- only 15 characters in length (16, but last byte must be 0)
- only allows isalnum(), _ and . characters
If it wasn't for the length limit we could use the normal unit escape
functions but I've opted to just make any forbidden character into
underscores for maximum brievty -- the map prefix is also rather short:
This isn't meant as a precise mapping, but as a hint for admins who want
to look at these.
(Note there is no problem if multiple maps have the same name)
Link: https://github.com/povilasv/systemd_exporter [1]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bpf-firewall and bpf-devices do not have names. This complicates
debugging with bpftool(8).
Assign names starting with 'sd_' prefix:
* firewall program names are 'sd_fw_ingress' for ingress attach
point and 'sd_fw_egress' for egress.
* 'sd_devices' for devices prog
'sd_' prefix is already used in source-compiled programs, e.g.
sd_restrictif_i, sd_restrictif_e, sd_bind6.
The name must not be longer than 15 characters or BPF_OBJ_NAME_LEN - 1.
Assign names only to programs loaded to kernel by systemd since
programs pinned to bpffs are already loaded.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently ref count of bpf-program is kept in user space. However, the
kernel already implements its own ref count. Thus the ref count we keep for
bpf-program is redundant.
This PR removes ref count for bpf program as part of a task to simplify
bpf-program and remove redundancies, which will make the switch to
code-compiled BPF programs easier.
Part of #19270
|
| |
|
|
|
|
| |
Alternative to #17495
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We recently started making more use of malloc_usable_size() and rely on
it (see the string_erase() story). Given that we don't really support
sytems where malloc_usable_size() cannot be trusted beyond statistics
anyway, let's go fully in and rework GREEDY_REALLOC() on top of it:
instead of passing around and maintaining the currenly allocated size
everywhere, let's just derive it automatically from
malloc_usable_size().
I am mostly after this for the simplicity this brings. It also brings
minor efficiency improvements I guess, but things become so much nicer
to look at if we can avoid these allocation size variables everywhere.
Note that the malloc_usable_size() man page says relying on it wasn't
"good programming practice", but I think it does this for reasons that
don't apply here: the greedy realloc logic specifically doesn't rely on
the returned extra size, beyond the fact that it is equal or larger than
what was requested.
(This commit was supposed to be a quick patch btw, but apparently we use
the greedy realloc stuff quite a bit across the codebase, so this ends
up touching *a*lot* of code.)
|
|
|
|
|
|
|
|
|
| |
Introduce bpf_cgroup_attach_type_table with accustomed attached type
names also used in bpftool.
Add bpf_cgroup_attach_type_{from|to}_string helpers to convert from|to
string representation of pinned bpf program, e.g.
"egress:/sys/fs/bpf/egress-hook" for
/sys/fs/bpf/egress-hook path and BPF_CGROUP_INET_EGRESS attach type.
|
|
|
|
|
|
|
|
| |
Add helpers to:
- Create new BPFProgram instance from a path in bpf
filesystem and bpf attach type;
- Pin a program to bpf fs;
- Get BPF program ID by BPF program FD.
|
| |
|
|
|
|
|
|
|
| |
Takes a single /sys/fs/bpf/pinned_prog string as argument, but may be
specified multiple times. An empty assignment resets all previous filters.
Closes https://github.com/systemd/systemd/issues/10227
|
|
This doesn't have much effect on the final build, because we link libbasic.a
into libsystemd-shared.so, so in the end, all the object built from basic/
end up in libsystemd-shared. And when the static library is linked into binaries,
any objects that are included in it but are not used are trimmed. Hence, the
size of output artifacts doesn't change:
$ du -sb /var/tmp/inst*
54181861 /var/tmp/inst1 (old)
54207441 /var/tmp/inst1s (old split-usr)
54182477 /var/tmp/inst2 (new)
54208041 /var/tmp/inst2s (new split-usr)
(The negligible change in size is because libsystemd-shared.so is bigger
by a few hundred bytes. I guess it's because symbols are named differently
or something like that.)
The effect is on the build process, in particular partial builds. This change
effectively moves the requirements on some build steps toward the leaves of the
dependency tree. Two effects:
- when building items that do not depend on libsystemd-shared, we
build less stuff for libbasic.a (which wouldn't be used anyway,
so it's a net win).
- when building items that do depend on libshared, we reduce libbasic.a as a
synchronization point, possibly allowing better parallelism.
Method:
1. copy list of .h files from src/basic/meson.build to /tmp/basic
2. $ for i in $(grep '.h$' /tmp/basic); do echo $i; git --no-pager grep "include \"$i\"" src/basic/ 'src/lib*' 'src/nss-*' 'src/journal/sd-journal.c' |grep -v "${i%.h}.c";echo ;done | less
|