diff options
author | David Vernet <void@manifault.com> | 2023-08-28 17:59:46 +0200 |
---|---|---|
committer | Daniel Borkmann <daniel@iogearbox.net> | 2023-08-30 16:35:44 +0200 |
commit | aee1720eeb87a3adc242eb07e5d4f7ba3eb8c736 (patch) | |
tree | c48a174f7c85b6b481d89468d4a3d4e5ec582c1e /Documentation/bpf/linux-notes.rst | |
parent | bpf, sockmap: Fix preempt_rt splat when using raw_spin_lock_t (diff) | |
download | linux-aee1720eeb87a3adc242eb07e5d4f7ba3eb8c736.tar.xz linux-aee1720eeb87a3adc242eb07e5d4f7ba3eb8c736.zip |
bpf, docs: Move linux-notes.rst to root bpf docs tree
In commit 4d496be9ca05 ("bpf,docs: Create new standardization
subdirectory"), I added a standardization/ directory to the BPF
documentation, which will contain the docs that will be standardized
as part of the effort with the IETF.
I included linux-notes.rst in that directory, but I shouldn't have. It
doesn't contain anything that will be standardized. Let's move it back
to Documentation/bpf.
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230828155948.123405-2-void@manifault.com
Diffstat (limited to 'Documentation/bpf/linux-notes.rst')
-rw-r--r-- | Documentation/bpf/linux-notes.rst | 84 |
1 files changed, 84 insertions, 0 deletions
diff --git a/Documentation/bpf/linux-notes.rst b/Documentation/bpf/linux-notes.rst new file mode 100644 index 000000000000..00d2693de025 --- /dev/null +++ b/Documentation/bpf/linux-notes.rst @@ -0,0 +1,84 @@ +.. contents:: +.. sectnum:: + +========================== +Linux implementation notes +========================== + +This document provides more details specific to the Linux kernel implementation of the eBPF instruction set. + +Byte swap instructions +====================== + +``BPF_FROM_LE`` and ``BPF_FROM_BE`` exist as aliases for ``BPF_TO_LE`` and ``BPF_TO_BE`` respectively. + +Jump instructions +================= + +``BPF_CALL | BPF_X | BPF_JMP`` (0x8d), where the helper function +integer would be read from a specified register, is not currently supported +by the verifier. Any programs with this instruction will fail to load +until such support is added. + +Maps +==== + +Linux only supports the 'map_val(map)' operation on array maps with a single element. + +Linux uses an fd_array to store maps associated with a BPF program. Thus, +map_by_idx(imm) uses the fd at that index in the array. + +Variables +========= + +The following 64-bit immediate instruction specifies that a variable address, +which corresponds to some integer stored in the 'imm' field, should be loaded: + +========================= ====== === ========================================= =========== ============== +opcode construction opcode src pseudocode imm type dst type +========================= ====== === ========================================= =========== ============== +BPF_IMM | BPF_DW | BPF_LD 0x18 0x3 dst = var_addr(imm) variable id data pointer +========================= ====== === ========================================= =========== ============== + +On Linux, this integer is a BTF ID. + +Legacy BPF Packet access instructions +===================================== + +As mentioned in the `ISA standard documentation +<instruction-set.html#legacy-bpf-packet-access-instructions>`_, +Linux has special eBPF instructions for access to packet data that have been +carried over from classic BPF to retain the performance of legacy socket +filters running in the eBPF interpreter. + +The instructions come in two forms: ``BPF_ABS | <size> | BPF_LD`` and +``BPF_IND | <size> | BPF_LD``. + +These instructions are used to access packet data and can only be used when +the program context is a pointer to a networking packet. ``BPF_ABS`` +accesses packet data at an absolute offset specified by the immediate data +and ``BPF_IND`` access packet data at an offset that includes the value of +a register in addition to the immediate data. + +These instructions have seven implicit operands: + +* Register R6 is an implicit input that must contain a pointer to a + struct sk_buff. +* Register R0 is an implicit output which contains the data fetched from + the packet. +* Registers R1-R5 are scratch registers that are clobbered by the + instruction. + +These instructions have an implicit program exit condition as well. If an +eBPF program attempts access data beyond the packet boundary, the +program execution will be aborted. + +``BPF_ABS | BPF_W | BPF_LD`` (0x20) means:: + + R0 = ntohl(*(u32 *) ((struct sk_buff *) R6->data + imm)) + +where ``ntohl()`` converts a 32-bit value from network byte order to host byte order. + +``BPF_IND | BPF_W | BPF_LD`` (0x40) means:: + + R0 = ntohl(*(u32 *) ((struct sk_buff *) R6->data + src + imm)) |