| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Add virtual-address based lskcipher interface
- Optimise ahash/shash performance in light of costly indirect calls
- Remove ahash alignmask attribute
Algorithms:
- Improve AES/XTS performance of 6-way unrolling for ppc
- Remove some uses of obsolete algorithms (md4, md5, sha1)
- Add FIPS 202 SHA-3 support in pkcs1pad
- Add fast path for single-page messages in adiantum
- Remove zlib-deflate
Drivers:
- Add support for S4 in meson RNG driver
- Add STM32MP13x support in stm32
- Add hwrng interface support in qcom-rng
- Add support for deflate algorithm in hisilicon/zip"
* tag 'v6.7-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (283 commits)
crypto: adiantum - flush destination page before unmapping
crypto: testmgr - move pkcs1pad(rsa,sha3-*) to correct place
Documentation/module-signing.txt: bring up to date
module: enable automatic module signing with FIPS 202 SHA-3
crypto: asymmetric_keys - allow FIPS 202 SHA-3 signatures
crypto: rsa-pkcs1pad - Add FIPS 202 SHA-3 support
crypto: FIPS 202 SHA-3 register in hash info for IMA
x509: Add OIDs for FIPS 202 SHA-3 hash and signatures
crypto: ahash - optimize performance when wrapping shash
crypto: ahash - check for shash type instead of not ahash type
crypto: hash - move "ahash wrapping shash" functions to ahash.c
crypto: talitos - stop using crypto_ahash::init
crypto: chelsio - stop using crypto_ahash::init
crypto: ahash - improve file comment
crypto: ahash - remove struct ahash_request_priv
crypto: ahash - remove crypto_ahash_alignmask
crypto: gcm - stop using alignmask of ahash
crypto: chacha20poly1305 - stop using alignmask of ahash
crypto: ccm - stop using alignmask of ahash
net: ipv6: stop checking crypto_ahash_alignmask
...
|
| |
| |
| |
| |
| |
| |
| |
| | |
Add Kconfig options to use SHA-3 for kernel module signing. 256 size
for RSA only, and higher sizes for RSA and NIST P-384.
Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
sha224 does not provide enough security against collision attacks
relative to the default keys used for signing (RSA 4k & P-384). Also
sha224 never became popular, as sha256 got widely adopter ahead of
sha224 being introduced.
Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Removes support for sha1 signed kernel modules, importing sha1 signed
x.509 certificates.
rsa-pkcs1pad keeps sha1 padding support, which seems to be used by
virtio driver.
sha1 remains available as there are many drivers and subsystems using
it. Note only hmac(sha1) with secret keys remains cryptographically
secure.
In the kernel there are filesystems, IMA, tpm/pcr that appear to be
using sha1. Maybe they can all start to be slowly upgraded to
something else i.e. blake3, ParallelHash, SHAKE256 as needed.
Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We consistently switched from kmalloc() to vmalloc() in module
decompression to prevent potential memory allocation failures with large
modules, however vmalloc() is not as memory-efficient and fast as
kmalloc().
Since we don't know in general the size of the workspace required by the
decompression algorithm, it is more reasonable to use kvmalloc()
consistently, also considering that we don't have special memory
requirements here.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).
As found with Coccinelle[1], add __counted_by for struct module_notes_attrs.
[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: linux-modules@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| | |
Delete duplicated word in comment.
Signed-off-by: Zhu Mao <zhumao001@208suo.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use a similar approach as commit a419beac4a07 ("module/decompress: use
vmalloc() for zstd decompression workspace") and replace kmalloc() with
vmalloc() also for the gzip module decompression workspace.
In this case the workspace is represented by struct inflate_workspace
that can be fairly large for kmalloc() and it can potentially lead to
allocation errors on certain systems:
$ pahole inflate_workspace
struct inflate_workspace {
struct inflate_state inflate_state; /* 0 9544 */
/* --- cacheline 149 boundary (9536 bytes) was 8 bytes ago --- */
unsigned char working_window[32768]; /* 9544 32768 */
/* size: 42312, cachelines: 662, members: 2 */
/* last cacheline: 8 bytes */
};
Considering that there is no need to use continuous physical memory,
simply switch to vmalloc() to provide a more reliable in-kernel module
decompression.
Fixes: b1ae6dc41eaa ("module: add in-kernel support for decompressing")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using kmalloc() to allocate the decompression workspace for zstd may
trigger the following warning when large modules are loaded (i.e., xfs):
[ 2.961884] WARNING: CPU: 1 PID: 254 at mm/page_alloc.c:4453 __alloc_pages+0x2c3/0x350
...
[ 2.989033] Call Trace:
[ 2.989841] <TASK>
[ 2.990614] ? show_regs+0x6d/0x80
[ 2.991573] ? __warn+0x89/0x160
[ 2.992485] ? __alloc_pages+0x2c3/0x350
[ 2.993520] ? report_bug+0x17e/0x1b0
[ 2.994506] ? handle_bug+0x51/0xa0
[ 2.995474] ? exc_invalid_op+0x18/0x80
[ 2.996469] ? asm_exc_invalid_op+0x1b/0x20
[ 2.997530] ? module_zstd_decompress+0xdc/0x2a0
[ 2.998665] ? __alloc_pages+0x2c3/0x350
[ 2.999695] ? module_zstd_decompress+0xdc/0x2a0
[ 3.000821] __kmalloc_large_node+0x7a/0x150
[ 3.001920] __kmalloc+0xdb/0x170
[ 3.002824] module_zstd_decompress+0xdc/0x2a0
[ 3.003857] module_decompress+0x37/0xc0
[ 3.004688] init_module_from_file+0xd0/0x100
[ 3.005668] idempotent_init_module+0x11c/0x2b0
[ 3.006632] __x64_sys_finit_module+0x64/0xd0
[ 3.007568] do_syscall_64+0x59/0x90
[ 3.008373] ? ksys_read+0x73/0x100
[ 3.009395] ? exit_to_user_mode_prepare+0x30/0xb0
[ 3.010531] ? syscall_exit_to_user_mode+0x37/0x60
[ 3.011662] ? do_syscall_64+0x68/0x90
[ 3.012511] ? do_syscall_64+0x68/0x90
[ 3.013364] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
However, continuous physical memory does not seem to be required in
module_zstd_decompress(), so use vmalloc() instead, to prevent the
warning and avoid potential failures at loading compressed modules.
Fixes: 169a58ad824d ("module/decompress: Support zstd in-kernel decompression")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
module_init_layout_section() choses whether the core module loader
considers a section as init or not. This affects the placement of the
exit section when module unloading is disabled. This code will never run,
so it can be free()d once the module has been initialised.
arm and arm64 need to count the number of PLTs they need before applying
relocations based on the section name. The init PLTs are stored separately
so they can be free()d. arm and arm64 both use within_module_init() to
decide which list of PLTs to use when applying the relocation.
Because within_module_init()'s behaviour changes when module unloading
is disabled, both architecture would need to take this into account when
counting the PLTs.
Today neither architecture does this, meaning when module unloading is
disabled there are insufficient PLTs in the init section to load some
modules, resulting in warnings:
| WARNING: CPU: 2 PID: 51 at arch/arm64/kernel/module-plts.c:99 module_emit_plt_entry+0x184/0x1cc
| Modules linked in: crct10dif_common
| CPU: 2 PID: 51 Comm: modprobe Not tainted 6.5.0-rc4-yocto-standard-dirty #15208
| Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : module_emit_plt_entry+0x184/0x1cc
| lr : module_emit_plt_entry+0x94/0x1cc
| sp : ffffffc0803bba60
[...]
| Call trace:
| module_emit_plt_entry+0x184/0x1cc
| apply_relocate_add+0x2bc/0x8e4
| load_module+0xe34/0x1bd4
| init_module_from_file+0x84/0xc0
| __arm64_sys_finit_module+0x1b8/0x27c
| invoke_syscall.constprop.0+0x5c/0x104
| do_el0_svc+0x58/0x160
| el0_svc+0x38/0x110
| el0t_64_sync_handler+0xc0/0xc4
| el0t_64_sync+0x190/0x194
Instead of duplicating module_init_layout_section()s logic, expose it.
Reported-by: Adam Johnston <adam.johnston@arm.com>
Fixes: 055f23b74b20 ("module: check for exit sections in layout_sections() instead of module_init_section()")
Cc: stable@vger.kernel.org
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It has recently come to my attention that nvidia is circumventing the
protection added in 262e6ae7081d ("modules: inherit
TAINT_PROPRIETARY_MODULE") by importing exports from their proprietary
modules into an allegedly GPL licensed module and then rexporting them.
Given that symbol_get was only ever intended for tightly cooperating
modules using very internal symbols it is logical to restrict it to
being used on EXPORT_SYMBOL_GPL and prevent nvidia from costly DMCA
Circumvention of Access Controls law suites.
All symbols except for four used through symbol_get were already exported
as EXPORT_SYMBOL_GPL, and the remaining four ones were switched over in
the preparation patches.
Fixes: 262e6ae7081d ("modules: inherit TAINT_PROPRIETARY_MODULE")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Trying to restrict the '$'-prefix change to RISC-V caused some fallout,
so let's just treat all those symbols as special.
Fixes: c05780ef3c190 ("module: Ignore RISC-V mapping symbols too")
Link: https://lore.kernel.org/all/20230712015747.77263-1-wangkefeng.wang@huawei.com/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RISC-V has an extended form of mapping symbols that we use to encode
the ISA when it changes in the middle of an ELF. This trips up modpost
as a build failure, I haven't yet verified it yet but I believe the
kallsyms difference should result in stacks looking sane again.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Closes: https://lore.kernel.org/all/9d9e2902-5489-4bf0-d9cb-556c8e5d71c2@infradead.org/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Vegard Nossum pointed out two different problems with the error handling
in init_module_from_file():
(a) the idempotent loading code didn't clean up properly in some error
cases, leaving the on-stack 'struct idempotent' element still in
the hash table
(b) failure to read the module file would nonsensically update the
'invalid_kread_bytes' stat counter with the error value
The first error is quite nasty, in that it can then cause subsequent
idempotent loads of that same file to access stale stack contents of the
previous failure. The case may not happen in any normal situation
(explaining all the "Tested-by's on the original change), and requires
admin privileges, but syzkaller triggers random bad behavior as a
result:
BUG: soft lockup in sys_finit_module
BUG: unable to handle kernel paging request in init_module_from_file
general protection fault in init_module_from_file
INFO: task hung in init_module_from_file
KASAN: out-of-bounds Read in init_module_from_file
KASAN: slab-out-of-bounds Read in init_module_from_file
...
The second error is fairly benign and just leads to nonsensical stats
(and has been around since the debug stats were added).
Vegard also provided a patch for the idempotent loading issue, but I'd
rather re-organize the code and make it more legible using another level
of helper functions than add the usual "goto out" error handling.
Link: https://lore.kernel.org/lkml/20230704100852.23452-1-vegard.nossum@oracle.com/
Fixes: 9b9879fc0327 ("modules: catch concurrent module loads, treat them as idempotent")
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Reported-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Reported-by: syzbot+9c2bdc9d24e4a7abe741@syzkaller.appspotmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Remove the deprecated rule to build *.dtbo from *.dts
- Refactor section mismatch detection in modpost
- Fix bogus ARM section mismatch detections
- Fix error of 'make gtags' with O= option
- Add Clang's target triple to KBUILD_CPPFLAGS to fix a build error
with the latest LLVM version
- Rebuild the built-in initrd when KBUILD_BUILD_TIMESTAMP is changed
- Ignore more compiler-generated symbols for kallsyms
- Fix 'make local*config' to handle the ${CONFIG_FOO} form in Makefiles
- Enable more kernel-doc warnings with W=2
- Refactor <linux/export.h> by generating KSYMTAB data by modpost
- Deprecate <asm/export.h> and <asm-generic/export.h>
- Remove the EXPORT_DATA_SYMBOL macro
- Move the check for static EXPORT_SYMBOL back to modpost, which makes
the build faster
- Re-implement CONFIG_TRIM_UNUSED_KSYMS with one-pass algorithm
- Warn missing MODULE_DESCRIPTION when building modules with W=1
- Make 'make clean' robust against too long argument error
- Exclude more objects from GCOV to fix CFI failures with GCOV
- Allow 'make modules_install' to install modules.builtin and
modules.builtin.modinfo even when CONFIG_MODULES is disabled
- Include modules.builtin and modules.builtin.modinfo in the
linux-image Debian package even when CONFIG_MODULES is disabled
- Revive "Entering directory" logging for the latest Make version
* tag 'kbuild-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (72 commits)
modpost: define more R_ARM_* for old distributions
kbuild: revive "Entering directory" for Make >= 4.4.1
kbuild: set correct abs_srctree and abs_objtree for package builds
scripts/mksysmap: Ignore prefixed KCFI symbols
kbuild: deb-pkg: remove the CONFIG_MODULES check in buildeb
kbuild: builddeb: always make modules_install, to install modules.builtin*
modpost: continue even with unknown relocation type
modpost: factor out Elf_Sym pointer calculation to section_rel()
modpost: factor out inst location calculation to section_rel()
kbuild: Disable GCOV for *.mod.o
kbuild: Fix CFI failures with GCOV
kbuild: make clean rule robust against too long argument error
script: modpost: emit a warning when the description is missing
kbuild: make modules_install copy modules.builtin(.modinfo)
linux/export.h: rename 'sec' argument to 'license'
modpost: show offset from symbol for section mismatch warnings
modpost: merge two similar section mismatch warnings
kbuild: implement CONFIG_TRIM_UNUSED_KSYMS without recursion
modpost: use null string instead of NULL pointer for default namespace
modpost: squash sym_update_namespace() into sym_add_exported()
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Commit 7b4537199a4a ("kbuild: link symbol CRCs at final link, removing
CONFIG_MODULE_REL_CRCS") made modpost output CRCs in the same way
whether the EXPORT_SYMBOL() is placed in *.c or *.S.
For further cleanups, this commit applies a similar approach to the
entire data structure of EXPORT_SYMBOL().
The EXPORT_SYMBOL() compilation is split into two stages.
When a source file is compiled, EXPORT_SYMBOL() will be converted into
a dummy symbol in the .export_symbol section.
For example,
EXPORT_SYMBOL(foo);
EXPORT_SYMBOL_NS_GPL(bar, BAR_NAMESPACE);
will be encoded into the following assembly code:
.section ".export_symbol","a"
__export_symbol_foo:
.asciz "" /* license */
.asciz "" /* name space */
.balign 8
.quad foo /* symbol reference */
.previous
.section ".export_symbol","a"
__export_symbol_bar:
.asciz "GPL" /* license */
.asciz "BAR_NAMESPACE" /* name space */
.balign 8
.quad bar /* symbol reference */
.previous
They are mere markers to tell modpost the name, license, and namespace
of the symbols. They will be dropped from the final vmlinux and modules
because the *(.export_symbol) will go into /DISCARD/ in the linker script.
Then, modpost extracts all the information about EXPORT_SYMBOL() from the
.export_symbol section, and generates the final C code:
KSYMTAB_FUNC(foo, "", "");
KSYMTAB_FUNC(bar, "_gpl", "BAR_NAMESPACE");
KSYMTAB_FUNC() (or KSYMTAB_DATA() if it is data) is expanded to struct
kernel_symbol that will be linked to the vmlinux or a module.
With this change, EXPORT_SYMBOL() works in the same way for *.c and *.S
files, providing the following benefits.
[1] Deprecate EXPORT_DATA_SYMBOL()
In the old days, EXPORT_SYMBOL() was only available in C files. To export
a symbol in *.S, EXPORT_SYMBOL() was placed in a separate *.c file.
arch/arm/kernel/armksyms.c is one example written in the classic manner.
Commit 22823ab419d8 ("EXPORT_SYMBOL() for asm") removed this limitation.
Since then, EXPORT_SYMBOL() can be placed close to the symbol definition
in *.S files. It was a nice improvement.
However, as that commit mentioned, you need to use EXPORT_DATA_SYMBOL()
for data objects on some architectures.
In the new approach, modpost checks symbol's type (STT_FUNC or not),
and outputs KSYMTAB_FUNC() or KSYMTAB_DATA() accordingly.
There are only two users of EXPORT_DATA_SYMBOL:
EXPORT_DATA_SYMBOL_GPL(empty_zero_page) (arch/ia64/kernel/head.S)
EXPORT_DATA_SYMBOL(ia64_ivt) (arch/ia64/kernel/ivt.S)
They are transformed as follows and output into .vmlinux.export.c
KSYMTAB_DATA(empty_zero_page, "_gpl", "");
KSYMTAB_DATA(ia64_ivt, "", "");
The other EXPORT_SYMBOL users in ia64 assembly are output as
KSYMTAB_FUNC().
EXPORT_DATA_SYMBOL() is now deprecated.
[2] merge <linux/export.h> and <asm-generic/export.h>
There are two similar header implementations:
include/linux/export.h for .c files
include/asm-generic/export.h for .S files
Ideally, the functionality should be consistent between them, but they
tend to diverge.
Commit 8651ec01daed ("module: add support for symbol namespaces.") did
not support the namespace for *.S files.
This commit shifts the essential implementation part to C, which supports
EXPORT_SYMBOL_NS() for *.S files.
<asm/export.h> and <asm-generic/export.h> will remain as a wrapper of
<linux/export.h> for a while.
They will be removed after #include <asm/export.h> directives are all
replaced with #include <linux/export.h>.
[3] Implement CONFIG_TRIM_UNUSED_KSYMS in one-pass algorithm (by a later commit)
When CONFIG_TRIM_UNUSED_KSYMS is enabled, Kbuild recursively traverses
the directory tree to determine which EXPORT_SYMBOL to trim. If an
EXPORT_SYMBOL turns out to be unused by anyone, Kbuild begins the
second traverse, where some source files are recompiled with their
EXPORT_SYMBOL() tuned into a no-op.
We can do this better now; modpost can selectively emit KSYMTAB entries
that are really used by modules.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull module updates from Luis Chamberlain:
"The changes queued up for modules are pretty tame, mostly code removal
of moving of code.
Only two minor functional changes are made, the only one which stands
out is Sebastian Andrzej Siewior's simplification of module reference
counting by removing preempt_disable() and that has been tested on
linux-next for well over a month without no regressions.
I'm now, I guess, also a kitchen sink for some kallsyms changes"
[ There was a mis-communication about the concurrent module load changes
that I had expected to come through Luis despite me authoring the
patch. So some of the module updates were left hanging in the email
ether, and I just committed them separately.
It's my bad - I should have made it more clear that I expected my
own patches to come through the module tree too. Now they missed
linux-next, but hopefully that won't cause any issues - Linus ]
* tag 'v6.5-rc1-modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
kallsyms: make kallsyms_show_value() as generic function
kallsyms: move kallsyms_show_value() out of kallsyms.c
kallsyms: remove unsed API lookup_symbol_attrs
kallsyms: remove unused arch_get_kallsym() helper
module: Remove preempt_disable() from module reference counting.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
with commit '7878c231dae0 ("slab: remove /proc/slab_allocators")'
lookup_symbol_attrs usage is removed.
Thus removing redundant API.
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The preempt_disable() section in module_put() was added in commit
e1783a240f491 ("module: Use this_cpu_xx to dynamically allocate counters")
while the per-CPU counter were switched to another API. The API requires
that during the RMW operation the CPU remained the same.
This counting API was later replaced with atomic_t in commit
2f35c41f58a97 ("module: Replace module_ref with atomic_t refcnt")
Since this atomic_t replacement there is no need to keep preemption
disabled while the reference counter is modified.
Remove preempt_disable() from module_put(), __module_get() and
try_module_get().
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This is the new-and-improved attempt at avoiding huge memory load spikes
when the user space boot sequence tries to load hundreds (or even
thousands) of redundant duplicate modules in parallel.
See commit 9828ed3f695a ("module: error out early on concurrent load of
the same module file") for background and an earlier failed attempt that
was reverted.
That earlier attempt just said "concurrently loading the same module is
silly, just open the module file exclusively and return -ETXTBSY if
somebody else is already loading it".
While it is true that concurrent module loads of the same module is
silly, the reason that earlier attempt then failed was that the
concurrently loaded module would often be a prerequisite for another
module.
Thus failing to load the prerequisite would then cause cascading
failures of the other modules, rather than just short-circuiting that
one unnecessary module load.
At the same time, we still really don't want to load the contents of the
same module file hundreds of times, only to then wait for an eventually
successful load, and have everybody else return -EEXIST.
As a result, this takes another approach, and treats concurrent module
loads from the same file as "idempotent" in the inode. So if one module
load is ongoing, we don't start a new one, but instead just wait for the
first one to complete and return the same return value as it did.
So unlike the first attempt, this does not return early: the intent is
not to speed up the boot, but to avoid a thundering herd problem in
allocating memory (both physical and virtual) for a module more than
once.
Also note that this does change behavior: it used to be that when you
had concurrent loads, you'd have one "winner" that would return success,
and everybody else would return -EEXIST.
In contrast, this idempotent logic goes all Oprah on the problem, and
says "You are a winner! And you are a winner! We are ALL winners". But
since there's no possible actual real semantic difference between "you
loaded the module" and "somebody else already loaded the module", this
is more of a feel-good change than an actual honest-to-goodness semantic
change.
Of course, any true Johnny-come-latelies that don't get caught in the
concurrency filter will still return -EEXIST. It's no different from
not even getting a seat at an Oprah taping. That's life.
See the long thread on the kernel mailing list about this all, which
includes some numbers for memory use before and after the patch.
Link: https://lore.kernel.org/lkml/20230524213620.3509138-1-mcgrof@kernel.org/
Reviewed-by: Johan Hovold <johan@kernel.org>
Tested-by: Johan Hovold <johan@kernel.org>
Tested-by: Luis Chamberlain <mcgrof@kernel.org>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Rudi Heitbaum <rudi@heitbaum..com>
Tested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This will simplify the next step, where we can then key off the inode to
do one idempotent module load.
Let's do the obvious re-organization in one step, and then the new code
in another.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
While implementing support for in-kernel decompression in kmod,
finit_module() was returning a very suspicious value:
finit_module(3, "", MODULE_INIT_COMPRESSED_FILE) = 18446744072717407296
It turns out the check for module_get_next_page() failing is wrong,
and hence the decompression was not really taking place. Invert
the condition to fix it.
Fixes: 169a58ad824d ("module/decompress: Support zstd in-kernel decompression")
Cc: stable@kernel.org
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|/ /
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Frank reported boot regression in ia64 as:
ELILO v3.16 for EFI/IA-64
..
Uncompressing Linux... done
Loading file AC100221.initrd.img...done
[ 0.000000] Linux version 6.4.0-rc3 (root@x4270) (ia64-linux-gcc
(GCC) 12.2.0, GNU ld (GNU Binutils) 2.39) #1 SMP Thu May 25 15:52:20
CEST 2023
[ 0.000000] efi: EFI v1.1 by HP
[ 0.000000] efi: SALsystab=0x3ee7a000 ACPI 2.0=0x3fe2a000
ESI=0x3ee7b000 SMBIOS=0x3ee7c000 HCDP=0x3fe28000
[ 0.000000] PCDP: v3 at 0x3fe28000
[ 0.000000] earlycon: uart8250 at MMIO 0x00000000f4050000 (options
'9600n8')
[ 0.000000] printk: bootconsole [uart8250] enabled
[ 0.000000] ACPI: Early table checksum verification disabled
[ 0.000000] ACPI: RSDP 0x000000003FE2A000 000028 (v02 HP )
[ 0.000000] ACPI: XSDT 0x000000003FE2A02C 0000CC (v01 HP rx2620
00000000 HP 00000000)
[...]
[ 3.793350] Run /init as init process
Loading, please wait...
Starting systemd-udevd version 252.6-1
[ 3.951100] ------------[ cut here ]------------
[ 3.951100] WARNING: CPU: 6 PID: 140 at kernel/module/main.c:1547
__layout_sections+0x370/0x3c0
[ 3.949512] Unable to handle kernel paging request at virtual address
1000000000000000
[ 3.951100] Modules linked in:
[ 3.951100] CPU: 6 PID: 140 Comm: (udev-worker) Not tainted 6.4.0-rc3 #1
[ 3.956161] (udev-worker)[142]: Oops 11003706212352 [1]
[ 3.951774] Hardware name: hp server rx2620 , BIOS
04.29
11/30/2007
[ 3.951774]
[ 3.951774] Call Trace:
[ 3.958339] Unable to handle kernel paging request at virtual address
1000000000000000
[ 3.956161] Modules linked in:
[ 3.951774] [<a0000001000156d0>] show_stack.part.0+0x30/0x60
[ 3.951774] sp=e000000183a67b20
bsp=e000000183a61628
[ 3.956161]
[ 3.956161]
which bisect to module_memory change [1].
Debug showed that ia64 uses some special sections:
__layout_sections: section .got (sh_flags 10000002) matched to MOD_INVALID
__layout_sections: section .sdata (sh_flags 10000003) matched to MOD_INVALID
__layout_sections: section .sbss (sh_flags 10000003) matched to MOD_INVALID
All these sections are loaded to module core memory before [1].
Fix ia64 boot by loading these sections to MOD_DATA (core rw data).
[1] commit ac3b43283923 ("module: replace module_layout with module_memory")
Fixes: ac3b43283923 ("module: replace module_layout with module_memory")
Reported-by: Frank Scheiner <frank.scheiner@web.de>
Closes: https://lists.debian.org/debian-ia64/2023/05/msg00010.html
Closes: https://marc.info/?l=linux-ia64&m=168509859125505
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Song Liu <song@kernel.org>
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
| |
Smatch warns:
kernel/module/stats.c:394 read_file_mod_stats()
warn: passing freed memory 'buf'
We are passing 'buf' to simple_read_from_buffer() after freeing it.
Fix this by changing the order of 'simple_read_from_buffer' and 'kfree'.
Fixes: df3e764d8e5c ("module: add debug stats to help identify memory pressure")
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two newly introduced functions are declared in a header that is not
included before the definition, causing a warning with sparse or
'make W=1':
kernel/module/dups.c:118:6: error: no previous prototype for 'kmod_dup_request_exists_wait' [-Werror=missing-prototypes]
118 | bool kmod_dup_request_exists_wait(char *module_name, bool wait, int *dup_ret)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
kernel/module/dups.c:220:6: error: no previous prototype for 'kmod_dup_request_announce' [-Werror=missing-prototypes]
220 | void kmod_dup_request_announce(char *module_name, int ret)
| ^~~~~~~~~~~~~~~~~~~~~~~~~
Add an explicit include to ensure the prototypes match.
Fixes: 8660484ed1cf ("module: add debugging auto-load duplicate module support")
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/oe-kbuild-all/202304141440.DYO4NAzp-lkp@intel.com/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull module updates from Luis Chamberlain:
"The summary of the changes for this pull requests is:
- Song Liu's new struct module_memory replacement
- Nick Alcock's MODULE_LICENSE() removal for non-modules
- My cleanups and enhancements to reduce the areas where we vmalloc
module memory for duplicates, and the respective debug code which
proves the remaining vmalloc pressure comes from userspace.
Most of the changes have been in linux-next for quite some time except
the minor fixes I made to check if a module was already loaded prior
to allocating the final module memory with vmalloc and the respective
debug code it introduces to help clarify the issue. Although the
functional change is small it is rather safe as it can only *help*
reduce vmalloc space for duplicates and is confirmed to fix a bootup
issue with over 400 CPUs with KASAN enabled. I don't expect stable
kernels to pick up that fix as the cleanups would have also had to
have been picked up. Folks on larger CPU systems with modules will
want to just upgrade if vmalloc space has been an issue on bootup.
Given the size of this request, here's some more elaborate details:
The functional change change in this pull request is the very first
patch from Song Liu which replaces the 'struct module_layout' with a
new 'struct module_memory'. The old data structure tried to put
together all types of supported module memory types in one data
structure, the new one abstracts the differences in memory types in a
module to allow each one to provide their own set of details. This
paves the way in the future so we can deal with them in a cleaner way.
If you look at changes they also provide a nice cleanup of how we
handle these different memory areas in a module. This change has been
in linux-next since before the merge window opened for v6.3 so to
provide more than a full kernel cycle of testing. It's a good thing as
quite a bit of fixes have been found for it.
Jason Baron then made dynamic debug a first class citizen module user
by using module notifier callbacks to allocate / remove module
specific dynamic debug information.
Nick Alcock has done quite a bit of work cross-tree to remove module
license tags from things which cannot possibly be module at my request
so to:
a) help him with his longer term tooling goals which require a
deterministic evaluation if a piece a symbol code could ever be
part of a module or not. But quite recently it is has been made
clear that tooling is not the only one that would benefit.
Disambiguating symbols also helps efforts such as live patching,
kprobes and BPF, but for other reasons and R&D on this area is
active with no clear solution in sight.
b) help us inch closer to the now generally accepted long term goal
of automating all the MODULE_LICENSE() tags from SPDX license tags
In so far as a) is concerned, although module license tags are a no-op
for non-modules, tools which would want create a mapping of possible
modules can only rely on the module license tag after the commit
8b41fc4454e ("kbuild: create modules.builtin without
Makefile.modbuiltin or tristate.conf").
Nick has been working on this *for years* and AFAICT I was the only
one to suggest two alternatives to this approach for tooling. The
complexity in one of my suggested approaches lies in that we'd need a
possible-obj-m and a could-be-module which would check if the object
being built is part of any kconfig build which could ever lead to it
being part of a module, and if so define a new define
-DPOSSIBLE_MODULE [0].
A more obvious yet theoretical approach I've suggested would be to
have a tristate in kconfig imply the same new -DPOSSIBLE_MODULE as
well but that means getting kconfig symbol names mapping to modules
always, and I don't think that's the case today. I am not aware of
Nick or anyone exploring either of these options. Quite recently Josh
Poimboeuf has pointed out that live patching, kprobes and BPF would
benefit from resolving some part of the disambiguation as well but for
other reasons. The function granularity KASLR (fgkaslr) patches were
mentioned but Joe Lawrence has clarified this effort has been dropped
with no clear solution in sight [1].
In the meantime removing module license tags from code which could
never be modules is welcomed for both objectives mentioned above. Some
developers have also welcomed these changes as it has helped clarify
when a module was never possible and they forgot to clean this up, and
so you'll see quite a bit of Nick's patches in other pull requests for
this merge window. I just picked up the stragglers after rc3. LWN has
good coverage on the motivation behind this work [2] and the typical
cross-tree issues he ran into along the way. The only concrete blocker
issue he ran into was that we should not remove the MODULE_LICENSE()
tags from files which have no SPDX tags yet, even if they can never be
modules. Nick ended up giving up on his efforts due to having to do
this vetting and backlash he ran into from folks who really did *not
understand* the core of the issue nor were providing any alternative /
guidance. I've gone through his changes and dropped the patches which
dropped the module license tags where an SPDX license tag was missing,
it only consisted of 11 drivers. To see if a pull request deals with a
file which lacks SPDX tags you can just use:
./scripts/spdxcheck.py -f \
$(git diff --name-only commid-id | xargs echo)
You'll see a core module file in this pull request for the above, but
that's not related to his changes. WE just need to add the SPDX
license tag for the kernel/module/kmod.c file in the future but it
demonstrates the effectiveness of the script.
Most of Nick's changes were spread out through different trees, and I
just picked up the slack after rc3 for the last kernel was out. Those
changes have been in linux-next for over two weeks.
The cleanups, debug code I added and final fix I added for modules
were motivated by David Hildenbrand's report of boot failing on a
systems with over 400 CPUs when KASAN was enabled due to running out
of virtual memory space. Although the functional change only consists
of 3 lines in the patch "module: avoid allocation if module is already
present and ready", proving that this was the best we can do on the
modules side took quite a bit of effort and new debug code.
The initial cleanups I did on the modules side of things has been in
linux-next since around rc3 of the last kernel, the actual final fix
for and debug code however have only been in linux-next for about a
week or so but I think it is worth getting that code in for this merge
window as it does help fix / prove / evaluate the issues reported with
larger number of CPUs. Userspace is not yet fixed as it is taking a
bit of time for folks to understand the crux of the issue and find a
proper resolution. Worst come to worst, I have a kludge-of-concept [3]
of how to make kernel_read*() calls for modules unique / converge
them, but I'm currently inclined to just see if userspace can fix this
instead"
Link: https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/ [0]
Link: https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com [1]
Link: https://lwn.net/Articles/927569/ [2]
Link: https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org [3]
* tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (121 commits)
module: add debugging auto-load duplicate module support
module: stats: fix invalid_mod_bytes typo
module: remove use of uninitialized variable len
module: fix building stats for 32-bit targets
module: stats: include uapi/linux/module.h
module: avoid allocation if module is already present and ready
module: add debug stats to help identify memory pressure
module: extract patient module check into helper
modules/kmod: replace implementation with a semaphore
Change DEFINE_SEMAPHORE() to take a number argument
module: fix kmemleak annotations for non init ELF sections
module: Ignore L0 and rename is_arm_mapping_symbol()
module: Move is_arm_mapping_symbol() to module_symbol.h
module: Sync code of is_arm_mapping_symbol()
scripts/gdb: use mem instead of core_layout to get the module address
interconnect: remove module-related code
interconnect: remove MODULE_LICENSE in non-modules
zswap: remove MODULE_LICENSE in non-modules
zpool: remove MODULE_LICENSE in non-modules
x86/mm/dump_pagetables: remove MODULE_LICENSE in non-modules
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The finit_module() system call can in the worst case use up to more than
twice of a module's size in virtual memory. Duplicate finit_module()
system calls are non fatal, however they unnecessarily strain virtual
memory during bootup and in the worst case can cause a system to fail
to boot. This is only known to currently be an issue on systems with
larger number of CPUs.
To help debug this situation we need to consider the different sources for
finit_module(). Requests from the kernel that rely on module auto-loading,
ie, the kernel's *request_module() API, are one source of calls. Although
modprobe checks to see if a module is already loaded prior to calling
finit_module() there is a small race possible allowing userspace to
trigger multiple modprobe calls racing against modprobe and this not
seeing the module yet loaded.
This adds debugging support to the kernel module auto-loader (*request_module()
calls) to easily detect duplicate module requests. To aid with possible bootup
failure issues incurred by this, it will converge duplicates requests to a
single request. This avoids any possible strain on virtual memory during
bootup which could be incurred by duplicate module autoloading requests.
Folks debugging virtual memory abuse on bootup can and should enable
this to see what pr_warn()s come on, to see if module auto-loading is to
blame for their wores. If they see duplicates they can further debug this
by enabling the module.enable_dups_trace kernel parameter or by enabling
CONFIG_MODULE_DEBUG_AUTOLOAD_DUPS_TRACE.
Current evidence seems to point to only a few duplicates for module
auto-loading. And so the source for other duplicates creating heavy
virtual memory pressure due to larger number of CPUs should becoming
from another place (likely udev).
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This was caught by randconfig builds but does not show up in
build testing without CONFIG_MODULE_DECOMPRESS:
kernel/module/stats.c: In function 'mod_stat_bump_invalid':
kernel/module/stats.c:229:42: error: 'invalid_mod_byte' undeclared (first use in this function); did you mean 'invalid_mod_bytes'?
229 | atomic_long_add(info->compressed_len, &invalid_mod_byte);
| ^~~~~~~~~~~~~~~~
| invalid_mod_bytes
Fixes: df3e764d8e5c ("module: add debug stats to help identify memory pressure")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
clang build reports
kernel/module/stats.c:307:34: error: variable
'len' is uninitialized when used here [-Werror,-Wuninitialized]
len = scnprintf(buf + 0, size - len,
^~~
At the start of this sequence, neither the '+ 0', nor the '- len' are needed.
So remove them and fix using 'len' uninitalized.
Fixes: df3e764d8e5c ("module: add debug stats to help identify memory pressure")
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The new module statistics code mixes 64-bit types and wordsized 'long'
variables, which leads to build failures on 32-bit architectures:
kernel/module/stats.c: In function 'read_file_mod_stats':
kernel/module/stats.c:291:29: error: passing argument 1 of 'atomic64_read' from incompatible pointer type [-Werror=incompatible-pointer-types]
291 | total_size = atomic64_read(&total_mod_size);
x86_64-linux-ld: kernel/module/stats.o: in function `read_file_mod_stats':
stats.c:(.text+0x2b2): undefined reference to `__udivdi3'
To fix this, the code has to use one of the two types consistently.
Change them all to word-size types here.
Fixes: df3e764d8e5c ("module: add debug stats to help identify memory pressure")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
MODULE_INIT_COMPRESSED_FILE is defined in the uapi header, which
is not included indirectly from the normal linux/module.h, but
has to be pulled in explicitly:
kernel/module/stats.c: In function 'mod_stat_bump_invalid':
kernel/module/stats.c:227:14: error: 'MODULE_INIT_COMPRESSED_FILE' undeclared (first use in this function)
227 | if (flags & MODULE_INIT_COMPRESSED_FILE)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The finit_module() system call can create unnecessary virtual memory
pressure for duplicate modules. This is because load_module() can in
the worse case allocate more than twice the size of a module in virtual
memory. This saves at least a full size of the module in wasted vmalloc
space memory by trying to avoid duplicates as soon as we can validate
the module name in the read module structure.
This can only be an issue if a system is getting hammered with userspace
loading modules. There are two ways to load modules typically on systems,
one is the kernel moduile auto-loading (*request_module*() calls in-kernel)
and the other is things like udev. The auto-loading is in-kernel, but that
pings back to userspace to just call modprobe. We already have a way to
restrict the amount of concurrent kernel auto-loads in a given time, however
that still allows multiple requests for the same module to go through
and force two threads in userspace racing to call modprobe for the same
exact module. Even though libkmod which both modprobe and udev does check
if a module is already loaded prior calling finit_module() races are
still possible and this is clearly evident today when you have multiple
CPUs.
To avoid memory pressure for such stupid cases put a stop gap for them.
The *earliest* we can detect duplicates from the modules side of things
is once we have blessed the module name, sadly after the first vmalloc
allocation. We can check for the module being present *before* a secondary
vmalloc() allocation.
There is a linear relationship between wasted virtual memory bytes and
the number of CPU counts. The reason is that udev ends up racing to call
tons of the same modules for each of the CPUs.
We can see the different linear relationships between wasted virtual
memory and CPU count during after boot in the following graph:
+----------------------------------------------------------------------------+
14GB |-+ + + + + *+ +-|
| **** |
| *** |
| ** |
12GB |-+ ** +-|
| ** |
| ** |
| ** |
| ** |
10GB |-+ ** +-|
| ** |
| ** |
| ** |
8GB |-+ ** +-|
waste | ** ### |
| ** #### |
| ** ####### |
6GB |-+ **** #### +-|
| * #### |
| * #### |
| ***** #### |
4GB |-+ ** #### +-|
| ** #### |
| ** #### |
| ** #### |
2GB |-+ ** ##### +-|
| * #### |
| * #### Before ******* |
| **## + + + + After ####### |
+----------------------------------------------------------------------------+
0 50 100 150 200 250 300
CPUs count
On the y-axis we can see gigabytes of wasted virtual memory during boot
due to duplicate module requests which just end up failing. Trying to
infer the slope this ends up being about ~463 MiB per CPU lost prior
to this patch. After this patch we only loose about ~230 MiB per CPU, for
a total savings of about ~233 MiB per CPU. This is all *just on bootup*!
On a 8vcpu 8 GiB RAM system using kdevops and testing against selftests
kmod.sh -t 0008 I see a saving in the *highest* side of memory
consumption of up to ~ 84 MiB with the Linux kernel selftests kmod
test 0008. With the new stress-ng module test I see a 145 MiB difference
in max memory consumption with 100 ops. The stress-ng module ops tests can be
pretty pathalogical -- it is not realistic, however it was used to
finally successfully reproduce issues which are only reported to happen on
system with over 400 CPUs [0] by just usign 100 ops on a 8vcpu 8 GiB RAM
system. Running out of virtual memory space is no surprise given the
above graph, since at least on x86_64 we're capped at 128 MiB, eventually
we'd hit a series of errors and once can use the above graph to
guestimate when. This of course will vary depending on the features
you have enabled. So for instance, enabling KASAN seems to make this
much worse.
The results with kmod and stress-ng can be observed and visualized below.
The time it takes to run the test is also not affected.
The kmod tests 0008:
The gnuplot is set to a range from 400000 KiB (390 Mib) - 580000 (566 Mib)
given the tests peak around that range.
cat kmod.plot
set term dumb
set output fileout
set yrange [400000:580000]
plot filein with linespoints title "Memory usage (KiB)"
Before:
root@kmod ~ # /data/linux-next/tools/testing/selftests/kmod/kmod.sh -t 0008
root@kmod ~ # free -k -s 1 -c 40 | grep Mem | awk '{print $3}' > log-0008-before.txt ^C
root@kmod ~ # sort -n -r log-0008-before.txt | head -1
528732
So ~516.33 MiB
After:
root@kmod ~ # /data/linux-next/tools/testing/selftests/kmod/kmod.sh -t 0008
root@kmod ~ # free -k -s 1 -c 40 | grep Mem | awk '{print $3}' > log-0008-after.txt ^C
root@kmod ~ # sort -n -r log-0008-after.txt | head -1
442516
So ~432.14 MiB
That's about 84 ~MiB in savings in the worst case. The graphs:
root@kmod ~ # gnuplot -e "filein='log-0008-before.txt'; fileout='graph-0008-before.txt'" kmod.plot
root@kmod ~ # gnuplot -e "filein='log-0008-after.txt'; fileout='graph-0008-after.txt'" kmod.plot
root@kmod ~ # cat graph-0008-before.txt
580000 +-----------------------------------------------------------------+
| + + + + + + + |
560000 |-+ Memory usage (KiB) ***A***-|
| |
540000 |-+ +-|
| |
| *A *AA*AA*A*AA *A*AA A*A*A *AA*A*AA*A A |
520000 |-+A*A*AA *AA*A *A*AA*A*AA *A*A A *A+-|
|*A |
500000 |-+ +-|
| |
480000 |-+ +-|
| |
460000 |-+ +-|
| |
| |
440000 |-+ +-|
| |
420000 |-+ +-|
| + + + + + + + |
400000 +-----------------------------------------------------------------+
0 5 10 15 20 25 30 35 40
root@kmod ~ # cat graph-0008-after.txt
580000 +-----------------------------------------------------------------+
| + + + + + + + |
560000 |-+ Memory usage (KiB) ***A***-|
| |
540000 |-+ +-|
| |
| |
520000 |-+ +-|
| |
500000 |-+ +-|
| |
480000 |-+ +-|
| |
460000 |-+ +-|
| |
| *A *A*A |
440000 |-+A*A*AA*A A A*A*AA A*A*AA*A*AA*A*AA*A*AA*AA*A*AA*A*AA-|
|*A *A*AA*A |
420000 |-+ +-|
| + + + + + + + |
400000 +-----------------------------------------------------------------+
0 5 10 15 20 25 30 35 40
The stress-ng module tests:
This is used to run the test to try to reproduce the vmap issues
reported by David:
echo 0 > /proc/sys/vm/oom_dump_tasks
./stress-ng --module 100 --module-name xfs
Prior to this commit:
root@kmod ~ # free -k -s 1 -c 40 | grep Mem | awk '{print $3}' > baseline-stress-ng.txt
root@kmod ~ # sort -n -r baseline-stress-ng.txt | head -1
5046456
After this commit:
root@kmod ~ # free -k -s 1 -c 40 | grep Mem | awk '{print $3}' > after-stress-ng.txt
root@kmod ~ # sort -n -r after-stress-ng.txt | head -1
4896972
5046456 - 4896972
149484
149484/1024
145.98046875000000000000
So this commit using stress-ng reveals saving about 145 MiB in memory
using 100 ops from stress-ng which reproduced the vmap issue reported.
cat kmod.plot
set term dumb
set output fileout
set yrange [4700000:5070000]
plot filein with linespoints title "Memory usage (KiB)"
root@kmod ~ # gnuplot -e "filein='baseline-stress-ng.txt'; fileout='graph-stress-ng-before.txt'" kmod-simple-stress-ng.plot
root@kmod ~ # gnuplot -e "filein='after-stress-ng.txt'; fileout='graph-stress-ng-after.txt'" kmod-simple-stress-ng.plot
root@kmod ~ # cat graph-stress-ng-before.txt
+---------------------------------------------------------------+
5.05e+06 |-+ + A + + + + + + +-|
| * Memory usage (KiB) ***A*** |
| * A |
5e+06 |-+ ** ** +-|
| ** * * A |
4.95e+06 |-+ * * A * A* +-|
| * * A A * * * * A |
| * * * * * * *A * * * A * |
4.9e+06 |-+ * * * A*A * A*AA*A A *A **A **A*A *+-|
| A A*A A * A * * A A * A * ** |
| * ** ** * * * * * * * |
4.85e+06 |-+ A A A ** * * ** *-|
| * * * * ** * |
| * A * * * * |
4.8e+06 |-+ * * * A A-|
| * * * |
4.75e+06 |-+ * * * +-|
| * ** |
| * + + + + + + ** + |
4.7e+06 +---------------------------------------------------------------+
0 5 10 15 20 25 30 35 40
root@kmod ~ # cat graph-stress-ng-after.txt
+---------------------------------------------------------------+
5.05e+06 |-+ + + + + + + + +-|
| Memory usage (KiB) ***A*** |
| |
5e+06 |-+ +-|
| |
4.95e+06 |-+ +-|
| |
| |
4.9e+06 |-+ *AA +-|
| A*AA*A*A A A*AA*AA*A*AA*A A A A*A *AA*A*A A A*AA*AA |
| * * ** * * * ** * *** * |
4.85e+06 |-+* *** * * * * *** A * * +-|
| * A * * ** * * A * * |
| * * * * ** * * |
4.8e+06 |-+* * * A * * * +-|
| * * * A * * |
4.75e+06 |-* * * * * +-|
| * * * * * |
| * + * *+ + + + + * *+ |
4.7e+06 +---------------------------------------------------------------+
0 5 10 15 20 25 30 35 40
[0] https://lkml.kernel.org/r/20221013180518.217405-1-david@redhat.com
Reported-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Loading modules with finit_module() can end up using vmalloc(), vmap()
and vmalloc() again, for a total of up to 3 separate allocations in the
worst case for a single module. We always kernel_read*() the module,
that's a vmalloc(). Then vmap() is used for the module decompression,
and if so the last read buffer is freed as we use the now decompressed
module buffer to stuff data into our copy module. The last allocation is
specific to each architectures but pretty much that's generally a series
of vmalloc() calls or a variation of vmalloc to handle ELF sections with
special permissions.
Evaluation with new stress-ng module support [1] with just 100 ops
is proving that you can end up using GiBs of data easily even with all
care we have in the kernel and userspace today in trying to not load modules
which are already loaded. 100 ops seems to resemble the sort of pressure a
system with about 400 CPUs can create on module loading. Although issues
relating to duplicate module requests due to each CPU inucurring a new
module reuest is silly and some of these are being fixed, we currently lack
proper tooling to help diagnose easily what happened, when it happened
and who likely is to blame -- userspace or kernel module autoloading.
Provide an initial set of stats which use debugfs to let us easily scrape
post-boot information about failed loads. This sort of information can
be used on production worklaods to try to optimize *avoiding* redundant
memory pressure using finit_module().
There's a few examples that can be provided:
A 255 vCPU system without the next patch in this series applied:
Startup finished in 19.143s (kernel) + 7.078s (userspace) = 26.221s
graphical.target reached after 6.988s in userspace
And 13.58 GiB of virtual memory space lost due to failed module loading:
root@big ~ # cat /sys/kernel/debug/modules/stats
Mods ever loaded 67
Mods failed on kread 0
Mods failed on decompress 0
Mods failed on becoming 0
Mods failed on load 1411
Total module size 11464704
Total mod text size 4194304
Failed kread bytes 0
Failed decompress bytes 0
Failed becoming bytes 0
Failed kmod bytes 14588526272
Virtual mem wasted bytes 14588526272
Average mod size 171115
Average mod text size 62602
Average fail load bytes 10339140
Duplicate failed modules:
module-name How-many-times Reason
kvm_intel 249 Load
kvm 249 Load
irqbypass 8 Load
crct10dif_pclmul 128 Load
ghash_clmulni_intel 27 Load
sha512_ssse3 50 Load
sha512_generic 200 Load
aesni_intel 249 Load
crypto_simd 41 Load
cryptd 131 Load
evdev 2 Load
serio_raw 1 Load
virtio_pci 3 Load
nvme 3 Load
nvme_core 3 Load
virtio_pci_legacy_dev 3 Load
virtio_pci_modern_dev 3 Load
t10_pi 3 Load
virtio 3 Load
crc32_pclmul 6 Load
crc64_rocksoft 3 Load
crc32c_intel 40 Load
virtio_ring 3 Load
crc64 3 Load
The following screen shot, of a simple 8vcpu 8 GiB KVM guest with the
next patch in this series applied, shows 226.53 MiB are wasted in virtual
memory allocations which due to duplicate module requests during boot.
It also shows an average module memory size of 167.10 KiB and an an
average module .text + .init.text size of 61.13 KiB. The end shows all
modules which were detected as duplicate requests and whether or not
they failed early after just the first kernel_read*() call or late after
we've already allocated the private space for the module in
layout_and_allocate(). A system with module decompression would reveal
more wasted virtual memory space.
We should put effort now into identifying the source of these duplicate
module requests and trimming these down as much possible. Larger systems
will obviously show much more wasted virtual memory allocations.
root@kmod ~ # cat /sys/kernel/debug/modules/stats
Mods ever loaded 67
Mods failed on kread 0
Mods failed on decompress 0
Mods failed on becoming 83
Mods failed on load 16
Total module size 11464704
Total mod text size 4194304
Failed kread bytes 0
Failed decompress bytes 0
Failed becoming bytes 228959096
Failed kmod bytes 8578080
Virtual mem wasted bytes 237537176
Average mod size 171115
Average mod text size 62602
Avg fail becoming bytes 2758544
Average fail load bytes 536130
Duplicate failed modules:
module-name How-many-times Reason
kvm_intel 7 Becoming
kvm 7 Becoming
irqbypass 6 Becoming & Load
crct10dif_pclmul 7 Becoming & Load
ghash_clmulni_intel 7 Becoming & Load
sha512_ssse3 6 Becoming & Load
sha512_generic 7 Becoming & Load
aesni_intel 7 Becoming
crypto_simd 7 Becoming & Load
cryptd 3 Becoming & Load
evdev 1 Becoming
serio_raw 1 Becoming
nvme 3 Becoming
nvme_core 3 Becoming
t10_pi 3 Becoming
virtio_pci 3 Becoming
crc32_pclmul 6 Becoming & Load
crc64_rocksoft 3 Becoming
crc32c_intel 3 Becoming
virtio_pci_modern_dev 2 Becoming
virtio_pci_legacy_dev 1 Becoming
crc64 2 Becoming
virtio 2 Becoming
virtio_ring 2 Becoming
[0] https://github.com/ColinIanKing/stress-ng.git
[1] echo 0 > /proc/sys/vm/oom_dump_tasks
./stress-ng --module 100 --module-name xfs
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The patient module check inside add_unformed_module() is large
enough as we need it. It is a bit hard to read too, so just
move it to a helper and do the inverse checks first to help
shift the code and make it easier to read. The new helper then
is module_patient_check_exists().
To make this work we need to mvoe the finished_loading() up,
we do that without making any functional changes to that routine.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Simplify the concurrency delimiter we use for kmod with the semaphore.
I had used the kmod strategy to try to implement a similar concurrency
delimiter for the kernel_read*() calls from the finit_module() path
so to reduce vmalloc() memory pressure. That effort didn't provide yet
conclusive results, but one thing that became clear is we can use
the suggested alternative solution with semaphores which Linus hinted
at instead of using the atomic / wait strategy.
I've stress tested this with kmod test 0008:
time /data/linux-next/tools/testing/selftests/kmod/kmod.sh -t 0008
And I get only a *slight* delay. That delay however is small, a few
seconds for a full test loop run that runs 150 times, for about ~30-40
seconds. The small delay is worth the simplfication IMHO.
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Commit ac3b43283923 ("module: replace module_layout with module_memory")
reworked the way to handle memory allocations to make it clearer. But it
lost in translation how we handled kmemleak_ignore() or kmemleak_not_leak()
for different ELF sections.
Fix this and clarify the comments a bit more. Contrary to the old way
of using kmemleak_ignore() for init.* ELF sections we stick now only to
kmemleak_not_leak() as per suggestion by Catalin Marinas so to avoid
any false positives and simplify the code.
Fixes: ac3b43283923 ("module: replace module_layout with module_memory")
Reported-by: Jim Cromie <jim.cromie@gmail.com>
Acked-by: Song Liu <song@kernel.org>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The L0 symbol is generated when build module on LoongArch, ignore it in
modpost and when looking at module symbols, otherwise we can not see the
expected call trace.
Now is_arm_mapping_symbol() is not only for ARM, in order to reflect the
reality, rename is_arm_mapping_symbol() to is_mapping_symbol().
This is related with commit c17a2538704f ("mksysmap: Fix the mismatch of
'L0' symbols in System.map").
(1) Simple test case
[loongson@linux hello]$ cat hello.c
#include <linux/init.h>
#include <linux/module.h>
#include <linux/printk.h>
static void test_func(void)
{
pr_info("This is a test\n");
dump_stack();
}
static int __init hello_init(void)
{
pr_warn("Hello, world\n");
test_func();
return 0;
}
static void __exit hello_exit(void)
{
pr_warn("Goodbye\n");
}
module_init(hello_init);
module_exit(hello_exit);
MODULE_LICENSE("GPL");
[loongson@linux hello]$ cat Makefile
obj-m:=hello.o
ccflags-y += -g -Og
all:
make -C /lib/modules/$(shell uname -r)/build/ M=$(PWD) modules
clean:
make -C /lib/modules/$(shell uname -r)/build/ M=$(PWD) clean
(2) Test environment
system: LoongArch CLFS 5.5
https://github.com/sunhaiyong1978/CLFS-for-LoongArch/releases/tag/5.0
It needs to update grub to avoid booting error "invalid magic number".
kernel: 6.3-rc1 with loongson3_defconfig + CONFIG_DYNAMIC_FTRACE=y
(3) Test result
Without this patch:
[root@linux hello]# insmod hello.ko
[root@linux hello]# dmesg
...
Hello, world
This is a test
...
Call Trace:
[<9000000000223728>] show_stack+0x68/0x18c
[<90000000013374cc>] dump_stack_lvl+0x60/0x88
[<ffff800002050028>] L0\x01+0x20/0x2c [hello]
[<ffff800002058028>] L0\x01+0x20/0x30 [hello]
[<900000000022097c>] do_one_initcall+0x88/0x288
[<90000000002df890>] do_init_module+0x54/0x200
[<90000000002e1e18>] __do_sys_finit_module+0xc4/0x114
[<90000000013382e8>] do_syscall+0x7c/0x94
[<9000000000221e3c>] handle_syscall+0xbc/0x158
With this patch:
[root@linux hello]# insmod hello.ko
[root@linux hello]# dmesg
...
Hello, world
This is a test
...
Call Trace:
[<9000000000223728>] show_stack+0x68/0x18c
[<90000000013374cc>] dump_stack_lvl+0x60/0x88
[<ffff800002050028>] test_func+0x28/0x34 [hello]
[<ffff800002058028>] hello_init+0x28/0x38 [hello]
[<900000000022097c>] do_one_initcall+0x88/0x288
[<90000000002df890>] do_init_module+0x54/0x200
[<90000000002e1e18>] __do_sys_finit_module+0xc4/0x114
[<90000000013382e8>] do_syscall+0x7c/0x94
[<9000000000221e3c>] handle_syscall+0xbc/0x158
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Tested-by: Youling Tang <tangyouling@loongson.cn> # for LoongArch
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In order to avoid duplicated code, move is_arm_mapping_symbol() to
include/linux/module_symbol.h, then remove is_arm_mapping_symbol()
in the other places.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
After commit 2e3a10a1551d ("ARM: avoid ARM binutils leaking ELF local
symbols") and commit d6b732666a1b ("modpost: fix undefined behavior of
is_arm_mapping_symbol()"), many differences of is_arm_mapping_symbol()
exist in kernel/module/kallsyms.c and scripts/mod/modpost.c, just sync
the code to keep consistent.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
already_uses() is unnecessarily chatty.
`modprobe i915` yields 491 messages like:
[ 64.108744] i915 uses drm!
This is a normal situation, and isn't worth all the log entries.
NOTE: I've preserved the "does not use %s" messages, which happens
less often, but does happen. Its not clear to me what it tells a
reader, or what info might improve the pr_debug's utility.
[ 6847.584999] main:already_uses:569: amdgpu does not use ttm!
[ 6847.585001] main:add_module_usage:584: Allocating new usage for amdgpu.
[ 6847.585014] main:already_uses:569: amdgpu does not use drm!
[ 6847.585016] main:add_module_usage:584: Allocating new usage for amdgpu.
[ 6847.585024] main:already_uses:569: amdgpu does not use drm_display_helper!
[ 6847.585025] main:add_module_usage:584: Allocating new usage for amdgpu.
[ 6847.585084] main:already_uses:569: amdgpu does not use drm_kms_helper!
[ 6847.585086] main:add_module_usage:584: Allocating new usage for amdgpu.
[ 6847.585175] main:already_uses:569: amdgpu does not use drm_buddy!
[ 6847.585176] main:add_module_usage:584: Allocating new usage for amdgpu.
[ 6847.585202] main:already_uses:569: amdgpu does not use i2c_algo_bit!
[ 6847.585204] main:add_module_usage:584: Allocating new usage for amdgpu.
[ 6847.585249] main:already_uses:569: amdgpu does not use gpu_sched!
[ 6847.585250] main:add_module_usage:584: Allocating new usage for amdgpu.
[ 6847.585314] main:already_uses:569: amdgpu does not use video!
[ 6847.585315] main:add_module_usage:584: Allocating new usage for amdgpu.
[ 6847.585409] main:already_uses:569: amdgpu does not use iommu_v2!
[ 6847.585410] main:add_module_usage:584: Allocating new usage for amdgpu.
[ 6847.585816] main:already_uses:569: amdgpu does not use drm_ttm_helper!
[ 6847.585818] main:add_module_usage:584: Allocating new usage for amdgpu.
[ 6848.762268] dyndbg: add-module: amdgpu.2533 sites
no functional changes.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
move_module() pr_debug's "Final section addresses for $modname".
Add section addresses to the message, for anyone looking at these.
no functional changes.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The pr_debug("Absolute symbol" ..) reports value, (which is usually
0), but not the name, which is more informative. So add it.
no functional changes
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
layout_sections() and move_module() each issue ~50 messages for each
module loaded. Add mod-name into their 2 header lines, to help the
reader find his module.
no functional changes.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
The kernel/kmod.c is already only built if we enabled modules, so
just stuff it under kernel/module/kmod.c and unify the MAINTAINERS
file for it.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The setup_load_info() was actually had ELF validation checks of its
own. To later cache useful variables as an secondary step just means
looping again over the ELF sections we just validated. We can simply
keep tabs of the key sections of interest as we validate the module
ELF section in one swoop, so do that and merge the two routines
together.
Expand a bit on the documentation / intent / goals.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
The symbol and strings section validation currently happen in
setup_load_info() but since they are also doing validity checks
move this to elf_validity_check().
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The integrity of the struct module we load is important, and although
our ELF validator already checks that the module section must match
struct module, add a stop-gap check before we memcpy() the final minted
module. This also makes those inspecting the code what the goal is.
While at it, clarify the goal behind updating the sh_addr address.
The current comment is pretty misleading.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The ELF ".gnu.linkonce.this_module" section is special, it is what we
use to construct the struct module __this_module, which THIS_MODULE
points to. When userspace loads a module we always deal first with a
copy of the userspace buffer, and twiddle with the userspace copy's
version of the struct module. Eventually we allocate memory to do a
memcpy() of that struct module, under the assumption that the module
size is right. But we have no validity checks against the size or
the requirements for the section.
Add some validity checks for the special module section early and while
at it, cache the module section index early, so we don't have to do that
later.
While at it, just move over the assigment of the info->mod to make the
code clearer. The validity checker also adds an explicit size check to
ensure the module section size matches the kernel's run time size for
sizeof(struct module). This should prevent sloppy loads of modules
which are built today *without* actually increasing the size of
the struct module. A developer today can for example expand the size
of struct module, rebuild a directoroy 'make fs/xfs/' for example and
then try to insmode the driver there. That module would in effect have
an incorrect size. This new size check would put a stop gap against such
mistakes.
This also makes the entire goal of ".gnu.linkonce.this_module" pretty
clear. Before this patch verification of the goal / intent required some
Indian Jones whips, torches and cleaning up big old spider webs.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
check_export_symbol_versions()
This makes the routine easier to understand what the check its checking for.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
| |
| |
| |
| |
| |
| |
| | |
Converge on a compromise: so long as we have a module hit our linked
list of modules we taint. That is, the module was about to become live.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|