| Commit message (Collapse) | Author | Files | Lines |
|
Don't reissue requests from io_iopoll_reap_events(), the task may not
have mm, which ends up with NULL. It's better to kill everything off on
exit anyway.
[ 677.734670] RIP: 0010:io_iopoll_complete+0x27e/0x630
...
[ 677.734679] Call Trace:
[ 677.734695] ? __send_signal+0x1f2/0x420
[ 677.734698] ? _raw_spin_unlock_irqrestore+0x24/0x40
[ 677.734699] ? send_signal+0xf5/0x140
[ 677.734700] io_iopoll_getevents+0x12f/0x1a0
[ 677.734702] io_iopoll_reap_events.part.0+0x5e/0xa0
[ 677.734703] io_ring_ctx_wait_and_kill+0x132/0x1c0
[ 677.734704] io_uring_release+0x20/0x30
[ 677.734706] __fput+0xcd/0x230
[ 677.734707] ____fput+0xe/0x10
[ 677.734709] task_work_run+0x67/0xa0
[ 677.734710] do_exit+0x35d/0xb70
[ 677.734712] do_group_exit+0x43/0xa0
[ 677.734713] get_signal+0x140/0x900
[ 677.734715] do_signal+0x37/0x780
[ 677.734717] ? enqueue_hrtimer+0x41/0xb0
[ 677.734718] ? recalibrate_cpu_khz+0x10/0x10
[ 677.734720] ? ktime_get+0x3e/0xa0
[ 677.734721] ? lapic_next_deadline+0x26/0x30
[ 677.734723] ? tick_program_event+0x4d/0x90
[ 677.734724] ? __hrtimer_get_next_event+0x4d/0x80
[ 677.734726] __prepare_exit_to_usermode+0x126/0x1c0
[ 677.734741] prepare_exit_to_usermode+0x9/0x40
[ 677.734742] idtentry_exit_cond_rcu+0x4c/0x60
[ 677.734743] sysvec_reschedule_ipi+0x92/0x160
[ 677.734744] ? asm_sysvec_reschedule_ipi+0xa/0x20
[ 677.734745] asm_sysvec_reschedule_ipi+0x12/0x20
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_do_iopoll() won't do anything with a request unless
req->iopoll_completed is set. So io_complete_rw_iopoll() has to set
it, otherwise io_do_iopoll() will poll a file again and again even
though the request of interest was completed long time ago.
Also, remove -EAGAIN check from io_issue_sqe() as it races with
the changed lines. The request will take the long way and be
resubmitted from io_iopoll*().
io_kiocb's result and iopoll_completed")
Fixes: bbde017a32b3 ("io_uring: add memory barrier to synchronize
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When the user consumes and generates sqe at a fast rate,
io_sqring_entries can always get sqe, and ret will not be equal to -EBUSY,
so that io_sq_thread will never call cond_resched or schedule, and then
we will get the following system error prompt:
rcu: INFO: rcu_sched self-detected stall on CPU
or
watchdog: BUG: soft lockup-CPU#23 stuck for 112s! [io_uring-sq:1863]
This patch checks whether need to call cond_resched() by checking
the need_resched() function every cycle.
Suggested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
|
|
This userspace program includes UAPI headers exported to usr/include/.
'make headers' always works for the target architecture (i.e. the same
architecture as the kernel), so the sample program should be built for
the target as well. Kbuild now supports 'userprogs' for that.
I also guarded the CONFIG option by 'depends on CC_CAN_LINK' because
$(CC) may not provide libc.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
This reverts commit e0b250b57dcf403529081e5898a9de717f96b76b,
which broke build systems that need to install files to a certain
path, but do not set INSTALL_MOD_PATH when invoking 'make install'.
$ make INSTALL_PATH=/tmp/destdir install
mkdir: cannot create directory ‘/lib/modules/5.8.0-rc1+/’: Permission denied
Makefile:1342: recipe for target '_builtin_inst_' failed
make: *** [_builtin_inst_] Error 1
While modules.builtin is useful also for CONFIG_MODULES=n, this change
in the behavior is quite unexpected. Maybe "make modules_install"
can install modules.builtin irrespective of CONFIG_MODULES as Jonas
originally suggested.
Anyway, that commit should be reverted ASAP.
Reported-by: Douglas Anderson <dianders@chromium.org>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
|
|
Use the correct the function name in the documentation for
"pcs_parse_one_pinctrl_entry()".
"smux_parse_one_pinctrl_entry()" appears to be an artifact from the
development of a prior patch series ("simple pinmux driver") which
transformed into pinctrl-single.
Signed-off-by: Drew Fustini <drew@beagleboard.org>
Link: https://lore.kernel.org/r/20200612112758.GA3407886@x1
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The patch adds missing qpic data pins to qpic pingroup. These pins are
necessary for the qpic nand to work.
Fixes: ef1ea54eab0e ("pinctrl: qcom: Add ipq6018 pinctrl driver")
Signed-off-by: Sivaprakash Murugesan <sivaprak@codeaurora.org>
Link: https://lore.kernel.org/r/1592541089-17700-1-git-send-email-sivaprak@codeaurora.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
leak in case of error in 'imx_pinctrl_probe()'"
This reverts commit ba403242615c2c99e27af7984b1650771a2cc2c9.
After commit 26d8cde5260b ("pinctrl: freescale: imx: add shared
input select reg support"). i.MX7D has two iomux controllers
iomuxc and iomuxc-lpsr which share select_input register for
daisy chain settings.
If use 'devm_of_iomap()', when probe the iomuxc-lpsr, will call
devm_request_mem_region() for the region <0x30330000-0x3033ffff>
for the first time. Then, next time when probe the iomuxc, API
devm_platform_ioremap_resource() will also use the API
devm_request_mem_region() for the share region <0x30330000-0x3033ffff>
again, then cause issue, log like below:
[ 0.179561] imx7d-pinctrl 302c0000.iomuxc-lpsr: initialized IMX pinctrl driver
[ 0.191742] imx7d-pinctrl 30330000.pinctrl: can't request region for resource [mem 0x30330000-0x3033ffff]
[ 0.191842] imx7d-pinctrl: probe of 30330000.pinctrl failed with error -16
Fixes: ba403242615c ("pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak in case of error in 'imx_pinctrl_probe()'")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Link: https://lore.kernel.org/r/1591673223-1680-1-git-send-email-haibo.chen@nxp.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The fileserver probe timer, net->fs_probe_timer, isn't cancelled when
the kafs module is being removed and so the count it holds on
net->servers_outstanding doesn't get dropped..
This causes rmmod to wait forever. The hung process shows a stack like:
afs_purge_servers+0x1b5/0x23c [kafs]
afs_net_exit+0x44/0x6e [kafs]
ops_exit_list+0x72/0x93
unregister_pernet_operations+0x14c/0x1ba
unregister_pernet_subsys+0x1d/0x2a
afs_exit+0x29/0x6f [kafs]
__do_sys_delete_module.isra.0+0x1a2/0x24b
do_syscall_64+0x51/0x95
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fix this by:
(1) Attempting to cancel the probe timer and, if successful, drop the
count that the timer was holding.
(2) Make the timer function just drop the count and not schedule the
prober if the afs portion of net namespace is being destroyed.
Also, whilst we're at it, make the following changes:
(3) Initialise net->servers_outstanding to 1 and decrement it before
waiting on it so that it doesn't generate wake up events by being
decremented to 0 until we're cleaning up.
(4) Switch the atomic_dec() on ->servers_outstanding for ->fs_timer in
afs_purge_servers() to use the helper function for that.
Fixes: f6cbb368bcb0 ("afs: Actively poll fileservers to maintain NAT or firewall openings")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Fix afs_do_lookup()'s fallback case for when FS.InlineBulkStatus isn't
supported by the server.
In the fallback, it calls FS.FetchStatus for the specific vnode it's
meant to be looking up. Commit b6489a49f7b7 broke this by renaming one
of the two identically-named afs_fetch_status_operation descriptors to
something else so that one of them could be made non-static. The site
that used the renamed one, however, wasn't renamed and didn't produce
any warning because the other was declared in a header.
Fix this by making afs_do_lookup() use the renamed variant.
Note that there are two variants of the success method because one is
called from ->lookup() where we may or may not have an inode, but can't
call iget until after we've talked to the server - whereas the other is
called from within iget where we have an inode, but it may or may not be
initialised.
The latter variant expects there to be an inode, but because it's being
called from there former case, there might not be - resulting in an oops
like the following:
BUG: kernel NULL pointer dereference, address: 00000000000000b0
...
RIP: 0010:afs_fetch_status_success+0x27/0x7e
...
Call Trace:
afs_wait_for_operation+0xda/0x234
afs_do_lookup+0x2fe/0x3c1
afs_lookup+0x3c5/0x4bd
__lookup_slow+0xcd/0x10f
walk_component+0xa2/0x10c
path_lookupat.isra.0+0x80/0x110
filename_lookup+0x81/0x104
vfs_statx+0x76/0x109
__do_sys_newlstat+0x39/0x6b
do_syscall_64+0x4c/0x78
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: b6489a49f7b7 ("afs: Fix silly rename")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
READ_ONCE() now enforces atomic read, which leads to:
CC mm/gup.o
In file included from ./include/linux/kernel.h:11:0,
from mm/gup.c:2:
In function 'gup_hugepte.constprop',
inlined from 'gup_huge_pd.isra.79' at mm/gup.c:2465:8:
./include/linux/compiler.h:392:38: error: call to '__compiletime_assert_222' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
^
./include/linux/compiler.h:373:4: note: in definition of macro '__compiletime_assert'
prefix ## suffix(); \
^
./include/linux/compiler.h:392:2: note: in expansion of macro '_compiletime_assert'
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
^
./include/linux/compiler.h:405:2: note: in expansion of macro 'compiletime_assert'
compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
^
./include/linux/compiler.h:291:2: note: in expansion of macro 'compiletime_assert_rwonce_type'
compiletime_assert_rwonce_type(x); \
^
mm/gup.c:2428:8: note: in expansion of macro 'READ_ONCE'
pte = READ_ONCE(*ptep);
^
In function 'gup_get_pte',
inlined from 'gup_pte_range' at mm/gup.c:2228:9,
inlined from 'gup_pmd_range' at mm/gup.c:2613:15,
inlined from 'gup_pud_range' at mm/gup.c:2641:15,
inlined from 'gup_p4d_range' at mm/gup.c:2666:15,
inlined from 'gup_pgd_range' at mm/gup.c:2694:15,
inlined from 'internal_get_user_pages_fast' at mm/gup.c:2795:3:
./include/linux/compiler.h:392:38: error: call to '__compiletime_assert_219' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
^
./include/linux/compiler.h:373:4: note: in definition of macro '__compiletime_assert'
prefix ## suffix(); \
^
./include/linux/compiler.h:392:2: note: in expansion of macro '_compiletime_assert'
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
^
./include/linux/compiler.h:405:2: note: in expansion of macro 'compiletime_assert'
compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
^
./include/linux/compiler.h:291:2: note: in expansion of macro 'compiletime_assert_rwonce_type'
compiletime_assert_rwonce_type(x); \
^
mm/gup.c:2199:9: note: in expansion of macro 'READ_ONCE'
return READ_ONCE(*ptep);
^
make[2]: *** [mm/gup.o] Error 1
Define ptep_get() on 8xx when using 16k pages.
Fixes: 9e343b467c70 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/341688399c1b102756046d19ea6ce39db1ae4742.1592225558.git.christophe.leroy@csgroup.eu
|
|
Since commit 9e343b467c70 ("READ_ONCE: Enforce atomicity for
{READ,WRITE}_ONCE() memory accesses") it is not possible anymore to
use READ_ONCE() to access complex page table entries like the one
defined for powerpc 8xx with 16k size pages.
Define a ptep_get() helper that architectures can override instead
of performing a READ_ONCE() on the page table entry pointer.
Fixes: 9e343b467c70 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/087fa12b6e920e32315136b998aa834f99242695.1592225558.git.christophe.leroy@csgroup.eu
|
|
gup_hugepte() reads hugepage table entries, it can't read
them directly, huge_ptep_get() must be used.
Fixes: 9e343b467c70 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ffc3714334c3bfaca6f13788ad039e8759ae413f.1592225558.git.christophe.leroy@csgroup.eu
|
|
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Robert Foss <robert.foss@linaro.org>
[wsa: kept sorting]
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
Fix spelling mistake in the comments with help of `codespell`.
seperate ==> separate
Signed-off-by: Keyur Patel <iamkeyur96@gmail.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
Just like all other I2C/SMBus commands, the start signal for the SMBus
Quick Command is S, not A.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Daniel Schaefer <git@danielschaefer.me>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
All in-tree users have been converted to the new i2c_new_client_device
function, so remove this deprecated one.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
Move away from the deprecated API and advertise the new one.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
Move away from the deprecated API and return the shiny new ERRPTR where
useful.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
Move away from the deprecated API and return the shiny new ERRPTR where
useful.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
i2c_new_client() is deprecated, use the replacement
i2c_new_client_device(). Also, we have a helper to check if a driver is
bound. Use it to simplify the code. Note that this changes the errno for
a failed device creation from ENOMEM to ENODEV. No callers currently
interpret this errno, though, so we use this condensed error check.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
module_put() balances try_module_get(), not request_module(). Fix the
error path to match that.
Fixes: 2066facca4c7 ("drm/kms: slave encoder interface.")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
As per walk_page_range documentation, mmap lock should be acquired by the
caller before invoking walk_page_range. mmap_assert_locked gets triggered
without that. The details can be found here.
http://lists.infradead.org/pipermail/linux-riscv/2020-June/010335.html
Fixes: 395a21ff859c(riscv: add ARCH_HAS_SET_DIRECT_MAP support)
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
As per the table 4.4 of version "20190608-Priv-MSU-Ratified" of the
RISC-V instruction set manual[0], the PTE permission bit combination of
"write+exec only" is reserved for future use. Hence, don't allow such
mapping request in mmap call.
An issue is been reported by David Abdurachmanov, that while running
stress-ng with "sysbadaddr" argument, RCU stalls are observed on RISC-V
specific kernel.
This issue arises when the stress-sysbadaddr request for pages with
"write+exec only" permission bits and then passes the address obtain
from this mmap call to various system call. For the riscv kernel, the
mmap call should fail for this particular combination of permission bits
since it's not valid.
[0]: http://dabbelt.com/~palmer/keep/riscv-isa-manual/riscv-privileged-20190608-1.pdf
Signed-off-by: Yash Shah <yash.shah@sifive.com>
Reported-by: David Abdurachmanov <david.abdurachmanov@gmail.com>
[Palmer: Refer to the latest ISA specification at the only link I could
find, and update the terminology.]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
Now that we've renamed probe_kernel_address() to get_kernel_nofault()
and made it look and behave more in line with get_user(), some of the
subtle type behavior differences end up being more obvious and possibly
dangerous.
When you do
get_user(val, user_ptr);
the type of the access comes from the "user_ptr" part, and the above
basically acts as
val = *user_ptr;
by design (except, of course, for the fact that the actual dereference
is done with a user access).
Note how in the above case, the type of the end result comes from the
pointer argument, and then the value is cast to the type of 'val' as
part of the assignment.
So the type of the pointer is ultimately the more important type both
for the access itself.
But 'get_kernel_nofault()' may now _look_ similar, but it behaves very
differently. When you do
get_kernel_nofault(val, kernel_ptr);
it behaves like
val = *(typeof(val) *)kernel_ptr;
except, of course, for the fact that the actual dereference is done with
exception handling so that a faulting access is suppressed and returned
as the error code.
But note how different the casting behavior of the two superficially
similar accesses are: one does the actual access in the size of the type
the pointer points to, while the other does the access in the size of
the target, and ignores the pointer type entirely.
Actually changing get_kernel_nofault() to act like get_user() is almost
certainly the right thing to do eventually, but in the meantime this
patch adds logit to at least verify that the pointer type is compatible
with the type of the result.
In many cases, this involves just casting the pointer to 'void *' to
make it obvious that the type of the pointer is not the important part.
It's not how 'get_user()' acts, but at least the behavioral difference
is now obvious and explicit.
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Better describe what this helper does, and match the naming of
copy_from_kernel_nofault.
Also switch the argument order around, so that it acts and looks
like get_user().
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently, address spaces in warnings are displayed as '<asn:X>' with
'X' being the address space's arbitrary number.
But since sparse v0.6.0-rc1 (late December 2018), sparse allows you to
define the address spaces using an identifier instead of a number. This
identifier is then directly used in the warnings.
So, use the identifiers '__user', '__iomem', '__percpu' & '__rcu' for
the corresponding address spaces. The default address space, __kernel,
being not displayed in warnings, stays defined as '0'.
With this change, warnings that used to be displayed as:
cast removes address space '<asn:1>' of expression
... void [noderef] <asn:2> *
will now be displayed as:
cast removes address space '__user' of expression
... void [noderef] __iomem *
This also moves the __kernel annotation to be the first one, since it is
quite different from the others because it's the default one, and so:
- it's never displayed
- it's normally not needed, nor in type annotations, nor in cast
between address spaces. The only time it's needed is when it's
combined with a typeof to express "the same type as this one but
without the address space"
- it can't be defined with a name, '0' must be used.
So, it seemed strange to me to have it in the middle of the other
ones.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
kill_bdev does not have any external user, so make it static.
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When a filesystem is mounted on a loop device and on a loop ioctl
LOOP_SET_STATUS64, because of kill_bdev, buffer_head mappings are getting
destroyed.
kill_bdev
truncate_inode_pages
truncate_inode_pages_range
do_invalidatepage
block_invalidatepage
discard_buffer -->clear BH_Mapped flag
sb_bread
__bread_gfp
bh = __getblk_gfp
-->discard_buffer clear BH_Mapped flag
__bread_slow
submit_bh
submit_bh_wbc
BUG_ON(!buffer_mapped(bh)) --> hit this BUG_ON
Fixes: 5db470e229e2 ("loop: drop caches if offset or block_size are changed")
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Commit 130f4caf145c ("libata: Ensure ata_port probe has completed before
detach") may cause system freeze during suspend.
Using async_synchronize_full() in PM callbacks is wrong, since async
callbacks that are already scheduled may wait for not-yet-scheduled
callbacks, causes a circular dependency.
Instead of using big hammer like async_synchronize_full(), use async
cookie to make sure port probe are synced, without affecting other
scheduled PM callbacks.
Fixes: 130f4caf145c ("libata: Ensure ata_port probe has completed before detach")
Suggested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: John Garry <john.garry@huawei.com>
BugLink: https://bugs.launchpad.net/bugs/1867983
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There is a specific API to treat raw data as UUID, i.e. import_uuid().
Use it instead of uuid_copy() with explicit casting.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In io_read() or io_write(), when io request is submitted successfully,
it'll go through the below sequence:
kfree(iovec);
req->flags &= ~REQ_F_NEED_CLEANUP;
return ret;
But clearing REQ_F_NEED_CLEANUP might be unsafe. The io request may
already have been completed, and then io_complete_rw_iopoll()
and io_complete_rw() will be called, both of which will also modify
req->flags if needed. This causes a race condition, with concurrent
non-atomic modification of req->flags.
To eliminate this race, in io_read() or io_write(), if io request is
submitted successfully, we don't remove REQ_F_NEED_CLEANUP flag. If
REQ_F_NEED_CLEANUP is set, we'll leave __io_req_aux_free() to the
iovec cleanup work correspondingly.
Cc: stable@vger.kernel.org
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
required libraries
When build perf with ASan or UBSan, if libasan or libubsan can not find,
the feature-glibc is 0 and there exists the following error log which is
wrong, because we can find gnu/libc-version.h in /usr/include,
glibc-devel is also installed.
[yangtiezhu@linux perf]$ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address'
BUILD: Doing 'make -j4' parallel build
HOSTCC fixdep.o
HOSTLD fixdep-in.o
LINK fixdep
<stdin>:1:0: warning: -fsanitize=address and -fsanitize=kernel-address are not supported for this target
<stdin>:1:0: warning: -fsanitize=address not supported for this target
Auto-detecting system features:
... dwarf: [ OFF ]
... dwarf_getlocations: [ OFF ]
... glibc: [ OFF ]
... gtk2: [ OFF ]
... libaudit: [ OFF ]
... libbfd: [ OFF ]
... libcap: [ OFF ]
... libelf: [ OFF ]
... libnuma: [ OFF ]
... numa_num_possible_cpus: [ OFF ]
... libperl: [ OFF ]
... libpython: [ OFF ]
... libcrypto: [ OFF ]
... libunwind: [ OFF ]
... libdw-dwarf-unwind: [ OFF ]
... zlib: [ OFF ]
... lzma: [ OFF ]
... get_cpuid: [ OFF ]
... bpf: [ OFF ]
... libaio: [ OFF ]
... libzstd: [ OFF ]
... disassembler-four-args: [ OFF ]
Makefile.config:393: *** No gnu/libc-version.h found, please install glibc-dev[el]. Stop.
Makefile.perf:224: recipe for target 'sub-make' failed
make[1]: *** [sub-make] Error 2
Makefile:69: recipe for target 'all' failed
make: *** [all] Error 2
[yangtiezhu@linux perf]$ ls /usr/include/gnu/libc-version.h
/usr/include/gnu/libc-version.h
After install libasan and libubsan, the feature-glibc is 1 and the build
process is success, so the cause is related with libasan or libubsan, we
should check them and print an error log to reflect the reality.
Committer testing:
$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf
$ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address' O=/tmp/build/perf -C tools/perf/ install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j12' parallel build
HOSTCC /tmp/build/perf/fixdep.o
HOSTLD /tmp/build/perf/fixdep-in.o
LINK /tmp/build/perf/fixdep
Auto-detecting system features:
... dwarf: [ OFF ]
... dwarf_getlocations: [ OFF ]
... glibc: [ OFF ]
... gtk2: [ OFF ]
... libbfd: [ OFF ]
... libcap: [ OFF ]
... libelf: [ OFF ]
... libnuma: [ OFF ]
... numa_num_possible_cpus: [ OFF ]
... libperl: [ OFF ]
... libpython: [ OFF ]
... libcrypto: [ OFF ]
... libunwind: [ OFF ]
... libdw-dwarf-unwind: [ OFF ]
... zlib: [ OFF ]
... lzma: [ OFF ]
... get_cpuid: [ OFF ]
... bpf: [ OFF ]
... libaio: [ OFF ]
... libzstd: [ OFF ]
... disassembler-four-args: [ OFF ]
Makefile.config:401: *** No libasan found, please install libasan. Stop.
make[1]: *** [Makefile.perf:231: sub-make] Error 2
make: *** [Makefile:70: all] Error 2
make: Leaving directory '/home/acme/git/perf/tools/perf'
$
$
$ sudo dnf install libasan
<SNIP>
Installed:
libasan-9.3.1-2.fc31.x86_64
$
$
$ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address' O=/tmp/build/perf -C tools/perf/ install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j12' parallel build
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libbfd: [ on ]
... libcap: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libcrypto: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]
... libaio: [ on ]
... libzstd: [ on ]
... disassembler-four-args: [ on ]
<SNIP>
CC /tmp/build/perf/util/pmu-flex.o
FLEX /tmp/build/perf/util/expr-flex.c
CC /tmp/build/perf/util/expr-bison.o
CC /tmp/build/perf/util/expr.o
CC /tmp/build/perf/util/expr-flex.o
CC /tmp/build/perf/util/parse-events-flex.o
CC /tmp/build/perf/util/parse-events.o
LD /tmp/build/perf/util/intel-pt-decoder/perf-in.o
LD /tmp/build/perf/util/perf-in.o
LD /tmp/build/perf/perf-in.o
LINK /tmp/build/perf/perf
<SNIP>
INSTALL python-scripts
INSTALL perf_completion-script
INSTALL perf-tip
make: Leaving directory '/home/acme/git/perf/tools/perf'
$ ldd ~/bin/perf | grep asan
libasan.so.5 => /lib64/libasan.so.5 (0x00007f0904164000)
$
And if we rebuild without -fsanitize-address:
$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf
$ make O=/tmp/build/perf -C tools/perf/ install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j12' parallel build
HOSTCC /tmp/build/perf/fixdep.o
HOSTLD /tmp/build/perf/fixdep-in.o
LINK /tmp/build/perf/fixdep
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libbfd: [ on ]
... libcap: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libcrypto: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]
... libaio: [ on ]
... libzstd: [ on ]
... disassembler-four-args: [ on ]
GEN /tmp/build/perf/common-cmds.h
CC /tmp/build/perf/exec-cmd.o
<SNIP>
INSTALL perf_completion-script
INSTALL perf-tip
make: Leaving directory '/home/acme/git/perf/tools/perf'
$ ldd ~/bin/perf | grep asan
$
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: tiezhu yang <yangtiezhu@loongson.cn>
Cc: xuefeng li <lixuefeng@loongson.cn>
Link: http://lore.kernel.org/lkml/1592445961-28044-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In order to move pointer checks like IS_ERR_VALUE() out of the hotpath
and into the reader path of a trace event, user space tools need to be
able to parse that. IS_ERR_VALUE() is defined as:
#define IS_ERR_VALUE() unlikely((unsigned long)(void *)(x) >= (unsigned long)-MAX_ERRNO)
Which eventually turns into:
__builtin_expect(!!((unsigned long)(void *)(x) >= (unsigned long)-4095), 0)
Now the traceevent parser can handle most of that except for the
__builtin_expect(), which needs to be added.
Link: https://lore.kernel.org/linux-mm/20200320055823.27089-3-jaewon31.kim@samsung.com/
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jaewon Kim <jaewon31.kim@samsung.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: linux-mm@kvack.org
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200324200956.821799393@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Commit c61f13eaa1ee1 ("gcc-plugins: Add structleak for more stack
initialization") added "__attribute__((user))" to the user when
stackleak detector is enabled. This now appears in the field format of
system call trace events for system calls that have user buffers. The
"__attribute__((user))" breaks the parsing in libtraceevent. That needs
to be handled.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jaewon Kim <jaewon31.kim@samsung.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: linux-mm@kvack.org
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200324200956.663647256@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There's several locations that open code realloc and strcat() to append
text to strings. Add an append() function that takes a delimiter and a
string to append to another string.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jaewon Lim <jaewon31.kim@samsung.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: linux-mm@kvack.org
Cc: linux-trace-devel@vger.kernel.org
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: http://lore.kernel.org/lkml/20200324200956.515118403@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Unprivileged memory accesses generated by the so-called "translated"
instructions (e.g. STTR) at EL1 can cause EL0 watchpoints to fire
unexpectedly if kernel debugging is enabled. In such cases, the
hw_breakpoint logic will invoke the user overflow handler which will
typically raise a SIGTRAP back to the current task. This is futile when
returning back to the kernel because (a) the signal won't have been
delivered and (b) userspace can't handle the thing anyway.
Avoid invoking the user overflow handler for watchpoints triggered by
kernel uaccess routines, and instead single-step over the faulting
instruction as we would if no overflow handler had been installed.
(Fixes tag identifies the introduction of unprivileged memory accesses,
which exposed this latent bug in the hw_breakpoint code)
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Fixes: 57f4959bad0a ("arm64: kernel: Add support for User Access Override")
Reported-by: Luis Machado <luis.machado@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.
This code was detected with the help of Coccinelle and, audited and
fixed manually.
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20200617213407.GA1385@embeddedor
Signed-off-by: Will Deacon <will@kernel.org>
|
|
hugetlb_cma_reserve() is called at the wrong place. numa_init has not been
done yet. so all reserved memory will be located at node0.
Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200617215828.25296-1-song.bao.hua@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
There is an issue when tune the number for read and write queues,
if the total queue count was not changed. The hctx->type cannot
be updated, since __blk_mq_update_nr_hw_queues will return directly
if the total queue count has not been changed.
Reproduce:
dmesg | grep "default/read/poll"
[ 2.607459] nvme nvme0: 48/0/0 default/read/poll queues
cat /sys/kernel/debug/block/nvme0n1/hctx*/type | sort | uniq -c
48 default
tune the write queues to 24:
echo 24 > /sys/module/nvme/parameters/write_queues
echo 1 > /sys/block/nvme0n1/device/reset_controller
dmesg | grep "default/read/poll"
[ 433.547235] nvme nvme0: 24/24/0 default/read/poll queues
cat /sys/kernel/debug/block/nvme0n1/hctx*/type | sort | uniq -c
48 default
The driver's hardware queue mapping is not same as block layer.
Signed-off-by: Weiping Zhang <zhangweiping@didiglobal.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add rename the gpu busy percentage for consistency and
add the mem busy percentage documentation.
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Vega10 and previous asics use one interface, vega20 and newer
use another.
Reviewed-by: Evan Quan <evan.quan@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The existing code used the major version number of the DRM driver
instead of the device major number of the DRM subsystem for
validating access for a devices cgroup.
This meant that accesses allowed by the devices cgroup weren't
permitted and certain accesses denied by the devices cgroup were
permitted (if they matched the wrong major device number).
Signed-off-by: Lorenz Brun <lorenz@brun.one>
Fixes: 6b855f7b83d2f ("drm/amdkfd: Check against device cgroup")
Reviewed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
clang static analysis reports an undefined return
security/selinux/ss/conditional.c:79:2: warning: Undefined or garbage value returned to caller [core.uninitialized.UndefReturn]
return s[0];
^~~~~~~~~~~
static int cond_evaluate_expr( ...
{
u32 i;
int s[COND_EXPR_MAXDEPTH];
for (i = 0; i < expr->len; i++)
...
return s[0];
When expr->len is 0, the loop which sets s[0] never runs.
So return -1 if the loop never runs.
Cc: stable@vger.kernel.org
Signed-off-by: Tom Rix <trix@redhat.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
During build compiler reports some 'false positive' warnings about
variables {'seq_ops', 'filtered_pids', 'other_pids'} may be used
uninitialized. This patch silences these warnings.
Also delete some useless spaces
Link: https://lkml.kernel.org/r/20200529141214.37648-1-pilgrimtao@gmail.com
Signed-off-by: Kaitao Cheng <pilgrimtao@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
It is possible that a platform that is capable of 'namespace labels'
comes up without the labels properly initialized. In this case, the
region's 'align' attribute is hidden. Howerver, once the user does
initialize he labels, the 'align' attribute still stays hidden, which is
unexpected.
The sysfs_update_group() API is meant to address this, and could be
called during region probe, but it has entanglements with the device
'lockdep_mutex'. Therefore, simply make the 'align' attribute always
visible. It doesn't matter what it says for label-less namespaces, since
it is not possible to change their allocation anyway.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20200520225026.29426-1-vishal.l.verma@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
If we're doing polled IO and end up having requests being submitted
async, then completions can come in while we're waiting for refs to
drop. We need to reap these manually, as nobody else will be looking
for them.
Break the wait into 1/20th of a second time waits, and check for done
poll completions if we time out. Otherwise we can have done poll
completions sitting in ctx->poll_list, which needs us to reap them but
we're just waiting for them.
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If both the tracer and the tracee are compat processes, and gprs[2]
is assigned a value by __poke_user_compat, then the higher 32 bits
of gprs[2] are cleared, IS_ERR_VALUE() always returns false, and
syscall_get_error() always returns 0.
Fix the implementation by sign-extending the value for compat processes
the same way as x86 implementation does.
The bug was exposed to user space by commit 201766a20e30f ("ptrace: add
PTRACE_GET_SYSCALL_INFO request") and detected by strace test suite.
This change fixes strace syscall tampering on s390.
Link: https://lkml.kernel.org/r/20200602180051.GA2427@altlinux.org
Fixes: 753c4dd6a2fa2 ("[S390] ptrace changes")
Cc: Elvira Khabirova <lineprinter@altlinux.org>
Cc: stable@vger.kernel.org # v2.6.28+
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
The way we produce SBALs to the device (first update q->nr_buf_used,
then update the SLSB) should ensure that we never see some of the
SLSB states when scanning the queue for progress.
So make some noise if we do, this implies a bug in our SBAL tracking.
Also tweak the WARN msg to provide more information.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|