| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to determine whether a caller holds privilege over a given
inode the capability framework exposes the two helpers
privileged_wrt_inode_uidgid() and capable_wrt_inode_uidgid(). The former
verifies that the inode has a mapping in the caller's user namespace and
the latter additionally verifies that the caller has the requested
capability in their current user namespace.
If the inode is accessed through an idmapped mount map it into the
mount's user namespace. Afterwards the checks are identical to
non-idmapped inodes. If the initial user namespace is passed all
operations are a nop so non-idmapped mounts will not see a change in
behavior.
Link: https://lore.kernel.org/r/20210121131959.646623-5-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add two simple helpers to check permissions on a file and path
respectively and convert over some callers. It simplifies quite a few
codepaths and also reduces the churn in later patches quite a bit.
Christoph also correctly points out that this makes codepaths (e.g.
ioctls) way easier to follow that would otherwise have to do more
complex argument passing than necessary.
Link: https://lore.kernel.org/r/20210121131959.646623-4-christian.brauner@ubuntu.com
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"Blacklist properly on all archs.
The code to blacklist notrace functions for kprobes was not using the
right kconfig option, which caused some archs (powerpc) to possibly
not blacklist them"
* tag 'trace-v5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/kprobes: Do the notrace functions check without kprobes on ftrace
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Enable the notrace function check on the architecture which doesn't
support kprobes on ftrace but support dynamic ftrace. This notrace
function check is not only for the kprobes on ftrace but also
sw-breakpoint based kprobes.
Thus there is no reason to limit this check for the arch which
supports kprobes on ftrace.
This also changes the dependency of Kconfig. Because kprobe event
uses the function tracer's address list for identifying notrace
function, if the CONFIG_DYNAMIC_FTRACE=n, it can not check whether
the target function is notrace or not.
Link: https://lkml.kernel.org/r/20210105065730.2634785-1-naveen.n.rao@linux.vnet.ibm.com
Link: https://lkml.kernel.org/r/161007957862.114704.4512260007555399463.stgit@devnote2
Cc: stable@vger.kernel.org
Fixes: 45408c4f92506 ("tracing: kprobes: Prohibit probing on notrace function")
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are some small staging driver fixes for 5.11-rc3. Nothing major,
just resolving some reported issues:
- cleanup some remaining mentions of the ION drivers that were
removed in 5.11-rc1
- comedi driver bugfix
- two error path memory leak fixes
All have been in linux-next for a while with no reported issues"
* tag 'staging-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: ION: remove some references to CONFIG_ION
staging: mt7621-dma: Fix a resource leak in an error handling path
Staging: comedi: Return -EFAULT if copy_to_user() fails
staging: spmi: hisi-spmi-controller: Fix some error handling paths
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
With commit e722a295cf49 ("staging: ion: remove from the tree"), ION and
its corresponding config CONFIG_ION is gone. Remove stale references
from drivers/staging/media/atomisp/pci and from the recommended Android
kernel config.
Fixes: e722a295cf49 ("staging: ion: remove from the tree")
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Rob Herring <robh@kernel.org>
Cc: linux-media@vger.kernel.org
Cc: devel@driverdev.osuosl.org
Signed-off-by: Matthias Maennich <maennich@google.com>
Link: https://lore.kernel.org/r/20210106155201.2845319-1-maennich@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Alexei Starovoitov says:
====================
pull-request: bpf 2021-01-07
We've added 4 non-merge commits during the last 10 day(s) which contain
a total of 4 files changed, 14 insertions(+), 7 deletions(-).
The main changes are:
1) Fix task_iter bug caused by the merge conflict resolution, from Yonghong.
2) Fix resolve_btfids for multiple type hierarchies, from Jiri.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpftool: Fix compilation failure for net.o with older glibc
tools/resolve_btfids: Warn when having multiple IDs for single type
bpf: Fix a task_iter bug caused by a merge conflict resolution
selftests/bpf: Fix a compile error for BPF_F_BPRM_SECUREEXEC
====================
Link: https://lore.kernel.org/r/20210107221555.64959-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Latest bpf tree has a bug for bpf_iter selftest:
$ ./test_progs -n 4/25
test_bpf_sk_storage_get:PASS:bpf_iter_bpf_sk_storage_helpers__open_and_load 0 nsec
test_bpf_sk_storage_get:PASS:socket 0 nsec
...
do_dummy_read:PASS:read 0 nsec
test_bpf_sk_storage_get:FAIL:bpf_map_lookup_elem map value wasn't set correctly
(expected 1792, got -1, err=0)
#4/25 bpf_sk_storage_get:FAIL
#4 bpf_iter:FAIL
Summary: 0/0 PASSED, 0 SKIPPED, 2 FAILED
When doing merge conflict resolution, Commit 4bfc4714849d missed to
save curr_task to seq_file private data. The task pointer in seq_file
private data is passed to bpf program. This caused NULL-pointer task
passed to bpf program which will immediately return upon checking
whether task pointer is NULL.
This patch added back the assignment of curr_task to seq_file private
data and fixed the issue.
Fixes: 4bfc4714849d ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20201231052418.577024-1-yhs@fb.com
|
|\| | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes, including fixes from netfilter, wireless and bpf
trees.
Current release - regressions:
- mt76: fix NULL pointer dereference in mt76u_status_worker and
mt76s_process_tx_queue
- net: ipa: fix interconnect enable bug
Current release - always broken:
- netfilter: fixes possible oops in mtype_resize in ipset
- ath11k: fix number of coding issues found by static analysis tools
and spurious error messages
Previous releases - regressions:
- e1000e: re-enable s0ix power saving flows for systems with the
Intel i219-LM Ethernet controllers to fix power use regression
- virtio_net: fix recursive call to cpus_read_lock() to avoid a
deadlock
- ipv4: ignore ECN bits for fib lookups in fib_compute_spec_dst()
- sysfs: take the rtnl lock around XPS configuration
- xsk: fix memory leak for failed bind and rollback reservation at
NETDEV_TX_BUSY
- r8169: work around power-saving bug on some chip versions
Previous releases - always broken:
- dcb: validate netlink message in DCB handler
- tun: fix return value when the number of iovs exceeds MAX_SKB_FRAGS
to prevent unnecessary retries
- vhost_net: fix ubuf refcount when sendmsg fails
- bpf: save correct stopping point in file seq iteration
- ncsi: use real net-device for response handler
- neighbor: fix div by zero caused by a data race (TOCTOU)
- bareudp: fix use of incorrect min_headroom size and a false
positive lockdep splat from the TX lock
- mvpp2:
- clear force link UP during port init procedure in case
bootloader had set it
- add TCAM entry to drop flow control pause frames
- fix PPPoE with ipv6 packet parsing
- fix GoP Networking Complex Control config of port 3
- fix pkt coalescing IRQ-threshold configuration
- xsk: fix race in SKB mode transmit with shared cq
- ionic: account for vlan tag len in rx buffer len
- stmmac: ignore the second clock input, current clock framework does
not handle exclusive clock use well, other drivers may reconfigure
the second clock
Misc:
- ppp: change PPPIOCUNBRIDGECHAN ioctl request number to follow
existing scheme"
* tag 'net-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (99 commits)
net: dsa: lantiq_gswip: Fix GSWIP_MII_CFG(p) register access
net: dsa: lantiq_gswip: Enable GSWIP_MII_CFG_EN also for internal PHYs
net: lapb: Decrease the refcount of "struct lapb_cb" in lapb_device_event
r8169: work around power-saving bug on some chip versions
net: usb: qmi_wwan: add Quectel EM160R-GL
selftests: mlxsw: Set headroom size of correct port
net: macb: Correct usage of MACB_CAPS_CLK_HW_CHG flag
ibmvnic: fix: NULL pointer dereference.
docs: networking: packet_mmap: fix old config reference
docs: networking: packet_mmap: fix formatting for C macros
vhost_net: fix ubuf refcount incorrectly when sendmsg fails
bareudp: Fix use of incorrect min_headroom size
bareudp: set NETIF_F_LLTX flag
net: hdlc_ppp: Fix issues when mod_timer is called while timer is running
atlantic: remove architecture depends
erspan: fix version 1 check in gre_parse_header()
net: hns: fix return value check in __lb_other_process()
net: sched: prevent invalid Scell_log shift count
net: neighbor: fix a crash caused by mod zero
ipv4: Ignore ECN bits for fib lookups in fib_compute_spec_dst()
...
|
| |\ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Daniel Borkmann says:
====================
pull-request: bpf 2020-12-28
The following pull-request contains BPF updates for your *net* tree.
There is a small merge conflict between bpf tree commit 69ca310f3416
("bpf: Save correct stopping point in file seq iteration") and net tree
commit 66ed594409a1 ("bpf/task_iter: In task_file_seq_get_next use
task_lookup_next_fd_rcu"). The get_files_struct() does not exist anymore
in net, so take the hunk in HEAD and add the `info->tid = curr_tid` to
the error path:
[...]
curr_task = task_seq_get_next(ns, &curr_tid, true);
if (!curr_task) {
info->task = NULL;
info->tid = curr_tid;
return NULL;
}
/* set info->task and info->tid */
[...]
We've added 10 non-merge commits during the last 9 day(s) which contain
a total of 11 files changed, 75 insertions(+), 20 deletions(-).
The main changes are:
1) Various AF_XDP fixes such as fill/completion ring leak on failed bind and
fixing a race in skb mode's backpressure mechanism, from Magnus Karlsson.
2) Fix latency spikes on lockdep enabled kernels by adding a rescheduling
point to BPF hashtab initialization, from Eric Dumazet.
3) Fix a splat in task iterator by saving the correct stopping point in the
seq file iteration, from Jonathan Lemon.
4) Fix BPF maps selftest by adding retries in case hashtab returns EBUSY
errors on update/deletes, from Andrii Nakryiko.
5) Fix BPF selftest error reporting to something more user friendly if the
vmlinux BTF cannot be found, from Kamal Mostafa.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Instead of directly comparing task->tgid and task->pid, use the
thread_group_leader() helper. This helps with readability, and
there should be no functional change.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201218185032.2464558-3-jonathan.lemon@gmail.com
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
On some systems, some variant of the following splat is
repeatedly seen. The common factor in all traces seems
to be the entry point to task_file_seq_next(). With the
patch, all warnings go away.
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: \x0926-....: (20992 ticks this GP) idle=d7e/1/0x4000000000000002 softirq=81556231/81556231 fqs=4876
\x09(t=21033 jiffies g=159148529 q=223125)
NMI backtrace for cpu 26
CPU: 26 PID: 2015853 Comm: bpftool Kdump: loaded Not tainted 5.6.13-0_fbk4_3876_gd8d1f9bf80bb #1
Hardware name: Quanta Twin Lakes MP/Twin Lakes Passive MP, BIOS F09_3A12 10/08/2018
Call Trace:
<IRQ>
dump_stack+0x50/0x70
nmi_cpu_backtrace.cold.6+0x13/0x50
? lapic_can_unplug_cpu.cold.30+0x40/0x40
nmi_trigger_cpumask_backtrace+0xba/0xca
rcu_dump_cpu_stacks+0x99/0xc7
rcu_sched_clock_irq.cold.90+0x1b4/0x3aa
? tick_sched_do_timer+0x60/0x60
update_process_times+0x24/0x50
tick_sched_timer+0x37/0x70
__hrtimer_run_queues+0xfe/0x270
hrtimer_interrupt+0xf4/0x210
smp_apic_timer_interrupt+0x5e/0x120
apic_timer_interrupt+0xf/0x20
</IRQ>
RIP: 0010:get_pid_task+0x38/0x80
Code: 89 f6 48 8d 44 f7 08 48 8b 00 48 85 c0 74 2b 48 83 c6 55 48 c1 e6 04 48 29 f0 74 19 48 8d 78 20 ba 01 00 00 00 f0 0f c1 50 20 <85> d2 74 27 78 11 83 c2 01 78 0c 48 83 c4 08 c3 31 c0 48 83 c4 08
RSP: 0018:ffffc9000d293dc8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
RAX: ffff888637c05600 RBX: ffffc9000d293e0c RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000550 RDI: ffff888637c05620
RBP: ffffffff8284eb80 R08: ffff88831341d300 R09: ffff88822ffd8248
R10: ffff88822ffd82d0 R11: 00000000003a93c0 R12: 0000000000000001
R13: 00000000ffffffff R14: ffff88831341d300 R15: 0000000000000000
? find_ge_pid+0x1b/0x20
task_seq_get_next+0x52/0xc0
task_file_seq_get_next+0x159/0x220
task_file_seq_next+0x4f/0xa0
bpf_seq_read+0x159/0x390
vfs_read+0x8a/0x140
ksys_read+0x59/0xd0
do_syscall_64+0x42/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f95ae73e76e
Code: Bad RIP value.
RSP: 002b:00007ffc02c1dbf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 000000000170faa0 RCX: 00007f95ae73e76e
RDX: 0000000000001000 RSI: 00007ffc02c1dc30 RDI: 0000000000000007
RBP: 00007ffc02c1ec70 R08: 0000000000000005 R09: 0000000000000006
R10: fffffffffffff20b R11: 0000000000000246 R12: 00000000019112a0
R13: 0000000000000000 R14: 0000000000000007 R15: 00000000004283c0
If unable to obtain the file structure for the current task,
proceed to the next task number after the one returned from
task_seq_get_next(), instead of the next task number from the
original iterator.
Also, save the stopping task number from task_seq_get_next()
on failure in case of restarts.
Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets")
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201218185032.2464558-2-jonathan.lemon@gmail.com
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
We noticed that with a LOCKDEP enabled kernel,
allocating a hash table with 65536 buckets would
use more than 60ms.
htab_init_buckets() runs from process context,
it is safe to schedule to avoid latency spikes.
Fixes: c50eb518e262 ("bpf: Use separate lockdep class for each hashtab")
Reported-by: John Sperbeck <jsperbeck@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201221192506.707584-1-eric.dumazet@gmail.com
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Remove including <linux/version.h> that don't need it.
Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1608086835-54523-1-git-send-email-tiantao6@hisilicon.com
|
|\ \ \ \ \
| |_|_|_|/
|/| | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU fix from Paul McKenney:
"This is a fix for a regression in the v5.10 merge window, but it was
reported quite late in the v5.10 process, plus generating and testing
the fix took some time.
The regression is due to commit 36dadef23fcc ("kprobes: Init kprobes
in early_initcall") which on powerpc can use RCU Tasks before
initialization, resulting in boot failures.
The fix is straightforward, simply moving initialization of RCU Tasks
before the early_initcall()s. The fix has been exposed to -next and
kbuild test robot testing, and has been tested by the PowerPC guys"
* 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
rcu-tasks: Move RCU-tasks initialization to before early_initcall()
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
PowerPC testing encountered boot failures due to RCU Tasks not being
fully initialized until core_initcall() time. This commit therefore
initializes RCU Tasks (along with Rude RCU and RCU Tasks Trace) just
before early_initcall() time, thus allowing waiting on RCU Tasks grace
periods from early_initcall() handlers.
Link: https://lore.kernel.org/rcu/87eekfh80a.fsf@dja-thinkpad.axtens.net/
Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall")
Tested-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Pull io_uring fixes from Jens Axboe:
"A few fixes that should go into 5.11, all marked for stable as well:
- Fix issue around identity COW'ing and users that share a ring
across processes
- Fix a hang associated with unregistering fixed files (Pavel)
- Move the 'process is exiting' cancelation a bit earlier, so
task_works aren't affected by it (Pavel)"
* tag 'io_uring-5.11-2021-01-01' of git://git.kernel.dk/linux-block:
kernel/io_uring: cancel io_uring before task works
io_uring: fix io_sqe_files_unregister() hangs
io_uring: add a helper for setting a ref node
io_uring: don't assume mm is constant across submits
|
| | |_|_|/
| |/| | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
For cancelling io_uring requests it needs either to be able to run
currently enqueued task_works or having it shut down by that moment.
Otherwise io_uring_cancel_files() may be waiting for requests that won't
ever complete.
Go with the first way and do cancellations before setting PF_EXITING and
so before putting the task_work infrastructure into a transition state
where task_work_run() would better not be called.
Cc: stable@vger.kernel.org # 5.5+
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Pull workqueue update from Tejun Heo:
"The same as the cgroup tree - one commit which was scheduled for the
5.11 merge window.
All the commit does is avoding spurious worker wakeups from workqueue
allocation / config change path to help cpuisol use cases"
* 'for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Kick a worker based on the actual activation of delayed works
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
In realtime scenario, We do not want to have interference on the
isolated cpu cores. but when invoking alloc_workqueue() for percpu wq
on the housekeeping cpu, it kick a kworker on the isolated cpu.
alloc_workqueue
pwq_adjust_max_active
wake_up_worker
The comment in pwq_adjust_max_active() said:
"Need to kick a worker after thawed or an unbound wq's
max_active is bumped"
So it is unnecessary to kick a kworker for percpu's wq when invoking
alloc_workqueue(). this patch only kick a worker based on the actual
activation of delayed works.
Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|\ \ \ \ \ \
| |_|/ / / /
|/| | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
"These three patches were scheduled for the merge window but I forgot
to send them out. Sorry about that.
None of them are significant and they fit well in a fix pull request
too - two are cosmetic and one fixes a memory leak in the mount option
parsing path"
* 'for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: Fix memory leak when parsing multiple source parameters
cgroup/cgroup.c: replace 'of->kn->priv' with of_cft()
kernel: cgroup: Mundane spelling fixes throughout the file
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
A memory leak is found in cgroup1_parse_param() when multiple source
parameters overwrite fc->source in the fs_context struct without free.
unreferenced object 0xffff888100d930e0 (size 16):
comm "mount", pid 520, jiffies 4303326831 (age 152.783s)
hex dump (first 16 bytes):
74 65 73 74 6c 65 61 6b 00 00 00 00 00 00 00 00 testleak........
backtrace:
[<000000003e5023ec>] kmemdup_nul+0x2d/0xa0
[<00000000377dbdaa>] vfs_parse_fs_string+0xc0/0x150
[<00000000cb2b4882>] generic_parse_monolithic+0x15a/0x1d0
[<000000000f750198>] path_mount+0xee1/0x1820
[<0000000004756de2>] do_mount+0xea/0x100
[<0000000094cafb0a>] __x64_sys_mount+0x14b/0x1f0
Fix this bug by permitting a single source parameter and rejecting with
an error all subsequent ones.
Fixes: 8d2451f4994f ("cgroup1: switch to option-by-option parsing")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Reviewed-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
we have supplied the inline function: of_cft() in cgroup.h.
So replace the direct use 'of->kn->priv' with inline func
of_cft(), which is more readable.
Signed-off-by: Hui Su <sh_def@163.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
| |/ / / /
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Few spelling fixes throughout the file.
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"Misc fixes/updates:
- Fix static keys usage in module __init sections
- Add separate MAINTAINERS entry for static branches/calls
- Fix lockdep splat with CONFIG_PREEMPTIRQ_EVENTS=y tracing"
* tag 'locking-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
softirq: Avoid bad tracing / lockdep interaction
jump_label/static_call: Add MAINTAINERS
jump_label: Fix usage in module __init
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Similar to commit:
1a63dcd8765b ("softirq: Reorder trace_softirqs_on to prevent lockdep splat")
__local_bh_enable_ip() can also call into tracing with inconsistent
state. Unlike that commit we don't need to bother about the tracepoint
because 'cnt-1' never matches preempt_count() (by construction).
Reported-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lkml.kernel.org/r/20201218154519.GW3092@hirez.programming.kicks-ass.net
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
When the static_key is part of the module, and the module calls
static_key_inc/enable() from it's __init section *AND* has a
static_branch_*() user in that very same __init section, things go
wobbly.
If the static_key lives outside the module, jump_label_add_module()
would append this module's sites to the key and jump_label_update()
would take the static_key_linked() branch and all would be fine.
If all the sites are outside of __init, then everything will be fine
too.
However, when all is aligned just as described above,
jump_label_update() calls __jump_label_update(.init = false) and we'll
not update sites in __init text.
Fixes: 19483677684b ("jump_label: Annotate entries that operate on __init code earlier")
Reported-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Jessica Yu <jeyu@kernel.org>
Link: https://lkml.kernel.org/r/20201216135435.GV3092@hirez.programming.kicks-ass.net
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
"Update/fix two CPU sanity checks in the hotplug and the boot code, and
fix a typo in the Kconfig help text.
[ Context: the first two commits are the result of an ongoing
annotation+review work of (intentional) tick_do_timer_cpu() data
races reported by KCSAN, but the annotations aren't fully cooked
yet ]"
* tag 'timers-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping: Fix spelling mistake in Kconfig "fullfill" -> "fulfill"
tick/sched: Remove bogus boot "safety" check
tick: Remove pointless cpu valid check in hotplug code
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
There is a spelling mistake in the Kconfig help text. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20201217171705.57586-1-colin.king@canonical.com
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
can_stop_idle_tick() checks whether the do_timer() duty has been taken over
by a CPU on boot. That's silly because the boot CPU always takes over with
the initial clockevent device.
But even if no CPU would have installed a clockevent and taken over the
duty then the question whether the tick on the current CPU can be stopped
or not is moot. In that case the current CPU would have no clockevent
either, so there would be nothing to keep ticking.
Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20201206212002.725238293@linutronix.de
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
tick_handover_do_timer() which is invoked when a CPU is unplugged has a
check for cpumask_first(cpu_online_mask) when it tries to hand over the
tick update duty.
Checking the result of cpumask_first() there is pointless because if the
online mask is empty at this point, then this would be the last CPU in the
system going offline, which is impossible. There is always at least one CPU
remaining. If online mask would be really empty then the timer duty would
be the least of the resulting problems.
Remove the well meant check simply because it is pointless and confusing.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20201206212002.582579516@linutronix.de
|
|\ \ \ \ \ \ \
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
"Fix a context switch performance regression"
* tag 'sched-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Optimize finish_lock_switch()
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
The kernel test robot measured a -1.6% performance regression on
will-it-scale/sched_yield due to commit:
2558aacff858 ("sched/hotplug: Ensure only per-cpu kthreads run during hotplug")
Even though we were careful to replace a single load with another
single load from the same cacheline.
Restore finish_lock_switch() to the exact state before the offending
patch and solve the problem differently.
Fixes: 2558aacff858 ("sched/hotplug: Ensure only per-cpu kthreads run during hotplug")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20201210161408.GX3021@hirez.programming.kicks-ass.net
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Commit 64a1b95bb9fe ("genirq: Restrict export of irq_to_desc()") removed
the export of irq_to_desc() unless powerpc KVM is being built, because
there is still a use of irq_to_desc() in modular code there.
However it used:
#ifdef CONFIG_KVM_BOOK3S_64_HV
Which doesn't work when that symbol is =m, leading to a build failure:
ERROR: modpost: "irq_to_desc" [arch/powerpc/kvm/kvm-hv.ko] undefined!
Fix it by checking for the definedness of the correct symbol which is
CONFIG_KVM_BOOK3S_64_HV_MODULE.
Fixes: 64a1b95bb9fe ("genirq: Restrict export of irq_to_desc()")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\ \ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"This is the second attempt after the first one failed miserably and
got zapped to unblock the rest of the interrupt related patches.
A treewide cleanup of interrupt descriptor (ab)use with all sorts of
racy accesses, inefficient and disfunctional code. The goal is to
remove the export of irq_to_desc() to prevent these things from
creeping up again"
* tag 'irq-core-2020-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
genirq: Restrict export of irq_to_desc()
xen/events: Implement irq distribution
xen/events: Reduce irq_info:: Spurious_cnt storage size
xen/events: Only force affinity mask for percpu interrupts
xen/events: Use immediate affinity setting
xen/events: Remove disfunct affinity spreading
xen/events: Remove unused bind_evtchn_to_irq_lateeoi()
net/mlx5: Use effective interrupt affinity
net/mlx5: Replace irq_to_desc() abuse
net/mlx4: Use effective interrupt affinity
net/mlx4: Replace irq_to_desc() abuse
PCI: mobiveil: Use irq_data_get_irq_chip_data()
PCI: xilinx-nwl: Use irq_data_get_irq_chip_data()
NTB/msi: Use irq_has_action()
mfd: ab8500-debugfs: Remove the racy fiddling with irq_desc
pinctrl: nomadik: Use irq_has_action()
drm/i915/pmu: Replace open coded kstat_irqs() copy
drm/i915/lpe_audio: Remove pointless irq_to_desc() usage
s390/irq: Use irq_desc_kstat_cpu() in show_msi_interrupt()
parisc/irq: Use irq_desc_kstat_cpu() in show_interrupts()
...
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
No more (ab)use in drivers finally. There is still the modular build of
PPC/KVM which needs it, so restrict it to this case which still makes it
unavailable for most drivers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201210194045.551428291@linutronix.de
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Most users of kstat_irqs_cpu() have the irq descriptor already. No point in
calling into the core code and looking it up once more.
Use it in per_cpu_count_show() to start with.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201210194043.362094758@linutronix.de
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
No more users outside the core code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201210194043.268774449@linutronix.de
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Both the per cpu stats and the accumulated count are accessed lockless and
can be concurrently modified. That's intentional and the stats are a rough
estimate anyway. Annotate them with data_race().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201210194043.067097663@linutronix.de
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
irq_set_lockdep_class() is used from modules and requires irq_to_desc() to
be exported. Move it into the core code which lifts another requirement for
the export.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201210194042.860029489@linutronix.de
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
These checks are used by modules and prevent the removal of the export of
irq_to_desc(). Move the accessor into the core.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201210194042.703779349@linutronix.de
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
This function uses irq_to_desc() and is going to be used by modules to
replace the open coded irq_to_desc() (ab)usage. The final goal is to remove
the export of irq_to_desc() so driver cannot fiddle with it anymore.
Move it into the core code and fixup the usage sites to include the proper
header.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201210194042.548936472@linutronix.de
|
|\ \ \ \ \ \ \ \ \
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These update the CPPC cpufreq driver and intel_pstate (which involves
updating the cpufreq core and the schedutil governor) and make
janitorial changes in the ACPI code handling processor objects.
Specifics:
- Rework the passive-mode "fast switch" path in the intel_pstate
driver to allow it receive the minimum (required) and target
(desired) performance information from the schedutil governor so as
to avoid running some workloads too fast (Rafael Wysocki).
- Make the intel_pstate driver allow the policy max limit to be
increased after the guaranteed performance value for the given CPU
has increased (Rafael Wysocki).
- Clean up the handling of CPU coordination types in the CPPC cpufreq
driver and make it export frequency domains information to user
space via sysfs (Ionela Voinescu).
- Fix the ACPI code handling processor objects to use a correct
coordination type when it fails to map frequency domains and drop a
redundant CPU map initialization from it (Ionela Voinescu, Punit
Agrawal)"
* tag 'pm-5.11-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: intel_pstate: Use most recent guaranteed performance values
cpufreq: intel_pstate: Implement the ->adjust_perf() callback
cpufreq: Add special-purpose fast-switching callback for drivers
cpufreq: schedutil: Add util to struct sg_cpu
cppc_cpufreq: replace per-cpu data array with a list
cppc_cpufreq: expose information on frequency domains
cppc_cpufreq: clarify support for coordination types
cppc_cpufreq: use policy->cpu as driver of frequency setting
ACPI: processor: fix NONE coordination for domain mapping failure
|
| |\ \ \ \ \ \ \ \ \
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
* pm-cpufreq:
cpufreq: intel_pstate: Use most recent guaranteed performance values
cpufreq: intel_pstate: Implement the ->adjust_perf() callback
cpufreq: Add special-purpose fast-switching callback for drivers
cpufreq: schedutil: Add util to struct sg_cpu
cppc_cpufreq: replace per-cpu data array with a list
cppc_cpufreq: expose information on frequency domains
cppc_cpufreq: clarify support for coordination types
cppc_cpufreq: use policy->cpu as driver of frequency setting
ACPI: processor: fix NONE coordination for domain mapping failure
ACPI: processor: Drop duplicate setting of shared_cpu_map
|
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
First off, some cpufreq drivers (eg. intel_pstate) can pass hints
beyond the current target frequency to the hardware and there are no
provisions for doing that in the cpufreq framework. In particular,
today the driver has to assume that it should not allow the frequency
to fall below the one requested by the governor (or the required
capacity may not be provided) which may not be the case and which may
lead to excessive energy usage in some scenarios.
Second, the hints passed by these drivers to the hardware need not be
in terms of the frequency, so representing the utilization numbers
coming from the scheduler as frequency before passing them to those
drivers is not really useful.
Address the two points above by adding a special-purpose replacement
for the ->fast_switch callback, called ->adjust_perf, allowing the
governor to pass abstract performance level (rather than frequency)
values for the minimum (required) and target (desired) performance
along with the CPU capacity to compare them to.
Also update the schedutil governor to use the new callback instead
of ->fast_switch if present and if the utilization mertics are
frequency-invariant (that is requisite for the direct mapping
between the utilization and the CPU performance levels to be a
reasonable approximation).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
|
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
Instead of passing util and max between functions while computing the
utilization and capacity, store the former in struct sg_cpu (along
with the latter and bw_dl).
This will allow the current utilization value to be compared with the
one obtained previously (which is requisite for some code changes to
follow this one), but also it causes the code to look slightly more
consistent and cleaner.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|\ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
Merge KASAN updates from Andrew Morton.
This adds a new hardware tag-based mode to KASAN. The new mode is
similar to the existing software tag-based KASAN, but relies on arm64
Memory Tagging Extension (MTE) to perform memory and pointer tagging
(instead of shadow memory and compiler instrumentation).
By Andrey Konovalov and Vincenzo Frascino.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (60 commits)
kasan: update documentation
kasan, mm: allow cache merging with no metadata
kasan: sanitize objects when metadata doesn't fit
kasan: clarify comment in __kasan_kfree_large
kasan: simplify assign_tag and set_tag calls
kasan: don't round_up too much
kasan, mm: rename kasan_poison_kfree
kasan, mm: check kasan_enabled in annotations
kasan: add and integrate kasan boot parameters
kasan: inline (un)poison_range and check_invalid_free
kasan: open-code kasan_unpoison_slab
kasan: inline random_tag for HW_TAGS
kasan: inline kasan_reset_tag for tag-based modes
kasan: remove __kasan_unpoison_stack
kasan: allow VMAP_STACK for HW_TAGS mode
kasan, arm64: unpoison stack only with CONFIG_KASAN_STACK
kasan: introduce set_alloc_info
kasan: rename get_alloc/free_info
kasan: simplify quarantine_put call site
kselftest/arm64: check GCR_EL1 after context switch
...
|
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
This is a preparatory commit for the upcoming addition of a new hardware
tag-based (MTE-based) KASAN mode.
The new mode won't be using shadow memory. Rename external annotation
kasan_unpoison_shadow() to kasan_unpoison_range(), and introduce internal
functions (un)poison_range() (without kasan_ prefix).
Co-developed-by: Marco Elver <elver@google.com>
Link: https://lkml.kernel.org/r/fccdcaa13dc6b2211bf363d6c6d499279a54fe3a.1606161801.git.andreyknvl@google.com
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Branislav Rankov <Branislav.Rankov@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\ \ \ \ \ \ \ \ \ \ \ \
| |/ / / / / / / / / / /
|/| | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
Pull dma-mapping updates from Christoph Hellwig:
- support for a partial IOMMU bypass (Alexey Kardashevskiy)
- add a DMA API benchmark (Barry Song)
- misc fixes (Tiezhu Yang, tangjianqiang)
* tag 'dma-mapping-5.11' of git://git.infradead.org/users/hch/dma-mapping:
selftests/dma: add test application for DMA_MAP_BENCHMARK
dma-mapping: add benchmark support for streaming DMA APIs
dma-contiguous: fix a typo error in a comment
dma-pool: no need to check return value of debugfs_create functions
powerpc/dma: Fallback to dma_ops when persistent memory present
dma-mapping: Allow mixing bypass and mapped DMA operation
|
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
Nowadays, there are increasing requirements to benchmark the performance
of dma_map and dma_unmap particually while the device is attached to an
IOMMU.
This patch enables the support. Users can run specified number of threads
to do dma_map_page and dma_unmap_page on a specific NUMA node with the
specified duration. Then dma_map_benchmark will calculate the average
latency for map and unmap.
A difficulity for this benchmark is that dma_map/unmap APIs must run on
a particular device. Each device might have different backend of IOMMU or
non-IOMMU.
So we use the driver_override to bind dma_map_benchmark to a particual
device by:
For platform devices:
echo dma_map_benchmark > /sys/bus/platform/devices/xxx/driver_override
echo xxx > /sys/bus/platform/drivers/xxx/unbind
echo xxx > /sys/bus/platform/drivers/dma_map_benchmark/bind
For PCI devices:
echo dma_map_benchmark > /sys/bus/pci/devices/0000:00:01.0/driver_override
echo 0000:00:01.0 > /sys/bus/pci/drivers/xxx/unbind
echo 0000:00:01.0 > /sys/bus/pci/drivers/dma_map_benchmark/bind
Cc: Will Deacon <will@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
[hch: folded in two fixes from Colin Ian King <colin.king@canonical.com>]
Signed-off-by: Christoph Hellwig <hch@lst.de>
|