| Commit message (Collapse) | Author | Files | Lines |
|
On success, mmap should return the begin address of newly mapped area,
but patch "mm: mmap: merge vma after call_mmap() if possible" set
vm_start of newly merged vma to return value addr. Users of mmap will
get wrong address if vma is merged after call_mmap(). We fix this by
moving the assignment to addr before merging vma.
We have a driver which changes vm_flags, and this bug is found by our
testcases.
Fixes: d70cec898324 ("mm: mmap: merge vma after call_mmap() if possible")
Signed-off-by: Liu Zixian <liuzixian4@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Hongxiang Lou <louhongxiang@huawei.com>
Cc: Hu Shiyuan <hushiyuan@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Link: https://lkml.kernel.org/r/20201203085350.22624-1-liuzixian4@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Adrian Moreno was ruuning a kubernetes 1.19 + containerd/docker workload
using hugetlbfs. In this environment the issue is reproduced by:
- Start a simple pod that uses the recently added HugePages medium
feature (pod yaml attached)
- Start a DPDK app. It doesn't need to run successfully (as in transfer
packets) nor interact with real hardware. It seems just initializing
the EAL layer (which handles hugepage reservation and locking) is
enough to trigger the issue
- Delete the Pod (or let it "Complete").
This would result in a kworker thread going into a tight loop (top output):
1425 root 20 0 0 0 0 R 99.7 0.0 5:22.45 kworker/28:7+cgroup_destroy
'perf top -g' reports:
- 63.28% 0.01% [kernel] [k] worker_thread
- 49.97% worker_thread
- 52.64% process_one_work
- 62.08% css_killed_work_fn
- hugetlb_cgroup_css_offline
41.52% _raw_spin_lock
- 2.82% _cond_resched
rcu_all_qs
2.66% PageHuge
- 0.57% schedule
- 0.57% __schedule
We are spinning in the do-while loop in hugetlb_cgroup_css_offline.
Worse yet, we are holding the master cgroup lock (cgroup_mutex) while
infinitely spinning. Little else can be done on the system as the
cgroup_mutex can not be acquired.
Do note that the issue can be reproduced by simply offlining a hugetlb
cgroup containing pages with reservation counts.
The loop in hugetlb_cgroup_css_offline is moving page counts from the
cgroup being offlined to the parent cgroup. This is done for each
hstate, and is repeated until hugetlb_cgroup_have_usage returns false.
The routine moving counts (hugetlb_cgroup_move_parent) is only moving
'usage' counts. The routine hugetlb_cgroup_have_usage is checking for
both 'usage' and 'reservation' counts. Discussion about what to do with
reservation counts when reparenting was discussed here:
https://lore.kernel.org/linux-kselftest/CAHS8izMFAYTgxym-Hzb_JmkTK1N_S9tGN71uS6MFV+R7swYu5A@mail.gmail.com/
The decision was made to leave a zombie cgroup for with reservation
counts. Unfortunately, the code checking reservation counts was
incorrectly added to hugetlb_cgroup_have_usage.
To fix the issue, simply remove the check for reservation counts. While
fixing this issue, a related bug in hugetlb_cgroup_css_offline was
noticed. The hstate index is not reinitialized each time through the
do-while loop. Fix this as well.
Fixes: 1adc4d419aa2 ("hugetlb_cgroup: add interface for charge/uncharge hugetlb reservations")
Reported-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Adrian Moreno <amorenoz@redhat.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20201203220242.158165-1-mike.kravetz@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
mm/filemap.c:830:14: warning: no previous prototype for `__add_to_page_cache_locked' [-Wmissing-prototypes]
Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Link: https://lkml.kernel.org/r/1604661895-5495-1-git-send-email-alex.shi@linux.alibaba.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The error handling in hugetlb_allocate_area() was incorrect for the
hugetlb_shared test case.
Previously the behavior was:
- mmap a hugetlb area
- If this fails, set the pointer to NULL, and carry on
- mmap an alias of the same hugetlb fd
- If this fails, munmap the original area
If the original mmap failed, it's likely the second one did too. If
both failed, we'd blindly try to munmap a NULL pointer, causing a
SIGSEGV. Instead, "goto fail" so we return before trying to mmap the
alias.
This issue can be hit "in real life" by forgetting to set
/proc/sys/vm/nr_hugepages (leaving it at 0), and then trying to run the
hugetlb_shared test.
Another small improvement is, when the original mmap fails, don't just
print "it failed": perror(), so we can see *why*. :)
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Alan Gilbert <dgilbert@redhat.com>
Link: https://lkml.kernel.org/r/20201204203443.2714693-1-axelrasmussen@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Only x86 and PowerPC implement the pkey-xxx.h, and an error was reported
when compiling protection_keys.c.
Add a Arch judgment to compile "protection_keys" in the Makefile.
If other arch implement this, add the arch name to the Makefile.
eg:
ifneq (,$(findstring $(ARCH),powerpc mips ... ))
Following build errors:
pkey-helpers.h:93:2: error: #error Architecture not supported
#error Architecture not supported
pkey-helpers.h:96:20: error: `PKEY_DISABLE_ACCESS' undeclared
#define PKEY_MASK (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE)
^
protection_keys.c:218:45: error: `PKEY_DISABLE_WRITE' undeclared
pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
^
Signed-off-by: Xingxing Su <suxingxing@loongson.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Mina Almasry <almasrymina@google.com>
Link: https://lkml.kernel.org/r/1606826876-30656-1-git-send-email-suxingxing@loongson.cn
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This fixes attribution for the commits (among others)
- d4097456cd1d ("video/framebuffer: move the probe func into
.devinit.text in Blackfin LCD driver")
- 0312e024d6cd ("mfd: mc13xxx: Add support for mc34708")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20201127213358.3440830-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We can't call kvfree() with a spin lock held, so defer it. Fixes a
might_sleep() runtime warning.
Fixes: 873d7bcfd066 ("mm/swapfile.c: use kvzalloc for swap_info_struct allocation")
Signed-off-by: Qian Cai <qcai@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20201202151549.10350-1-qcai@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
While I was doing zram testing, I found sometimes decompression failed
since the compression buffer was corrupted. With investigation, I found
below commit calls cond_resched unconditionally so it could make a
problem in atomic context if the task is reschedule.
BUG: sleeping function called from invalid context at mm/vmalloc.c:108
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 946, name: memhog
3 locks held by memhog/946:
#0: ffff9d01d4b193e8 (&mm->mmap_lock#2){++++}-{4:4}, at: __mm_populate+0x103/0x160
#1: ffffffffa3d53de0 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0xa98/0x1160
#2: ffff9d01d56b8110 (&zspage->lock){.+.+}-{3:3}, at: zs_map_object+0x8e/0x1f0
CPU: 0 PID: 946 Comm: memhog Not tainted 5.9.3-00011-gc5bfc0287345-dirty #316
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
Call Trace:
unmap_kernel_range_noflush+0x2eb/0x350
unmap_kernel_range+0x14/0x30
zs_unmap_object+0xd5/0xe0
zram_bvec_rw.isra.0+0x38c/0x8e0
zram_rw_page+0x90/0x101
bdev_write_page+0x92/0xe0
__swap_writepage+0x94/0x4a0
pageout+0xe3/0x3a0
shrink_page_list+0xb94/0xd60
shrink_inactive_list+0x158/0x460
We can fix this by removing the ZSMALLOC_PGTABLE_MAPPING feature (which
contains the offending calling code) from zsmalloc.
Even though this option showed some amount improvement(e.g., 30%) in
some arm32 platforms, it has been headache to maintain since it have
abused APIs[1](e.g., unmap_kernel_range in atomic context).
Since we are approaching to deprecate 32bit machines and already made
the config option available for only builtin build since v5.8, lastly it
has been not default option in zsmalloc, it's time to drop the option
for better maintenance.
[1] http://lore.kernel.org/linux-mm/20201105170249.387069-1-minchan@kernel.org
Fixes: e47110e90584 ("mm/vunmap: add cond_resched() in vunmap_pmd_range")
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Harish Sriram <harish@linux.ibm.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20201117202916.GA3856507@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When investigating a slab cache bloat problem, significant amount of
negative dentry cache was seen, but confusingly they neither got shrunk
by reclaimer (the host has very tight memory) nor be shrunk by dropping
cache. The vmcore shows there are over 14M negative dentry objects on
lru, but tracing result shows they were even not scanned at all.
Further investigation shows the memcg's vfs shrinker_map bit is not set.
So the reclaimer or dropping cache just skip calling vfs shrinker. So
we have to reboot the hosts to get the memory back.
I didn't manage to come up with a reproducer in test environment, and
the problem can't be reproduced after rebooting. But it seems there is
race between shrinker map bit clear and reparenting by code inspection.
The hypothesis is elaborated as below.
The memcg hierarchy on our production environment looks like:
root
/ \
system user
The main workloads are running under user slice's children, and it
creates and removes memcg frequently. So reparenting happens very often
under user slice, but no task is under user slice directly.
So with the frequent reparenting and tight memory pressure, the below
hypothetical race condition may happen:
CPU A CPU B
reparent
dst->nr_items == 0
shrinker:
total_objects == 0
add src->nr_items to dst
set_bit
return SHRINK_EMPTY
clear_bit
child memcg offline
replace child's kmemcg_id with
parent's (in memcg_offline_kmem())
list_lru_del() between shrinker runs
see parent's kmemcg_id
dec dst->nr_items
reparent again
dst->nr_items may go negative
due to concurrent list_lru_del()
The second run of shrinker:
read nr_items without any
synchronization, so it may
see intermediate negative
nr_items then total_objects
may return 0 coincidently
keep the bit cleared
dst->nr_items != 0
skip set_bit
add scr->nr_item to dst
After this point dst->nr_item may never go zero, so reparenting will not
set shrinker_map bit anymore. And since there is no task under user
slice directly, so no new object will be added to its lru to set the
shrinker map bit either. That bit is kept cleared forever.
How does list_lru_del() race with reparenting? It is because reparenting
replaces children's kmemcg_id to parent's without protecting from
nlru->lock, so list_lru_del() may see parent's kmemcg_id but actually
deleting items from child's lru, but dec'ing parent's nr_items, so the
parent's nr_items may go negative as commit 2788cf0c401c ("memcg:
reparent list_lrus and free kmemcg_id on css offline") says.
Since it is impossible that dst->nr_items goes negative and
src->nr_items goes zero at the same time, so it seems we could set the
shrinker map bit iff src->nr_items != 0. We could synchronize
list_lru_count_one() and reparenting with nlru->lock, but it seems
checking src->nr_items in reparenting is the simplest and avoids lock
contention.
Fixes: fae91d6d8be5 ("mm/list_lru.c: set bit in memcg shrinker bitmap on first list_lru item appearance")
Suggested-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org> [4.19]
Link: https://lkml.kernel.org/r/20201202171749.264354-1-shy828301@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 10befea91b61 ("mm: memcg/slab: use a single set of kmem_caches
for all allocations") introduced a regression into the handling of the
obj_cgroup_charge() return value. If a non-zero value is returned
(indicating of exceeding one of memory.max limits), the allocation
should fail, instead of falling back to non-accounted mode.
To make the code more readable, move memcg_slab_pre_alloc_hook() and
memcg_slab_post_alloc_hook() calling conditions into bodies of these
hooks.
Fixes: 10befea91b61 ("mm: memcg/slab: use a single set of kmem_caches for all allocations")
Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20201127161828.GD840171@carbon.dhcp.thefacebook.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
'format_corename()' will splite 'core_pattern' on spaces when it is in
pipe mode, and take helper_argv[0] as the path to usermode executable.
It works fine in most cases.
However, if there is a space between '|' and '/file/path', such as
'| /usr/lib/systemd/systemd-coredump %P %u %g', then helper_argv[0] will
be parsed as '', and users will get a 'Core dump to | disabled'.
It is not friendly to users, as the pattern above was valid previously.
Fix this by ignoring the spaces between '|' and '/file/path'.
Fixes: 315c69261dd3 ("coredump: split pipe command whitespace before expanding template")
Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul Wise <pabs3@bonedaddy.net>
Cc: Jakub Wilk <jwilk@jwilk.net> [https://bugs.debian.org/924398]
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/5fb62870.1c69fb81.8ef5d.af76@mx.google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Fix build errors when ZLIB_INFLATE=m and ZLIB_DEFLATE=m and ZLIB_DFLTCC=y
by exporting the 2 needed symbols in dfltcc_inflate.c.
Fixes these build errors:
ERROR: modpost: "dfltcc_inflate" [lib/zlib_inflate/zlib_inflate.ko] undefined!
ERROR: modpost: "dfltcc_can_inflate" [lib/zlib_inflate/zlib_inflate.ko] undefined!
Fixes: 126196100063 ("lib/zlib: add s390 hardware support for kernel zlib_inflate")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Link: https://lkml.kernel.org/r/20201123191712.4882-1-rdunlap@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
"xargs echo" is not a safe way to remove line breaks because the input
may exceed the command line limit and xargs may break it up into
multiple invocations of echo. This should never happen because
scripts/gen_autoksyms.sh expects all undefined symbols are placed in
the second line of .mod files.
One possible way is to replace "xargs echo" with
"sed ':x;N;$!bx;s/\n/ /g'" or something, but I rewrote the code by
using awk because it is more readable.
This issue was reported by Sami Tolvanen; in his Clang LTO patch set,
$(multi-used-m) is no longer an ELF object, but a thin archive that
contains LLVM bitcode files. llvm-nm prints out symbols for each
archive member separately, which results a lot of dupications, in some
places, beyond the system-defined limit.
This problem must be fixed irrespective of LTO, and we must ensure
zero possibility of having this issue.
Link: https://lkml.org/lkml/2020/12/1/1658
Reported-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
|
|
This reverts commit d162219c655c8cf8003128a13840d6c1e183fb80.
The device uses a VIRTIO device ID out of a not-for-production range.
Releasing Linux using an ID out of this range will make it conflict with
development setups. An official request to reserve an ID for an MEI
device is yet to be submitted to the virtio TC, thus there's no chance
it will be reserved and fixed in time before the next release.
Once requested it usually takes 2-3 weeks to land in the spec, which
means the device can be supported with the official ID in the next Linux
version if contributors act quickly.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Cc: Alexander Usyskin <alexander.usyskin@intel.com>
Cc: Wang Yu <yu1.wang@intel.com>
Cc: Liu Shuo <shuo.a.liu@intel.com>
Link: https://lore.kernel.org/r/20201205193625.469773-1-mst@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Since insn.prefixes.nbytes can be bigger than the size of
insn.prefixes.bytes[] when a prefix is repeated, the proper
check must be:
insn.prefixes.bytes[i] != 0 and i < 4
instead of using insn.prefixes.nbytes. Use the new
for_each_insn_prefix() macro which does it correctly.
Debugged by Kees Cook <keescook@chromium.org>.
[ bp: Massage commit message. ]
Fixes: 25189d08e516 ("x86/sev-es: Add support for handling IOIO exceptions")
Reported-by: syzbot+9b64b619f10f19d19a7c@syzkaller.appspotmail.com
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/160697106089.3146288.2052422845039649176.stgit@devnote2
|
|
Since insn.prefixes.nbytes can be bigger than the size of
insn.prefixes.bytes[] when a prefix is repeated, the proper check must
be
insn.prefixes.bytes[i] != 0 and i < 4
instead of using insn.prefixes.nbytes. Use the new
for_each_insn_prefix() macro which does it correctly.
Debugged by Kees Cook <keescook@chromium.org>.
[ bp: Massage commit message. ]
Fixes: 32d0b95300db ("x86/insn-eval: Add utility functions to get segment selector")
Reported-by: syzbot+9b64b619f10f19d19a7c@syzkaller.appspotmail.com
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/160697104969.3146288.16329307586428270032.stgit@devnote2
|
|
Since insn.prefixes.nbytes can be bigger than the size of
insn.prefixes.bytes[] when a prefix is repeated, the proper check must
be
insn.prefixes.bytes[i] != 0 and i < 4
instead of using insn.prefixes.nbytes.
Introduce a for_each_insn_prefix() macro for this purpose. Debugged by
Kees Cook <keescook@chromium.org>.
[ bp: Massage commit message, sync with the respective header in tools/
and drop "we". ]
Fixes: 2b1444983508 ("uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints")
Reported-by: syzbot+9b64b619f10f19d19a7c@syzkaller.appspotmail.com
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/160697103739.3146288.7437620795200799020.stgit@devnote2
|
|
devm_ioremap and ioremap may return NULL which cannot be checked
by IS_ERR.
Signed-off-by: Wang Xiaojun <wangxiaojun11@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Acked-by: Khalil Blaiech <kblaiech@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
If this is not enabled, the interfaces used in this driver do not work:
drivers/i2c/busses/i2c-mlxbf.c:1888:3: error: implicit declaration of function 'i2c_slave_event' [-Werror,-Wimplicit-function-declaration]
i2c_slave_event(slave, I2C_SLAVE_WRITE_REQUESTED, &value);
^
drivers/i2c/busses/i2c-mlxbf.c:1888:26: error: use of undeclared identifier 'I2C_SLAVE_WRITE_REQUESTED'
i2c_slave_event(slave, I2C_SLAVE_WRITE_REQUESTED, &value);
^
drivers/i2c/busses/i2c-mlxbf.c:1890:32: error: use of undeclared identifier 'I2C_SLAVE_WRITE_RECEIVED'
ret = i2c_slave_event(slave, I2C_SLAVE_WRITE_RECEIVED,
^
drivers/i2c/busses/i2c-mlxbf.c:1892:26: error: use of undeclared identifier 'I2C_SLAVE_STOP'
i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
^
Fixes: b5b5b32081cd ("i2c: mlxbf: I2C SMBus driver for Mellanox BlueField SoC")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Khalil Blaiech <kblaiech@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
If non-zero 'chunk_sectors' is passed in to blk_max_size_offset() that
override will be incorrectly ignored.
Old blk_max_size_offset() branching, prior to commit 3ee16db390b4,
must be used only if passed 'chunk_sectors' override is zero.
Fixes: 3ee16db390b4 ("dm: fix IO splitting")
Cc: stable@vger.kernel.org # 5.9
Reported-by: John Dorminy <jdorminy@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Fixes sparse warnings:
drivers/md/dm.c:508:12: warning: context imbalance in 'dm_prepare_ioctl' - wrong count at exit
drivers/md/dm.c:543:13: warning: context imbalance in 'dm_unprepare_ioctl' - wrong count at exit
Fixes: 971888c46993f ("dm: hold DM table for duration of ioctl rather than use blkdev_get")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Remove redundant dm_put_live_table() in dm_dax_zero_page_range() error
path to fix sparse warning:
drivers/md/dm.c:1208:9: warning: context imbalance in 'dm_dax_zero_page_range' - unexpected unlock
Fixes: cdf6cdcd3b99a ("dm,dax: Add dax zero_page_range operation")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Commit 882ec4e609c1 ("dm table: stack 'chunk_sectors' limit to account
for target-specific splitting") caused a couple regressions:
1) Using lcm_not_zero() when stacking chunk_sectors was a bug because
chunk_sectors must reflect the most limited of all devices in the
IO stack.
2) DM targets that set max_io_len but that do _not_ provide an
.iterate_devices method no longer had there IO split properly.
And commit 5091cdec56fa ("dm: change max_io_len() to use
blk_max_size_offset()") also caused a regression where DM no longer
supported varied (per target) IO splitting. The implication being the
potential for severely reduced performance for IO stacks that use a DM
target like dm-cache to hide performance limitations of a slower
device (e.g. one that requires 4K IO splitting).
Coming full circle: Fix all these issues by discontinuing stacking
chunk_sectors up using ti->max_io_len in dm_calculate_queue_limits(),
add optional chunk_sectors override argument to blk_max_size_offset()
and update DM's max_io_len() to pass ti->max_io_len to its
blk_max_size_offset() call.
Passing in an optional chunk_sectors override to blk_max_size_offset()
allows for code reuse of block's centralized calculation for max IO
size based on provided offset and split boundary.
Fixes: 882ec4e609c1 ("dm table: stack 'chunk_sectors' limit to account for target-specific splitting")
Fixes: 5091cdec56fa ("dm: change max_io_len() to use blk_max_size_offset()")
Cc: stable@vger.kernel.org
Reported-by: John Dorminy <jdorminy@redhat.com>
Reported-by: Bruce Johnston <bjohnsto@redhat.com>
Reported-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Reviewed-by: John Dorminy <jdorminy@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently, locking of ->session is very inconsistent; most places
protect it using the legacy tty mutex, but disassociate_ctty(),
__do_SAK(), tiocspgrp() and tiocgsid() don't.
Two of the writers hold the ctrl_lock (because they already need it for
->pgrp), but __proc_set_tty() doesn't do that yet.
On a PREEMPT=y system, an unprivileged user can theoretically abuse
this broken locking to read 4 bytes of freed memory via TIOCGSID if
tiocgsid() is preempted long enough at the right point. (Other things
might also go wrong, especially if root-only ioctls are involved; I'm
not sure about that.)
Change the locking on ->session such that:
- tty_lock() is held by all writers: By making disassociate_ctty()
hold it. This should be fine because the same lock can already be
taken through the call to tty_vhangup_session().
The tricky part is that we need to shorten the area covered by
siglock to be able to take tty_lock() without ugly retry logic; as
far as I can tell, this should be fine, since nothing in the
signal_struct is touched in the `if (tty)` branch.
- ctrl_lock is held by all writers: By changing __proc_set_tty() to
hold the lock a little longer.
- All readers that aren't holding tty_lock() hold ctrl_lock: By
adding locking to tiocgsid() and __do_SAK(), and expanding the area
covered by ctrl_lock in tiocspgrp().
Cc: stable@kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
tiocspgrp() takes two tty_struct pointers: One to the tty that userspace
passed to ioctl() (`tty`) and one to the TTY being changed (`real_tty`).
These pointers are different when ioctl() is called with a master fd.
To properly lock real_tty->pgrp, we must take real_tty->ctrl_lock.
This bug makes it possible for racing ioctl(TIOCSPGRP, ...) calls on
both sides of a PTY pair to corrupt the refcount of `struct pid`,
leading to use-after-free errors.
Fixes: 47f86834bbd4 ("redo locking of tty->pgrp")
CC: stable@kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The function may be unbound causing the ffs_ep and its descriptors
to be freed while userspace is in the middle of an ioctl requesting
the same descriptors. Avoid dangling pointer reference by first
making a local copy of desctiptors before releasing the spinlock.
Fixes: c559a3534109 ("usb: gadget: f_fs: add ioctl returning ep descriptor")
Reviewed-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Vamsi Krishna Samavedam <vskrishn@codeaurora.org>
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20201130203453.28154-1-jackp@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
There were a bunch of issues with the patch converting the
OMAP1 OSK board to use descriptors for controlling the USB
host:
- The chip label was incorrect
- The GPIO offset was off-by-one
- The code should use sleeping accessors
This patch tries to fix all issues at the same time.
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Fixes: 15d157e87443 ("usb: ohci-omap: Convert to use GPIO descriptors")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20201130083033.29435-1-linus.walleij@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit 0154012f8018bba4d9971d1007c12ffd48539ddb as Hans
reports it causes problems on some systems. Until a "real" fix for this
can be found, revert this change to get normal functionality back.
Link: https://lore.kernel.org/r/70ca74c2-4a80-e25b-eca9-a63a75516673@redhat.com
Cc: Tom Yan <tom.ty89@gmail.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit 558033c2828f832ab3b68c6f8b8710e0de6faef0 as Hans
reports it causes problems on some systems. Until a "real" fix for this
can be found, revert this change to get normal functionality back.
Link: https://lore.kernel.org/r/70ca74c2-4a80-e25b-eca9-a63a75516673@redhat.com
Cc: Tom Yan <tom.ty89@gmail.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit 5df7ef7d32fec1d6d1c34dbec019b461a12ce870 as Hans
reports it causes problems on some systems. Until a "real" fix for this
can be found, revert this change to get normal functionality back.
Link: https://lore.kernel.org/r/70ca74c2-4a80-e25b-eca9-a63a75516673@redhat.com
Cc: Tom Yan <tom.ty89@gmail.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Fix memory leak of control-message transfer buffer on successful open().
Fixes: 6774d5f53271 ("USB: serial: kl5kusb105: fix open error path")
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
|
|
When mounting with "idsfromsid" mount option, Azure
corrupted the owner SIDs due to excessive padding
caused by placing the owner fields at the end of the
security descriptor on create. Placing owners at the
front of the security descriptor (rather than the end)
is also safer, as the number of ACEs (that follow it)
are variable.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Suggested-by: Rohith Surabattula <rohiths@microsoft.com>
CC: Stable <stable@vger.kernel.org> # v5.8
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
In some scenarios (DFS and BAD_NETWORK_NAME) set_root_set() can be
called with a NULL ses->tcon_ipc.
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
For an operation compounded with an SMB2 CREATE request, client must set
COMPOUND_FID(0xFFFFFFFFFFFFFFFF) to FileID field of smb2 ioctl.
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Fixes: 2e4564b31b645 ("smb3: add support stat of WSL reparse points for special file types")
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
STEs format for Connect-X5 and Connect-X6DX different. Currently, on
Connext-X6DX the SW steering would break at some point when building STEs
w/o giving a proper error message. Fix this by checking the STE format of
the current device when initializing domain: add mlx5_ifc definitions for
Connect-X6DX SW steering, read FW capability to get the current format
version, and check this version when domain is being created.
Fixes: 26d688e33f88 ("net/mlx5: DR, Add Steering entry (STE) utilities")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Checksum calculation cannot be done in SW for TX kTLS HW offloaded
packets.
Offload it to the device, disregard the declared state of the TX
csum offload feature.
Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix build when CONFIG_IPV6 is not enabled by making a function
be built conditionally.
Fixes these build errors and warnings:
../drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c: In function 'accel_fs_tcp_set_ipv6_flow':
../include/net/sock.h:380:34: error: 'struct sock_common' has no member named 'skc_v6_daddr'; did you mean 'skc_daddr'?
380 | #define sk_v6_daddr __sk_common.skc_v6_daddr
| ^~~~~~~~~~~~
../drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c:55:14: note: in expansion of macro 'sk_v6_daddr'
55 | &sk->sk_v6_daddr, 16);
| ^~~~~~~~~~~
At top level:
../drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c:47:13: warning: 'accel_fs_tcp_set_ipv6_flow' defined but not used [-Wunused-function]
47 | static void accel_fs_tcp_set_ipv6_flow(struct mlx5_flow_spec *spec, struct sock *sk)
Fixes: 5229a96e59ec ("net/mlx5e: Accel, Expose flow steering API for rules add/del")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When command interface is down, driver to reclaim all 4K page chucks that
were hold by the Firmeware. Fix a bug for 64K page size systems, where
driver repeatedly released only the first chunk of the page.
Define helper function to fill 4K chunks for a given Firmware pages.
Iterate over all unreleased Firmware pages and call the hepler per each.
Fixes: 5adff6a08862 ("net/mlx5: Fix incorrect page count when in internal error")
Signed-off-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
when 'act_mpls' is used to mangle the LSE, the current value is read from
the packet dereferencing 4 bytes at mpls_hdr(): ensure that the label is
contained in the skb "linear" area.
Found by code inspection.
v2:
- use MPLS_HLEN instead of sizeof(new_lse), thanks to Jakub Kicinski
Fixes: 2a2ea50870ba ("net: sched: add mpls manipulation actions to TC")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Link: https://lore.kernel.org/r/3243506cba43d14858f3bd21ee0994160e44d64a.1606987058.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
when openvswitch is configured to mangle the LSE, the current value is
read from the packet dereferencing 4 bytes at mpls_hdr(): ensure that
the label is contained in the skb "linear" area.
Found by code inspection.
Fixes: d27cf5c59a12 ("net: core: add MPLS update core helper and use in OvS")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/aa099f245d93218b84b5c056b67b6058ccf81a66.1606987185.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
skb_mpls_dec_ttl() reads the LSE without ensuring that it is contained in
the skb "linear" area. Fix this calling pskb_may_pull() before reading the
current ttl.
Found by code inspection.
Fixes: 2a2ea50870ba ("net: sched: add mpls manipulation actions to TC")
Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/53659f28be8bc336c113b5254dc637cc76bbae91.1606987074.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix to return negative error code -ENOENT from invalid configuration
error handling case instead of 0, as done elsewhere in this function.
Fixes: 4bb043262878 ("net: mvpp2: phylink support")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20201203141806.37966-1-wanghai38@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The "skb" is freed by the transmit code in cxgb4_ofld_send() and we
shouldn't use it again. But in the current code, if we hit an error
later on in the function then the clean up code will call kfree_skb(skb)
and so it causes a double free.
Set the "skb" to NULL and that makes the kfree_skb() a no-op.
Fixes: d25f2f71f653 ("crypto: chtls - Program the TLS session Key")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/X8ilb6PtBRLWiSHp@mwanda
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
STATX_ATTR_MOUNT_ROOT and STATX_ATTR_DAX got merged with the same value,
so one of them needs fixing. Move STATX_ATTR_DAX.
While we're in here, clarify the value-matching scheme for some of the
attributes, and explain why the value for DAX does not match.
Fixes: 80340fe3605c ("statx: add mount_root")
Fixes: 712b2698e4c0 ("fs/stat: Define DAX statx attribute")
Link: https://lore.kernel.org/linux-fsdevel/7027520f-7c79-087e-1d00-743bdefa1a1e@redhat.com/
Link: https://lore.kernel.org/lkml/20201202214629.1563760-1-ira.weiny@intel.com/
Reported-by: David Howells <dhowells@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Cc: <stable@vger.kernel.org> # 5.8
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently .get_state() and .apply() use dev_get_drvdata() on the struct
device related to the pwm chip. This only works after .probe() called
platform_set_drvdata() which in this driver happens only after
pwmchip_add() and so comes possibly too late.
Instead of setting the driver data earlier use the traditional
container_of approach as this way the driver data is conceptually and
computational nearer.
Fixes: 9db33d221efc ("pwm: Add support for sl28cpld PWM controller")
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Lilith >_> and Claudio Bozzato of Cisco Talos security team reported
that collect_syscall() improperly casts the syscall registers to 64-bit
values leaking the uninitialized last 24 bytes on 32-bit platforms, that
are visible in /proc/self/syscall.
The cause is that info->data.args are u64 while syscall_get_arguments()
uses longs, as hinted by the bogus pointer cast in the function.
Let's just proceed like the other call places, by retrieving the
registers into an array of longs before assigning them to the caller's
array. This was successfully tested on x86_64, i386 and ppc32.
Reference: CVE-2020-28588, TALOS-2020-1211
Fixes: 631b7abacd02 ("ptrace: Remove maxargs from task_current_syscall()")
Cc: Greg KH <greg@kroah.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Michael Ellerman <mpe@ellerman.id.au> (ppc32)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently, UV4 is incorrectly identified as UV4A and UV4A as UV5. Hub
chip starts with revision 1, fix it.
[ bp: Massage commit message. ]
Fixes: 647128f1536e ("x86/platform/uv: Update UV MMRs for UV5")
Signed-off-by: Mike Travis <mike.travis@hpe.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
Acked-by: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Link: https://lkml.kernel.org/r/20201203152252.371199-1-mike.travis@hpe.com
|
|
This code does not ensure that the whole buffer is initialized and none
of the callers check for errors so potentially none of the buffer is
initialized. Add a memset to eliminate this bug.
Fixes: e3037485c68e ("rtw88: new Realtek 802.11ac driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/X8ilOfVz3pf0T5ec@mwanda
|
|
Keep the device-id entries sorted to make it easier to add new ones in
the right spot.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
|
|
Add PID for CH340 that's found on a ch341 based Programmer made by keeyees.
The specific device that contains the serial converter is described
here: http://www.keeyees.com/a/Products/ej/36.html
The driver works flawlessly as soon as the new PID (0x5512) is added to
it.
Signed-off-by: Jan-Niklas Burfeind <kernel@aiyionpri.me>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
|