summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'alloc-args-v4.19-rc8' of ↵Greg Kroah-Hartman2018-10-1110-19/+22
|\ | | | | | | | | | | | | | | | | | | | | | | | | https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Kees writes: "Fix open-coded multiplication arguments to allocators - Fixes several new open-coded multiplications added in the 4.19 merge window." * tag 'alloc-args-v4.19-rc8' of https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: treewide: Replace more open-coded allocation size multiplications
| * treewide: Replace more open-coded allocation size multiplicationsKees Cook2018-10-0610-19/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | As done treewide earlier, this catches several more open-coded allocation size calculations that were added to the kernel during the merge window. This performs the following mechanical transformations using Coccinelle: kvmalloc(a * b, ...) -> kvmalloc_array(a, b, ...) kvzalloc(a * b, ...) -> kvcalloc(a, b, ...) devm_kzalloc(..., a * b, ...) -> devm_kcalloc(..., a, b, ...) Signed-off-by: Kees Cook <keescook@chromium.org>
* | Merge branch 'x86-urgent-for-linus' of ↵Greg Kroah-Hartman2018-10-114-27/+45
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Ingo writes: "x86 fixes An intel_rdt memory access fix and a VLA fix in pgd_alloc()." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Avoid VLA in pgd_alloc() x86/intel_rdt: Fix out-of-bounds memory access in CBM tests
| * | x86/mm: Avoid VLA in pgd_alloc()Kees Cook2018-10-091-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Arnd Bergmann reported that turning on -Wvla found a new (unintended) VLA usage: arch/x86/mm/pgtable.c: In function 'pgd_alloc': include/linux/build_bug.h:29:45: error: ISO C90 forbids variable length array 'u_pmds' [-Werror=vla] arch/x86/mm/pgtable.c:190:34: note: in expansion of macro 'static_cpu_has' #define PREALLOCATED_USER_PMDS (static_cpu_has(X86_FEATURE_PTI) ? \ ^~~~~~~~~~~~~~ arch/x86/mm/pgtable.c:431:16: note: in expansion of macro 'PREALLOCATED_USER_PMDS' pmd_t *u_pmds[PREALLOCATED_USER_PMDS]; ^~~~~~~~~~~~~~~~~~~~~~ Use the actual size of the array that is used for X86_FEATURE_PTI, which is known at build time, instead of the variable size. [ mingo: Squashed original fix with followup fix to avoid bisection breakage, wrote new changelog. ] Reported-by: Arnd Bergmann <arnd@arndb.de> Original-written-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hpe.com> Fixes: 1be3f247c288 ("x86/mm: Avoid VLA in pgd_alloc()") Link: http://lkml.kernel.org/r/20181008235434.GA35035@beast Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | x86/intel_rdt: Fix out-of-bounds memory access in CBM testsReinette Chatre2018-10-093-25/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While the DOC at the beginning of lib/bitmap.c explicitly states that "The number of valid bits in a given bitmap does _not_ need to be an exact multiple of BITS_PER_LONG.", some of the bitmap operations do indeed access BITS_PER_LONG portions of the provided bitmap no matter the size of the provided bitmap. For example, if bitmap_intersects() is provided with an 8 bit bitmap the operation will access BITS_PER_LONG bits from the provided bitmap. While the operation ensures that these extra bits do not affect the result, the memory is still accessed. The capacity bitmasks (CBMs) are typically stored in u32 since they can never exceed 32 bits. A few instances exist where a bitmap_* operation is performed on a CBM by simply pointing the bitmap operation to the stored u32 value. The consequence of this pattern is that some bitmap_* operations will access out-of-bounds memory when interacting with the provided CBM. This is confirmed with a KASAN test that reports: BUG: KASAN: stack-out-of-bounds in __bitmap_intersects+0xa2/0x100 and BUG: KASAN: stack-out-of-bounds in __bitmap_weight+0x58/0x90 Fix this by moving any CBM provided to a bitmap operation needing BITS_PER_LONG to an 'unsigned long' variable. [ tglx: Changed related function arguments to unsigned long and got rid of the _cbm extra step ] Fixes: 72d505056604 ("x86/intel_rdt: Add utilities to test pseudo-locked region possibility") Fixes: 49f7b4efa110 ("x86/intel_rdt: Enable setting of exclusive mode") Fixes: d9b48c86eb38 ("x86/intel_rdt: Display resource groups' allocations' size in bytes") Fixes: 95f0b77efa57 ("x86/intel_rdt: Initialize new resource group with sane defaults") Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/69a428613a53f10e80594679ac726246020ff94f.1538686926.git.reinette.chatre@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | Merge branch 'sched-urgent-for-linus' of ↵Greg Kroah-Hartman2018-10-112-14/+0
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Ingo writes: "scheduler fix: Cleanup of dead code left over from the recent sched/numa fixes." * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: mm, sched/numa: Remove remaining traces of NUMA rate-limiting
| * | | mm, sched/numa: Remove remaining traces of NUMA rate-limitingSrikar Dronamraju2018-10-092-14/+0
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the leftover pglist_data::numabalancing_migrate_lock and its initialization, we stopped using this lock with: efaffc5e40ae ("mm, sched/numa: Remove rate-limiting of automatic NUMA balancing migration") [ mingo: Rewrote the changelog. ] Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linux-MM <linux-mm@kvack.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1538824999-31230-1-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | Merge branch 'perf-urgent-for-linus' of ↵Greg Kroah-Hartman2018-10-114-5/+20
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Ingo, a man of few words, writes: "perf fixes: misc perf tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf record: Use unmapped IP for inline callchain cursors perf python: Use -Wno-redundant-decls to build with PYTHON=python3 perf report: Don't try to map ip to invalid map perf script python: Fix export-to-sqlite.py sample columns perf script python: Fix export-to-postgresql.py occasional failure
| * \ \ Merge tag 'perf-urgent-for-mingo-4.19-20181005' of ↵Ingo Molnar2018-10-054-5/+20
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Fix the build on Clear Linux, coping with redundant declarations of function prototypes in python3 header files by adding -Wno-redundant-decls to build with PYTHON=python3 (Arnaldo Carvalho de Melo) - Fixes for processing inline frames in backtraces using DWARF based unwinding (Milian Wolff) - Cope with bad DWARF info for function names for inline frames,not trying to demangle this symbol. Problem reported with rust but reproduced as well with C++. Problem reported to the libbpf maintainers (Milian Wolff) - Fix python export to postgresql and sqlite code (Adrian Hunter) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | | perf record: Use unmapped IP for inline callchain cursorsMilian Wolff2018-10-051-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only use the mapped IP to find inline frames, but keep using the unmapped IP for the callchain cursor. This ensures we properly show the unmapped IP when displaying a frame we received via the dso__parse_addr_inlines API for a module which does not contain sufficient debug symbols to show the srcline. This is another follow-up to commit 19610184693c ("perf script: Show virtual addresses instead of offsets"). Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sandipan Das <sandipan@linux.ibm.com> Fixes: 19610184693c ("perf script: Show virtual addresses instead of offsets") Link: http://lkml.kernel.org/r/20180926135207.30263-2-milian.wolff@kdab.com Link: http://lkml.kernel.org/r/20181002073949.3297-1-milian.wolff@kdab.com [ Squashed a fix from Milian for a problem reported by Ravi, fixed up space damage ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | perf python: Use -Wno-redundant-decls to build with PYTHON=python3Arnaldo Carvalho de Melo2018-10-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When building in ClearLinux using 'make PYTHON=python3' with gcc 8.2.1 it fails with: GEN /tmp/build/perf/python/perf.so In file included from /usr/include/python3.7m/Python.h:126, from /git/linux/tools/perf/util/python.c:2: /usr/include/python3.7m/import.h:58:24: error: redundant redeclaration of ‘_PyImport_AddModuleObject’ [-Werror=redundant-decls] PyAPI_FUNC(PyObject *) _PyImport_AddModuleObject(PyObject *, PyObject *); ^~~~~~~~~~~~~~~~~~~~~~~~~ /usr/include/python3.7m/import.h:47:24: note: previous declaration of ‘_PyImport_AddModuleObject’ was here PyAPI_FUNC(PyObject *) _PyImport_AddModuleObject(PyObject *name, ^~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors error: command 'gcc' failed with exit status 1 And indeed there is a redundant declaration in that Python.h file, one with parameter names and the other without, so just add -Wno-error=redundant-decls to the python setup instructions. Now perf builds with gcc in ClearLinux with the following Dockerfile: # docker.io/acmel/linux-perf-tools-build-clearlinux:latest FROM docker.io/clearlinux:latest MAINTAINER Arnaldo Carvalho de Melo <acme@kernel.org> RUN swupd update && \ swupd bundle-add sysadmin-basic-dev RUN mkdir -m 777 -p /git /tmp/build/perf /tmp/build/objtool /tmp/build/linux && \ groupadd -r perfbuilder && \ useradd -m -r -g perfbuilder perfbuilder && \ chown -R perfbuilder.perfbuilder /tmp/build/ /git/ USER perfbuilder COPY rx_and_build.sh / ENV EXTRA_MAKE_ARGS=PYTHON=python3 ENTRYPOINT ["/rx_and_build.sh"] Now to figure out why the build fails with clang, that is present in the above container as detected by the rx_and_build.sh script: clang version 6.0.1 (tags/RELEASE_601/final) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/sbin make: Entering directory '/git/linux/tools/perf' BUILD: Doing 'make -j4' parallel build HOSTCC /tmp/build/perf/fixdep.o HOSTLD /tmp/build/perf/fixdep-in.o LINK /tmp/build/perf/fixdep Auto-detecting system features: ... dwarf: [ OFF ] ... dwarf_getlocations: [ OFF ] ... glibc: [ OFF ] ... gtk2: [ OFF ] ... libaudit: [ OFF ] ... libbfd: [ OFF ] ... libelf: [ OFF ] ... libnuma: [ OFF ] ... numa_num_possible_cpus: [ OFF ] ... libperl: [ OFF ] ... libpython: [ OFF ] ... libslang: [ OFF ] ... libcrypto: [ OFF ] ... libunwind: [ OFF ] ... libdw-dwarf-unwind: [ OFF ] ... zlib: [ OFF ] ... lzma: [ OFF ] ... get_cpuid: [ OFF ] ... bpf: [ OFF ] Makefile.config:331: *** No gnu/libc-version.h found, please install glibc-dev[el]. Stop. make[1]: *** [Makefile.perf:206: sub-make] Error 2 make: *** [Makefile:70: all] Error 2 make: Leaving directory '/git/linux/tools/perf' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thiago Macieira <thiago.macieira@intel.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-c3khb9ac86s00qxzjrueomme@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | perf report: Don't try to map ip to invalid mapMilian Wolff2018-09-271-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes a crash when the report encounters an address that could not be associated with an mmaped region: #0 0x00005555557bdc4a in callchain_srcline (ip=<error reading variable: Cannot access memory at address 0x38>, sym=0x0, map=0x0) at util/machine.c:2329 #1 unwind_entry (entry=entry@entry=0x7fffffff9180, arg=arg@entry=0x7ffff5642498) at util/machine.c:2329 #2 0x00005555558370af in entry (arg=0x7ffff5642498, cb=0x5555557bdb50 <unwind_entry>, thread=<optimized out>, ip=18446744073709551615) at util/unwind-libunwind-local.c:586 #3 get_entries (ui=ui@entry=0x7fffffff9620, cb=0x5555557bdb50 <unwind_entry>, arg=0x7ffff5642498, max_stack=<optimized out>) at util/unwind-libunwind-local.c:703 #4 0x0000555555837192 in _unwind__get_entries (cb=<optimized out>, arg=<optimized out>, thread=<optimized out>, data=<optimized out>, max_stack=<optimized out>) at util/unwind-libunwind-local.c:725 #5 0x00005555557c310f in thread__resolve_callchain_unwind (max_stack=127, sample=0x7fffffff9830, evsel=0x555555c7b3b0, cursor=0x7ffff5642498, thread=0x555555c7f6f0) at util/machine.c:2351 #6 thread__resolve_callchain (thread=0x555555c7f6f0, cursor=0x7ffff5642498, evsel=0x555555c7b3b0, sample=0x7fffffff9830, parent=0x7fffffff97b8, root_al=0x7fffffff9750, max_stack=127) at util/machine.c:2378 #7 0x00005555557ba4ee in sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fffffff97b8, evsel=<optimized out>, al=al@entry=0x7fffffff9750, max_stack=<optimized out>) at util/callchain.c:1085 Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Tested-by: Sandipan Das <sandipan@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: 2a9d5050dc84 ("perf script: Show correct offsets for DWARF-based unwinding") Link: http://lkml.kernel.org/r/20180926135207.30263-1-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | perf script python: Fix export-to-sqlite.py sample columnsAdrian Hunter2018-09-251-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the "branches" export option, not all sample columns are exported. However the unwanted columns are not at the end of the tuple, as assumed by the code. Fix by taking the first 15 and last 3 values, instead of the first 18. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180911114504.28516-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | perf script python: Fix export-to-postgresql.py occasional failureAdrian Hunter2018-09-251-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Occasional export failures were found to be caused by truncating 64-bit pointers to 32-bits. Fix by explicitly setting types for all ctype arguments and results. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180911114504.28516-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | | | Merge tag 'for-4.19/dm-fixes-4' of ↵Greg Kroah-Hartman2018-10-112-2/+4
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Mike writes: "device mapper fix for 4.19 final - Fix for earlier 4.19 final DM linear change that incorrectly checked for CONFIG_DM_ZONED rather than CONFIG_BLK_DEV_ZONED." * tag 'for-4.19/dm-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm linear: fix linear_end_io conditional definition
| * | | | | dm linear: fix linear_end_io conditional definitionDamien Le Moal2018-10-112-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The dm-linear target is independent of the dm-zoned target. For code requiring support for zoned block devices, use CONFIG_BLK_DEV_ZONED instead of CONFIG_DM_ZONED. While at it, similarly to dm linear, also enable the DM_TARGET_ZONED_HM feature in dm-flakey only if CONFIG_BLK_DEV_ZONED is defined. Fixes: beb9caac211c1 ("dm linear: eliminate linear_end_io call if CONFIG_DM_ZONED disabled") Fixes: 0be12c1c7fce7 ("dm linear: add support for zoned block devices") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* | | | | | Merge tag 'xfs-fixes-for-4.19-rc7' of ↵Greg Kroah-Hartman2018-10-111-35/+165
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Dave writes: "xfs: fixes for 4.19-rc7 Update for 4.19-rc7 to fix numerous file clone and deduplication issues." * tag 'xfs-fixes-for-4.19-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix data corruption w/ unaligned reflink ranges xfs: fix data corruption w/ unaligned dedupe ranges xfs: update ctime and remove suid before cloning files xfs: zero posteof blocks when cloning above eof xfs: refactor clonerange preparation into a separate helper
| * | | | | | xfs: fix data corruption w/ unaligned reflink rangesDave Chinner2018-10-061-13/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When reflinking sub-file ranges, a data corruption can occur when the source file range includes a partial EOF block. This shares the unknown data beyond EOF into the second file at a position inside EOF, exposing stale data in the second file. XFS only supports whole block sharing, but we still need to support whole file reflink correctly. Hence if the reflink request includes the last block of the souce file, only proceed with the reflink operation if it lands at or past the destination file's current EOF. If it lands within the destination file EOF, reject the entire request with -EINVAL and make the caller go the hard way. This avoids the data corruption vector, but also avoids disruption of returning EINVAL to userspace for the common case of whole file cloning. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | | | | | xfs: fix data corruption w/ unaligned dedupe rangesDave Chinner2018-10-061-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A deduplication data corruption is Exposed by fstests generic/505 on XFS. It is caused by extending the block match range to include the partial EOF block, but then allowing unknown data beyond EOF to be considered a "match" to data in the destination file because the comparison is only made to the end of the source file. This corrupts the destination file when the source extent is shared with it. XFS only supports whole block dedupe, but we still need to appear to support whole file dedupe correctly. Hence if the dedupe request includes the last block of the souce file, don't include it in the actual XFS dedupe operation. If the rest of the range dedupes successfully, then report the partial last block as deduped, too, so that userspace sees it as a successful dedupe rather than return EINVAL because we can't dedupe unaligned blocks. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | | | | | xfs: update ctime and remove suid before cloning filesDarrick J. Wong2018-10-051-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before cloning into a file, update the ctime and remove sensitive attributes like suid, just like we'd do for a regular file write. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | | | | | xfs: zero posteof blocks when cloning above eofDarrick J. Wong2018-10-051-8/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we're reflinking between two files and the destination file range is well beyond the destination file's EOF marker, zero any posteof speculative preallocations in the destination file so that we don't expose stale disk contents. The previous strategy of trying to clear the preallocations does not work if the destination file has the PREALLOC flag set. Uncovered by shared/010. Reported-by: Zorro Lang <zlang@redhat.com> Bugzilla-id: https://bugzilla.kernel.org/show_bug.cgi?id=201259 Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | | | | | xfs: refactor clonerange preparation into a separate helperDarrick J. Wong2018-10-051-27/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor all the reflink preparation steps into a separate helper that we'll use to land all the upcoming fixes for insufficient input checks. This rework also moves the invalidation of the destination range to the prep function so that it is done before the range is remapped. This ensures that nobody can access the data in range being remapped until the remap is complete. [dgc: fix xfs_reflink_remap_prep() return value and caller check to handle vfs_clone_file_prep_inodes() returning 0 to mean "nothing to do". ] [dgc: make sure length changed by vfs_clone_file_prep_inodes() gets propagated back to XFS code that does the remapping. ] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | | | | | | Merge tag 'for-4.19/dm-fixes-3' of ↵Greg Kroah-Hartman2018-10-103-11/+29
|\ \ \ \ \ \ \ | | |/ / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Mike writes: "device mapper fixes for 4.19 final - Fix a DM cache module init error path bug that doesn't properly cleanup a KMEM_CACHE if target registration fails. - Two stable@ fixes for DM zoned target; 4.20 will have changes that eliminate this code entirely but <= 4.19 needs these changes." * tag 'for-4.19/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm linear: eliminate linear_end_io call if CONFIG_DM_ZONED disabled dm: fix report zone remapping to account for partition offset dm cache: destroy migration_cache if cache target registration failed
| * | | | | | dm linear: eliminate linear_end_io call if CONFIG_DM_ZONED disabledMike Snitzer2018-10-101-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is best to avoid any extra overhead associated with bio completion. DM core will indirectly call a DM target's .end_io if it is defined. In the case of DM linear, there is no need to do so (for every bio that completes) if CONFIG_DM_ZONED is not enabled. Avoiding an extra indirect call for every bio completion is very important for ensuring DM linear doesn't incur more overhead that further widens the performance gap between dm-linear and raw block devices. Fixes: 0be12c1c7fce7 ("dm linear: add support for zoned block devices") Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * | | | | | dm: fix report zone remapping to account for partition offsetDamien Le Moal2018-10-091-7/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If dm-linear or dm-flakey are layered on top of a partition of a zoned block device, remapping of the start sector and write pointer position of the zones reported by a report zones BIO must be modified to account for the target table entry mapping (start offset within the device and entry mapping with the dm device). If the target's backing device is a partition of a whole disk, the start sector on the physical device of the partition must also be accounted for when modifying the zone information. However, dm_remap_zone_report() was not considering this last case, resulting in incorrect zone information remapping with targets using disk partitions. Fix this by calculating the target backing device start sector using the position of the completed report zones BIO and the unchanged position and size of the original report zone BIO. With this value calculated, the start sector and write pointer position of the target zones can be correctly remapped. Fixes: 10999307c14e ("dm: introduce dm_remap_zone_report()") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * | | | | | dm cache: destroy migration_cache if cache target registration failedShenghui Wang2018-10-091-3/+2
| | |_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 7e6358d244e47 ("dm: fix various targets to dm_register_target after module __init resources created") inadvertently introduced this bug when it moved dm_register_target() after the call to KMEM_CACHE(). Fixes: 7e6358d244e47 ("dm: fix various targets to dm_register_target after module __init resources created") Cc: stable@vger.kernel.org Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* | | | | | Merge tag 'trace-v4.19-rc5' of ↵Greg Kroah-Hartman2018-10-101-1/+1
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Steven writes: "vsprint fix: It was reported that trace_printk() was not reporting properly values that came after a dereference pointer. trace_printk() utilizes vbin_printf() and bstr_printf() to keep the overhead of tracing down. vbin_printf() does not do any conversions and just stors the string format and the raw arguments into the buffer. bstr_printf() is used to read the buffer and does the conversions to complete the printf() output. This can be troublesome with dereferenced pointers because the reference may be different from the time vbin_printf() is called to the time bstr_printf() is called. To fix this, a prior commit changed vbin_printf() to convert dereferenced pointers into strings and load the converted string into the buffer. But the change to bstr_printf() had an off-by-one error and didn't account for the nul character at the end of the string and this corrupted the rest of the values in the format that came after a dereferenced pointer." * tag 'trace-v4.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: vsprintf: Fix off-by-one bug in bstr_printf() processing dereferenced pointers
| * | | | | | vsprintf: Fix off-by-one bug in bstr_printf() processing dereferenced pointersSteven Rostedt (VMware)2018-10-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The functions vbin_printf() and bstr_printf() are used by trace_printk() to try to keep the overhead down during printing. trace_printk() uses vbin_printf() at the time of execution, as it only scans the fmt string to record the printf values into the buffer, and then uses vbin_printf() to do the conversions to print the string based on the format and the saved values in the buffer. This is an issue for dereferenced pointers, as before commit 841a915d20c7b, the processing of the pointer could happen some time after the pointer value was recorded (reading the trace buffer). This means the processing of the value at a later time could show different results, or even crash the system, if the pointer no longer existed. Commit 841a915d20c7b addressed this by processing dereferenced pointers at the time of execution and save the result in the ring buffer as a string. The bstr_printf() would then treat these pointers as normal strings, and print the value. But there was an off-by-one bug here, where after processing the argument, it move the pointer only "strlen(arg)" which made the arg pointer not point to the next argument in the ring buffer, but instead point to the nul character of the last argument. This causes any values after a dereferenced pointer to be corrupted. Cc: stable@vger.kernel.org Fixes: 841a915d20c7b ("vsprintf: Do not have bprintf dereference pointers") Reported-by: Nikolay Borisov <nborisov@suse.com> Tested-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
* | | | | | | Merge tag 'devicetree-fixes-for-4.19-3' of ↵Greg Kroah-Hartman2018-10-101-8/+18
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Rob writes: "Devicetree fixes for 4.19, part 3: - Fix DT unittest on Oldworld MAC systems" * tag 'devicetree-fixes-for-4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: unittest: Disable interrupt node tests for old world MAC systems
| * | | | | | | of: unittest: Disable interrupt node tests for old world MAC systemsGuenter Roeck2018-10-101-8/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On systems with OF_IMAP_OLDWORLD_MAC set in of_irq_workarounds, the devicetree interrupt parsing code is different, causing unit tests of devicetree interrupt nodes to fail. Due to a bug in unittest code, which tries to dereference an uninitialized pointer, this results in a crash. OF: /testcase-data/phandle-tests/consumer-a: arguments longer than property Unable to handle kernel paging request for data at address 0x00bc616e Faulting instruction address: 0xc08e9468 Oops: Kernel access of bad area, sig: 11 [#1] BE PREEMPT PowerMac Modules linked in: CPU: 0 PID: 1 Comm: swapper Not tainted 4.14.72-rc1-yocto-standard+ #1 task: cf8e0000 task.stack: cf8da000 NIP: c08e9468 LR: c08ea5bc CTR: c08ea5ac REGS: cf8dbb50 TRAP: 0300 Not tainted (4.14.72-rc1-yocto-standard+) MSR: 00001032 <ME,IR,DR,RI> CR: 82004044 XER: 00000000 DAR: 00bc616e DSISR: 40000000 GPR00: c08ea5bc cf8dbc00 cf8e0000 c13ca517 c13ca517 c13ca8a0 00000066 00000002 GPR08: 00000063 00bc614e c0b05865 000affff 82004048 00000000 c00047f0 00000000 GPR16: c0a80000 c0a9cc34 c13ca517 c0ad1134 05ffffff 000affff c0b05860 c0abeef8 GPR24: cecec278 cecec278 c0a8c4d0 c0a885e0 c13ca8a0 05ffffff c13ca8a0 c13ca517 NIP [c08e9468] device_node_gen_full_name+0x30/0x15c LR [c08ea5bc] device_node_string+0x190/0x3c8 Call Trace: [cf8dbc00] [c007f670] trace_hardirqs_on_caller+0x118/0x1fc (unreliable) [cf8dbc40] [c08ea5bc] device_node_string+0x190/0x3c8 [cf8dbcb0] [c08eb794] pointer+0x25c/0x4d0 [cf8dbd00] [c08ebcbc] vsnprintf+0x2b4/0x5ec [cf8dbd60] [c08ec00c] vscnprintf+0x18/0x48 [cf8dbd70] [c008e268] vprintk_store+0x4c/0x22c [cf8dbda0] [c008ecac] vprintk_emit+0x94/0x130 [cf8dbdd0] [c008ff54] printk+0x5c/0x6c [cf8dbe10] [c0b8ddd4] of_unittest+0x2220/0x26f8 [cf8dbea0] [c0004434] do_one_initcall+0x4c/0x184 [cf8dbf00] [c0b4534c] kernel_init_freeable+0x13c/0x1d8 [cf8dbf30] [c0004814] kernel_init+0x24/0x118 [cf8dbf40] [c0013398] ret_from_kernel_thread+0x5c/0x64 The problem was observed when running a qemu test for the g3beige machine with devicetree unittests enabled. Disable interrupt node tests on affected systems to avoid both false unittest failures and the crash. With this patch in place, unittest on the affected system passes with the following message. dt-test ### end of unittest - 144 passed, 0 failed Fixes: 53a42093d96ef ("of: Add device tree selftests") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Frank Rowand <frank.rowand@sony.com> Signed-off-by: Rob Herring <robh@kernel.org>
* | | | | | | | Merge tag 'tag-chrome-platform-fixes-for-v4.19-rc8' of ↵Greg Kroah-Hartman2018-10-101-1/+1
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform Benson writes: "chrome-platform fix for v4.19-rc8 This contains a fix to 57e94c8b974d ("mfd: cros-ec: Increase maximum mkbp event size"), which caused cros_ec based chromebooks to truncate an entire column of their built-in keyboard." * tag 'tag-chrome-platform-fixes-for-v4.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform: mfd: cros-ec: copy the whole event in get_next_event_xfer
| * | | | | | | | mfd: cros-ec: copy the whole event in get_next_event_xferEmil Karlson2018-10-101-1/+1
| | |_|/ / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 57e94c8b974db2d83c60e1139c89a70806abbea0 caused cros-ec keyboard events be truncated on many chromebooks so that Left and Right keys on Column 12 were always 0. Use ret as memcpy len to fix this. The old code was using ec_dev->event_size, which is the event payload/data size excluding event_type header, for the length of the memcpy operation. Use ret as memcpy length to avoid the off by one and copy the whole msg->data. Fixes: 57e94c8b974d ("mfd: cros-ec: Increase maximum mkbp event size") Acked-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Tested-by: Emil Renner Berthing <kernel@esmil.dk> Signed-off-by: Emil Karlson <jekarlson@gmail.com> Signed-off-by: Benson Leung <bleung@chromium.org>
* | | | | | | | Merge branch 'for-4.19-fixes' of ↵Greg Kroah-Hartman2018-10-101-0/+1
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu Dennis writes: "percpu fixes for-4.19-rc8 The new percpu allocator introduced in 4.14 had a missing free for the percpu metadata. This caused a memory leak when percpu memory is being churned resulting in the allocation and deallocation of percpu memory chunks" * 'for-4.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu: percpu: stop leaking bitmap metadata blocks
| * | | | | | | | percpu: stop leaking bitmap metadata blocksMike Rapoport2018-10-071-0/+1
| |/ / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit ca460b3c9627 ("percpu: introduce bitmap metadata blocks") introduced bitmap metadata blocks. These metadata blocks are allocated whenever a new chunk is created, but they are never freed. Fix it. Fixes: ca460b3c9627 ("percpu: introduce bitmap metadata blocks") Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Dennis Zhou <dennis@kernel.org>
* | | | | | | | Merge tag 'gfs2-4.19.fixes2' of ↵Greg Kroah-Hartman2018-10-101-0/+4
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Andreas writes: "gfs2 4.19 fix: This fixes a regression introduced in commit 64bc06bb32ee "gfs2: iomap buffered write support"" * tag 'gfs2-4.19.fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Fix iomap buffered write support for journaled files
| * | | | | | | | gfs2: Fix iomap buffered write support for journaled filesAndreas Gruenbacher2018-10-091-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 64bc06bb32ee broke buffered writes to journaled files (chattr +j): we'll try to journal the buffer heads of the page being written to in gfs2_iomap_journaled_page_done. However, the iomap code no longer creates buffer heads, so we'll BUG() in gfs2_page_add_databufs. Fix that by creating buffer heads ourself when needed. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* | | | | | | | | Merge tag 's390-4.19-4' of ↵Greg Kroah-Hartman2018-10-109-18/+44
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Martin writes: "s390 fixes for 4.19-rc8 Four more patches for 4.19: - Fix resume after suspend-to-disk if resume-CPU != suspend-CPU - Fix vfio-ccw check for pinned pages - Two patches to avoid a usercopy-whitelist warning in vfio-ccw" * tag 's390-4.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cio: Fix how vfio-ccw checks pinned pages s390/cio: Refactor alloc of ccw_io_region s390/cio: Convert ccw_io_region to pointer s390/hibernate: fix error handling when suspend cpu != resume cpu
| * \ \ \ \ \ \ \ \ Merge tag 'vfio-ccw-20181002' of ↵Martin Schwidefsky2018-10-081-1/+1
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw into fixes Pull vfio-ccw from Cornelia Huck with the following changes: - Another fix for vfio-ccw: make sure it accesses the correct entries in the pfn_array_table arrays when checking pinned pages.
| | * | | | | | | | | s390/cio: Fix how vfio-ccw checks pinned pagesEric Farman2018-10-021-1/+1
| |/ / / / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have two nested loops to check the entries within the pfn_array_table arrays. But we mistakenly use the outer array as an index in our check, and completely ignore the indexing performed by the inner loop. Cc: stable@vger.kernel.org Signed-off-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20181002010235.42483-1-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | | | | | | | | Merge tag 'vfio-ccw-20181001' of ↵Martin Schwidefsky2018-10-014-7/+29
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw into fixes Pull vfio-ccw from Cornelia Huck with the following changes: - Change allocation of ccw_io_region so that the usercopy hardening code can figure out that everything is fine.
| | * | | | | | | | | s390/cio: Refactor alloc of ccw_io_regionEric Farman2018-09-271-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If I attach a vfio-ccw device to my guest, I get the following warning on the host when the host kernel is CONFIG_HARDENED_USERCOPY=y [250757.595325] Bad or missing usercopy whitelist? Kernel memory overwrite attempt detected to SLUB object 'dma-kmalloc-512' (offset 64, size 124)! [250757.595365] WARNING: CPU: 2 PID: 10958 at mm/usercopy.c:81 usercopy_warn+0xac/0xd8 [250757.595369] Modules linked in: kvm vhost_net vhost tap xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c devlink tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables sunrpc dm_multipath s390_trng crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 des_generic sha512_s390 sha1_s390 eadm_sch tape_3590 tape tape_class qeth_l2 qeth ccwgroup vfio_ccw vfio_mdev zcrypt_cex4 mdev vfio_iommu_type1 zcrypt vfio sha256_s390 sha_common zfcp scsi_transport_fc qdio dasd_eckd_mod dasd_mod [250757.595424] CPU: 2 PID: 10958 Comm: CPU 2/KVM Not tainted 4.18.0-derp #2 [250757.595426] Hardware name: IBM 3906 M05 780 (LPAR) ...snip regs... [250757.595523] Call Trace: [250757.595529] ([<0000000000349210>] usercopy_warn+0xa8/0xd8) [250757.595535] [<000000000032daaa>] __check_heap_object+0xfa/0x160 [250757.595540] [<0000000000349396>] __check_object_size+0x156/0x1d0 [250757.595547] [<000003ff80332d04>] vfio_ccw_mdev_write+0x74/0x148 [vfio_ccw] [250757.595552] [<000000000034ed12>] __vfs_write+0x3a/0x188 [250757.595556] [<000000000034f040>] vfs_write+0xa8/0x1b8 [250757.595559] [<000000000034f4e6>] ksys_pwrite64+0x86/0xc0 [250757.595568] [<00000000008959a0>] system_call+0xdc/0x2b0 [250757.595570] Last Breaking-Event-Address: [250757.595573] [<0000000000349210>] usercopy_warn+0xa8/0xd8 While vfio_ccw_mdev_{write|read} validates that the input position/count does not run over the ccw_io_region struct, the usercopy code that does copy_{to|from}_user doesn't necessarily know this. It sees the variable length and gets worried that it's affecting a normal kmalloc'd struct, and generates the above warning. Adjust how the ccw_io_region is alloc'd with a whitelist to remove this warning. The boundary checking will continue to do its thing. Signed-off-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20180921204013.95804-3-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| | * | | | | | | | | s390/cio: Convert ccw_io_region to pointerEric Farman2018-09-274-7/+17
| |/ / / / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the event that we want to change the layout of the ccw_io_region in the future[1], it might be easier to work with it as a pointer within the vfio_ccw_private struct rather than an embedded struct. [1] https://patchwork.kernel.org/comment/22228541/ Signed-off-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20180921204013.95804-2-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | | | | | | | | s390/hibernate: fix error handling when suspend cpu != resume cpuGerald Schaefer2018-09-204-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The resume code checks if the resume cpu is the same as the suspend cpu. If not, and if it is also not possible to switch to the suspend cpu, an error message should be printed and the resume process should be stopped by loading a disabled wait psw. The current logic is broken in multiple ways, the message is never printed, and the disabled wait psw never loaded because the kernel panics before that: - sam31 and SIGP_SET_ARCHITECTURE to ESA mode is wrong, this will break on the first 64bit instruction in sclp_early_printk(). - The init stack should be used, but the stack pointer is not set up correctly (missing aghi %r15,-STACK_FRAME_OVERHEAD). - __sclp_early_printk() checks the sclp_init_state. If it is not sclp_init_state_uninitialized, it simply returns w/o printing anything. In the resumed kernel however, sclp_init_state will never be uninitialized. This patch fixes those issues by removing the sam31/ESA logic, adding a correct init stack pointer, and also introducing sclp_early_printk_force() to allow using sclp_early_printk() even when sclp_init_state is not uninitialized. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | | | | | Merge tag 'mips_fixes_4.19_2' of ↵Greg Kroah-Hartman2018-10-106-28/+80
|\ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Paul writes: "A few MIPS fixes for 4.19: - Avoid suboptimal placement of our VDSO when using the legacy mmap layout, which can prevent statically linked programs that were able to allocate large amounts of memory using the brk syscall prior to the introduction of our VDSO from functioning correctly. - Fix up CONFIG_CMDLINE handling for platforms which ought to ignore DT arguments but have incorrectly used them & lost other arguments since v3.16. - Fix a path in MAINTAINERS to use valid wildcards. - Fixup a regression from v4.17 in memset() for systems using CPU_DADDI_WORKAROUNDS." * tag 'mips_fixes_4.19_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: memset: Fix CPU_DADDI_WORKAROUNDS `small_fixup' regression MAINTAINERS: MIPS/LOONGSON2 ARCHITECTURE - Use the normal wildcard style MIPS: Fix CONFIG_CMDLINE handling MIPS: VDSO: Always map near top of user memory
| * | | | | | | | | | MIPS: memset: Fix CPU_DADDI_WORKAROUNDS `small_fixup' regressionMaciej W. Rozycki2018-10-051-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a commit 8a8158c85e1e ("MIPS: memset.S: EVA & fault support for small_memset") regression and remove assembly warnings: arch/mips/lib/memset.S: Assembler messages: arch/mips/lib/memset.S:243: Warning: Macro instruction expanded into multiple instructions in a branch delay slot triggering with the CPU_DADDI_WORKAROUNDS option set and this code: PTR_SUBU a2, t1, a0 jr ra PTR_ADDIU a2, 1 This is because with that option in place the DADDIU instruction, which the PTR_ADDIU CPP macro expands to, becomes a GAS macro, which in turn expands to an LI/DADDU (or actually ADDIU/DADDU) sequence: 13c: 01a4302f dsubu a2,t1,a0 140: 03e00008 jr ra 144: 24010001 li at,1 148: 00c1302d daddu a2,a2,at ... Correct this by switching off the `noreorder' assembly mode and letting GAS schedule this jump's delay slot, as there is nothing special about it that would require manual scheduling. With this change in place correct code is produced: 13c: 01a4302f dsubu a2,t1,a0 140: 24010001 li at,1 144: 03e00008 jr ra 148: 00c1302d daddu a2,a2,at ... Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: 8a8158c85e1e ("MIPS: memset.S: EVA & fault support for small_memset") Patchwork: https://patchwork.linux-mips.org/patch/20833/ Cc: Ralf Baechle <ralf@linux-mips.org> Cc: stable@vger.kernel.org # 4.17+
| * | | | | | | | | | MAINTAINERS: MIPS/LOONGSON2 ARCHITECTURE - Use the normal wildcard styleJoe Perches2018-10-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Neither git nor get_maintainer understands the curly brace style. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20821/ Cc: Huacai Chen <chenhc@lemote.com> Cc: linux-mips <linux-mips@linux-mips.org> Cc: LKML <linux-kernel@vger.kernel.org>
| * | | | | | | | | | MIPS: Fix CONFIG_CMDLINE handlingPaul Burton2018-09-281-20/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 8ce355cf2e38 ("MIPS: Setup boot_command_line before plat_mem_setup") fixed a problem for systems which have CONFIG_CMDLINE_BOOL=y & use a DT with a chosen node that has either no bootargs property or an empty one. In this configuration early_init_dt_scan_chosen() copies CONFIG_CMDLINE into boot_command_line, but the MIPS code doesn't know this so it appends CONFIG_CMDLINE (via builtin_cmdline) to boot_command_line again. The result is that boot_command_line contains the arguments from CONFIG_CMDLINE twice. That commit took the approach of simply setting up boot_command_line from the MIPS code before early_init_dt_scan_chosen() runs, causing it not to copy CONFIG_CMDLINE to boot_command_line if a chosen node with no bootargs property is found. Unfortunately this is problematic for systems which do have a non-empty bootargs property & CONFIG_CMDLINE_BOOL=y. There early_init_dt_scan_chosen() will overwrite boot_command_line with the arguments from DT, which means we lose those from CONFIG_CMDLINE entirely. This breaks CONFIG_MIPS_CMDLINE_DTB_EXTEND. If we have CONFIG_MIPS_CMDLINE_FROM_BOOTLOADER or CONFIG_MIPS_CMDLINE_BUILTIN_EXTEND selected and the DT has a bootargs property which we should ignore, it will instead be honoured breaking those configurations too. Fix this by reverting commit 8ce355cf2e38 ("MIPS: Setup boot_command_line before plat_mem_setup") to restore the former behaviour, and fixing the CONFIG_CMDLINE duplication issue by initializing boot_command_line to a non-empty string that early_init_dt_scan_chosen() will not overwrite with CONFIG_CMDLINE. This is a little ugly, but cleanup in this area is on its way. In the meantime this is at least easy to backport & contains the ugliness within arch/mips/. Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: 8ce355cf2e38 ("MIPS: Setup boot_command_line before plat_mem_setup") References: https://patchwork.linux-mips.org/patch/18804/ Patchwork: https://patchwork.linux-mips.org/patch/20813/ Cc: Frank Rowand <frowand.list@gmail.com> Cc: Jaedon Shin <jaedon.shin@gmail.com> Cc: Mathieu Malaterre <malat@debian.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: devicetree@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.16+
| * | | | | | | | | | MIPS: VDSO: Always map near top of user memoryPaul Burton2018-09-283-6/+47
| | |_|_|_|/ / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using the legacy mmap layout, for example triggered using ulimit -s unlimited, get_unmapped_area() fills memory from bottom to top starting from a fairly low address near TASK_UNMAPPED_BASE. This placement is suboptimal if the user application wishes to allocate large amounts of heap memory using the brk syscall. With the VDSO being located low in the user's virtual address space, the amount of space available for access using brk is limited much more than it was prior to the introduction of the VDSO. For example: # ulimit -s unlimited; cat /proc/self/maps 00400000-004ec000 r-xp 00000000 08:00 71436 /usr/bin/coreutils 004fc000-004fd000 rwxp 000ec000 08:00 71436 /usr/bin/coreutils 004fd000-0050f000 rwxp 00000000 00:00 0 00cc3000-00ce4000 rwxp 00000000 00:00 0 [heap] 2ab96000-2ab98000 r--p 00000000 00:00 0 [vvar] 2ab98000-2ab99000 r-xp 00000000 00:00 0 [vdso] 2ab99000-2ab9d000 rwxp 00000000 00:00 0 ... Resolve this by adjusting STACK_TOP to reserve space for the VDSO & providing an address hint to get_unmapped_area() causing it to use this space even when using the legacy mmap layout. We reserve enough space for the VDSO, plus 1MB or 256MB for 32 bit & 64 bit systems respectively within which we randomize the VDSO base address. Previously this randomization was taken care of by the mmap base address randomization performed by arch_mmap_rnd(). The 1MB & 256MB sizes are somewhat arbitrary but chosen such that we have some randomization without taking up too much of the user's virtual address space, which is often in short supply for 32 bit systems. With this the VDSO is always mapped at a high address, leaving lots of space for statically linked programs to make use of brk: # ulimit -s unlimited; cat /proc/self/maps 00400000-004ec000 r-xp 00000000 08:00 71436 /usr/bin/coreutils 004fc000-004fd000 rwxp 000ec000 08:00 71436 /usr/bin/coreutils 004fd000-0050f000 rwxp 00000000 00:00 0 00c28000-00c49000 rwxp 00000000 00:00 0 [heap] ... 7f67c000-7f69d000 rwxp 00000000 00:00 0 [stack] 7f7fc000-7f7fd000 rwxp 00000000 00:00 0 7fcf1000-7fcf3000 r--p 00000000 00:00 0 [vvar] 7fcf3000-7fcf4000 r-xp 00000000 00:00 0 [vdso] Signed-off-by: Paul Burton <paul.burton@mips.com> Reported-by: Huacai Chen <chenhc@lemote.com> Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO") Cc: Huacai Chen <chenhc@lemote.com> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.4+
* | | | | | | | | | Merge tag 'arc-4.19-rc8' of ↵Greg Kroah-Hartman2018-10-093-25/+23
|\ \ \ \ \ \ \ \ \ \ | |_|_|/ / / / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Vineet writes: "ARC updates for 4.19-rc8 - Fix clone syscall to update Thread pointer register - Make/build updates (needed for AGL/OE builds) [Alexey] - Typo fix [Colin Ian King]" * tag 'arc-4.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: clone syscall to setp r25 as thread pointer ARC: build: Don't set CROSS_COMPILE in arch's Makefile ARC: fix spelling mistake "entires" -> "entries" ARC: build: Get rid of toolchain check ARCv2: build: use mcpu=hs38 iso generic mcpu=archs
| * | | | | | | | | ARC: clone syscall to setp r25 as thread pointerVineet Gupta2018-10-051-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Per ARC TLS ABI, r25 is designated TP (thread pointer register). However so far kernel didn't do any special treatment, like setting up usermode r25, even for CLONE_SETTLS. We instead relied on libc runtime to do this, in say clone libc wrapper [1]. This was deliberate to keep kernel ABI agnostic (userspace could potentially change TP, specially for different ARC ISA say ARCompact vs. ARCv2 with different spare registers etc) However userspace setting up r25, after clone syscall opens a race, if child is not scheduled and gets a signal instead. It starts off in userspace not in clone but in a signal handler and anything TP sepcific there such as pthread_self() fails which showed up with uClibc testsuite nptl/tst-kill6 [2] Fix this by having kernel populate r25 to TP value. So this locks in ABI, but it was not going to change anyways, and fwiw is same for both ARCompact (arc700 core) and ARCvs (HS3x cores) [1] https://cgit.uclibc-ng.org/cgi/cgit/uclibc-ng.git/tree/libc/sysdeps/linux/arc/clone.S [2] https://github.com/wbx-github/uclibc-ng-test/blob/master/test/nptl/tst-kill6.c Fixes: ARC STAR 9001378481 Cc: stable@vger.kernel.org Reported-by: Nikita Sobolev <sobolev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>