summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* libnvdimm: Move nvdimm_bus_attribute_group to device_typeDan Williams2019-11-198-25/+14
| | | | | | | | | | | | | | | A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nvdimm_bus_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309903815.1582359.6418211876315050283.stgit@dwillia2-desk3.amr.corp.intel.com
* libnvdimm: Move nvdimm_attribute_group to device_typeDan Williams2019-11-194-27/+20
| | | | | | | | | | | | | | | A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nvdimm_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309903201.1582359.10966209746585062329.stgit@dwillia2-desk3.amr.corp.intel.com
* libnvdimm: Move nd_mapping_attribute_group to device_typeDan Williams2019-11-194-10/+2
| | | | | | | | | | | | | | | A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nd_mapping_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309902686.1582359.6749533709859492704.stgit@dwillia2-desk3.amr.corp.intel.com
* libnvdimm: Move nd_region_attribute_group to device_typeDan Williams2019-11-196-17/+2
| | | | | | | | | | | | | | | A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nd_region_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309902169.1582359.16828508538444551337.stgit@dwillia2-desk3.amr.corp.intel.com
* libnvdimm: Move nd_numa_attribute_group to device_typeDan Williams2019-11-196-5/+3
| | | | | | | | | | | | | | | A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nd_numa_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157401269537.43284.14411189404186877352.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* libnvdimm: Move nd_device_attribute_group to device_typeDan Williams2019-11-179-15/+22
| | | | | | | | | | | | | | | | | | A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nd_device_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. For regions this creates a new nd_region_attribute_groups[] added to the per-region device-type instances. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309901138.1582359.12909354140826530394.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* libnvdimm: Move region attribute group definitionDan Williams2019-11-171-104/+104
| | | | | | | | | | | | | In preparation for moving region attributes from device attribute groups to the region device-type, reorder the declaration so that it can be referenced by the device-type definition without forward declarations. No functional changes are intended to result from this change. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309900624.1582359.6929998072035982264.stgit@dwillia2-desk3.amr.corp.intel.com
* libnvdimm: Move attribute groups to device typeDan Williams2019-11-175-76/+73
| | | | | | | | | | | | | | Statically initialize the attribute groups for each libnvdimm device_type. This is a preparation step for removing unnecessary exports of attributes that can be included in the device_type by default. Also take the opportunity to mark 'struct device_type' instances const. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309900111.1582359.2445687530383470348.stgit@dwillia2-desk3.amr.corp.intel.com
* libnvdimm: Remove prototypes for nonexistent functionsAlastair D'Silva2019-11-171-4/+0
| | | | | | | | | | These functions don't exist, so remove the prototypes for them. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Link: https://lore.kernel.org/r/20191025044721.16617-3-alastair@au1.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* libnvdimm/btt: fix variable 'rc' set but not usedQian Cai2019-11-171-4/+4
| | | | | | | | | | | | | | | | drivers/nvdimm/btt.c: In function 'btt_read_pg': drivers/nvdimm/btt.c:1264:8: warning: variable 'rc' set but not used [-Wunused-but-set-variable] int rc; ^~ Add a ratelimited message in case a storm of errors is encountered. Fixes: d9b83c756953 ("libnvdimm, btt: rework error clearing") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/1572530719-32161-1-git-send-email-cai@lca.pw Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* libnvdimm/pmem: Delete include of nd-core.hDan Williams2019-11-171-1/+0
| | | | | | | | | | | | The entire point of nd-core.h is to hide functionality that no leaf driver should touch. In fact, the commit that added it had no need to include it. Fixes: 06e8ccdab15f ("acpi: nfit: Add support for detect platform...") Cc: Ira Weiny <ira.weiny@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* libnvdimm/namespace: Differentiate between probe mapping and runtime mappingAneesh Kumar K.V2019-11-158-37/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The nvdimm core currently maps the full namespace to an ioremap range while probing the namespace mode. This can result in probe failures on architectures that have limited ioremap space. For example, with a large btt namespace that consumes most of I/O remap range, depending on the sequence of namespace initialization, the user can find a pfn namespace initialization failure due to unavailable I/O remap space which nvdimm core uses for temporary mapping. nvdimm core can avoid this failure by only mapping the reserved info block area to check for pfn superblock type and map the full namespace resource only before using the namespace. Given that personalities like BTT can be layered on top of any namespace type create a generic form of devm_nsio_enable (devm_namespace_enable) and use it inside the per-personality attach routines. Now devm_namespace_enable() is always paired with disable unless the mapping is going to be used for long term runtime access. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/20191017073308.32645-1-aneesh.kumar@linux.ibm.com [djbw: reworks to move devm_namespace_{en,dis}able into *attach helpers] Reported-by: kbuild test robot <lkp@intel.com> Link: https://lore.kernel.org/r/20191031105741.102793-2-aneesh.kumar@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* libnvdimm/pfn_dev: Don't clear device memmap area during generic namespace probeAneesh Kumar K.V2019-11-151-1/+7
| | | | | | | | | | | | | | | nvdimm core use nd_pfn_validate when looking for devdax or fsdax namespace. In this case device resources are allocated against nd_namespace_io dev. In-order to allow remap of range in nd_pfn_clear_memmap_error(), move the device memmap area clearing while initializing pfn namespace. With this device resource are allocated against nd_pfn and we can use nd_pfn->dev for remapping. This also avoids calling nd_pfn_clear_mmap_errors twice. Once while probing the namespace and second while initializing a pfn namespace. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/20191101032728.113001-1-aneesh.kumar@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* libnvdimm: Trivial comment fixIra Weiny2019-11-151-1/+1
| | | | | | Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20190918211933.13213-1-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* libnvdimm/namsepace: Don't set claim_class on errorIra Weiny2019-11-151-10/+10
| | | | | | | | | | | | | | | | Don't leave claim_class set to an invalid value if an error occurs in btt_claim_class(). While we are here change the return type of __holder_class_store() to be clear about the values it is returning. This was found via code inspection. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20190925211348.14082-1-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* Linux 5.4-rc5v5.4-rc5Linus Torvalds2019-10-271-2/+2
|
* Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2019-10-271-5/+9
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "Two fixes for the VMWare guest support: - Unbreak VMWare platform detection which got wreckaged by converting an integer constant to a string constant. - Fix the clang build of the VMWAre hypercall by explicitely specifying the ouput register for INL instead of using the short form" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu/vmware: Fix platform detection VMWARE_PORT macro x86/cpu/vmware: Use the full form of INL in VMWARE_HYPERCALL, for clang/llvm
| * x86/cpu/vmware: Fix platform detection VMWARE_PORT macroThomas Hellstrom2019-10-221-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The platform detection VMWARE_PORT macro uses the VMWARE_HYPERVISOR_PORT definition, but expects it to be an integer. However, when it was moved to the new vmware.h include file, it was changed to be a string to better fit into the VMWARE_HYPERCALL set of macros. This obviously breaks the platform detection VMWARE_PORT functionality. Change the VMWARE_HYPERVISOR_PORT and VMWARE_HYPERVISOR_PORT_HB definitions to be integers, and use __stringify() for their stringified form when needed. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: Borislav Petkov <bp@suse.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: b4dd4f6e3648 ("Add a header file for hypercall definitions") Link: https://lkml.kernel.org/r/20191021172403.3085-3-thomas_os@shipmail.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86/cpu/vmware: Use the full form of INL in VMWARE_HYPERCALL, for clang/llvmThomas Hellstrom2019-10-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LLVM's assembler doesn't accept the short form INL instruction: inl (%%dx) but instead insists on the output register to be explicitly specified. This was previously fixed for the VMWARE_PORT macro. Fix it also for the VMWARE_HYPERCALL macro. Suggested-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Cc: Borislav Petkov <bp@suse.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: clang-built-linux@googlegroups.com Fixes: b4dd4f6e3648 ("Add a header file for hypercall definitions") Link: https://lkml.kernel.org/r/20191021172403.3085-2-thomas_os@shipmail.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds2019-10-273-7/+10
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A small set of fixes for time(keeping): - Add a missing include to prevent compiler warnings. - Make the VDSO implementation of clock_getres() POSIX compliant again. A recent change dropped the NULL pointer guard which is required as NULL is a valid pointer value for this function. - Fix two function documentation typos" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: posix-cpu-timers: Fix two trivial comments timers/sched_clock: Include local timekeeping.h for missing declarations lib/vdso: Make clock_getres() POSIX compliant again
| * | posix-cpu-timers: Fix two trivial commentsYi Wang2019-10-231-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recent changes modified the function arguments of thread_group_sample_cputime() and task_cputimers_expired(), but forgot to update the comments. Fix it up. [ tglx: Changed the argument name of task_cputimers_expired() as the pointer points to an array of samples. ] Fixes: b7be4ef1365d ("posix-cpu-timers: Switch thread group sampling to array") Fixes: 001f7971433a ("posix-cpu-timers: Make expiry checks array based") Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1571643852-21848-1-git-send-email-wang.yi59@zte.com.cn
| * | timers/sched_clock: Include local timekeeping.h for missing declarationsBen Dooks (Codethink)2019-10-231-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Include the timekeeping.h header to get the declaration of the sched_clock_{suspend,resume} functions. Fixes the following sparse warnings: kernel/time/sched_clock.c:275:5: warning: symbol 'sched_clock_suspend' was not declared. Should it be static? kernel/time/sched_clock.c:286:6: warning: symbol 'sched_clock_resume' was not declared. Should it be static? Signed-off-by: Ben Dooks (Codethink) <ben.dooks@codethink.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191022131226.11465-1-ben.dooks@codethink.co.uk
| * | lib/vdso: Make clock_getres() POSIX compliant againThomas Gleixner2019-10-231-4/+5
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A recent commit removed the NULL pointer check from the clock_getres() implementation causing a test case to fault. POSIX requires an explicit NULL pointer check for clock_getres() aside of the validity check of the clock_id argument for obscure reasons. Add it back for both 32bit and 64bit. Note, this is only a partial revert of the offending commit which does not bring back the broken fallback invocation in the the 32bit compat implementations of clock_getres() and clock_gettime(). Fixes: a9446a906f52 ("lib/vdso/32: Remove inconsistent NULL pointer checks") Reported-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Christophe Leroy <christophe.leroy@c-s.fr> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1910211202260.1904@nanos.tec.linutronix.de
* | Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2019-10-2714-21/+71
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A set of perf fixes: kernel: - Unbreak the tracking of auxiliary buffer allocations which got imbalanced causing recource limit failures. - Fix the fallout of splitting of ToPA entries which missed to shift the base entry PA correctly. - Use the correct context to lookup the AUX event when unmapping the associated AUX buffer so the event can be stopped and the buffer reference dropped. tools: - Fix buildiid-cache mode setting in copyfile_mode_ns() when copying /proc/kcore - Fix freeing id arrays in the event list so the correct event is closed. - Sync sched.h anc kvm.h headers with the kernel sources. - Link jvmti against tools/lib/ctype.o to have weak strlcpy(). - Fix multiple memory and file descriptor leaks, found by coverity in perf annotate. - Fix leaks in error handling paths in 'perf c2c', 'perf kmem', found by a static analysis tool" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/aux: Fix AUX output stopping perf/aux: Fix tracking of auxiliary trace buffer allocation perf/x86/intel/pt: Fix base for single entry topa perf kmem: Fix memory leak in compact_gfp_flags() tools headers UAPI: Sync sched.h with the kernel tools headers kvm: Sync kvm.h headers with the kernel sources tools headers kvm: Sync kvm headers with the kernel sources tools headers kvm: Sync kvm headers with the kernel sources perf c2c: Fix memory leak in build_cl_output() perf tools: Fix mode setting in copyfile_mode_ns() perf annotate: Fix multiple memory and file descriptor leaks perf tools: Fix resource leak of closedir() on the error paths perf evlist: Fix fix for freed id arrays perf jvmti: Link against tools/lib/ctype.h to have weak strlcpy()
| * | perf/aux: Fix AUX output stoppingAlexander Shishkin2019-10-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit: 8a58ddae2379 ("perf/core: Fix exclusive events' grouping") allows CAP_EXCLUSIVE events to be grouped with other events. Since all of those also happen to be AUX events (which is not the case the other way around, because arch/s390), this changes the rules for stopping the output: the AUX event may not be on its PMU's context any more, if it's grouped with a HW event, in which case it will be on that HW event's context instead. If that's the case, munmap() of the AUX buffer can't find and stop the AUX event, potentially leaving the last reference with the atomic context, which will then end up freeing the AUX buffer. This will then trip warnings: Fix this by using the context's PMU context when looking for events to stop, instead of the event's PMU context. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20191022073940.61814-1-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | perf/aux: Fix tracking of auxiliary trace buffer allocationThomas Richter2019-10-211-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following commit from the v5.4 merge window: d44248a41337 ("perf/core: Rework memory accounting in perf_mmap()") ... breaks auxiliary trace buffer tracking. If I run command 'perf record -e rbd000' to record samples and saving them in the **auxiliary** trace buffer then the value of 'locked_vm' becomes negative after all trace buffers have been allocated and released: During allocation the values increase: [52.250027] perf_mmap user->locked_vm:0x87 pinned_vm:0x0 ret:0 [52.250115] perf_mmap user->locked_vm:0x107 pinned_vm:0x0 ret:0 [52.250251] perf_mmap user->locked_vm:0x188 pinned_vm:0x0 ret:0 [52.250326] perf_mmap user->locked_vm:0x208 pinned_vm:0x0 ret:0 [52.250441] perf_mmap user->locked_vm:0x289 pinned_vm:0x0 ret:0 [52.250498] perf_mmap user->locked_vm:0x309 pinned_vm:0x0 ret:0 [52.250613] perf_mmap user->locked_vm:0x38a pinned_vm:0x0 ret:0 [52.250715] perf_mmap user->locked_vm:0x408 pinned_vm:0x2 ret:0 [52.250834] perf_mmap user->locked_vm:0x408 pinned_vm:0x83 ret:0 [52.250915] perf_mmap user->locked_vm:0x408 pinned_vm:0x103 ret:0 [52.251061] perf_mmap user->locked_vm:0x408 pinned_vm:0x184 ret:0 [52.251146] perf_mmap user->locked_vm:0x408 pinned_vm:0x204 ret:0 [52.251299] perf_mmap user->locked_vm:0x408 pinned_vm:0x285 ret:0 [52.251383] perf_mmap user->locked_vm:0x408 pinned_vm:0x305 ret:0 [52.251544] perf_mmap user->locked_vm:0x408 pinned_vm:0x386 ret:0 [52.251634] perf_mmap user->locked_vm:0x408 pinned_vm:0x406 ret:0 [52.253018] perf_mmap user->locked_vm:0x408 pinned_vm:0x487 ret:0 [52.253197] perf_mmap user->locked_vm:0x408 pinned_vm:0x508 ret:0 [52.253374] perf_mmap user->locked_vm:0x408 pinned_vm:0x589 ret:0 [52.253550] perf_mmap user->locked_vm:0x408 pinned_vm:0x60a ret:0 [52.253726] perf_mmap user->locked_vm:0x408 pinned_vm:0x68b ret:0 [52.253903] perf_mmap user->locked_vm:0x408 pinned_vm:0x70c ret:0 [52.254084] perf_mmap user->locked_vm:0x408 pinned_vm:0x78d ret:0 [52.254263] perf_mmap user->locked_vm:0x408 pinned_vm:0x80e ret:0 The value of user->locked_vm increases to a limit then the memory is tracked by pinned_vm. During deallocation the size is subtracted from pinned_vm until it hits a limit. Then a larger value is subtracted from locked_vm leading to a large number (because of type unsigned): [64.267797] perf_mmap_close mmap_user->locked_vm:0x408 pinned_vm:0x78d [64.267826] perf_mmap_close mmap_user->locked_vm:0x408 pinned_vm:0x70c [64.267848] perf_mmap_close mmap_user->locked_vm:0x408 pinned_vm:0x68b [64.267869] perf_mmap_close mmap_user->locked_vm:0x408 pinned_vm:0x60a [64.267891] perf_mmap_close mmap_user->locked_vm:0x408 pinned_vm:0x589 [64.267911] perf_mmap_close mmap_user->locked_vm:0x408 pinned_vm:0x508 [64.267933] perf_mmap_close mmap_user->locked_vm:0x408 pinned_vm:0x487 [64.267952] perf_mmap_close mmap_user->locked_vm:0x408 pinned_vm:0x406 [64.268883] perf_mmap_close mmap_user->locked_vm:0x307 pinned_vm:0x406 [64.269117] perf_mmap_close mmap_user->locked_vm:0x206 pinned_vm:0x406 [64.269433] perf_mmap_close mmap_user->locked_vm:0x105 pinned_vm:0x406 [64.269536] perf_mmap_close mmap_user->locked_vm:0x4 pinned_vm:0x404 [64.269797] perf_mmap_close mmap_user->locked_vm:0xffffffffffffff84 pinned_vm:0x303 [64.270105] perf_mmap_close mmap_user->locked_vm:0xffffffffffffff04 pinned_vm:0x202 [64.270374] perf_mmap_close mmap_user->locked_vm:0xfffffffffffffe84 pinned_vm:0x101 [64.270628] perf_mmap_close mmap_user->locked_vm:0xfffffffffffffe04 pinned_vm:0x0 This value sticks for the user until system is rebooted, causing follow-on system calls using locked_vm resource limit to fail. Note: There is no issue using the normal trace buffer. In fact the issue is in perf_mmap_close(). During allocation auxiliary trace buffer memory is either traced as 'extra' and added to 'pinned_vm' or trace as 'user_extra' and added to 'locked_vm'. This applies for normal trace buffers and auxiliary trace buffer. However in function perf_mmap_close() all auxiliary trace buffer is subtraced from 'locked_vm' and never from 'pinned_vm'. This breaks the ballance. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: gor@linux.ibm.com Cc: hechaol@fb.com Cc: heiko.carstens@de.ibm.com Cc: linux-perf-users@vger.kernel.org Cc: songliubraving@fb.com Fixes: d44248a41337 ("perf/core: Rework memory accounting in perf_mmap()") Link: https://lkml.kernel.org/r/20191021083354.67868-1-tmricht@linux.ibm.com [ Minor readability edits. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | Merge tag 'perf-urgent-for-mingo-5.4-20191017' of ↵Ingo Molnar2019-10-2112-17/+65
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: perf buildid-cache: Adrian Hunter: - Fix mode setting in copyfile_mode_ns() when copying /proc/kcore. perf evlist: Andi Kleen: - Fix freeing id arrays. tools headers: - Sync sched.h anc kvm.h headers with the kernel sources. perf jvmti: Thomas Richter: - Link against tools/lib/ctype.o to have weak strlcpy(). perf annotate: Gustavo A. R. Silva: - Fix multiple memory and file descriptor leaks, found by coverity. perf c2c/kmem: Yunfeng Ye: - Fix leaks in error handling paths in 'perf c2c', 'perf kmem', found by internal static analysis tool. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | perf kmem: Fix memory leak in compact_gfp_flags()Yunfeng Ye2019-10-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The memory @orig_flags is allocated by strdup(), it is freed on the normal path, but leak to free on the error path. Fix this by adding free(orig_flags) on the error path. Fixes: 0e11115644b3 ("perf kmem: Print gfp flags in human readable string") Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Feilong Lin <linfeilong@huawei.com> Cc: Hu Shiyuan <hushiyuan@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/f9e9f458-96f3-4a97-a1d5-9feec2420e07@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | tools headers UAPI: Sync sched.h with the kernelArnaldo Carvalho de Melo2019-10-151-2/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To get the changes in: 78f6face5af3 ("sched: add kernel-doc for struct clone_args") f14c234b4bc5 ("clone3: switch to copy_struct_from_user()") This file gets rebuilt, but no changes ensues: CC /tmp/build/perf/trace/beauty/clone.o This addresses this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/sched.h' differs from latest version at 'include/uapi/linux/sched.h' diff -u tools/include/uapi/linux/sched.h include/uapi/linux/sched.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-xqruu8wohwlbc57udg1g0xzx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | tools headers kvm: Sync kvm.h headers with the kernel sourcesArnaldo Carvalho de Melo2019-10-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To pick the changes in: 344c6c804703 ("KVM/Hyper-V: Add new KVM capability KVM_CAP_HYPERV_DIRECT_TLBFLUSH") dee04eee9182 ("KVM: RISC-V: Add KVM_REG_RISCV for ONE_REG interface") These trigger the rebuild of this object: CC /tmp/build/perf/trace/beauty/ioctl.o But do not result in any change in tooling, as the additions are not being used in any table generatator. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h' diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Anup Patel <Anup.Patel@wdc.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Tianyu Lan <Tianyu.Lan@microsoft.com> Link: https://lkml.kernel.org/n/tip-d1v48a0qfoe98u5v9tn3mu5u@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | tools headers kvm: Sync kvm headers with the kernel sourcesArnaldo Carvalho de Melo2019-10-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To pick the changes in: 0cb8410b90e7 ("kvm: svm: Intercept RDPRU") That trigger a rebuild in too in tooling: CC /tmp/build/perf/arch/x86/util/kvm-stat.o But this time around no changes in tooling results, as SVM_EXIT_RDPRU wasn't added to SVM_EXIT_REASONS, that is used in kvm-stat.c. And addresses this perf build warnings: Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/svm.h' differs from latest version at 'arch/x86/include/uapi/asm/svm.h' diff -u tools/arch/x86/include/uapi/asm/svm.h arch/x86/include/uapi/asm/svm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jim Mattson <jmattson@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lkml.kernel.org/n/tip-pqzkt1hmfpqph3ts8i6zzmim@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | tools headers kvm: Sync kvm headers with the kernel sourcesArnaldo Carvalho de Melo2019-10-151-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To pick the changes in: bf653b78f960 ("KVM: vmx: Introduce handle_unexpected_vmexit and handle WAITPKG vmexit") That trigger these changes in tooling: CC /tmp/build/perf/arch/x86/util/kvm-stat.o INSTALL GTK UI DESCEND plugins make[3]: Nothing to be done for '/tmp/build/perf/plugins/libtraceevent-dynamic-list'. INSTALL trace_plugins LD /tmp/build/perf/arch/x86/util/perf-in.o LD /tmp/build/perf/arch/x86/perf-in.o LD /tmp/build/perf/arch/perf-in.o LD /tmp/build/perf/perf-in.o LINK /tmp/build/perf/perf And this is not just because that header is included, kvm-stat.c uses the VMX_EXIT_REASONS define and it got changed by the above cset. And addresses this perf build warnings: Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/vmx.h' differs from latest version at 'arch/x86/include/uapi/asm/vmx.h' diff -u tools/arch/x86/include/uapi/asm/vmx.h arch/x86/include/uapi/asm/vmx.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Tao Xu <tao3.xu@intel.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-gr1eel0hckmi5l3p2ewdpfxh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | perf c2c: Fix memory leak in build_cl_output()Yunfeng Ye2019-10-151-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a memory leak problem in the failure paths of build_cl_output(), so fix it. Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Feilong Lin <linfeilong@huawei.com> Cc: Hu Shiyuan <hushiyuan@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/4d3c0178-5482-c313-98e1-f82090d2d456@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | perf tools: Fix mode setting in copyfile_mode_ns()Adrian Hunter2019-10-151-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | slow_copyfile() opens the file by name, so "write" permissions must not be removed in copyfile_mode_ns() before calling slow_copyfile(). Example: Before: $ sudo chmod +r /proc/kcore $ sudo setcap "cap_sys_admin,cap_sys_ptrace,cap_syslog,cap_sys_rawio=ep" tools/perf/perf $ tools/perf/perf buildid-cache -k /proc/kcore Couldn't add /proc/kcore After: $ sudo chmod +r /proc/kcore $ sudo setcap "cap_sys_admin,cap_sys_ptrace,cap_syslog,cap_sys_rawio=ep" tools/perf/perf $ tools/perf/perf buildid-cache -v -k /proc/kcore kcore added to build-id cache directory /home/ahunter/.debug/[kernel.kcore]/37e340b1b5a7cf4f57ba8de2bc777359588a957f/2019100709562289 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lore.kernel.org/lkml/20191007070221.11158-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | perf annotate: Fix multiple memory and file descriptor leaksGustavo A. R. Silva2019-10-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Store SYMBOL_ANNOTATE_ERRNO__BPF_MISSING_BTF in variable *ret*, instead of returning in the middle of the function and leaking multiple resources: prog_linfo, btf, s and bfdf. Addresses-Coverity-ID: 1454832 ("Structurally dead code") Fixes: 11aad897f6d1 ("perf annotate: Don't return -1 for error when doing BPF disassembly") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191014171047.GA30850@embeddedor Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | perf tools: Fix resource leak of closedir() on the error pathsYunfeng Ye2019-10-152-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both build_mem_topology() and rm_rf_depth_pat() have resource leaks of closedir() on the error paths. Fix this by calling closedir() before function returns. Fixes: e2091cedd51b ("perf tools: Add MEM_TOPOLOGY feature to perf data file") Fixes: cdb6b0235f17 ("perf tools: Add pattern name checking to rm_rf") Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Feilong Lin <linfeilong@huawei.com> Cc: Hu Shiyuan <hushiyuan@huawei.com> Cc: Igor Lubashev <ilubashe@akamai.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Link: http://lore.kernel.org/lkml/cd5f7cd2-b80d-6add-20a1-32f4f43e0744@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | perf evlist: Fix fix for freed id arraysAndi Kleen2019-10-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the earlier fix for the memory overrun of id arrays I managed to typo the wrong event in the fix. Of course we need to close the current event in the loop, not the original failing event. The same test case as in the original patch still passes. Fixes: 7834fa948beb ("perf evlist: Fix access of freed id arrays") Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191011182140.8353-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | perf jvmti: Link against tools/lib/ctype.h to have weak strlcpy()Thomas Richter2019-10-151-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The build of file libperf-jvmti.so succeeds but the resulting object fails to load: # ~/linux/tools/perf/perf record -k mono -- java \ -XX:+PreserveFramePointer \ -agentpath:/root/linux/tools/perf/libperf-jvmti.so \ hog 100000 123450 Error occurred during initialization of VM Could not find agent library /root/linux/tools/perf/libperf-jvmti.so in absolute path, with error: /root/linux/tools/perf/libperf-jvmti.so: undefined symbol: _ctype Add the missing _ctype symbol into the build script. Fixes: 79743bc927f6 ("perf jvmti: Link against tools/lib/string.o to have weak strlcpy()") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: http://lore.kernel.org/lkml/20191008093841.59387-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf/x86/intel/pt: Fix base for single entry topaJiri Olsa2019-10-201-1/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jan reported failing ltp test for PT: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/tracing/pt_test/pt_test.c It looks like the reason is this new commit added in this v5.4 merge window: 38bb8d77d0b9 ("perf/x86/intel/pt: Split ToPA metadata and page layout") which did not keep the TOPA_SHIFT for entry base. Add it back. Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 38bb8d77d0b9 ("perf/x86/intel/pt: Split ToPA metadata and page layout") Link: https://lkml.kernel.org/r/20191019220726.12213-1-jolsa@kernel.org [ Minor changelog edits. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds2019-10-272-5/+20
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "Two fixes for interrupt controller drivers: - Skip IRQ_M_EXT entries in the device tree when initializing the RISCV PLIC controller to avoid a double init attempt. - Use the correct ITS list when issuing the VMOVP synchronization command so the operation works only on the ITS instances which are associated to a VM" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/sifive-plic: Skip contexts except supervisor in plic_init() irqchip/gic-v3-its: Use the exact ITSList for VMOVP
| * \ \ Merge tag 'irqchip-fixes-5.4-2' of ↵Thomas Gleixner2019-10-252-5/+20
| |\ \ \ | | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull the second lot of irqchip updates for 5.4 from Marc Zyngier: - Sifive PLIC: force driver to skip non-relevant contexts - GICv4: Don't send VMOVP commands to ITSs that don't have this vPE mapped
| | * | irqchip/sifive-plic: Skip contexts except supervisor in plic_init()Alan Mikhak2019-10-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Modify plic_init() to skip .dts interrupt contexts other than supervisor external interrupt. The .dts entry for plic may specify multiple interrupt contexts. For example, it may assign two entries IRQ_M_EXT and IRQ_S_EXT, in that order, to the same interrupt controller. This patch modifies plic_init() to skip the IRQ_M_EXT context since IRQ_S_EXT is currently the only supported context. If IRQ_M_EXT is not skipped, plic_init() will report "handler already present for context" when it comes across the IRQ_S_EXT context in the next iteration of its loop. Without this patch, .dts would have to be edited to replace the value of IRQ_M_EXT with -1 for it to be skipped. Signed-off-by: Alan Mikhak <alan.mikhak@sifive.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Paul Walmsley <paul.walmsley@sifive.com> # arch/riscv Link: https://lkml.kernel.org/r/1571933503-21504-1-git-send-email-alan.mikhak@sifive.com
| | * | irqchip/gic-v3-its: Use the exact ITSList for VMOVPZenghui Yu2019-10-241-3/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On a system without Single VMOVP support (say GITS_TYPER.VMOVP == 0), we will map vPEs only on ITSs that will actually control interrupts for the given VM. And when moving a vPE, the VMOVP command will be issued only for those ITSs. But when issuing VMOVPs we seemed fail to present the exact ITSList to ITSs who are actually included in the synchronization operation. The its_list_map we're currently using includes all ITSs in the system, even though some of them don't have the corresponding vPE mapping at all. Introduce get_its_list() to get the per-VM its_list_map, to indicate which ITSs have vPE mappings for the given VM, and use this map as the expected ITSList when building VMOVP. This is hopefully a performance gain not to do some synchronization with those unsuspecting ITSs. And initialize the whole command descriptor to zero at beginning, since the seq_num and its_list should be RES0 when GITS_TYPER.VMOVP == 1. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1571802386-2680-1-git-send-email-yuzenghui@huawei.com
* | | | Merge tag '5.4-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2019-10-279-37/+75
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull cifs fixes from Steve French: "Seven cifs/smb3 fixes, including three for stable" * tag '5.4-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs CIFS: Fix use after free of file info structures CIFS: Fix retry mid list corruption on reconnects cifs: Fix missed free operations CIFS: avoid using MID 0xFFFF cifs: clarify comment about timestamp granularity for old servers cifs: Handle -EINPROGRESS only when noblockcnt is set
| * | | | cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occursDave Wysochanski2019-10-254-9/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a deadlock that is possible and can easily be seen with a test where multiple readers open/read/close of the same file and a disruption occurs causing reconnect. The deadlock is due a reader thread inside cifs_strict_readv calling down_read and obtaining lock_sem, and then after reconnect inside cifs_reopen_file calling down_read a second time. If in between the two down_read calls, a down_write comes from another process, deadlock occurs. CPU0 CPU1 ---- ---- cifs_strict_readv() down_read(&cifsi->lock_sem); _cifsFileInfo_put OR cifs_new_fileinfo down_write(&cifsi->lock_sem); cifs_reopen_file() down_read(&cifsi->lock_sem); Fix the above by changing all down_write(lock_sem) calls to down_write_trylock(lock_sem)/msleep() loop, which in turn makes the second down_read call benign since it will never block behind the writer while holding lock_sem. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Suggested-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed--by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
| * | | | CIFS: Fix use after free of file info structuresPavel Shilovsky2019-10-251-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the code assumes that if a file info entry belongs to lists of open file handles of an inode and a tcon then it has non-zero reference. The recent changes broke that assumption when putting the last reference of the file info. There may be a situation when a file is being deleted but nothing prevents another thread to reference it again and start using it. This happens because we do not hold the inode list lock while checking the number of references of the file info structure. Fix this by doing the proper locking when doing the check. Fixes: 487317c99477d ("cifs: add spinlock for the openFileList to cifsInodeInfo") Fixes: cb248819d209d ("cifs: use cifsInodeInfo->open_file_lock while iterating to avoid a panic") Cc: Stable <stable@vger.kernel.org> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * | | | CIFS: Fix retry mid list corruption on reconnectsPavel Shilovsky2019-10-252-20/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the client hits reconnect it iterates over the mid pending queue marking entries for retry and moving them to a temporary list to issue callbacks later without holding GlobalMid_Lock. In the same time there is no guarantee that mids can't be removed from the temporary list or even freed completely by another thread. It may cause a temporary list corruption: [ 430.454897] list_del corruption. prev->next should be ffff98d3a8f316c0, but was 2e885cb266355469 [ 430.464668] ------------[ cut here ]------------ [ 430.466569] kernel BUG at lib/list_debug.c:51! [ 430.468476] invalid opcode: 0000 [#1] SMP PTI [ 430.470286] CPU: 0 PID: 13267 Comm: cifsd Kdump: loaded Not tainted 5.4.0-rc3+ #19 [ 430.473472] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 430.475872] RIP: 0010:__list_del_entry_valid.cold+0x31/0x55 ... [ 430.510426] Call Trace: [ 430.511500] cifs_reconnect+0x25e/0x610 [cifs] [ 430.513350] cifs_readv_from_socket+0x220/0x250 [cifs] [ 430.515464] cifs_read_from_socket+0x4a/0x70 [cifs] [ 430.517452] ? try_to_wake_up+0x212/0x650 [ 430.519122] ? cifs_small_buf_get+0x16/0x30 [cifs] [ 430.521086] ? allocate_buffers+0x66/0x120 [cifs] [ 430.523019] cifs_demultiplex_thread+0xdc/0xc30 [cifs] [ 430.525116] kthread+0xfb/0x130 [ 430.526421] ? cifs_handle_standard+0x190/0x190 [cifs] [ 430.528514] ? kthread_park+0x90/0x90 [ 430.530019] ret_from_fork+0x35/0x40 Fix this by obtaining extra references for mids being retried and marking them as MID_DELETED which indicates that such a mid has been dequeued from the pending list. Also move mid cleanup logic from DeleteMidQEntry to _cifs_mid_q_entry_release which is called when the last reference to a particular mid is put. This allows to avoid any use-after-free of response buffers. The patch needs to be backported to stable kernels. A stable tag is not mentioned below because the patch doesn't apply cleanly to any actively maintained stable kernel. Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-and-tested-by: David Wysochanski <dwysocha@redhat.com> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * | | | cifs: Fix missed free operationsChuhong Yuan2019-10-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cifs_setattr_nounix has two paths which miss free operations for xid and fullpath. Use goto cifs_setattr_exit like other paths to fix them. CC: Stable <stable@vger.kernel.org> Fixes: aa081859b10c ("cifs: flush before set-info if we have writeable handles") Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
| * | | | CIFS: avoid using MID 0xFFFFRoberto Bergantinos Corpas2019-10-211-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to MS-CIFS specification MID 0xFFFF should not be used by the CIFS client, but we actually do. Besides, this has proven to cause races leading to oops between SendReceive2/cifs_demultiplex_thread. On SMB1, MID is a 2 byte value easy to reach in CurrentMid which may conflict with an oplock break notification request coming from server Signed-off-by: Roberto Bergantinos Corpas <rbergant@redhat.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org>
| * | | | cifs: clarify comment about timestamp granularity for old serversSteve French2019-10-211-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It could be confusing why we set granularity to 1 seconds rather than 2 seconds (1 second is the max the VFS allows) for these mounts to very old servers ... Signed-off-by: Steve French <stfrench@microsoft.com>