summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Documentation: core-api/cpuhotplug: Rewrite the API sectionThomas Gleixner2021-09-112-121/+590
| | | | | | | | | | | | | Dave stumbled over the incomplete and confusing documentation of the CPU hotplug API. Rewrite it, add the missing function documentations and correct the existing ones. Reported-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210909123212.489059409@linutronix.de
* cpu/hotplug: Remove deprecated CPU-hotplug functions.Sebastian Andrzej Siewior2021-09-111-6/+0
| | | | | | | | | | | No users in tree use the deprecated CPU-hotplug functions anymore. Remove them. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210803141621.780504-39-bigeasy@linutronix.de
* thermal: Replace deprecated CPU-hotplug functions.Sebastian Andrzej Siewior2021-09-111-2/+2
| | | | | | | | | | | | | | The functions get_online_cpus() and put_online_cpus() have been deprecated during the CPU hotplug rework. They map directly to cpus_read_lock() and cpus_read_unlock(). Replace deprecated CPU-hotplug functions with the official version. The behavior remains unchanged. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210803141621.780504-20-bigeasy@linutronix.de
* Merge branch 'linus' into smp/urgentThomas Gleixner2021-09-118966-226476/+530193
|\ | | | | | | | | | | Ensure that all usage sites of get/put_online_cpus() except for the struggler in drivers/thermal are gone. So the last user and the deprecated inlines can be removed.
| * Merge tag 'acpi-5.15-rc1-3' of ↵Linus Torvalds2021-09-103-3/+9
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more ACPI updates from Rafael Wysocki: "These prevent a confusing PRMT-related message from being printed, drop an unnecessary header file include and update the list of ACPICA maintainers. Specifics: - Prevent a message about missing PRMT from being printed on systems that do not support PRM, which are the majority now (Aubrey Li). - Drop unnecessary header include from scan.c (Kari Argillander). - Update the list of ACPICA maintainers after recent departure of one of them (Rafael Wysocki)" * tag 'acpi-5.15-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPICA: Update the list of maintainers ACPI: PRM: Find PRMT table before parsing it ACPI: scan: Remove unneeded header linux/nls.h
| | *-. Merge branches 'acpi-scan' and 'acpi-prm'Rafael J. Wysocki2021-09-102-2/+9
| | |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * acpi-scan: ACPI: scan: Remove unneeded header linux/nls.h * acpi-prm: ACPI: PRM: Find PRMT table before parsing it
| | | | * ACPI: PRM: Find PRMT table before parsing itAubrey Li2021-09-081-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Find and verify PRMT before parsing it, which eliminates a warning on machines without PRMT: [ 7.197173] ACPI: PRMT not present Fixes: cefc7ca46235 ("ACPI: PRM: implement OperationRegion handler for the PlatformRtMechanism subtype") Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Cc: 5.14+ <stable@vger.kernel.org> # 5.14+ [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | | * | ACPI: scan: Remove unneeded header linux/nls.hKari Argillander2021-09-071-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Code that use linux/nls.h was moved to device_sysfs.c by commit c2efefb33abf ("ACPI / scan: Move sysfs-related device code to a separate file") Remove this include so that complier has easier times and it would be easier to grep where nls code is used. Signed-off-by: Kari Argillander <kari.argillander@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | * | | ACPICA: Update the list of maintainersRafael J. Wysocki2021-09-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Erik Kaneda will not be maintaining ACPICA any more, so drop his address (which doesn't work any more anyway) from the maintainer list. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | Merge tag 'pm-5.15-rc1-3' of ↵Linus Torvalds2021-09-108-155/+138
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These improve hybrid processors support in intel_pstate, fix an issue in the core devices PM code, clean up the handling of dedicated wake IRQs, update the Energy Model documentation and update MAINTAINERS. Specifics: - Make the HWP performance levels calibration on hybrid processors in intel_pstate more straightforward (Rafael Wysocki). - Prevent the PM core from leaving devices in suspend after a failing system-wide suspend transition in some cases when driver PM flags are used (Prasad Sodagudi). - Drop unused function argument from the dedicated wake IRQs handling code (Sergey Shtylyov). - Fix up Energy Model kerneldoc comments and include them in the Energy Model documentation (Lukasz Luba). - Use my kernel.org address in MAINTAINERS insead of the personal one (Rafael Wysocki)" * tag 'pm-5.15-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: MAINTAINERS: Change Rafael's e-mail address PM: sleep: core: Avoid setting power.must_resume to false Documentation: power: include kernel-doc in Energy Model doc PM: EM: fix kernel-doc comments cpufreq: intel_pstate: hybrid: Rework HWP calibration ACPI: CPPC: Introduce cppc_get_nominal_perf() PM: sleep: wakeirq: drop useless parameter from dev_pm_attach_wake_irq()
| | | \ \ \
| | | \ \ \
| | | \ \ \
| | | \ \ \
| | *---. \ \ \ Merge branches 'pm-cpufreq', 'pm-sleep' and 'pm-em'Rafael J. Wysocki2021-09-107-145/+128
| | |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * pm-cpufreq: cpufreq: intel_pstate: hybrid: Rework HWP calibration ACPI: CPPC: Introduce cppc_get_nominal_perf() * pm-sleep: PM: sleep: core: Avoid setting power.must_resume to false PM: sleep: wakeirq: drop useless parameter from dev_pm_attach_wake_irq() * pm-em: Documentation: power: include kernel-doc in Energy Model doc PM: EM: fix kernel-doc comments
| | | | | * | | | Documentation: power: include kernel-doc in Energy Model docLukasz Luba2021-09-071-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve the existing documentation of the Energy Model and add kernel-doc comments. This extends an API description and helps better understanding the usage. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | | | | * | | | PM: EM: fix kernel-doc commentsLukasz Luba2021-09-071-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the kernel-doc comments for the improved Energy Model documentation. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | | | * | | | | PM: sleep: core: Avoid setting power.must_resume to falsePrasad Sodagudi2021-09-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are variables(power.may_skip_resume and dev->power.must_resume) and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices after a system wide suspend transition. Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows its "noirq" and "early" resume callbacks to be skipped if the device can be left in suspend after a system-wide transition into the working state. PM core determines that the driver's "noirq" and "early" resume callbacks should be skipped or not with dev_pm_skip_resume() function by checking power.may_skip_resume variable. power.must_resume variable is getting set to false in __device_suspend() function without checking device's DPM_FLAG_MAY_SKIP_RESUME settings. In problematic scenario, where all the devices in the suspend_late stage are successful and some device can fail to suspend in suspend_noirq phase. So some devices successfully suspended in suspend_late stage are not getting chance to execute __device_suspend_noirq() to set dev->power.must_resume variable to true and not getting resumed in early_resume phase. Add a check for device's DPM_FLAG_MAY_SKIP_RESUME flag before setting power.must_resume variable in __device_suspend function. Fixes: 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase") Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | | | * | | | | PM: sleep: wakeirq: drop useless parameter from dev_pm_attach_wake_irq()Sergey Shtylyov2021-09-071-7/+4
| | | | |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function has the 'irq' parameter which isn't ever used, so drop it. Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | | * | | | | cpufreq: intel_pstate: hybrid: Rework HWP calibrationRafael J. Wysocki2021-09-071-114/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current HWP calibration for hybrid processors in intel_pstate is fragile, because it depends too much on the information provided by the platform firmware via CPPC which may not be reliable enough. It also need not be so complicated. In order to improve that mechanism and make it more resistant to platform firmware issues, make it only use the CPPC nominal_perf values to compute the HWP-to-frequency scaling factors for all CPUs and possibly use the HWP_CAP highest_perf values to recompute them if the ones derived from the CPPC nominal_perf values alone appear to be too high. Namely, fetch CPC.nominal_perf for all CPUs present in the system, find the minimum one and use it as a reference for computing all of the CPUs' scaling factors (using the observation that for the CPUs having the minimum CPC.nominal_perf the HWP range of available performance levels should be the same as the range of available "legacy" P-states and so the HWP-to-frequency scaling factor for them should be the same as the corresponding scaling factor used for representing the P-state values in kHz). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Zhang Rui <rui.zhang@intel.com>
| | | * | | | | ACPI: CPPC: Introduce cppc_get_nominal_perf()Rafael J. Wysocki2021-09-072-16/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On some systems the nominal_perf value retrieved via CPPC is just a constant and fetching it doesn't require accessing any registers, so if it is the only CPPC capability that's needed, it is wasteful to run cppc_get_perf_caps() in order to get just that value alone, especially when this is done for CPUs other than the one running the code. For this reason, introduce cppc_get_nominal_perf() allowing nominal_perf to be obtained individually, by generalizing the existing cppc_get_desired_perf() (and renaming it) so it can be used to retrieve any specific CPPC capability value. While at it, clean up the cppc_get_desired_perf() kerneldoc comment a bit. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | * | | | | | MAINTAINERS: Change Rafael's e-mail addressRafael J. Wysocki2021-09-101-10/+10
| | | |_|/ / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I have been slow to respond to messages going to rjw@rjwysocki.net recently, so change it to rafael@kernel.org (which works better for me) in MAINTAINERS. Signed-off-by: Rafael J. Wysocki <rafael@kernel.org>
| * | | | | | Merge tag 'arm64-fixes' of ↵Linus Torvalds2021-09-101-1/+21
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Limit the linear region to 51-bit when KVM is running in nVHE mode. Otherwise, depending on the placement of the ID map, kernel-VA to hyp-VA translations may produce addresses that either conflict with other HYP mappings or generate addresses outside of the 52-bit addressable range. - Instruct kmemleak not to scan the memory reserved for kdump as this range is removed from the kernel linear map and therefore not accessible. * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: kdump: Skip kmemleak scan reserved memory for kdump arm64: mm: limit linear region to 51 bits for KVM in nVHE mode
| | * | | | | | arm64: kdump: Skip kmemleak scan reserved memory for kdumpChen Wandun2021-09-101-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trying to boot with kdump + kmemleak, command will result in a crash: "echo scan > /sys/kernel/debug/kmemleak" crashkernel reserved: 0x0000000007c00000 - 0x0000000027c00000 (512 MB) Kernel command line: BOOT_IMAGE=(hd1,gpt2)/vmlinuz-5.14.0-rc5-next-20210809+ root=/dev/mapper/ao-root ro rd.lvm.lv=ao/root rd.lvm.lv=ao/swap crashkernel=512M Unable to handle kernel paging request at virtual address ffff000007c00000 Mem abort info: ESR = 0x96000007 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x07: level 3 translation fault Data abort info: ISV = 0, ISS = 0x00000007 CM = 0, WnR = 0 swapper pgtable: 64k pages, 48-bit VAs, pgdp=00002024f0d80000 [ffff000007c00000] pgd=1800205ffffd0003, p4d=1800205ffffd0003, pud=1800205ffffd0003, pmd=1800205ffffc0003, pte=0068000007c00f06 Internal error: Oops: 96000007 [#1] SMP pstate: 804000c9 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : scan_block+0x98/0x230 lr : scan_block+0x94/0x230 sp : ffff80008d6cfb70 x29: ffff80008d6cfb70 x28: 0000000000000000 x27: 0000000000000000 x26: 00000000000000c0 x25: 0000000000000001 x24: 0000000000000000 x23: ffffa88a6b18b398 x22: ffff000007c00ff9 x21: ffffa88a6ac7fc40 x20: ffffa88a6af6a830 x19: ffff000007c00000 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: ffffffffffffffff x14: ffffffff00000000 x13: ffffffffffffffff x12: 0000000000000020 x11: 0000000000000000 x10: 0000000001080000 x9 : ffffa88a6951c77c x8 : ffffa88a6a893988 x7 : ffff203ff6cfb3c0 x6 : ffffa88a6a52b3c0 x5 : ffff203ff6cfb3c0 x4 : 0000000000000000 x3 : 0000000000000000 x2 : 0000000000000001 x1 : ffff20226cb56a40 x0 : 0000000000000000 Call trace: scan_block+0x98/0x230 scan_gray_list+0x120/0x270 kmemleak_scan+0x3a0/0x648 kmemleak_write+0x3ac/0x4c8 full_proxy_write+0x6c/0xa0 vfs_write+0xc8/0x2b8 ksys_write+0x70/0xf8 __arm64_sys_write+0x24/0x30 invoke_syscall+0x4c/0x110 el0_svc_common+0x9c/0x190 do_el0_svc+0x30/0x98 el0_svc+0x28/0xd8 el0t_64_sync_handler+0x90/0xb8 el0t_64_sync+0x180/0x184 The reserved memory for kdump will be looked up by kmemleak, this area will be set invalid when kdump service is bring up. That will result in crash when kmemleak scan this area. Fixes: a7259df76702 ("memblock: make memblock_find_in_range method private") Signed-off-by: Chen Wandun <chenwandun@huawei.com> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210910064844.3827813-1-chenwandun@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| | * | | | | | arm64: mm: limit linear region to 51 bits for KVM in nVHE modeArd Biesheuvel2021-09-091-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KVM in nVHE mode divides up its VA space into two equal halves, and picks the half that does not conflict with the HYP ID map to map its linear region. This worked fine when the kernel's linear map itself was guaranteed to cover precisely as many bits of VA space, but this was changed by commit f4693c2716b35d08 ("arm64: mm: extend linear region for 52-bit VA configurations"). The result is that, depending on the placement of the ID map, kernel-VA to hyp-VA translations may produce addresses that either conflict with other HYP mappings (including the ID map itself) or generate addresses outside of the 52-bit addressable range, neither of which is likely to lead to anything useful. Given that 52-bit capable cores are guaranteed to implement VHE, this only affects configurations such as pKVM where we opt into non-VHE mode even if the hardware is VHE capable. So just for these configurations, let's limit the kernel linear map to 51 bits and work around the problem. Fixes: f4693c2716b3 ("arm64: mm: extend linear region for 52-bit VA configurations") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210826165613.60774-1-ardb@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * | | | | | | Merge tag 'for-5.15/parisc-3' of ↵Linus Torvalds2021-09-1014-179/+102
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: - Build warning fixes in Makefile and Dino PCI driver - Fix when sched_clock is marked unstable - Drop strnlen_user() in favour of generic version - Prevent kernel to write outside userspace signal stack - Remove CONFIG_SET_FS including KERNEL_DS and USER_DS from parisc and switch to __get/put_kernel_nofault() * tag 'for-5.15/parisc-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Implement __get/put_kernel_nofault() parisc: Mark sched_clock unstable only if clocks are not syncronized parisc: Move pci_dev_is_behind_card_dino to where it is used parisc: Reduce sigreturn trampoline to 3 instructions parisc: Check user signal stack trampoline is inside TASK_SIZE parisc: Drop useless debug info and comments from signal.c parisc: Drop strnlen_user() in favour of generic version parisc: Add missing FORCE prerequisite in Makefile
| | * | | | | | | parisc: Implement __get/put_kernel_nofault()Helge Deller2021-09-096-86/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove CONFIG_SET_FS from parisc, so we need to add __get_kernel_nofault() and __put_kernel_nofault(), define HAVE_GET_KERNEL_NOFAULT and remove set_fs(), get_fs(), load_sr2(), thread_info->addr_limit, KERNEL_DS and USER_DS. The nice side-effect of this patch is that we now can directly access userspace via sr3 without the need to use a temporary sr2 which is either copied from sr3 or set to zero (for kernel space). Signed-off-by: Helge Deller <deller@gmx.de> Suggested-by: Arnd Bergmann <arnd@kernel.org>
| | * | | | | | | parisc: Mark sched_clock unstable only if clocks are not syncronizedHelge Deller2021-09-092-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We check at runtime if the cr16 clocks are stable across CPUs. Only mark the sched_clock unstable by calling clear_sched_clock_stable() if we know that we run on a system which isn't syncronized across CPUs. Signed-off-by: Helge Deller <deller@gmx.de>
| | * | | | | | | parisc: Move pci_dev_is_behind_card_dino to where it is usedGuenter Roeck2021-09-091-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | parisc build test images fail to compile with the following error. drivers/parisc/dino.c:160:12: error: 'pci_dev_is_behind_card_dino' defined but not used Move the function just ahead of its only caller to avoid the error. Fixes: 5fa1659105fa ("parisc: Disable HP HSC-PCI Cards to prevent kernel crash") Cc: Helge Deller <deller@gmx.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Helge Deller <deller@gmx.de>
| | * | | | | | | parisc: Reduce sigreturn trampoline to 3 instructionsHelge Deller2021-09-093-9/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can move the INSN_LDI_R20 instruction into the branch delay slot. Signed-off-by: Helge Deller <deller@gmx.de>
| | * | | | | | | parisc: Check user signal stack trampoline is inside TASK_SIZEHelge Deller2021-09-091-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add some additional checks to ensure the signal stack is inside userspace bounds. Signed-off-by: Helge Deller <deller@gmx.de>
| | * | | | | | | parisc: Drop useless debug info and comments from signal.cHelge Deller2021-09-091-15/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Helge Deller <deller@gmx.de>
| | * | | | | | | parisc: Drop strnlen_user() in favour of generic versionHelge Deller2021-09-094-38/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As suggested by Arnd Bergmann, drop the parisc version of strnlen_user() and switch to the generic version. Suggested-by: Arnd Bergmann <arnd@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Helge Deller <deller@gmx.de>
| | * | | | | | | parisc: Add missing FORCE prerequisite in MakefileHelge Deller2021-09-091-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Helge Deller <deller@gmx.de>
| * | | | | | | | Merge tag 'iommu-fixes-v5.15-rc0' of ↵Linus Torvalds2021-09-103-24/+41
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Intel VT-d: - PASID leakage in intel_svm_unbind_mm() - Deadlock in intel_svm_drain_prq() - AMD IOMMU: Fixes for an unhandled page-fault bug when AVIC is used for a KVM guest. - Make CONFIG_IOMMU_DEFAULT_DMA_LAZY architecture instead of IOMMU driver dependent * tag 'iommu-fixes-v5.15-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu: Clarify default domain Kconfig iommu/vt-d: Fix a deadlock in intel_svm_drain_prq() iommu/vt-d: Fix PASID leak in intel_svm_unbind_mm() iommu/amd: Remove iommu_init_ga() iommu/amd: Relocate GAMSup check to early_enable_iommus
| | * | | | | | | | iommu: Clarify default domain KconfigRobin Murphy2021-09-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although strictly it is the AMD and Intel drivers which have an existing expectation of lazy behaviour by default, it ends up being rather unintuitive to describe this literally in Kconfig. Express it instead as an architecture dependency, to clarify that it is a valid config-time decision. The end result is the same since virtio-iommu doesn't support lazy mode and thus falls back to strict at runtime regardless. The per-architecture disparity is a matter of historical expectations: the AMD and Intel drivers have been lazy by default since 2008, and changing that gets noticed by people asking where their I/O throughput has gone. Conversely, Arm-based systems with their wider assortment of IOMMU drivers mostly only support strict mode anyway; only the Arm SMMU drivers have later grown support for passthrough and lazy mode, for users who wanted to explicitly trade off isolation for performance. These days, reducing the default level of isolation in a way which may go unnoticed by users who expect otherwise hardly seems worth risking for the sake of one line of Kconfig, so here's where we are. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/69a0c6f17b000b54b8333ee42b3124c1d5a869e2.1631105737.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | * | | | | | | | iommu/vt-d: Fix a deadlock in intel_svm_drain_prq()Fenghua Yu2021-09-091-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pasid_mutex and dev->iommu->param->lock are held while unbinding mm is flushing IO page fault workqueue and waiting for all page fault works to finish. But an in-flight page fault work also need to hold the two locks while unbinding mm are holding them and waiting for the work to finish. This may cause an ABBA deadlock issue as shown below: idxd 0000:00:0a.0: unbind PASID 2 ====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc7+ #549 Not tainted [ 186.615245] ---------- dsa_test/898 is trying to acquire lock: ffff888100d854e8 (&param->lock){+.+.}-{3:3}, at: iopf_queue_flush_dev+0x29/0x60 but task is already holding lock: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at: intel_svm_unbind+0x34/0x1e0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (pasid_mutex){+.+.}-{3:3}: __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 intel_svm_page_response+0x8e/0x260 iommu_page_response+0x122/0x200 iopf_handle_group+0x1c2/0x240 process_one_work+0x2a5/0x5a0 worker_thread+0x55/0x400 kthread+0x13b/0x160 ret_from_fork+0x22/0x30 -> #1 (&param->fault_param->lock){+.+.}-{3:3}: __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 iommu_report_device_fault+0xc2/0x170 prq_event_thread+0x28a/0x580 irq_thread_fn+0x28/0x60 irq_thread+0xcf/0x180 kthread+0x13b/0x160 ret_from_fork+0x22/0x30 -> #0 (&param->lock){+.+.}-{3:3}: __lock_acquire+0x1134/0x1d60 lock_acquire+0xc6/0x2e0 __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 iopf_queue_flush_dev+0x29/0x60 intel_svm_drain_prq+0x127/0x210 intel_svm_unbind+0xc5/0x1e0 iommu_sva_unbind_device+0x62/0x80 idxd_cdev_release+0x15a/0x200 [idxd] __fput+0x9c/0x250 ____fput+0xe/0x10 task_work_run+0x64/0xa0 exit_to_user_mode_prepare+0x227/0x230 syscall_exit_to_user_mode+0x2c/0x60 do_syscall_64+0x48/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: &param->lock --> &param->fault_param->lock --> pasid_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(pasid_mutex); lock(&param->fault_param->lock); lock(pasid_mutex); lock(&param->lock); *** DEADLOCK *** 2 locks held by dsa_test/898: #0: ffff888100cc1cc0 (&group->mutex){+.+.}-{3:3}, at: iommu_sva_unbind_device+0x53/0x80 #1: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at: intel_svm_unbind+0x34/0x1e0 stack backtrace: CPU: 2 PID: 898 Comm: dsa_test Not tainted 5.14.0-rc7+ #549 Hardware name: Intel Corporation Kabylake Client platform/KBL S DDR4 UD IMM CRB, BIOS KBLSE2R1.R00.X050.P01.1608011715 08/01/2016 Call Trace: dump_stack_lvl+0x5b/0x74 dump_stack+0x10/0x12 print_circular_bug.cold+0x13d/0x142 check_noncircular+0xf1/0x110 __lock_acquire+0x1134/0x1d60 lock_acquire+0xc6/0x2e0 ? iopf_queue_flush_dev+0x29/0x60 ? pci_mmcfg_read+0xde/0x240 __mutex_lock+0x75/0x730 ? iopf_queue_flush_dev+0x29/0x60 ? pci_mmcfg_read+0xfd/0x240 ? iopf_queue_flush_dev+0x29/0x60 mutex_lock_nested+0x1b/0x20 iopf_queue_flush_dev+0x29/0x60 intel_svm_drain_prq+0x127/0x210 ? intel_pasid_tear_down_entry+0x22e/0x240 intel_svm_unbind+0xc5/0x1e0 iommu_sva_unbind_device+0x62/0x80 idxd_cdev_release+0x15a/0x200 pasid_mutex protects pasid and svm data mapping data. It's unnecessary to hold pasid_mutex while flushing the workqueue. To fix the deadlock issue, unlock pasid_pasid during flushing the workqueue to allow the works to be handled. Fixes: d5b9e4bfe0d8 ("iommu/vt-d: Report prq to io-pgfault framework") Reported-and-tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210828070622.2437559-3-baolu.lu@linux.intel.com [joro: Removed timing information from kernel log messages] Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | * | | | | | | | iommu/vt-d: Fix PASID leak in intel_svm_unbind_mm()Fenghua Yu2021-09-091-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The mm->pasid will be used in intel_svm_free_pasid() after load_pasid() during unbinding mm. Clearing it in load_pasid() will cause PASID cannot be freed in intel_svm_free_pasid(). Additionally mm->pasid was updated already before load_pasid() during pasid allocation. No need to update it again in load_pasid() during binding mm. Don't update mm->pasid to avoid the issues in both binding mm and unbinding mm. Fixes: 4048377414162 ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers") Reported-and-tested-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210828070622.2437559-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | * | | | | | | | iommu/amd: Remove iommu_init_ga()Suravee Suthikulpanit2021-09-091-13/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the function has been simplified and only call iommu_init_ga_log(), remove the function and replace with iommu_init_ga_log() instead. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20210820202957.187572-4-suravee.suthikulpanit@amd.com Fixes: 8bda0cfbdc1a ("iommu/amd: Detect and initialize guest vAPIC log") Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | * | | | | | | | iommu/amd: Relocate GAMSup check to early_enable_iommusWei Huang2021-09-091-7/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, iommu_init_ga() checks and disables IOMMU VAPIC support (i.e. AMD AVIC support in IOMMU) when GAMSup feature bit is not set. However it forgets to clear IRQ_POSTING_CAP from the previously set amd_iommu_irq_ops.capability. This triggers an invalid page fault bug during guest VM warm reboot if AVIC is enabled since the irq_remapping_cap(IRQ_POSTING_CAP) is incorrectly set, and crash the system with the following kernel trace. BUG: unable to handle page fault for address: 0000000000400dd8 RIP: 0010:amd_iommu_deactivate_guest_mode+0x19/0xbc Call Trace: svm_set_pi_irte_mode+0x8a/0xc0 [kvm_amd] ? kvm_make_all_cpus_request_except+0x50/0x70 [kvm] kvm_request_apicv_update+0x10c/0x150 [kvm] svm_toggle_avic_for_irq_window+0x52/0x90 [kvm_amd] svm_enable_irq_window+0x26/0xa0 [kvm_amd] vcpu_enter_guest+0xbbe/0x1560 [kvm] ? avic_vcpu_load+0xd5/0x120 [kvm_amd] ? kvm_arch_vcpu_load+0x76/0x240 [kvm] ? svm_get_segment_base+0xa/0x10 [kvm_amd] kvm_arch_vcpu_ioctl_run+0x103/0x590 [kvm] kvm_vcpu_ioctl+0x22a/0x5d0 [kvm] __x64_sys_ioctl+0x84/0xc0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes by moving the initializing of AMD IOMMU interrupt remapping mode (amd_iommu_guest_ir) earlier before setting up the amd_iommu_irq_ops.capability with appropriate IRQ_POSTING_CAP flag. [joro: Squashed the two patches and limited check_features_on_all_iommus() to CONFIG_IRQ_REMAP to fix a compile warning.] Signed-off-by: Wei Huang <wei.huang2@amd.com> Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20210820202957.187572-2-suravee.suthikulpanit@amd.com Link: https://lore.kernel.org/r/20210820202957.187572-3-suravee.suthikulpanit@amd.com Fixes: 8bda0cfbdc1a ("iommu/amd: Detect and initialize guest vAPIC log") Signed-off-by: Joerg Roedel <jroedel@suse.de>
| * | | | | | | | | Merge tag 'char-misc-5.15-rc1-2' of ↵Linus Torvalds2021-09-1028-790/+3874
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull habanalabs updates from Greg KH: "Here is another round of misc driver patches for 5.15-rc1. In here is only updates for the Habanalabs driver. This request is late because the previously-objected-to dma-buf patches are all removed and some fixes that you and others found are now included in here as well. All of these have been in linux-next for well over a week with no reports of problems, and they are all self-contained to only this one driver. Full details are in the shortlog" * tag 'char-misc-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (61 commits) habanalabs/gaudi: hwmon default card name habanalabs: add support for f/w reset habanalabs/gaudi: block ICACHE_BASE_ADDERESS_HIGH in TPC habanalabs: cannot sleep while holding spinlock habanalabs: never copy_from_user inside spinlock habanalabs: remove unnecessary device status check habanalabs: disable IRQ in user interrupts spinlock habanalabs: add "in device creation" status habanalabs/gaudi: invalidate PMMU mem cache on init habanalabs/gaudi: size should be printed in decimal habanalabs/gaudi: define DC POWER for secured PMC habanalabs/gaudi: unmask out of bounds SLM access interrupt habanalabs: add userptr_lookup node in debugfs habanalabs/gaudi: fetch TPC/MME ECC errors from F/W habanalabs: modify multi-CS to wait on stream masters habanalabs/gaudi: add monitored SOBs to state dump habanalabs/gaudi: restore user registers when context opens habanalabs/gaudi: increase boot fit timeout habanalabs: update to latest firmware headers habanalabs/gaudi: minimize number of register reads ...
| | * \ \ \ \ \ \ \ \ Merge tag 'misc-habanalabs-next-2021-09-01' of ↵Greg Kroah-Hartman2021-09-0128-790/+3874
| | |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux into char-misc-next Oded writes: This tag contains habanalabs driver changes for v5.15: - Add a new uAPI (under the cs ioctl) to enable to user to reserve signals and signal them from within its workloads, while the driver performs the waiting. This allows finer granularity of pipelining between the different engines and resource utilization. - Add a new uAPI (under the wait_for_cs ioctl) to allow waiting on multiple command submissions (workloads) at the same time. This is an optimization for the user process so it won't need to call multiple times to the wait_for_cs ioctl. - Add new feature of "state dump", which can be triggered through new debugfs node. This is a similar concept to the kernel panic dump. This new mechanism retrieves information from the device in case one of the workloads that was sent by the user got stuck. This is very helpful for debugging the hang. - Add a new debugfs node to perform lookup of user pointers that are mapped to habana device's pmmu. - Fix to the tracking of user process when running inside a container. - Allow user to map more than 4GB of memory to the device MMU in single IOCTL call. - Minimize number of register reads done in GAUDI during user operation. - Allow user to retrieve the device's server type that the device is connected to. - Several fixes to the code of waiting on interrupts on behalf of the user. - Fixes and improvements to the hint mechanism in our VA allocation. - Update the firmware header files to the latest version while maintaining backward compatibility with older firmware versions. - Multiple fixes to various bugs. * tag 'misc-habanalabs-next-2021-09-01' of https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux: (61 commits) habanalabs/gaudi: hwmon default card name habanalabs: add support for f/w reset habanalabs/gaudi: block ICACHE_BASE_ADDERESS_HIGH in TPC habanalabs: cannot sleep while holding spinlock habanalabs: never copy_from_user inside spinlock habanalabs: remove unnecessary device status check habanalabs: disable IRQ in user interrupts spinlock habanalabs: add "in device creation" status habanalabs/gaudi: invalidate PMMU mem cache on init habanalabs/gaudi: size should be printed in decimal habanalabs/gaudi: define DC POWER for secured PMC habanalabs/gaudi: unmask out of bounds SLM access interrupt habanalabs: add userptr_lookup node in debugfs habanalabs/gaudi: fetch TPC/MME ECC errors from F/W habanalabs: modify multi-CS to wait on stream masters habanalabs/gaudi: add monitored SOBs to state dump habanalabs/gaudi: restore user registers when context opens habanalabs/gaudi: increase boot fit timeout habanalabs: update to latest firmware headers habanalabs/gaudi: minimize number of register reads ...
| | | * | | | | | | | | habanalabs/gaudi: hwmon default card nameRajaravi Krishna Katta2021-09-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit corrects CARD NAME for Gaudi as "HL205" Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| | | * | | | | | | | | habanalabs: add support for f/w resetOded Gabbay2021-09-015-35/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the f/w runs in secured mode, it can reset the ASIC when certain events occur. In unsecured mode, the driver asks the f/w to reset the ASIC for those events. We need to perform the entire reset procedure but without accessing the ASIC. i.e. without halting the engines and without sending messages to the f/w. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| | | * | | | | | | | | habanalabs/gaudi: block ICACHE_BASE_ADDERESS_HIGH in TPCOded Gabbay2021-09-011-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This register shouldn't be modified by user. Prefetch is disabled in Gaudi. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| | | * | | | | | | | | habanalabs: cannot sleep while holding spinlockfarah kassabri2021-09-012-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix 2 areas in the code where it's possible the code will go to sleep while holding a spinlock. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| | | * | | | | | | | | habanalabs: never copy_from_user inside spinlockOded Gabbay2021-09-011-23/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | copy_from_user might sleep so we can never call it when we have a spinlock. Moreover, it is not necessary in waiting for user interrupt, because if multiple threads will call this function on the same interrupt, each one will have it's own fence object inside the driver. The user address might be the same, but it doesn't really matter to us, as we only read from it. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| | | * | | | | | | | | habanalabs: remove unnecessary device status checkOded Gabbay2021-09-011-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Checking if the device is operational when entering the function to wait for user interrupt is not something that is useful or necessary. It is not done in any other wait_for_cs ioctl path. If the device becomes non-operational during the wait, the reset function will make sure the process wait is interrupted. Instead, move the check to the beginning of hl_wait_ioctl(). It will block any attempt to wait on CS or user interrupt once the device is already marked as non-operational. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| | | * | | | | | | | | habanalabs: disable IRQ in user interrupts spinlockOded Gabbay2021-09-011-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because this spinlock is taken in an interrupt handler, we must use the spin_lock_irqsave/irqrestore version to disable the interrupts on the local CPU. Otherwise, we can have a potential deadlock (if the interrupt handler is scheduled to run on the same cpu that the code who took the lock was running on). Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| | | * | | | | | | | | habanalabs: add "in device creation" statusOmer Shpigelman2021-09-015-17/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On init, the disabled state is cleared right before hw_init and that causes the device to report on "Operational" state before the device initialization is finished. Although the char device is not yet exposed to the user at this stage, the sysfs entries are exposed. This can cause errors in monitoring applications that use the sysfs entries. In order to avoid this, a new state "in device creation" is introduced to ne reported when the device is not disabled but is still in init flow. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| | | * | | | | | | | | habanalabs/gaudi: invalidate PMMU mem cache on initOded Gabbay2021-09-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This must be done to clear the internal mem cache so we won't get ecc errors on the first invalidation. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| | | * | | | | | | | | habanalabs/gaudi: size should be printed in decimalOded Gabbay2021-09-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's more readable for the size to be in decimal. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| | | * | | | | | | | | habanalabs/gaudi: define DC POWER for secured PMCOded Gabbay2021-09-012-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In secured mode, the CGM is disabled. Therefore, the DC power is higher. Without taking it into consideration, the utilization is 12-15% at idle. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
| | | * | | | | | | | | habanalabs/gaudi: unmask out of bounds SLM access interruptTomer Tayar2021-09-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The out of bounds SLM access TPC interrupt indicates a severe compiler bug and needs to be informed to user. This interrupt is currently masked so unmask it. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>