summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* x86/boot: Add error_putdec() helperH. Peter Anvin2024-02-062-10/+25
| | | | | | | | | | | Add a helper to print decimal numbers to early console. Suggested-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Jun'ichi Nomura <junichi.nomura@nec.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/lkml/20240123112624.GBZa-iYP1l9SSYtr-V@fat_crate.local/ Link: https://lore.kernel.org/r/20240202035052.17963-1-junichi.nomura@nec.com
* x86/startup_64: Drop long return to initial_code pointerArd Biesheuvel2024-01-311-32/+3
| | | | | | | | | | | | | | | | Since 866b556efa12 ("x86/head/64: Install startup GDT") the primary startup sequence sets the code segment register (CS) to __KERNEL_CS before calling into the startup code shared between primary and secondary boot. This means a simple indirect call is sufficient here. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240129180502.4069817-24-ardb+git@google.com
* Linux 6.8-rc2v6.8-rc2Linus Torvalds2024-01-291-1/+1
|
* Merge tag 'cxl-fixes-6.8-rc2' of ↵Linus Torvalds2024-01-285-13/+23
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Dan Williams: "A build regression fix, a device compatibility fix, and an original bug preventing creation of large (16 device) interleave sets: - Fix unit test build regression fallout from global "missing-prototypes" change - Fix compatibility with devices that do not support interrupts - Fix overflow when calculating the capacity of large interleave sets" * tag 'cxl-fixes-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/region:Fix overflow issue in alloc_hpa() cxl/pci: Skip irq features if MSI/MSI-X are not supported tools/testing/nvdimm: Disable "missing prototypes / declarations" warnings tools/testing/cxl: Disable "missing prototypes / declarations" warnings
| * cxl/region:Fix overflow issue in alloc_hpa()Quanquan Cao2024-01-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Creating a region with 16 memory devices caused a problem. The div_u64_rem function, used for dividing an unsigned 64-bit number by a 32-bit one, faced an issue when SZ_256M * p->interleave_ways. The result surpassed the maximum limit of the 32-bit divisor (4G), leading to an overflow and a remainder of 0. note: At this point, p->interleave_ways is 16, meaning 16 * 256M = 4G To fix this issue, I replaced the div_u64_rem function with div64_u64_rem and adjusted the type of the remainder. Signed-off-by: Quanquan Cao <caoqq@fujitsu.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Fixes: 23a22cd1c98b ("cxl/region: Allocate HPA capacity to regions") Cc: <stable@vger.kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * cxl/pci: Skip irq features if MSI/MSI-X are not supportedIra Weiny2024-01-221-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CXL 3.1 Section 3.1.1 states: "A Function on a CXL device must not generate INTx messages if that Function participates in CXL.cache protocol or CXL.mem protocols." The generic CXL memory driver only supports devices which use the CXL.mem protocol. The current driver attempts to allocate MSI/MSI-X vectors in anticipation of their need for mailbox interrupts or event processing. However, the above requirement does not require a device to support interrupts, only that they use MSI/MSI-X. For example, a device may disable mailbox interrupts and either be configured for firmware first or skip event processing and function. Dave Larsen reported that the following Intel / Agilex card does not support interrupts on function 0. CXL: Intel Corporation Device 0ddb (rev 01) (prog-if 10 [CXL Memory Device (CXL 2.x)]) Rather than fail device probe if interrupts are not supported; flag that irqs are not enabled and avoid features which require interrupts. Emit messages appropriate for the situation to aid in debugging should device behavior be unexpected due to a failure to allocate vectors. Note that it is possible for a device to have host based event processing through polling. However, the driver does not support polling and it is not anticipated to be generally required. Leave that functionality to a future patch if such a device comes along. Reported-by: Dave Larsen <davelarsen58@gmail.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-and-tested-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20240117-dont-fail-irq-v2-1-f33f26b0e365@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * tools/testing/nvdimm: Disable "missing prototypes / declarations" warningsDan Williams2024-01-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prevent warnings of the form: tools/testing/nvdimm/config_check.c:4:6: error: no previous prototype for ‘check’ [-Werror=missing-prototypes] ...by locally disabling some warnings. It turns out that: Commit 0fcb70851fbf ("Makefile.extrawarn: turn on missing-prototypes globally") ...in addition to expanding in-tree coverage, also impacts out-of-tree module builds like those in tools/testing/nvdimm/. Filter out the warning options on unit test code that does not effect mainline builds. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/170543984331.460832.1780246477583036191.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * tools/testing/cxl: Disable "missing prototypes / declarations" warningsDan Williams2024-01-222-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prevent warnings of the form: tools/testing/cxl/test/mock.c:44:6: error: no previous prototype for ‘__wrap_is_acpi_device_node’ [-Werror=missing-prototypes] tools/testing/cxl/test/mock.c:63:5: error: no previous prototype for ‘__wrap_acpi_table_parse_cedt’ [-Werror=missing-prototypes] tools/testing/cxl/test/mock.c:81:13: error: no previous prototype for ‘__wrap_acpi_evaluate_integer’ [-Werror=missing-prototypes] ...by locally disabling some warnings. It turns out that: Commit 0fcb70851fbf ("Makefile.extrawarn: turn on missing-prototypes globally") ...in addition to expanding in-tree coverage, also impacts out-of-tree module builds like those in tools/testing/cxl/. Filter out the warning options on unit test code that does not effect mainline builds. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/170543983780.460832.10920261849128601697.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | Merge tag 'mips-fixes_6.8_1' of ↵Linus Torvalds2024-01-2834-230/+83
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Thomas Bogendoerfer: - fix boot issue on single core Lantiq Danube devices - fix boot issue on Loongson64 platforms - fix improper FPU setup - fix missing prototypes issues * tag 'mips-fixes_6.8_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: mips: Call lose_fpu(0) before initializing fcr31 in mips_set_personality_nan MIPS: loongson64: set nid for reserved memblock region Revert "MIPS: loongson64: set nid for reserved memblock region" MIPS: lantiq: register smp_ops on non-smp platforms MIPS: loongson64: set nid for reserved memblock region MIPS: reserve exception vector space ONLY ONCE MIPS: BCM63XX: Fix missing prototypes MIPS: sgi-ip32: Fix missing prototypes MIPS: sgi-ip30: Fix missing prototypes MIPS: fw arc: Fix missing prototypes MIPS: sgi-ip27: Fix missing prototypes MIPS: Alchemy: Fix missing prototypes MIPS: Cobalt: Fix missing prototypes
| * | mips: Call lose_fpu(0) before initializing fcr31 in mips_set_personality_nanXi Ruoyao2024-01-271-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we still own the FPU after initializing fcr31, when we are preempted the dirty value in the FPU will be read out and stored into fcr31, clobbering our setting. This can cause an improper floating-point environment after execve(). For example: zsh% cat measure.c #include <fenv.h> int main() { return fetestexcept(FE_INEXACT); } zsh% cc measure.c -o measure -lm zsh% echo $((1.0/3)) # raising FE_INEXACT 0.33333333333333331 zsh% while ./measure; do ; done (stopped in seconds) Call lose_fpu(0) before setting fcr31 to prevent this. Closes: https://lore.kernel.org/linux-mips/7a6aa1bbdbbe2e63ae96ff163fab0349f58f1b9e.camel@xry111.site/ Fixes: 9b26616c8d9d ("MIPS: Respect the ISA level in FCSR handling") Cc: stable@vger.kernel.org Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
| * | MIPS: loongson64: set nid for reserved memblock regionHuang Pei2024-01-272-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()") reveals that reserved memblock regions have no valid node id set, just set it right since loongson64 firmware makes it clear in memory layout info. This works around booting failure on 3A1000+ since commit 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()") under CONFIG_DEFERRED_STRUCT_PAGE_INIT. Signed-off-by: Huang Pei <huangpei@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
| * | Revert "MIPS: loongson64: set nid for reserved memblock region"Thomas Bogendoerfer2024-01-272-4/+0
| | | | | | | | | | | | | | | | | | This reverts commit ce7b1b97776ec0b068c4dd6b6dbb48ae09a23519. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
| * | MIPS: lantiq: register smp_ops on non-smp platformsAleksander Jan Bajkowski2024-01-261-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lantiq uses a common kernel config for devices with 24Kc and 34Kc cores. The changes made previously to add support for interrupts on all cores work on 24Kc platforms with SMP disabled and 34Kc platforms with SMP enabled. This patch fixes boot issues on Danube (single core 24Kc) with SMP enabled. Fixes: 730320fd770d ("MIPS: lantiq: enable all hardware interrupts on second VPE") Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
| * | MIPS: loongson64: set nid for reserved memblock regionHuang Pei2024-01-262-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()") reveals that reserved memblock regions have no valid node id set, just set it right since loongson64 firmware makes it clear in memory layout info. This works around booting failure on 3A1000+ since commit 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()") under CONFIG_DEFERRED_STRUCT_PAGE_INIT. Signed-off-by: Huang Pei <huangpei@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
| * | MIPS: reserve exception vector space ONLY ONCEHuang Pei2024-01-261-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "cpu_probe" is called both by BP and APs, but reserving exception vector (like 0x0-0x1000) called by "cpu_probe" need once and calling on APs is too late since memblock is unavailable at that time. So, reserve exception vector ONLY by BP. Suggested-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Huang Pei <huangpei@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
| * | MIPS: BCM63XX: Fix missing prototypesFlorian Fainelli2024-01-267-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | Most of the symbols for which we do not have a prototype can actually be made static and for the few that cannot, there is already a declaration in a header for it. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
| * | MIPS: sgi-ip32: Fix missing prototypesThomas Bogendoerfer2024-01-227-8/+27
| | | | | | | | | | | | | | | | | | | | | | | | Fix interrupt function prototypes, move all prototypes into a new file ip32-common.h and include it where needed. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
| * | MIPS: sgi-ip30: Fix missing prototypesThomas Bogendoerfer2024-01-222-0/+2
| | | | | | | | | | | | | | | | | | | | | Include needed header files. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
| * | MIPS: fw arc: Fix missing prototypesThomas Bogendoerfer2024-01-221-1/+1
| | | | | | | | | | | | | | | | | | | | | Make ArcGetMemoryDescriptor() static since it's only needed internally. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
| * | MIPS: sgi-ip27: Fix missing prototypesThomas Bogendoerfer2024-01-227-204/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fix missing prototypes by making not shared functions static and adding others to ip27-common.h. Also drop ip27-hubio.c as it's not used for a long time. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
| * | MIPS: Alchemy: Fix missing prototypesFlorian Fainelli2024-01-223-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a number of missing prototypes warnings for board_setup(), alchemy_set_lpj() and prom_init_cmdline(), prom_getenv() and prom_get_ethernet_addr(). Fix those by providing definitions for the first two functions in au1000.h which is included everywhere relevant, and including prom.h for the last three. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
| * | MIPS: Cobalt: Fix missing prototypesFlorian Fainelli2024-01-222-3/+3
| |/ | | | | | | | | | | | | | | | | Fix missing prototypes warnings for cobalt_machine_halt() and cobalt_machine_restart() by moving their prototypes to cobalt.h which is included by setup.c. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
* | Merge tag 'locking_urgent_for_v6.8_rc2' of ↵Linus Torvalds2024-01-282-6/+20
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Borislav Petkov: - Prevent an inconsistent futex operation leading to stale state exposure * tag 'locking_urgent_for_v6.8_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Prevent the reuse of stale pi_state
| * | futex: Prevent the reuse of stale pi_stateSebastian Andrzej Siewior2024-01-192-6/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jiri Slaby reported a futex state inconsistency resulting in -EINVAL during a lock operation for a PI futex. It requires that the a lock process is interrupted by a timeout or signal: T1 Owns the futex in user space. T2 Tries to acquire the futex in kernel (futex_lock_pi()). Allocates a pi_state and attaches itself to it. T2 Times out and removes its rt_waiter from the rt_mutex. Drops the rtmutex lock and tries to acquire the hash bucket lock to remove the futex_q. The lock is contended and T2 schedules out. T1 Unlocks the futex (futex_unlock_pi()). Finds a futex_q but no rt_waiter. Unlocks the futex (do_uncontended) and makes it available to user space. T3 Acquires the futex in user space. T4 Tries to acquire the futex in kernel (futex_lock_pi()). Finds the existing futex_q of T2 and tries to attach itself to the existing pi_state. This (attach_to_pi_state()) fails with -EINVAL because uval contains the TID of T3 but pi_state points to T1. It's incorrect to unlock the futex and make it available for user space to acquire as long as there is still an existing state attached to it in the kernel. T1 cannot hand over the futex to T2 because T2 already gave up and started to clean up and is blocked on the hash bucket lock, so T2's futex_q with the pi_state pointing to T1 is still queued. T2 observes the futex_q, but ignores it as there is no waiter on the corresponding rt_mutex and takes the uncontended path which allows the subsequent caller of futex_lock_pi() (T4) to observe that stale state. To prevent this the unlock path must dequeue all futex_q entries which point to the same pi_state when there is no waiter on the rt mutex. This requires obviously to make the dequeue conditional in the locking path to prevent a double dequeue. With that it's guaranteed that user space cannot observe an uncontended futex which has kernel state attached. Fixes: fbeb558b0dd0d ("futex/pi: Fix recursive rt_mutex waiter state") Reported-by: Jiri Slaby <jirislaby@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Slaby <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20240118115451.0TkD_ZhB@linutronix.de Closes: https://lore.kernel.org/all/4611bcf2-44d0-4c34-9b84-17406f881003@kernel.org
* | | Merge tag 'irq_urgent_for_v6.8_rc2' of ↵Linus Torvalds2024-01-281-1/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Borislav Petkov: - Initialize the resend node of each IRQ descriptor, not only the first one * tag 'irq_urgent_for_v6.8_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Initialize resend_node hlist for all interrupt descriptors
| * | | genirq: Initialize resend_node hlist for all interrupt descriptorsDawei Li2024-01-241-1/+1
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For a CONFIG_SPARSE_IRQ=n kernel, early_irq_init() is supposed to initialize all interrupt descriptors. It does except for irq_desc::resend_node, which ia only initialized for the first descriptor. Use the indexed decriptor and not the base pointer to address that. Fixes: bc06a9e08742 ("genirq: Use hlist for managing resend handlers") Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240122085716.2999875-5-dawei.li@shingroup.cn
* | | Merge tag 'timers_urgent_for_v6.8_rc2' of ↵Linus Torvalds2024-01-282-1/+29
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Borislav Petkov: - Preserve the number of idle calls and sleep entries across CPU hotplug events in order to be able to compute correct averages - Limit the duration of the clocksource watchdog checking interval as too long intervals lead to wrongly marking the TSC as unstable * tag 'timers_urgent_for_v6.8_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick/sched: Preserve number of idle sleeps across CPU hotplug events clocksource: Skip watchdog check for large watchdog intervals
| * | | tick/sched: Preserve number of idle sleeps across CPU hotplug eventsTim Chen2024-01-251-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 71fee48f ("tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug") preserved total idle sleep time and iowait sleeptime across CPU hotplug events. Similar reasoning applies to the number of idle calls and idle sleeps to get the proper average of sleep time per idle invocation. Preserve those fields too. Fixes: 71fee48f ("tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug") Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240122233534.3094238-1-tim.c.chen@linux.intel.com
| * | | clocksource: Skip watchdog check for large watchdog intervalsJiri Wiesner2024-01-251-1/+24
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There have been reports of the watchdog marking clocksources unstable on machines with 8 NUMA nodes: clocksource: timekeeping watchdog on CPU373: Marking clocksource 'tsc' as unstable because the skew is too large: clocksource: 'hpet' wd_nsec: 14523447520 clocksource: 'tsc' cs_nsec: 14524115132 The measured clocksource skew - the absolute difference between cs_nsec and wd_nsec - was 668 microseconds: cs_nsec - wd_nsec = 14524115132 - 14523447520 = 667612 The kernel used 200 microseconds for the uncertainty_margin of both the clocksource and watchdog, resulting in a threshold of 400 microseconds (the md variable). Both the cs_nsec and the wd_nsec value indicate that the readout interval was circa 14.5 seconds. The observed behaviour is that watchdog checks failed for large readout intervals on 8 NUMA node machines. This indicates that the size of the skew was directly proportinal to the length of the readout interval on those machines. The measured clocksource skew, 668 microseconds, was evaluated against a threshold (the md variable) that is suited for readout intervals of roughly WATCHDOG_INTERVAL, i.e. HZ >> 1, which is 0.5 second. The intention of 2e27e793e280 ("clocksource: Reduce clocksource-skew threshold") was to tighten the threshold for evaluating skew and set the lower bound for the uncertainty_margin of clocksources to twice WATCHDOG_MAX_SKEW. Later in c37e85c135ce ("clocksource: Loosen clocksource watchdog constraints"), the WATCHDOG_MAX_SKEW constant was increased to 125 microseconds to fit the limit of NTP, which is able to use a clocksource that suffers from up to 500 microseconds of skew per second. Both the TSC and the HPET use default uncertainty_margin. When the readout interval gets stretched the default uncertainty_margin is no longer a suitable lower bound for evaluating skew - it imposes a limit that is far stricter than the skew with which NTP can deal. The root causes of the skew being directly proportinal to the length of the readout interval are: * the inaccuracy of the shift/mult pairs of clocksources and the watchdog * the conversion to nanoseconds is imprecise for large readout intervals Prevent this by skipping the current watchdog check if the readout interval exceeds 2 * WATCHDOG_INTERVAL. Considering the maximum readout interval of 2 * WATCHDOG_INTERVAL, the current default uncertainty margin (of the TSC and HPET) corresponds to a limit on clocksource skew of 250 ppm (microseconds of skew per second). To keep the limit imposed by NTP (500 microseconds of skew per second) for all possible readout intervals, the margins would have to be scaled so that the threshold value is proportional to the length of the actual readout interval. As for why the readout interval may get stretched: Since the watchdog is executed in softirq context the expiration of the watchdog timer can get severely delayed on account of a ksoftirqd thread not getting to run in a timely manner. Surely, a system with such belated softirq execution is not working well and the scheduling issue should be looked into but the clocksource watchdog should be able to deal with it accordingly. Fixes: 2e27e793e280 ("clocksource: Reduce clocksource-skew threshold") Suggested-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Jiri Wiesner <jwiesner@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Feng Tang <feng.tang@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240122172350.GA740@incl
* | | Merge tag 'x86_urgent_for_v6.8_rc2' of ↵Linus Torvalds2024-01-286-12/+50
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Make sure 32-bit syscall registers are properly sign-extended - Add detection for AMD's Zen5 generation CPUs and Intel's Clearwater Forest CPU model number - Make a stub function export non-GPL because it is part of the paravirt alternatives and that can be used by non-GPL code * tag 'x86_urgent_for_v6.8_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/CPU/AMD: Add more models to X86_FEATURE_ZEN5 x86/entry/ia32: Ensure s32 is sign extended to s64 x86/cpu: Add model number for Intel Clearwater Forest processor x86/CPU/AMD: Add X86_FEATURE_ZEN5 x86/paravirt: Make BUG_func() usable by non-GPL modules
| * | | x86/CPU/AMD: Add more models to X86_FEATURE_ZEN5Mario Limonciello2024-01-251-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add model ranges starting at 0x20, 0x40 and 0x70 to the synthetic feature flag X86_FEATURE_ZEN5. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240124220749.2983-1-mario.limonciello@amd.com
| * | | x86/entry/ia32: Ensure s32 is sign extended to s64Richard Palethorpe2024-01-242-4/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Presently ia32 registers stored in ptregs are unconditionally cast to unsigned int by the ia32 stub. They are then cast to long when passed to __se_sys*, but will not be sign extended. This takes the sign of the syscall argument into account in the ia32 stub. It still casts to unsigned int to avoid implementation specific behavior. However then casts to int or unsigned int as necessary. So that the following cast to long sign extends the value. This fixes the io_pgetevents02 LTP test when compiled with -m32. Presently the systemcall io_pgetevents_time64() unexpectedly accepts -1 for the maximum number of events. It doesn't appear other systemcalls with signed arguments are effected because they all have compat variants defined and wired up. Fixes: ebeb8c82ffaf ("syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32") Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com> Signed-off-by: Nikolay Borisov <nik.borisov@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240110130122.3836513-1-nik.borisov@suse.com Link: https://lore.kernel.org/ltp/20210921130127.24131-1-rpalethorpe@suse.com/
| * | | x86/cpu: Add model number for Intel Clearwater Forest processorTony Luck2024-01-231-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Server product based on the Atom Darkmont core. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240117191844.56180-1-tony.luck@intel.com
| * | | x86/CPU/AMD: Add X86_FEATURE_ZEN5Borislav Petkov (AMD)2024-01-232-7/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a synthetic feature flag for Zen5. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240104201138.5072-1-bp@alien8.de
| * | | x86/paravirt: Make BUG_func() usable by non-GPL modulesJuergen Gross2024-01-221-1/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several inlined functions subject to paravirt patching are referencing BUG_func() after the recent switch to the alternative patching mechanism. As those functions can legally be used by non-GPL modules, BUG_func() must be usable by those modules, too. So use EXPORT_SYMBOL() when exporting BUG_func(). Fixes: 9824b00c2b58 ("x86/paravirt: Move some functions and defines to alternative.c") Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240109082232.22657-1-jgross@suse.com
* | | Merge tag 'fixes-2024-01-28' of ↵Linus Torvalds2024-01-281-0/+3
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock fix from Mike Rapoport: "Fix crash when reserved memory is not added to memory. When CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, the initialization of reserved pages may cause access of NODE_DATA() with invalid nid and crash. Add a fall back to early_pfn_to_nid() in memmap_init_reserved_pages() to ensure a valid node id is always passed to init_reserved_page()" * tag 'fixes-2024-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: memblock: fix crash when reserved memory is not added to memory
| * | | memblock: fix crash when reserved memory is not added to memoryYajun Deng2024-01-191-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After commit 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()") nid of a reserved region is used by init_reserved_page() (with CONFIG_DEFERRED_STRUCT_PAGE_INIT=y) to access node strucure. In many cases the nid of the reserved memory is not set and this causes a crash. When the nid of a reserved region is not set, fall back to early_pfn_to_nid(), so that nid of the first_online_node will be passed to init_reserved_page(). Fixes: 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()") Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Link: https://lore.kernel.org/r/20240118061853.2652295-1-yajun.deng@linux.dev [rppt: massaged the commit message] Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
* | | | Merge tag 'platform-drivers-x86-v6.8-2' of ↵Linus Torvalds2024-01-2714-177/+460
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: - WMI bus driver fixes - Second attempt (previously reverted) at P2SB PCI rescan deadlock fix - AMD PMF driver improvements - MAINTAINERS updates - Misc other small fixes and hw-id additions * tag 'platform-drivers-x86-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: touchscreen_dmi: Add info for the TECLAST X16 Plus tablet platform/x86/intel/ifs: Call release_firmware() when handling errors. platform/x86/amd/pmf: Fix memory leak in amd_pmf_get_pb_data() platform/x86/amd/pmf: Get ambient light information from AMD SFH driver platform/x86/amd/pmf: Get Human presence information from AMD SFH driver platform/mellanox: mlxbf-pmc: Fix offset calculation for crspace events platform/mellanox: mlxbf-tmfifo: Drop Tx network packet when Tx TmFIFO is full MAINTAINERS: remove defunct acpi4asus project info from asus notebooks section MAINTAINERS: add Luke Jones as maintainer for asus notebooks MAINTAINERS: Remove Perry Yuan as DELL WMI HARDWARE PRIVACY SUPPORT maintainer platform/x86: silicom-platform: Add missing "Description:" for power_cycle sysfs attr platform/x86: intel-wmi-sbl-fw-update: Fix function name in error message platform/x86: p2sb: Use pci_resource_n() in p2sb_read_bar0() platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe platform/x86: intel-uncore-freq: Fix types in sysfs callbacks platform/x86: wmi: Fix wmi_dev_probe() platform/x86: wmi: Fix notify callback locking platform/x86: wmi: Decouple legacy WMI notify handlers from wmi_block_list platform/x86: wmi: Return immediately if an suitable WMI event is found platform/x86: wmi: Fix error handling in legacy WMI notify handler functions
| * | | | platform/x86: touchscreen_dmi: Add info for the TECLAST X16 Plus tabletPhoenix Chen2024-01-261-0/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add touch screen info for TECLAST X16 Plus tablet. Signed-off-by: Phoenix Chen <asbeltogf@gmail.com> Link: https://lore.kernel.org/r/20240126095308.5042-1-asbeltogf@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
| * | | | platform/x86/intel/ifs: Call release_firmware() when handling errors.Jithu Joseph2024-01-261-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Missing release_firmware() due to error handling blocked any future image loading. Fix the return code and release_fiwmare() to release the bad image. Fixes: 25a76dbb36dd ("platform/x86/intel/ifs: Validate image size") Reported-by: Pengfei Xu <pengfei.xu@intel.com> Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240125082254.424859-2-ashok.raj@intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
| * | | | platform/x86/amd/pmf: Fix memory leak in amd_pmf_get_pb_data()Cong Liu2024-01-261-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | amd_pmf_get_pb_data() will allocate memory for the policy buffer, but does not free it if copy_from_user() fails. This leads to a memory leak. Fixes: 10817f28e533 ("platform/x86/amd/pmf: Add capability to sideload of policy binary") Reviewed-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: Cong Liu <liucong2@kylinos.cn> Link: https://lore.kernel.org/r/20240124012939.6550-1-liucong2@kylinos.cn Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
| * | | | platform/x86/amd/pmf: Get ambient light information from AMD SFH driverShyam Sundar S K2024-01-261-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AMD SFH driver has APIs defined to export the ambient light information; use this within the PMF driver to send inputs to the PMF TA, so that PMF driver can enact to the actions coming from the TA. Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20240123141458.3715211-2-Shyam-sundar.S-k@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
| * | | | platform/x86/amd/pmf: Get Human presence information from AMD SFH driverShyam Sundar S K2024-01-262-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AMD SFH driver has APIs defined to export the human presence information; use this within the PMF driver to send inputs to the PMF TA, so that PMF driver can enact to the actions coming from the TA. Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20240123141458.3715211-1-Shyam-sundar.S-k@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
| * | | | platform/mellanox: mlxbf-pmc: Fix offset calculation for crspace eventsShravan Kumar Ramani2024-01-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The event selector fields for 2 counters are contained in one 32-bit register and the current logic does not account for this. Fixes: 423c3361855c ("platform/mellanox: mlxbf-pmc: Add support for BlueField-3") Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com> Reviewed-by: David Thompson <davthompson@nvidia.com> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Link: https://lore.kernel.org/r/8834cfa496c97c7c2fcebcfca5a2aa007e20ae96.1705485095.git.shravankr@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
| * | | | platform/mellanox: mlxbf-tmfifo: Drop Tx network packet when Tx TmFIFO is fullLiming Sun2024-01-221-0/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Starting from Linux 5.16 kernel, Tx timeout mechanism was added in the virtio_net driver which prints the "Tx timeout" warning message when a packet stays in Tx queue for too long. Below is an example of the reported message: "[494105.316739] virtio_net virtio1 tmfifo_net0: TX timeout on queue: 0, sq: output.0, vq: 0×1, name: output.0, usecs since last trans: 3079892256". This issue could happen when external host driver which drains the FIFO is restared, stopped or upgraded. To avoid such confusing "Tx timeout" messages, this commit adds logic to drop the outstanding Tx packet if it's not able to transmit in two seconds due to Tx FIFO full, which can be considered as congestion or out-of-resource drop. This commit also handles the special case that the packet is half- transmitted into the Tx FIFO. In such case, the packet is discarded with remaining length stored in vring->rem_padding. So paddings with zeros can be sent out when Tx space is available to maintain the integrity of the packet format. The padded packet will be dropped on the receiving side. Signed-off-by: Liming Sun <limings@nvidia.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240111173106.96958-1-limings@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
| * | | | MAINTAINERS: remove defunct acpi4asus project info from asus notebooks sectionHans de Goede2024-01-221-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The acpi4asus project appears to be defunct, according to: https://sourceforge.net/p/acpi4asus/mailman/acpi4asus-user/ the last posts to the list were done in May 2020 and even then they were mostly spam. And the http://acpi4asus.sf.net website still talks about 2.6.x kernels. Drop the defunct mailing-list and update the W: entry to point to the new up2date https://asus-linux.org/ site. Cc: Corentin Chary <corentin.chary@gmail.com> Cc: Luke D. Jones <luke@ljones.dev> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
| * | | | MAINTAINERS: add Luke Jones as maintainer for asus notebooksLuke D. Jones2024-01-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add myself as maintainer for "ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS" as suggested by Hans de Goede based on my history of contributions. Signed-off-by: Luke D. Jones <luke@ljones.dev> Link: https://lore.kernel.org/r/20240115211829.48251-1-luke@ljones.dev Signed-off-by: Hans de Goede <hdegoede@redhat.com>
| * | | | MAINTAINERS: Remove Perry Yuan as DELL WMI HARDWARE PRIVACY SUPPORT maintainerHeiner Kallweit2024-01-221-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recent mails to his Dell address bounced with "user unknown". So remove him as maintainer. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/c9757d0a-2046-464b-93e1-a2d9ab0ce36b@gmail.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
| * | | | platform/x86: silicom-platform: Add missing "Description:" for power_cycle ↵Hans de Goede2024-01-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sysfs attr The Documentation/ABI/testing/sysfs-platform-silicom entry for the power_cycle sysfs attr is missing the "Description:" keyword, add this. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20240108140655.547261-1-hdegoede@redhat.com
| * | | | platform/x86: intel-wmi-sbl-fw-update: Fix function name in error messageArmin Wolf2024-01-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since when the driver was converted to use the bus-based WMI interface, the old GUID-based WMI functions are not used anymore. Update the error message to avoid confusing users. Compile-tested only. Fixes: 75c487fcb69c ("platform/x86: intel-wmi-sbl-fw-update: Use bus-based WMI interface") Signed-off-by: Armin Wolf <W_Armin@gmx.de> Acked-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240106224126.13803-1-W_Armin@gmx.de Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>