summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'timers-core-2023-02-20' of ↵Linus Torvalds2023-02-2127-77/+252
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "Updates for timekeeping, timers and clockevent/source drivers: Core: - Yet another round of improvements to make the clocksource watchdog more robust: - Relax the clocksource-watchdog skew criteria to match the NTP criteria. - Temporarily skip the watchdog when high memory latencies are detected which can lead to false-positives. - Provide an option to enable TSC skew detection even on systems where TSC is marked as reliable. Sigh! - Initialize the restart block in the nanosleep syscalls to be directed to the no restart function instead of doing a partial setup on entry. This prevents an erroneous restart_syscall() invocation from corrupting user space data. While such a situation is clearly a user space bug, preventing this is a correctness issue and caters to the least suprise principle. - Ignore the hrtimer slack for realtime tasks in schedule_hrtimeout() to align it with the nanosleep semantics. Drivers: - The obligatory new driver bindings for Mediatek, Rockchip and RISC-V variants. - Add support for the C3STOP misfeature to the RISC-V timer to handle the case where the timer stops in deeper idle state. - Set up a static key in the RISC-V timer correctly before first use. - The usual small improvements and fixes all over the place" * tag 'timers-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits) clocksource/drivers/timer-sun4i: Add CLOCK_EVT_FEAT_DYNIRQ clocksource/drivers/em_sti: Mark driver as non-removable clocksource/drivers/sh_tmu: Mark driver as non-removable clocksource/drivers/riscv: Patch riscv_clock_next_event() jump before first use clocksource/drivers/timer-microchip-pit64b: Add delay timer clocksource/drivers/timer-microchip-pit64b: Select driver only on ARM dt-bindings: timer: sifive,clint: add comaptibles for T-Head's C9xx dt-bindings: timer: mediatek,mtk-timer: add MT8365 clocksource/drivers/riscv: Get rid of clocksource_arch_init() callback clocksource/drivers/sh_cmt: Mark driver as non-removable clocksource/drivers/timer-microchip-pit64b: Drop obsolete dependency on COMPILE_TEST clocksource/drivers/riscv: Increase the clock source rating clocksource/drivers/timer-riscv: Set CLOCK_EVT_FEAT_C3STOP based on DT dt-bindings: timer: Add bindings for the RISC-V timer device RISC-V: time: initialize hrtimer based broadcast clock event device dt-bindings: timer: rk-timer: Add rktimer for rv1126 time/debug: Fix memory leak with using debugfs_lookup() clocksource: Enable TSC watchdog checking of HPET and PMTMR only when requested posix-timers: Use atomic64_try_cmpxchg() in __update_gt_cputime() clocksource: Verify HPET and PMTMR when TSC unverified ...
| * Merge tag 'clocksource.2023.02.06b' of ↵Thomas Gleixner2023-02-137-29/+123
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into timers/core Pull clocksource watchdog changes from Paul McKenney: o Improvements to clocksource-watchdog console messages. o Loosening of the clocksource-watchdog skew criteria to match those of NTP (500 parts per million, relaxed from 400 parts per million). If it is good enough for NTP, it is good enough for the clocksource watchdog. o Suspend clocksource-watchdog checking temporarily when high memory latencies are detected. This avoids the false-positive clock-skew events that have been seen on production systems running memory-intensive workloads. o On systems where the TSC is deemed trustworthy, use it as the watchdog timesource, but only when specifically requested using the tsc=watchdog kernel boot parameter. This permits clock-skew events to be detected, but avoids forcing workloads to use the slow HPET and ACPI PM timers. These last two timers are slow enough to cause systems to be needlessly marked bad on the one hand, and real skew does sometimes happen on production systems running production workloads on the other. And sometimes it is the fault of the TSC, or at least of the firmware that told the kernel to program the TSC with the wrong frequency. o Add a tsc=revalidate kernel boot parameter to allow the kernel to diagnose cases where the TSC hardware works fine, but was told by firmware to tick at the wrong frequency. Such cases are rare, but they really have happened on production systems. Link: https://lore.kernel.org/r/20230210193640.GA3325193@paulmck-ThinkPad-P17-Gen-1
| | * clocksource: Enable TSC watchdog checking of HPET and PMTMR only when requestedPaul E. McKenney2023-02-072-2/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unconditionally enabling TSC watchdog checking of the HPET and PMTMR clocksources can degrade latency and performance. Therefore, provide a new "watchdog" option to the tsc= boot parameter that opts into such checking. Note that tsc=watchdog is overridden by a tsc=nowatchdog regardless of their relative positions in the list of boot parameters. Reported-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Waiman Long <longman@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Waiman Long <longman@redhat.com>
| | * clocksource: Verify HPET and PMTMR when TSC unverifiedPaul E. McKenney2023-02-024-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On systems with two or fewer sockets, when the boot CPU has CONSTANT_TSC, NONSTOP_TSC, and TSC_ADJUST, clocksource watchdog verification of the TSC is disabled. This works well much of the time, but there is the occasional production-level system that meets all of these criteria, but which still has a TSC that skews significantly from atomic-clock time. This is usually attributed to a firmware or hardware fault. Yes, the various NTP daemons do express their opinions of userspace-to-atomic-clock time skew, but they put them in various places, depending on the daemon and distro in question. It would therefore be good for the kernel to have some clue that there is a problem. The old behavior of marking the TSC unstable is a non-starter because a great many workloads simply cannot tolerate the overheads and latencies of the various non-TSC clocksources. In addition, NTP-corrected systems sometimes can tolerate significant kernel-space time skew as long as the userspace time sources are within epsilon of atomic-clock time. Therefore, when watchdog verification of TSC is disabled, enable it for HPET and PMTMR (AKA ACPI PM timer). This provides the needed in-kernel time-skew diagnostic without degrading the system's performance. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Waiman Long <longman@redhat.com> Cc: <x86@kernel.org> Tested-by: Feng Tang <feng.tang@intel.com>
| | * x86/tsc: Add option to force frequency recalibration with HW timerFeng Tang2023-02-022-4/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel assumes that the TSC frequency which is provided by the hardware / firmware via MSRs or CPUID(0x15) is correct after applying a few basic consistency checks. This disables the TSC recalibration against HPET or PM timer. As a result there is no mechanism to validate that frequency in cases where a firmware or hardware defect is suspected. And there was case that some user used atomic clock to measure the TSC frequency and reported an inaccuracy issue, which was later fixed in firmware. Add an option 'recalibrate' for 'tsc' kernel parameter to force the tsc freq recalibration with HPET or PM timer, and warn if the deviation from previous value is more than about 500 PPM, which provides a way to verify the data from hardware / firmware. There is no functional change to existing work flow. Recently there was a real-world case: "The 40ms/s divergence between TSC and HPET was observed on hardware that is quite recent" [1], on that platform the TSC frequence 1896 MHz was got from CPUID(0x15), and the force-reclibration with HPET/PMTIMER both calibrated out value of 1975 MHz, which also matched with check from software 'chronyd', indicating it's a problem of BIOS or firmware. [Thanks tglx for helping improving the commit log] [ paulmck: Wordsmith Kconfig help text. ] [1]. https://lore.kernel.org/lkml/20221117230910.GI4001@paulmck-ThinkPad-P17-Gen-1/ Signed-off-by: Feng Tang <feng.tang@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: <x86@kernel.org> Cc: <linux-doc@vger.kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
| | * clocksource: Suspend the watchdog temporarily when high read latency detectedFeng Tang2023-01-251-13/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bugs have been reported on 8 sockets x86 machines in which the TSC was wrongly disabled when the system is under heavy workload. [ 818.380354] clocksource: timekeeping watchdog on CPU336: hpet wd-wd read-back delay of 1203520ns [ 818.436160] clocksource: wd-tsc-wd read-back delay of 181880ns, clock-skew test skipped! [ 819.402962] clocksource: timekeeping watchdog on CPU338: hpet wd-wd read-back delay of 324000ns [ 819.448036] clocksource: wd-tsc-wd read-back delay of 337240ns, clock-skew test skipped! [ 819.880863] clocksource: timekeeping watchdog on CPU339: hpet read-back delay of 150280ns, attempt 3, marking unstable [ 819.936243] tsc: Marking TSC unstable due to clocksource watchdog [ 820.068173] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'. [ 820.092382] sched_clock: Marking unstable (818769414384, 1195404998) [ 820.643627] clocksource: Checking clocksource tsc synchronization from CPU 267 to CPUs 0,4,25,70,126,430,557,564. [ 821.067990] clocksource: Switched to clocksource hpet This can be reproduced by running memory intensive 'stream' tests, or some of the stress-ng subcases such as 'ioport'. The reason for these issues is the when system is under heavy load, the read latency of the clocksources can be very high. Even lightweight TSC reads can show high latencies, and latencies are much worse for external clocksources such as HPET or the APIC PM timer. These latencies can result in false-positive clocksource-unstable determinations. These issues were initially reported by a customer running on a production system, and this problem was reproduced on several generations of Xeon servers, especially when running the stress-ng test. These Xeon servers were not production systems, but they did have the latest steppings and firmware. Given that the clocksource watchdog is a continual diagnostic check with frequency of twice a second, there is no need to rush it when the system is under heavy load. Therefore, when high clocksource read latencies are detected, suspend the watchdog timer for 5 minutes. Signed-off-by: Feng Tang <feng.tang@intel.com> Acked-by: Waiman Long <longman@redhat.com> Cc: John Stultz <jstultz@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Stephen Boyd <sboyd@kernel.org> Cc: Feng Tang <feng.tang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
| | * clocksource: Improve "skew is too large" messagesPaul E. McKenney2023-01-051-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When clocksource_watchdog() detects excessive clocksource skew compared to the watchdog clocksource, it marks the clocksource under test as unstable and prints several lines worth of message. But that message is unclear to anyone unfamiliar with the code: clocksource: timekeeping watchdog on CPU2: Marking clocksource 'wdtest-ktime' as unstable because the skew is too large: clocksource: 'kvm-clock' wd_nsec: 400744390 wd_now: 612625c2c wd_last: 5fa7f7c66 mask: ffffffffffffffff clocksource: 'wdtest-ktime' cs_nsec: 600744034 cs_now: 173081397a292d4f cs_last: 17308139565a8ced mask: ffffffffffffffff clocksource: 'kvm-clock' (not 'wdtest-ktime') is current clocksource. Therefore, add the following line near the end of that message: Clocksource 'wdtest-ktime' skewed 199999644 ns (199 ms) over watchdog 'kvm-clock' interval of 400744390 ns (400 ms) This new line clearly indicates the amount of skew between the two clocksources, along with the duration of the time interval over which the skew occurred, both in nanoseconds and milliseconds. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: John Stultz <jstultz@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Stephen Boyd <sboyd@kernel.org> Cc: Feng Tang <feng.tang@intel.com>
| | * clocksource: Improve read-back-delay messagePaul E. McKenney2023-01-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When cs_watchdog_read() is unable to get a qualifying clocksource read within the limit set by max_cswd_read_retries, it prints a message and marks the clocksource under test as unstable. But that message is unclear to anyone unfamiliar with the code: clocksource: timekeeping watchdog on CPU13: wd-tsc-wd read-back delay 1000614ns, attempt 3, marking unstable Therefore, add some context so that the message appears as follows: clocksource: timekeeping watchdog on CPU13: wd-tsc-wd excessive read-back delay of 1000614ns vs. limit of 125000ns, wd-wd read-back delay only 27ns, attempt 3, marking tsc unstable Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: John Stultz <jstultz@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Stephen Boyd <sboyd@kernel.org> Cc: Feng Tang <feng.tang@intel.com>
| | * clocksource: Loosen clocksource watchdog constraintsPaul E. McKenney2023-01-042-7/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, MAX_SKEW_USEC is set to 100 microseconds, which has worked reasonably well. However, NTP is willing to tolerate 500 microseconds of skew per second, and a clocksource that is good enough for NTP should be good enough for the clocksource watchdog. The watchdog's skew is controlled by MAX_SKEW_USEC and the CLOCKSOURCE_WATCHDOG_MAX_SKEW_US Kconfig option. However, these values are doubled before being associated with a clocksource's ->uncertainty_margin, and the ->uncertainty_margin values of the pair of clocksource's being compared are summed before checking against the skew. Therefore, set both MAX_SKEW_USEC and the default for the CLOCKSOURCE_WATCHDOG_MAX_SKEW_US Kconfig option to 125 microseconds of skew per second, resulting in 500 microseconds of skew per second in the clocksource watchdog's skew comparison. Suggested-by Rik van Riel <riel@surriel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
| | * clocksource: Print clocksource name when clocksource is tested unstableYunying Sun2023-01-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some "TSC fall back to HPET" messages appear on systems having more than 2 NUMA nodes: clocksource: timekeeping watchdog on CPU168: hpet read-back delay of 4296200ns, attempt 4, marking unstable The "hpet" here is misleading the clocksource watchdog is really doing repeated reads of "hpet" in order to check for unrelated delays. Therefore, print the name of the clocksource under test, prefixed by "wd-" and suffixed by "-wd", for example, "wd-tsc-wd". Signed-off-by: Yunying Sun <yunying.sun@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
| * | Merge tag 'timers-v6.3-rc1' of ↵Thomas Gleixner2023-02-1313-35/+103
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://git.linaro.org/people/daniel.lezcano/linux into timers/core Pull clocksource/event changes from Daniel Lezcano: - Add rktimer for rv1126 Rockchip based board (Jagan Teki) - Initialize hrtimer based broadcast clock event device on RISC-V before C3STOP can be used (Conor Dooley) - Add DT binding for RISC-V timer and add the C3STOP flag if the DT tells the timer can not wake up the CPU (Anup Patel) - Increase the RISC-V timer rating as it is more efficient than mmio timers (Samuel Holland) - Drop obsolete dependency on COMPILE_TEST on microchip-pit64b as the OF is already depending on it (Jean Delvare) - Mark sh_cmt, sh_tmu, em_sti drivers as non-removable (Uwe Kleine-König) - Add binding description for mediatek,mt8365-systimer (Bernhard Rosenkränzer) - Add compatibles for T-Head's C9xx (Icenowy Zheng) - Restrict the microchip-pit64b compilation to the ARM architecture and add the delay timer (Claudiu Beznea) - Set the static key to select the SBI or Sstc timer sooner to prevent the first call to use the SBI while Sstc must be used (Matt Evans) - Add the CLOCK_EVT_FEAT_DYNIRQ flag to optimize the timer wake up on the sun4i platform (Yangtao Li) Link: https://lore.kernel/org/r/b7d1d982-d717-2930-b353-19b92cbe390f@linaro.org
| | * | clocksource/drivers/timer-sun4i: Add CLOCK_EVT_FEAT_DYNIRQYangtao Li2023-02-131-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add CLOCK_EVT_FEAT_DYNIRQ to allow the IRQ could be runtime set affinity to the cores that needs wake up, otherwise saying core0 has to send IPI to wakeup core1. With CLOCK_EVT_FEAT_DYNIRQ set, when broadcast timer could wake up the cores, IPI is not needed. After enabling this feature, especially the scene where cpuidle is enabled can benefit. Signed-off-by: Yangtao Li <frank.li@vivo.com> Link: https://lore.kernel.org/r/20230209040239.24710-1-frank.li@vivo.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
| | * | clocksource/drivers/em_sti: Mark driver as non-removableUwe Kleine-König2023-02-131-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The comment in the remove callback suggests that the driver is not supposed to be unbound. However returning an error code in the remove callback doesn't accomplish that. Instead set the suppress_bind_attrs property (which makes it impossible to unbind the driver via sysfs). The only remaining way to unbind a em_sti device would be module unloading, but that doesn't apply here, as the driver cannot be built as a module. Also drop the useless remove callback. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20230207193010.469495-1-u.kleine-koenig@pengutronix.de Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
| | * | clocksource/drivers/sh_tmu: Mark driver as non-removableUwe Kleine-König2023-02-131-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The comment in the remove callback suggests that the driver is not supposed to be unbound. However returning an error code in the remove callback doesn't accomplish that. Instead set the suppress_bind_attrs property (which makes it impossible to unbind the driver via sysfs). The only remaining way to unbind a sh_tmu device would be module unloading, but that doesn't apply here, as the driver cannot be built as a module. Also drop the useless remove callback. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20230207193614.472060-1-u.kleine-koenig@pengutronix.de Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
| | * | clocksource/drivers/riscv: Patch riscv_clock_next_event() jump before first useMatt Evans2023-02-131-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A static key is used to select between SBI and Sstc timer usage in riscv_clock_next_event(), but currently the direction is resolved after cpuhp_setup_state() is called (which sets the next event). The first event will therefore fall through the sbi_set_timer() path; this breaks Sstc-only systems. So, apply the jump patching before first use. Fixes: 9f7a8ff6391f ("RISC-V: Prefer sstc extension if available") Signed-off-by: Matt Evans <mev@rivosinc.com> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/CDDAB2D0-264E-42F3-8E31-BA210BEB8EC1@rivosinc.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
| | * | clocksource/drivers/timer-microchip-pit64b: Add delay timerClaudiu Beznea2023-02-131-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add delay timer. Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230203130537.1921608-3-claudiu.beznea@microchip.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
| | * | clocksource/drivers/timer-microchip-pit64b: Select driver only on ARMClaudiu Beznea2023-02-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Microchip PIT64B is currently available on ARM based devices. Thus select it only for ARM. This allows implementing delay timer. Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230203130537.1921608-2-claudiu.beznea@microchip.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
| | * | dt-bindings: timer: sifive,clint: add comaptibles for T-Head's C9xxIcenowy Zheng2023-02-131-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | T-Head C906/C910 CLINT is not compliant to SiFive ones (and even not compliant to the newcoming ACLINT spec) because of lack of mtime register. Add a compatible string formatted like the C9xx-specific PLIC compatible, and do not allow a SiFive one as fallback because they're not really compliant. Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Samuel Holland <samuel@sholland.org> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230202072814.319903-1-uwu@icenowy.me Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
| | * | dt-bindings: timer: mediatek,mtk-timer: add MT8365Bernhard Rosenkränzer2023-02-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add binding description for mediatek,mt8365-systimer Signed-off-by: Bernhard Rosenkränzer <bero@baylibre.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230125143503.1015424-8-bero@baylibre.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
| | * | clocksource/drivers/riscv: Get rid of clocksource_arch_init() callbackLad Prabhakar2023-02-133-10/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Having a clocksource_arch_init() callback always sets vdso_clock_mode to VDSO_CLOCKMODE_ARCHTIMER if GENERIC_GETTIMEOFDAY is enabled, this is required for the riscv-timer. This works for platforms where just riscv-timer clocksource is present. On platforms where other clock sources are available we want them to register with vdso_clock_mode set to VDSO_CLOCKMODE_NONE. On the Renesas RZ/Five SoC OSTM block can be used as clocksource [0], to avoid multiple clock sources being registered as VDSO_CLOCKMODE_ARCHTIMER move setting of vdso_clock_mode in the riscv-timer driver instead of doing this in clocksource_arch_init() callback as done similarly for ARM/64 architecture. [0] drivers/clocksource/renesas-ostm.c Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Tested-by: Samuel Holland <samuel@sholland.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Samuel Holland <samuel@sholland.org> Link: https://lore.kernel.org/r/20221229224601.103851-1-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
| | * | clocksource/drivers/sh_cmt: Mark driver as non-removableUwe Kleine-König2023-02-131-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The comment in the remove callback suggests that the driver is not supposed to be unbound. However returning an error code in the remove callback doesn't accomplish that. Instead set the suppress_bind_attrs property (which makes it impossible to unbind the driver via sysfs). The only remaining way to unbind a sh_cmt device would be module unloading, but that doesn't apply here, as the driver cannot be built as a module. Also drop the useless remove callback. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20230123220221.48164-1-u.kleine-koenig@pengutronix.de Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
| | * | clocksource/drivers/timer-microchip-pit64b: Drop obsolete dependency on ↵Jean Delvare2023-02-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | COMPILE_TEST Since commit 0166dc11be91 ("of: make CONFIG_OF user selectable"), it is possible to test-build any driver which depends on OF on any architecture by explicitly selecting OF. Therefore depending on COMPILE_TEST as an alternative is no longer needed. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230121182911.4e47a5ff@endymion.delvare Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
| | * | clocksource/drivers/riscv: Increase the clock source ratingSamuel Holland2023-02-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RISC-V provides an architectural clock source via the time CSR. This clock source exposes a 64-bit counter synchronized across all CPUs. Because it is accessed using a CSR, it is much more efficient to read than MMIO clock sources. For example, on the Allwinner D1, reading the sun4i timer in a loop takes 131 cycles/iteration, while reading the RISC-V time CSR takes only 5 cycles/iteration. Adjust the RISC-V clock source rating so it is preferred over the various platform-specific MMIO clock sources. Signed-off-by: Samuel Holland <samuel@sholland.org> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20221228004444.61568-1-samuel@sholland.org Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
| | * | clocksource/drivers/timer-riscv: Set CLOCK_EVT_FEAT_C3STOP based on DTAnup Patel2023-02-131-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should set CLOCK_EVT_FEAT_C3STOP for a clock_event_device only when riscv,timer-cannot-wake-cpu DT property is present in the RISC-V timer DT node. This way CLOCK_EVT_FEAT_C3STOP feature is set for clock_event_device based on RISC-V platform capabilities rather than having it set for all RISC-V platforms. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20230103141102.772228-4-apatel@ventanamicro.com Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
| | * | dt-bindings: timer: Add bindings for the RISC-V timer deviceAnup Patel2023-02-131-0/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We add DT bindings for a separate RISC-V timer DT node which can be used to describe implementation specific behaviour (such as timer interrupt not triggered during non-retentive suspend). Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20230103141102.772228-3-apatel@ventanamicro.com Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
| | * | RISC-V: time: initialize hrtimer based broadcast clock event deviceConor Dooley2023-02-131-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similarly to commit 022eb8ae8b5e ("ARM: 8938/1: kernel: initialize broadcast hrtimer based clock event device"), RISC-V needs to initiate hrtimer based broadcast clock event device before C3STOP can be used. Otherwise, the introduction of C3STOP for the RISC-V arch timer in commit 232ccac1bd9b ("clocksource/drivers/riscv: Events are stopped during CPU suspend") leaves us without any broadcast timer registered. This prevents the kernel from entering oneshot mode, which breaks timer behaviour, for example clock_nanosleep(). A test app that sleeps each cpu for 6, 5, 4, 3 ms respectively, HZ=250 & C3STOP enabled, the sleep times are rounded up to the next jiffy: == CPU: 1 == == CPU: 2 == == CPU: 3 == == CPU: 4 == Mean: 7.974992 Mean: 7.976534 Mean: 7.962591 Mean: 3.952179 Std Dev: 0.154374 Std Dev: 0.156082 Std Dev: 0.171018 Std Dev: 0.076193 Hi: 9.472000 Hi: 10.495000 Hi: 8.864000 Hi: 4.736000 Lo: 6.087000 Lo: 6.380000 Lo: 4.872000 Lo: 3.403000 Samples: 521 Samples: 521 Samples: 521 Samples: 521 Link: https://lore.kernel.org/linux-riscv/YzYTNQRxLr7Q9JR0@spud/ Fixes: 232ccac1bd9b ("clocksource/drivers/riscv: Events are stopped during CPU suspend") Suggested-by: Samuel Holland <samuel@sholland.org> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Samuel Holland <samuel@sholland.org> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20230103141102.772228-2-apatel@ventanamicro.com Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
| | * | dt-bindings: timer: rk-timer: Add rktimer for rv1126Jagan Teki2023-02-131-0/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | Add rockchip timer compatible string for rockchip rv1126. Signed-off-by: Jagan Teki <jagan@edgeble.ai> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20221123183124.6911-3-jagan@edgeble.ai Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
| * | time/debug: Fix memory leak with using debugfs_lookup()Greg Kroah-Hartman2023-02-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When calling debugfs_lookup() the result must have dput() called on it, otherwise the memory will leak over time. To make things simpler, just call debugfs_lookup_and_remove() instead which handles all of the logic at once. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230202151214.2306822-1-gregkh@linuxfoundation.org
| * | posix-timers: Use atomic64_try_cmpxchg() in __update_gt_cputime()Uros Bizjak2023-02-061-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use atomic64_try_cmpxchg() instead of atomic64_cmpxchg() in __update_gt_cputime(). The x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg() (and related move instruction in front of cmpxchg()). Also, atomic64_try_cmpxchg() implicitly assigns old *ptr value to "old" when cmpxchg() fails. There is no need to re-read the value in the loop. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230116165337.5810-1-ubizjak@gmail.com
| * | vdso/bits.h: Add BIT_ULL() for the sake of consistencyAndy Shevchenko2023-01-312-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The minimization done in 3945ff37d2f4 ("linux/bits.h: Extract common header for vDSO") was required to isolate the VDSO build from the larger kernel header impact. The split added some inconsistency since BIT() and BIT_ULL() are now defined in the different files which confuses unprepared reader. Move BIT_ULL() to vdso/bits.h. No functional change. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20221128141003.77929-1-andriy.shevchenko@linux.intel.com
| * | hrtimer: Ignore slack time for RT tasks in schedule_hrtimeout_range()Davidlohr Bueso2023-01-311-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While in theory the timer can be triggered before expires + delta, for the cases of RT tasks they really have no business giving any lenience for extra slack time, so override any passed value by the user and always use zero for schedule_hrtimeout_range() calls. Furthermore, this is similar to what the nanosleep(2) family already does with current->timer_slack_ns. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230123173206.6764-3-dave@stgolabs.net
| * | hrtimer: Rely on rt_task() for DL tasks tooDavidlohr Bueso2023-01-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Checking dl_task() is redundant as rt_task() returns true for deadline tasks too. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230123173206.6764-2-dave@stgolabs.net
| * | timers: Prevent union confusion from unexpected restart_syscall()Jann Horn2023-01-113-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nanosleep syscalls use the restart_block mechanism, with a quirk: The `type` and `rmtp`/`compat_rmtp` fields are set up unconditionally on syscall entry, while the rest of the restart_block is only set up in the unlikely case that the syscall is actually interrupted by a signal (or pseudo-signal) that doesn't have a signal handler. If the restart_block was set up by a previous syscall (futex(..., FUTEX_WAIT, ...) or poll()) and hasn't been invalidated somehow since then, this will clobber some of the union fields used by futex_wait_restart() and do_restart_poll(). If userspace afterwards wrongly calls the restart_syscall syscall, futex_wait_restart()/do_restart_poll() will read struct fields that have been clobbered. This doesn't actually lead to anything particularly interesting because none of the union fields contain trusted kernel data, and futex(..., FUTEX_WAIT, ...) and poll() aren't syscalls where it makes much sense to apply seccomp filters to their arguments. So the current consequences are just of the "if userspace does bad stuff, it can damage itself, and that's not a problem" flavor. But still, it seems like a hazard for future developers, so invalidate the restart_block when partly setting it up in the nanosleep syscalls. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230105134403.754986-1-jannh@google.com
* | | Merge tag 'x86-cleanups-2023-02-20' of ↵Linus Torvalds2023-02-218-28/+18
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull miscellaneous x86 cleanups from Thomas Gleixner: - Correct the common copy and pasted mishandling of kstrtobool() in the strict_sas_size() setup function - Make recalibrate_cpu_khz() an GPL only export - Check TSC feature before doing anything else which avoids pointless code execution if TSC is not available - Remove or fixup stale and misleading comments - Remove unused or pointelessly duplicated variables - Spelling and typo fixes * tag 'x86-cleanups-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/hotplug: Remove incorrect comment about mwait_play_dead() x86/tsc: Do feature check as the very first thing x86/tsc: Make recalibrate_cpu_khz() export GPL only x86/cacheinfo: Remove unused trace variable x86/Kconfig: Fix spellos & punctuation x86/signal: Fix the value returned by strict_sas_size() x86/cpu: Remove misleading comment x86/setup: Move duplicate boot_cpu_data definition out of the ifdeffery x86/boot/e820: Fix typo in e820.c comment
| * | | x86/hotplug: Remove incorrect comment about mwait_play_dead()Srivatsa S. Bhat (VMware)2023-02-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The comment that says mwait_play_dead() returns only on failure is a bit misleading because mwait_play_dead() could actually return for valid reasons (such as mwait not being supported by the platform) that do not indicate a failure of the CPU offline operation. So, remove the comment. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230128003751.141317-1-srivatsa@csail.mit.edu
| * | | x86/tsc: Do feature check as the very first thingBorislav Petkov (AMD)2023-02-111-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do the feature check as the very first thing in the function. Everything else comes after that and is meaningless work if the TSC CPUID bit is not even set. Switch to cpu_feature_enabled() too, while at it. No functional changes. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/Y5990CUCuWd5jfBH@zn.tnic
| * | | x86/tsc: Make recalibrate_cpu_khz() export GPL onlyBorislav Petkov (AMD)2023-02-111-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A quick search doesn't reveal any use outside of the kernel - which would be questionable to begin with anyway - so make the export GPL only. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/Y599miBzWRAuOwhg@zn.tnic
| * | | x86/cacheinfo: Remove unused trace variableBorislav Petkov (AMD)2023-02-111-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 15cd8812ab2c ("x86: Remove the CPU cache size printk's") removed the last use of the trace local var. Remove it too and the useless trace cache case. No functional changes. Reported-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230210234541.9694-1-bp@alien8.de Link: http://lore.kernel.org/r/20220705073349.1512-1-jiapeng.chong@linux.alibaba.com
| * | | x86/Kconfig: Fix spellos & punctuationRandy Dunlap2023-01-251-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix spelling (reported by codespell) & punctuation in arch/x86/ Kconfig. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230124181753.19309-1-rdunlap@infradead.org
| * | | x86/signal: Fix the value returned by strict_sas_size()Christophe JAILLET2023-01-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Functions used with __setup() return 1 when the argument has been successfully parsed. Reverse the returned value so that 1 is returned when kstrtobool() is successful (i.e. returns 0). My understanding of these __setup() functions is that returning 1 or 0 does not change much anyway - so this is more of a cleanup than a functional fix. I spot it and found it spurious while looking at something else. Even if the output is not perfect, you'll get the idea with: $ git grep -B2 -A10 retu.*kstrtobool | grep __setup -B10 Fixes: 3aac3ebea08f ("x86/signal: Implement sigaltstack size validation") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/73882d43ebe420c9d8fb82d0560021722b243000.1673717552.git.christophe.jaillet@wanadoo.fr
| * | | x86/cpu: Remove misleading commentJuergen Gross2023-01-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The comment of the "#endif" after setup_disable_pku() is wrong. As the related #ifdef is only a few lines above, just remove the comment. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230113130126.1966-1-jgross@suse.com
| * | | x86/setup: Move duplicate boot_cpu_data definition out of the ifdefferyYuntao Wang2023-01-112-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both the if and else blocks define an exact same boot_cpu_data variable, move the duplicate variable definition out of the if/else block. In addition, do some other minor cleanups. [ bp: Massage. ] Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20220601122914.820890-1-ytcoode@gmail.com
| * | | x86/boot/e820: Fix typo in e820.c commentWang Yong2023-01-111-1/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | change "itsmain" to "its main". Fixes: 544a0f47e780 ("x86/boot/e820: Rename e820_table_saved to e820_table_firmware and improve the description") Signed-off-by: Wang Yong <yongw.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20221211103849.173870-1-yongw.kernel@gmail.com
* | | Merge tag 'x86_vdso_for_v6.3_rc1' of ↵Linus Torvalds2023-02-2116-68/+57
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 vdso updates from Borislav Petkov: - Add getcpu support for the 32-bit version of the vDSO - Some smaller fixes * tag 'x86_vdso_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vdso: Fix -Wmissing-prototypes warnings x86/vdso: Fake 32bit VDSO build on 64bit compile for vgetcpu selftests: Emit a warning if getcpu() is missing on 32bit x86/vdso: Provide getcpu for x86-32. x86/cpu: Provide the full setup for getcpu() on x86-32 x86/vdso: Move VDSO image init to vdso2c generated code
| * | | x86/vdso: Fix -Wmissing-prototypes warningsBorislav Petkov (AMD)2023-02-072-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix those: In file included from arch/x86/entry/vdso/vdso32/vclock_gettime.c:4: arch/x86/entry/vdso/vdso32/../vclock_gettime.c:70:5: warning: no previous prototype for ‘__vdso_clock_gettime64’ [-Wmissing-prototypes] 70 | int __vdso_clock_gettime64(clockid_t clock, struct __kernel_timespec *ts) | In file included from arch/x86/entry/vdso/vdso32/vgetcpu.c:3: arch/x86/entry/vdso/vdso32/../vgetcpu.c:13:1: warning: no previous prototype for ‘__vdso_getcpu’ [-Wmissing-prototypes] 13 | __vdso_getcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *unused) | ^~~~~~~~~~~~~ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/202302070742.iYcnoJwk-lkp@intel.com
| * | | x86/vdso: Fake 32bit VDSO build on 64bit compile for vgetcpuSebastian Andrzej Siewior2023-02-073-26/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 64bit register constrains in __arch_hweight64() cannot be fulfilled in a 32-bit build. The function is only declared but not used within vclock_gettime.c and gcc does not care. LLVM complains and aborts. Reportedly because it validates extended asm even if latter would get compiled out, see https://lore.kernel.org/r/Y%2BJ%2BUQ1vAKr6RHuH@dev-arch.thelio-3990X i.e., a long standing design difference between gcc and LLVM. Move the "fake a 32 bit kernel configuration" bits from vclock_gettime.c into a common header file. Use this from vclock_gettime.c and vgetcpu.c. [ bp: Add background info from Nathan. ] Fixes: 92d33063c081a ("x86/vdso: Provide getcpu for x86-32.") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/Y+IsCWQdXEr8d9Vy@linutronix.de
| * | | selftests: Emit a warning if getcpu() is missing on 32bitSebastian Andrzej Siewior2023-02-061-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The VDSO implementation for getcpu() has been wired up on 32bit so warn if missing. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20221125094216.3663444-4-bigeasy@linutronix.de
| * | | x86/vdso: Provide getcpu for x86-32.Sebastian Andrzej Siewior2023-02-064-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Wire up __vdso_getcpu() for x86-32. The 64bit version is reused with trivial modifications. Contrary to vclock_gettime.c there is no requirement to fake any defines in the case of 32bit VDSO on a 64bit kernel because the GDT entry from which the CPU and node information is read is always the native one. Adopt vdso_getcpu.c by: - removing the unneeded time* header files which lead to compile errors for 32bit. - adding segment.h which provides vdso_read_cpunode() and the defines required by it. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20221125094216.3663444-3-bigeasy@linutronix.de
| * | | x86/cpu: Provide the full setup for getcpu() on x86-32Sebastian Andrzej Siewior2023-02-062-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | setup_getcpu() configures two things: - it writes the current CPU & node information into MSR_TSC_AUX - it writes the same information as a GDT entry. By using the "full" setup_getcpu() on i386 it is possible to read the CPU information in userland via RDTSCP() or via LSL from the GDT. Provide an GDT_ENTRY_CPUNODE for x86-32 and make the setup function unconditionally available. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org> Link: https://lore.kernel.org/r/20221125094216.3663444-2-bigeasy@linutronix.de
| * | | x86/vdso: Move VDSO image init to vdso2c generated codeBrian Gerst2023-01-256-27/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generate an init function for each VDSO image, replacing init_vdso() and sysenter_setup(). Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230124184019.26850-1-brgerst@gmail.com