summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* watchdog: imx2: Convert to use infrastructure triggered keepalivesGuenter Roeck2016-03-161-61/+13
| | | | | | | | | The watchdog infrastructure now supports handling watchdog keepalive if the watchdog is running while the watchdog device is closed. Convert the driver to use this infrastructure. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: dw_wdt: Convert to use watchdog infrastructureGuenter Roeck2016-03-162-207/+117
| | | | | | | | | | Convert driver to use watchdog infrastructure. This includes infrastructure support to handle watchdog keepalive if the watchdog is running while the watchdog device is closed. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: Add support for minimum time between heartbeatsGuenter Roeck2016-03-163-0/+21
| | | | | | | | Some watchdogs require a minimum time between heartbeats. Examples are the watchdogs in DA9062 and AT91SAM9x. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: Make stop function optionalGuenter Roeck2016-03-163-10/+18
| | | | | | | | | | Not all hardware watchdogs can be stopped. The driver for such watchdogs would typically only set the WATCHDOG_HW_RUNNING flag in its stop function. Make the stop function optional and set WATCHDOG_HW_RUNNING in the watchdog core if it is not provided. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: Introduce WDOG_HW_RUNNING flagGuenter Roeck2016-03-163-20/+64
| | | | | | | | | | | | | | The WDOG_HW_RUNNING flag is expected to be set by watchdog drivers if the hardware watchdog is running. If the flag is set, the watchdog subsystem will ping the watchdog even if the watchdog device is closed. The watchdog driver stop function is now optional and may be omitted if the watchdog can not be stopped. If stopping the watchdog is not possible but the driver implements a stop function, it is responsible to set the WDOG_HW_RUNNING flag in its stop function. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: Introduce hardware maximum heartbeat in watchdog coreGuenter Roeck2016-03-163-18/+158
| | | | | | | | | | | | | Introduce an optional hardware maximum heartbeat in the watchdog core. The hardware maximum heartbeat can be lower than the maximum timeout. Drivers can set the maximum hardware heartbeat value in the watchdog data structure. If the configured timeout exceeds the maximum hardware heartbeat, the watchdog core enables a timer function to assist sending keepalive requests to the watchdog driver. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: Make set_timeout function optionalGuenter Roeck2016-03-162-2/+14
| | | | | | | | | | | | | For some watchdogs, the watchdog driver handles timeout changes without explicitly setting any registers. In this situation, the watchdog driver might only set the 'timeout' variable but do nothing else. This can as well be handled by the infrastructure, so make the set_timeout callback optional. If WDIOF_SETTIMEOUT is configured but the .set_timeout callback is not available, update the timeout variable in the infrastructure code. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* arm: lpc32xx: remove restart handlerSylvain Lemieux2016-03-162-16/+0
| | | | | | | | | | Remove the restart handler from "mach-lpc32xx"; this functionality is now available in the pnx4008 watchdog driver. Signed-off-by: Sylvain Lemieux <slemieux@tycoint.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* arm: lpc32xx: phy3250 remove restart hookSylvain Lemieux2016-03-161-1/+0
| | | | | | | | | | Remove the restart hook assignment from phy3250; this functionality is now managed by the pnx4008 watchdog driver. Signed-off-by: Sylvain Lemieux <slemieux@tycoint.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: pnx4008: restart: support "cmd" from userspaceSylvain Lemieux2016-03-161-0/+15
| | | | | | | | | | Added support to verify if a "cmd" is passed from the userspace program rebooting the system; - if a valid "cmd" is available, handle it; - If the received "cmd" is not supported, use the default reboot mode. Signed-off-by: Sylvain Lemieux <slemieux@tycoint.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: pnx4008: add support for soft resetSylvain Lemieux2016-03-161-3/+10
| | | | | | | | | | | | Add support for explicit soft reset using the reboot mode. The default reboot mode behavior is unchanged; you can overwrite the default reboot type in the board specific file "DT_MACHINE_START" definition using the "reboot_mode" parameter. Signed-off-by: Sylvain Lemieux <slemieux@tycoint.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: pnx4008: add restart handlerSylvain Lemieux2016-03-161-0/+17
| | | | | | | | | | Add restart handler capability to the driver; the restart handler implementation was taken from "mach-lpc32xx" ("lpc23xx_restart" function). Signed-off-by: Sylvain Lemieux <slemieux@tycoint.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: pnx4008: update logging during power-onSylvain Lemieux2016-03-161-2/+1
| | | | | | | | | | There is no need to add the driver name in the text to display on the console during the power-on: pnx4008-watchdog 4003c000.watchdog: PNX4008 Watchdog Timer: heartbeat 19 sec Signed-off-by: Sylvain Lemieux <slemieux@tycoint.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: tangox_wdt: test clock rate to avoid division by 0Wolfram Sang2016-03-161-4/+10
| | | | | | | | | | The clk API may return 0 on clk_get_rate, so we should check the result before using it as a divisor. For this, refactor the code to use a central error path. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: atlas7_wdt: test clock rate to avoid division by 0Wolfram Sang2016-03-161-0/+5
| | | | | | | | | The clk API may return 0 on clk_get_rate, so we should check the result before using it as a divisor. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: s3c2410_wdt: Add max and min timeout valuesJavier Martinez Canillas2016-03-161-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | The watchdog maximum timeout value is determined by the number of bits for the interval timer counter, its source clock frequency, the number of bits of the prescaler and maximum divider value. This can be calculated with the following equation: max_timeout = counter / (freq / (max_prescale + 1) / max_divider) Setting a maximum timeout value will allow the watchdog core to refuse user-space calls to the WDIOC_SETTIMEOUT ioctl that sets not supported timeout values. For example, systemd tries to set a timeout of 10 minutes on reboot to ensure that the machine will be rebooted even if a reboot failed. This leads to the following error message on an Exynos5422 Odroid XU4 board: [ 147.986045] s3c2410-wdt 101d0000.watchdog: timeout 600 too big Reported-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* Watchdog: introduce ARM SBSA watchdog driverFu Wei2016-03-163-0/+429
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to Server Base System Architecture (SBSA) specification, the SBSA Generic Watchdog has two stage timeouts: the first signal (WS0) is for alerting the system by interrupt, the second one (WS1) is a real hardware reset. More details about the hardware specification of this device: ARM DEN0029B - Server Base System Architecture (SBSA) This driver can operate ARM SBSA Generic Watchdog as a single stage watchdog or a two stages watchdog, it's set up by the module parameter "action". In the single stage mode, when the timeout is reached, your system will be reset by WS1. The first signal (WS0) is ignored. In the two stages mode, when the timeout is reached, the first signal (WS0) will trigger panic. If the system is getting into trouble and cannot be reset by panic or restart properly by the kdump kernel(if supported), then the second stage (as long as the first stage) will be reached, system will be reset by WS1. This function can help administrator to backup the system context info by panic console output or kdump. This driver bases on linux kernel watchdog framework, so it can get timeout from module parameter and FDT at the driver init stage. Signed-off-by: Fu Wei <fu.wei@linaro.org> Reviewed-by: Graeme Gregory <graeme.gregory@linaro.org> Tested-by: Pratyush Anand <panand@redhat.com> Acked-by: Timur Tabi <timur@codeaurora.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* Documentation: add sbsa-gwdt driver documentationFu Wei2016-03-042-0/+38
| | | | | | | | | | | | | | The sbsa-gwdt.txt documentation in devicetree/bindings/watchdog is for introducing SBSA(Server Base System Architecture) Generic Watchdog device node info into FDT. Also add sbsa-gwdt introduction in watchdog-parameters.txt Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Fu Wei <fu.wei@linaro.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: Add watchdog timer support for the WinSystems EBC-C384William Breathitt Gray2016-03-014-0/+204
| | | | | | | | | | | | | | The WinSystems EBC-C384 has an onboard watchdog timer. The timeout range supported by the watchdog timer is 1 second to 255 minutes. Timeouts under 256 seconds have a 1 second granularity, while the rest have a 1 minute granularity. This driver adds watchdog timer support for this onboard watchdog timer. The timeout may be configured via the timeout module parameter. Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: ni903x_wdt: Add NI 903x/913x watchdog driverKyle Roeschley2016-03-014-0/+287
| | | | | | | | | | Add support for the watchdog timer on NI cRIO-903x and cDAQ-913x real- time controllers. Signed-off-by: Jeff Westfahl <jeff.westfahl@ni.com> Signed-off-by: Kyle Roeschley <kyle.roeschley@ni.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: Add 'action' and 'data' parameters to restart handler callbackGuenter Roeck2016-03-0114-14/+26
| | | | | | | | | The 'action' (or restart mode) and data parameters may be used by restart handlers, so they should be passed to the restart callback functions. Cc: Sylvain Lemieux <slemieux@tycoint.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: ziirave_wdt: Rename "trigger" reset reason "hw watchdog"Martyn Welch2016-03-011-1/+1
| | | | | | | | | | | | | The Zodiac watchdog is implemented on a microcontoller. The reset reason currently labelled "trigger" is not to detect when the watchdog has triggered (as had been initially understood and suggested by the naming), but to inform the reader that the watchdog, which in fact has it's own hardware watchdog, has been reset because the hardware watchdog has triggered. Renaming to "hw watchdog". Signed-off-by: Martyn Welch <martyn.welch@collabora.co.uk> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: rc32434_wdt: fix ioctl error handlingMichael S. Tsirkin2016-03-011-1/+1
| | | | | | | | | | | | | | | | Calling return copy_to_user(...) in an ioctl will not do the right thing if there's a pagefault: copy_to_user returns the number of bytes not copied in this case. Fix up watchdog/rc32434_wdt to do return copy_to_user(...)) ? -EFAULT : 0; instead. Cc: stable@vger.kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: Add hardware dependency to BCM7038 driverJean Delvare2016-03-011-3/+5
| | | | | | | | | | | The BCM7038 watchdog driver is specific to Broadcom ARM and MIPS SoCs so do not present it on other architectures, unless build-testing. Signed-off-by: Jean Delvare <jdelvare@suse.de> Reviewed-by: Justin Chen <justinpopo6@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: orion: Depend on 32-bit ARMThierry Reding2016-03-011-0/+1
| | | | | | | | | | | | The driver uses the atomic_io_modify() function to update registers, but that function is only available on 32-bit ARM. Recent changes have added ARCH_MVEBU support to 64-bit ARM and hence allowed this driver to build on 64-bit ARM where this function isn't available and thereby causing allmodconfig builds to break. Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: ts4800: add hardware dependencyJean Delvare2016-03-011-0/+1
| | | | | | | | | | | | The Technologic Systems TS-4800 is an i.MX515 board, so its drivers are useless unless building a SOC_IMX51 kernel, except for build testing purposes. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Damien Riegel <damien.riegel@savoirfairelinux.com> Cc: Rob Herring <robh@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* watchdog: w83627hf: Added NCT6102D support.Rob Kramer2016-03-012-3/+20
| | | | | | | | | As used in (and tested on) the ASRock IMB-150 board. Implementation is identical to other NCT chips, just with different registers. Signed-off-by: Rob Kramer <rob@solution-space.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* Linux 4.5-rc6v4.5-rc6Linus Torvalds2016-02-281-1/+1
|
* Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2016-02-282-131/+244
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A rather largish series of 12 patches addressing a maze of race conditions in the perf core code from Peter Zijlstra" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Robustify task_function_call() perf: Fix scaling vs. perf_install_in_context() perf: Fix scaling vs. perf_event_enable() perf: Fix scaling vs. perf_event_enable_on_exec() perf: Fix ctx time tracking by introducing EVENT_TIME perf: Cure event->pending_disable race perf: Fix race between event install and jump_labels perf: Fix cloning perf: Only update context time when active perf: Allow perf_release() with !event->ctx perf: Do not double free perf: Close install vs. exit race
| * perf: Robustify task_function_call()Peter Zijlstra2016-02-251-20/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since there is no serialization between task_function_call() doing task_curr() and the other CPU doing context switches, we could end up not sending an IPI even if we had to. And I'm not sure I still buy my own argument we're OK. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174948.340031200@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf: Fix scaling vs. perf_install_in_context()Peter Zijlstra2016-02-251-45/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Completely reworks perf_install_in_context() (again!) in order to ensure that there will be no ctx time hole between add_event_to_ctx() and any potential ctx_sched_in(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174948.279399438@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf: Fix scaling vs. perf_event_enable()Peter Zijlstra2016-02-251-19/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to the perf_enable_on_exec(), ensure that event timings are consistent across perf_event_enable(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174948.218288698@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf: Fix scaling vs. perf_event_enable_on_exec()Peter Zijlstra2016-02-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The recent commit 3e349507d12d ("perf: Fix perf_enable_on_exec() event scheduling") caused this by moving task_ctx_sched_out() from before __perf_event_mask_enable() to after it. The overlooked consequence of that change is that task_ctx_sched_out() would update the ctx time fields, and now __perf_event_mask_enable() uses stale time. In order to fix this, explicitly stop our context's time before enabling the event(s). Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Fixes: 3e349507d12d ("perf: Fix perf_enable_on_exec() event scheduling") Link: http://lkml.kernel.org/r/20160224174948.159242158@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf: Fix ctx time tracking by introducing EVENT_TIMEPeter Zijlstra2016-02-251-12/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently any ctx_sched_in() call will re-start the ctx time tracking, this means that calls like: ctx_sched_in(.event_type = EVENT_PINNED); ctx_sched_in(.event_type = EVENT_FLEXIBLE); will have a hole in their ctx time tracking. This is likely harmless but can confuse things a little. By adding EVENT_TIME, we can have the first ctx_sched_in() (is_active: 0 -> !0) start the time and any further ctx_sched_in() will leave the timestamps alone. Secondly, this allows for an early disable like: ctx_sched_out(.event_type = EVENT_TIME); which would update the ctx time (if the ctx is active) and any further calls to ctx_sched_out() would not further modify the ctx time. For ctx_sched_in() any 0 -> !0 transition will automatically include EVENT_TIME. For ctx_sched_out(), any transition that clears EVENT_ALL will automatically clear EVENT_TIME. These two rules ensure that under normal circumstances we need not bother with EVENT_TIME and get natural ctx time behaviour. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174948.100446561@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf: Cure event->pending_disable racePeter Zijlstra2016-02-251-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because event_sched_out() checks event->pending_disable _before_ actually disabling the event, it can happen that the event fires after it checks but before it gets disabled. This would leave event->pending_disable set and the queued irq_work will try and process it. However, if the event trigger was during schedule(), the event might have been de-scheduled by the time the irq_work runs, and perf_event_disable_local() will fail. Fix this by checking event->pending_disable _after_ we call event->pmu->del(). This depends on the latter being a compiler barrier, such that the compiler does not lift the load and re-creates the problem. Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174948.040469884@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf: Fix race between event install and jump_labelsPeter Zijlstra2016-02-252-11/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf_install_in_context() relies upon the context switch hooks to have scheduled in events when the IPI misses its target -- after all, if the task has moved from the CPU (or wasn't running at all), it will have to context switch to run elsewhere. This however doesn't appear to be happening. It is possible for the IPI to not happen (task wasn't running) only to later observe the task running with an inactive context. The only possible explanation is that the context switch hooks are not called. Therefore put in a sync_sched() after toggling the jump_label to guarantee all CPUs will have them enabled before we install an event. A simple if (0->1) sync_sched() will not in fact work, because any further increment can race and complete before the sync_sched(). Therefore we must jump through some hoops. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174947.980211985@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf: Fix cloningPeter Zijlstra2016-02-252-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alexander reported that when the 'original' context gets destroyed, no new clones happen. This can happen irrespective of the ctx switch optimization, any task can die, even the parent, and we want to continue monitoring the task hierarchy until we either close the event or no tasks are left in the hierarchy. perf_event_init_context() will attempt to pin the 'parent' context during clone(). At that point current is the parent, and since current cannot have exited while executing clone(), its context cannot have passed through perf_event_exit_task_context(). Therefore perf_pin_task_context() cannot observe ctx->task == TASK_TOMBSTONE. However, since inherit_event() does: if (parent_event->parent) parent_event = parent_event->parent; it looks at the 'original' event when it does: is_orphaned_event(). This can return true if the context that contains the this event has passed through perf_event_exit_task_context(). And thus we'll fail to clone the perf context. Fix this by adding a new state: STATE_DEAD, which is set by perf_release() to indicate that the filedesc (or kernel reference) is dead and there are no observers for our data left. Only for STATE_DEAD will is_orphaned_event() be true and inhibit cloning. STATE_EXIT is otherwise preserved such that is_event_hup() remains functional and will report when the observed task hierarchy becomes empty. Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Fixes: c6e5b73242d2 ("perf: Synchronously clean up child events") Link: http://lkml.kernel.org/r/20160224174947.919845295@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf: Only update context time when activePeter Zijlstra2016-02-251-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174947.860690919@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf: Allow perf_release() with !event->ctxPeter Zijlstra2016-02-251-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the err_file: fput(event_file) case, the event will not yet have been attached to a context. However perf_release() does assume it has been. Cure this. Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174947.793996260@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf: Do not double freePeter Zijlstra2016-02-251-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case of: err_file: fput(event_file), we'll end up calling perf_release() which in turn will free the event. Do not then free the event _again_. Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174947.697350349@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf: Close install vs. exit racePeter Zijlstra2016-02-251-9/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider the following scenario: CPU0 CPU1 ctx = find_get_ctx(); perf_event_exit_task_context() mutex_lock(&ctx->mutex); perf_install_in_context(ctx, ...); /* NO-OP */ mutex_unlock(&ctx->mutex); ... perf_release() WARN_ON_ONCE(event->state != STATE_EXIT); Since the event doesn't pass through perf_remove_from_context() because perf_install_in_context() NO-OPs because the ctx is dead, and perf_event_exit_task_context() will not observe the event because its not attached yet, the event->state will not be set. Solve this by revalidating ctx->task after we acquire ctx->mutex and failing the event creation as a whole. Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174947.626853419@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2016-02-285-7/+15
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "This update contains: - Hopefully the last ASM CLAC fixups - A fix for the Quark family related to the IMR lock which makes kexec work again - A off-by-one fix in the MPX code. Ironic, isn't it? - A fix for X86_PAE which addresses once more an unsigned long vs phys_addr_t hickup" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mpx: Fix off-by-one comparison with nr_registers x86/mm: Fix slow_virt_to_phys() for X86_PAE again x86/entry/compat: Add missing CLAC to entry_INT80_32 x86/entry/32: Add an ASM_CLAC to entry_SYSENTER_32 x86/platform/intel/quark: Change the kernel's IMR lock bit to false
| * | x86/mpx: Fix off-by-one comparison with nr_registersColin Ian King2016-02-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the unlikely event that regno == nr_registers then we get an array overrun on regoff because the invalid register check is currently off-by-one. Fix this with a check that regno is >= nr_registers instead. Detected with static analysis using CoverityScan. Fixes: fcc7ffd67991 "x86, mpx: Decode MPX instruction to get bound violation information" Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1456512931-3388-1-git-send-email-colin.king@canonical.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | x86/mm: Fix slow_virt_to_phys() for X86_PAE againDexuan Cui2016-02-251-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "d1cd12108346: x86, pageattr: Prevent overflow in slow_virt_to_phys() for X86_PAE" was unintentionally removed by the recent "34437e67a672: x86/mm: Fix slow_virt_to_phys() to handle large PAT bit". And, the variable 'phys_addr' was defined as "unsigned long" by mistake -- it should be "phys_addr_t". As a result, Hyper-V network driver in 32-PAE Linux guest can't work again. Fixes: commit 34437e67a672: "x86/mm: Fix slow_virt_to_phys() to handle large PAT bit" Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Cc: olaf@aepfle.de Cc: gregkh@linuxfoundation.org Cc: jasowang@redhat.com Cc: driverdev-devel@linuxdriverproject.org Cc: linux-mm@kvack.org Cc: apw@canonical.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Link: http://lkml.kernel.org/r/1456394292-9030-1-git-send-email-decui@microsoft.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | x86/entry/compat: Add missing CLAC to entry_INT80_32Andy Lutomirski2016-02-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This doesn't seem to fix a regression -- I don't think the CLAC was ever there. I double-checked in a debugger: entries through the int80 gate do not automatically clear AC. Stable maintainers: I can provide a backport to 4.3 and earlier if needed. This needs to be backported all the way to 3.10. Reported-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> # v3.10 and later Fixes: 63bcff2a307b ("x86, smap: Add STAC and CLAC instructions to control user space access") Link: http://lkml.kernel.org/r/b02b7e71ae54074be01fc171cbd4b72517055c0e.1456345086.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | x86/entry/32: Add an ASM_CLAC to entry_SYSENTER_32Andy Lutomirski2016-02-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both before and after 5f310f739b4c ("x86/entry/32: Re-implement SYSENTER using the new C path"), we relied on a uaccess very early in the SYSENTER path to clear AC. After that change, though, we can potentially make it all the way into C code with AC set, which enlarges the attack surface for SMAP bypass by doing SYSENTER with AC set. Strengthen the SMAP protection by addding the missing ASM_CLAC right at the beginning. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/3e36be110724896e32a4a1fe73bacb349d3cba94.1456262295.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | x86/platform/intel/quark: Change the kernel's IMR lock bit to falseBryan O'Donoghue2016-02-231-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently when setting up an IMR around the kernel's .text section we lock that IMR, preventing further modification. While superficially this appears to be the right thing to do, in fact this doesn't account for a legitimate change in the memory map such as when executing a new kernel via kexec. In such a scenario a second kernel can have a different size and location to it's predecessor and can view some of the memory occupied by it's predecessor as legitimately usable DMA RAM. If this RAM were then subsequently allocated to DMA agents within the system it could conceivably trigger an IMR violation. This patch fixes the this potential situation by keeping the kernel's .text section IMR lock bit false by default. Suggested-by: Ingo Molnar <mingo@kernel.org> Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Bryan O'Donoghue <pure.logic@nexus-software.ie> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boon.leong.ong@intel.com Cc: paul.gortmaker@windriver.com Link: http://lkml.kernel.org/r/1456190999-12685-2-git-send-email-pure.logic@nexus-software.ie Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds2016-02-281-1/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixlet from Thomas Gleixner: "A trivial printk typo fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/deadline: Fix trivial typo in printk() message
| * | | sched/deadline: Fix trivial typo in printk() messageSteven Rostedt2016-02-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's "too much" not "to much". Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Juri Lelli <juri.lelli@arm.com> Cc: Jiri Kosina <trivial@kernel.org> Cc: Juri Lelli <juri.lelli@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160210120422.4ca77e68@gandalf.local.home Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds2016-02-282-11/+8
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "Four small fixes for irqchip drivers: - Add missing low level irq handler initialization on mxs, so interrupts can acutally be delivered - Add a missing barrier to the GIC driver - Two fixes for the GIC-V3-ITS driver, addressing a double EOI write and a cache flush beyond the actual region" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v3: Add missing barrier to 32bit version of gic_read_iar() irqchip/mxs: Add missing set_handle_irq() irqchip/gicv3-its: Avoid cache flush beyond ITS_BASERn memory size irqchip/gic-v3-its: Fix double ICC_EOIR write for LPI in EOImode==1