summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* timers, sched/clock: Clean up the code a bitIngo Molnar2015-03-271-51/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | Trivial cleanups, to improve the readability of the generic sched_clock() code: - Improve and standardize comments - Standardize the coding style - Use vertical spacing where appropriate - etc. No code changed: md5: 19a053b31e0c54feaeff1492012b019a sched_clock.o.before.asm 19a053b31e0c54feaeff1492012b019a sched_clock.o.after.asm Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
* timers, sched/clock: Avoid deadlock during read from NMIDaniel Thompson2015-03-271-35/+68
| | | | | | | | | | | | | | | | | | | | | | | | | Currently it is possible for an NMI (or FIQ on ARM) to come in and read sched_clock() whilst update_sched_clock() has locked the seqcount for writing. This results in the NMI handler locking up when it calls raw_read_seqcount_begin(). This patch fixes the NMI safety issues by providing banked clock data. This is a similar approach to the one used in Thomas Gleixner's 4396e058c52e("timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC"). Suggested-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1427397806-20889-6-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* timers, sched/clock: Remove redundant notrace from update functionDaniel Thompson2015-03-271-1/+1
| | | | | | | | | | | | | | | | | | Currently update_sched_clock() is marked as notrace but this function is not called by ftrace. This is trivially fixed by removing the mark up. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1427397806-20889-5-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* timers, sched/clock: Remove suspend from clock_read_data()Daniel Thompson2015-03-271-15/+25
| | | | | | | | | | | | | | | | | | | | | | | | | Currently cd.read_data.suspended is read by the hotpath function sched_clock(). This variable need not be accessed on the hotpath. In fact, once it is removed, we can remove the conditional branches from sched_clock() and install a dummy read_sched_clock function to suspend the clock. The new master copy of the function pointer (actual_read_sched_clock) is introduced and is used for all reads of the clock hardware except those within sched_clock itself. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1427397806-20889-4-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* timers, sched/clock: Optimize cache line usageDaniel Thompson2015-03-271-35/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently sched_clock(), a very hot code path, is not optimized to minimise its cache profile. In particular: 1. cd is not ____cacheline_aligned, 2. struct clock_data does not distinguish between hotpath and coldpath data, reducing locality of reference in the hotpath, 3. Some hotpath data is missing from struct clock_data and is marked __read_mostly (which more or less guarantees it will not share a cache line with cd). This patch corrects these problems by extracting all hotpath data into a separate structure and using ____cacheline_aligned to ensure the hotpath uses a single (64 byte) cache line. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1427397806-20889-3-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* timers, sched/clock: Match scope of read and write seqcountsDaniel Thompson2015-03-271-15/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the scope of the raw_write_seqcount_begin/end() in sched_clock_register() far exceeds the scope of the read section in sched_clock(). This gives the impression of safety during cursory review but achieves little. Note that this is likely to be a latent issue at present because sched_clock_register() is typically called before we enable interrupts, however the issue does risk bugs being needlessly introduced as the code evolves. This patch fixes the problem by increasing the scope of the read locking performed by sched_clock() to cover all data modified by sched_clock_register. We also improve clarity by moving writes to struct clock_data that do not impact sched_clock() outside of the critical section. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> [ Reworked it slightly to apply to tip/timers/core] Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1427397806-20889-2-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* Merge branch 'timers/urgent' into timers/core, to pick up fixes before ↵Ingo Molnar2015-03-172-6/+6
|\ | | | | | | | | | | applying new changes Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * Merge branch 'clockevents/4.0-rc2' of ↵Ingo Molnar2015-03-052-6/+6
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | http://git.linaro.org/people/daniel.lezcano/linux into timers/urgent Pull clockevents fixes from Daniel Lezcano: " These two patches fix a potential crash at boot time. - Fix setup_irq / clockevents_config_and_register init ordering in order to prevent to have an interrupt to be fired before the handler is set for sun5i and efm32. (Yongbae Park)" Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * clockevents: sun5i: Fix setup_irq init sequenceYongbae Park2015-03-051-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The interrupt is enabled before the handler is set. Even this bug did not appear, it is potentially dangerous as it can lead to a NULL pointer dereference. Fix the error by enabling the interrupt after clockevents_config_and_register() is called. Cc: stable@vger.kernel.org Signed-off-by: Yongbae Park <yongbae2@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
| | * clocksource: efm32: Fix a NULL pointer dereferenceYongbae Park2015-03-051-2/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | The initialisation of the efm32 clocksource first sets up the irq and only after that initialises the data needed for irq handling. In case this initialisation is delayed the irq handler would dereference a NULL pointer. I'm not aware of anything that could delay the process in such a way, but it's better to be safe than sorry, so setup the irq only when the clock event device is ready. Cc: stable@vger.kernel.org Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Yongbae Park <yongbae2@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
* | clocksource: Rename __clocksource_updatefreq_*() to ↵John Stultz2015-03-136-14/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __clocksource_update_freq_*() Ingo requested this function be renamed to improve readability, so I've renamed __clocksource_updatefreq_scale() as well as the __clocksource_updatefreq_hz/khz() functions to avoid squishedtogethernames. This touches some of the sh clocksources, which I've not tested. The arch/arm/plat-omap change is just a comment change for consistency. Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1426133800-29329-13-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | clocksource: Add some debug info about clocksources being registeredJohn Stultz2015-03-131-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Print the mask, max_cycles, and max_idle_ns values for clocksources being registered. Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1426133800-29329-12-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | clocksource, sparc32: Convert to using clocksource_register_hz()John Stultz2015-03-131-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While cleaning up some clocksource code, I noticed the time_32 implementation uses the clocksource_hz2mult() helper, but doesn't use the clocksource_register_hz() method. I don't believe the Sparc clocksource is a default clocksource, so we shouldn't need to self-define the mult/shift pair. So convert the time_32.c implementation to use clocksource_register_hz(). Untested. Signed-off-by: John Stultz <john.stultz@linaro.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1426133800-29329-11-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | clocksource: Mostly kill clocksource_register()John Stultz2015-03-135-52/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A long running project has been to clean up remaining uses of clocksource_register(), replacing it with the simpler clocksource_register_khz/hz() functions. However, there are a few cases where we need to self-define our mult/shift values, so switch the function to a more obviously internal __clocksource_register() name, and consolidate much of the internal logic so we don't have duplication. Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: David S. Miller <davem@davemloft.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1426133800-29329-10-git-send-email-john.stultz@linaro.org [ Minor cleanups. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | clocksource: Improve clocksource watchdog reportingJohn Stultz2015-03-131-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The clocksource watchdog reporting has been less helpful then desired, as it just printed the delta between the two clocksources. This prevents any useful analysis of why the skew occurred. Thus this patch tries to improve the output when we mark a clocksource as unstable, printing out the cycle last and now values for both the current clocksource and the watchdog clocksource. This will allow us to see if the result was due to a false positive caused by a problematic watchdog. Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1426133800-29329-9-git-send-email-john.stultz@linaro.org [ Minor cleanups of kernel messages. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | timekeeping: Add warnings when overflows or underflows are observedJohn Stultz2015-03-131-7/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It was suggested that the underflow/overflow protection should probably throw some sort of warning out, rather than just silently fixing the issue. So this patch adds some warnings here. The flag variables used are not protected by locks, but since we can't print from the reading functions, just being able to say we saw an issue in the update interval is useful enough, and can be slightly racy without real consequence. The big complication is that we're only under a read seqlock, so the data could shift under us during our calculation to see if there was a problem. This patch avoids this issue by nesting another seqlock which allows us to snapshot the just required values atomically. So we shouldn't see false positives. I also added some basic rate-limiting here, since on one build machine w/ skewed TSCs it was fairly noisy at bootup. Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1426133800-29329-8-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | timekeeping: Try to catch clocksource delta underflowsJohn Stultz2015-03-131-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case where there is a broken clocksource where there are multiple actual clocks that aren't perfectly aligned, we may see small "negative" deltas when we subtract 'now' from 'cycle_last'. The values are actually negative with respect to the clocksource mask value, not necessarily negative if cast to a s64, but we can check by checking the delta to see if it is a small (relative to the mask) negative value (again negative relative to the mask). If so, we assume we jumped backwards somehow and instead use zero for our delta. Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1426133800-29329-7-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | timekeeping: Add checks to cap clocksource reads to the 'max_cycles' valueJohn Stultz2015-03-131-14/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When calculating the current delta since the last tick, we currently have no hard protections to prevent a multiplication overflow from occuring. This patch introduces infrastructure to allow a cap that limits the clocksource read delta value to the 'max_cycles' value, which is where an overflow would occur. Since this is in the hotpath, it adds the extra checking under CONFIG_DEBUG_TIMEKEEPING=y. There was some concern that capping time like this could cause problems as we may stop expiring timers, which could go circular if the timer that triggers time accumulation were mis-scheduled too far in the future, which would cause time to stop. However, since the mult overflow would result in a smaller time value, we would effectively have the same problem there. Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1426133800-29329-6-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | timekeeping: Add debugging checks to warn if we see delaysJohn Stultz2015-03-133-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recently there's been requests for better sanity checking in the time code, so that it's more clear when something is going wrong, since timekeeping issues could manifest in a large number of strange ways in various subsystems. Thus, this patch adds some extra infrastructure to add a check to update_wall_time() to print two new warnings: 1) if we see the call delayed beyond the 'max_cycles' overflow point, 2) or if we see the call delayed beyond the clocksource's 'max_idle_ns' value, which is currently 50% of the overflow point. This extra infrastructure is conditional on a new CONFIG_DEBUG_TIMEKEEPING option, also added in this patch - default off. Tested this a bit by halting qemu for specified lengths of time to trigger the warnings. Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1426133800-29329-5-git-send-email-john.stultz@linaro.org [ Improved the changelog and the messages a bit. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | clocksource: Add 'max_cycles' to 'struct clocksource'John Stultz2015-03-123-15/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to facilitate clocksource validation, add a 'max_cycles' field to the clocksource structure which will hold the maximum cycle value that can safely be multiplied without potentially causing an overflow. Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1426133800-29329-4-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | clocksource: Simplify the logic around clocksource wrapping safety marginsJohn Stultz2015-03-122-16/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The clocksource logic has a number of places where we try to include a safety margin. Most of these are 12% safety margins, but they are inconsistently applied and sometimes are applied on top of each other. Additionally, in the previous patch, we corrected an issue where we unintentionally in effect created a 50% safety margin, which these 12.5% margins where then added to. So to simplify the logic here, this patch removes the various 12.5% margins, and consolidates adding the margin in one place: clocks_calc_max_nsecs(). Additionally, Linus prefers a 50% safety margin, as it allows bad clock values to be more easily caught. This should really have no net effect, due to the corrected issue earlier which caused greater then 50% margins to be used w/o issue. Signed-off-by: John Stultz <john.stultz@linaro.org> Acked-by: Stephen Boyd <sboyd@codeaurora.org> (for the sched_clock.c bit) Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1426133800-29329-3-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | clocksource: Simplify the clocks_calc_max_nsecs() logicJohn Stultz2015-03-121-12/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous clocks_calc_max_nsecs() code had some unecessarily complex bit logic to find the max interval that could cause multiplication overflows. Since this is not in the hot path, just do the divide to make it easier to read. The previous implementation also had a subtle issue that it avoided overflows with signed 64-bit values, where as the intervals are always unsigned. This resulted in overly conservative intervals, which other safety margins were then added to, reducing the intended interval length. Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1426133800-29329-2-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | Merge tag 'v4.0-rc2' into timers/core, to refresh the tree before pulling ↵Ingo Molnar2015-03-048789-200491/+378180
|\| | | | | | | more changes
| * Linux 4.0-rc2v4.0-rc2Linus Torvalds2015-03-031-1/+1
| |
| * drm/i915: Fix modeset state confusion in the load detect codeDaniel Vetter2015-03-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a tricky story of the new atomic state handling and the legacy code fighting over each another. The bug at hand is an underrun of the framebuffer reference with subsequent hilarity caused by the load detect code. Which is peculiar since the the exact same code works fine as the implementation of the legacy setcrtc ioctl. Let's look at the ingredients: - Currently our code is a crazy mix of legacy modeset interfaces to set the parameters and half-baked atomic state tracking underneath. While this transition is going we're using the transitional plane helpers to update the atomic side (drm_plane_helper_disable/update and friends), i.e. plane->state->fb. Since the state structure owns the fb those functions take care of that themselves. The legacy state (specifically crtc->primary->fb) is still managed by the old code (and mostly by the drm core), with the fb reference counting done by callers (core drm for the ioctl or the i915 load detect code). The relevant commit is commit ea2c67bb4affa84080c616920f3899f123786e56 Author: Matt Roper <matthew.d.roper@intel.com> Date: Tue Dec 23 10:41:52 2014 -0800 drm/i915: Move to atomic plane helpers (v9) - drm_plane_helper_disable has special code to handle multiple calls in a row - it checks plane->crtc == NULL and bails out. This is to match the proper atomic implementation which needs the crtc to get at the implied locking context atomic updates always need. See commit acf24a395c5a9290189b080383564437101d411c Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Jul 29 15:33:05 2014 +0200 drm/plane-helper: transitional atomic plane helpers - The universal plane code split out the implicit primary plane from the CRTC into it's own full-blown drm_plane object. As part of that the setcrtc ioctl (which updated both the crtc mode and primary plane) learned to set crtc->primary->crtc on modeset to make sure the plane->crtc assignments statate up to date in commit e13161af80c185ecd8dc4641d0f5df58f9e3e0af Author: Matt Roper <matthew.d.roper@intel.com> Date: Tue Apr 1 15:22:38 2014 -0700 drm: Add drm_crtc_init_with_planes() (v2) Unfortunately we've forgotten to update the load detect code. Which wasn't a problem since the load detect modeset is temporary and always undone before we drop the locks. - Finally there is a organically grown history (i.e. don't ask) around who sets the legacy plane->fb for the various driver entry points. Originally updating that was the drivers duty, but for almost all places we've moved that (plus updating the refcounts) into the core. Again the exception is the load detect code. Taking all together the following happens: - The load detect code doesn't set crtc->primary->crtc. This is only really an issue on crtcs never before used or when userspace explicitly disabled the primary plane. - The plane helper glue code short-circuits because of that and leaves a non-NULL fb behind in plane->state->fb and plane->fb. The state fb isn't a real problem (it's properly refcounted on its own), it's just the canary. - Load detect code drops the reference for that fb, but doesn't set plane->fb = NULL. This is ok since it's still living in that old world where drivers had to clear the pointer but the core/callers handled the refcounting. - On the next modeset the drm core notices plane->fb and takes care of refcounting it properly by doing another unref. This drops the refcount to zero, leaving state->plane now pointing at freed memory. - intel_plane_duplicate_state still assume it owns a reference to that very state->fb and bad things start to happen. Fix this all by applying the same duct-tape as for the legacy setcrtc ioctl code and set crtc->primary->crtc properly. Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Paul Bolle <pebolle@tiscali.nl> Cc: Rob Clark <robdclark@gmail.com> Cc: Paulo Zanoni <przanoni@gmail.com> Cc: Sean Paul <seanpaul@chromium.org> Cc: Matt Roper <matthew.d.roper@intel.com> Reported-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * Merge tag 'gpio-v4.0-2' of ↵Linus Torvalds2015-03-022-8/+15
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Two GPIO fixes: - Fix a translation problem in of_get_named_gpiod_flags() - Fix a long standing container_of() mistake in the TPS65912 driver" * tag 'gpio-v4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: tps65912: fix wrong container_of arguments gpiolib: of: allow of_gpiochip_find_and_xlate to find more than one chip per node
| | * gpio: tps65912: fix wrong container_of argumentsNicolas Saenz Julienne2015-02-231-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The gpio_chip operations receive a pointer the gpio_chip struct which is contained in the driver's private struct, yet the container_of call in those functions point to the mfd struct defined in include/linux/mfd/tps65912.h. Cc: Stable <stable@vger.kernel.org> Signed-off-by: Nicolas Saenz Julienne <nicolassaenzj@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
| | * gpiolib: of: allow of_gpiochip_find_and_xlate to find more than one chip per ↵Hans Holmberg2015-02-231-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | node The change: 7b8792bbdffdff3abda704f89c6a45ea97afdc62 gpiolib: of: Correct error handling in of_get_named_gpiod_flags assumed that only one gpio-chip is registred per of-node. Some drivers register more than one chip per of-node, so adjust the matching function of_gpiochip_find_and_xlate to not stop looking for chips if a node-match is found and the translation fails. Cc: Stable <stable@vger.kernel.org> Fixes: 7b8792bbdffd ("gpiolib: of: Correct error handling in of_get_named_gpiod_flags") Signed-off-by: Hans Holmberg <hans.holmberg@intel.com> Acked-by: Alexandre Courbot <acourbot@nvidia.com> Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> Tested-by: Tyler Hall <tylerwhall@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
| * | Merge branch 'fixes-for-4.0-rc2' of ↵Linus Torvalds2015-03-0212-67/+145
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal Pull thermal management fixes from Eduardo Valentin: "Specifics: - Several fixes in tmon tool. - Fixes in intel int340x for _ART and _TRT tables. - Add id for Avoton SoC into powerclamp driver. - Fixes in RCAR thermal driver to remove race conditions and fix fail path - Fixes in TI thermal driver: removal of unnecessary code and build fix if !CONFIG_PM_SLEEP - Cleanups in exynos thermal driver - Add stubs for include/linux/thermal.h. Now drivers using thermal calls but that also work without CONFIG_THERMAL will be able to compile for systems that don't care about thermal. Note: I am sending this pull on Rui's behalf while he fixes issues in his Linux box" * 'fixes-for-4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: thermal: int340x_thermal: Ignore missing _ART, _TRT tables thermal/intel_powerclamp: add id for Avoton SoC tools/thermal: tmon: silence 'set but not used' warnings tools/thermal: tmon: use pkg-config to determine library dependencies tools/thermal: tmon: support cross-compiling tools/thermal: tmon: add .gitignore tools/thermal: tmon: fixup tui windowing calculations tools/thermal: tmon: tui: don't hard-code dialog window size assumptions tools/thermal: tmon: add min/max macros tools/thermal: tmon: add --target-temp parameter thermal: exynos: Clean-up code to use oneline entry for exynos compatible table thermal: rcar: Make error and remove paths symmetrical with init thermal: rcar: Fix race condition between init and interrupt thermal: Introduce dummy functions when thermal is not defined ti-soc-thermal: Delete an unnecessary check before the function call "cpufreq_cooling_unregister" thermal: ti-soc-thermal: bandgap: Fix build warning if !CONFIG_PM_SLEEP
| | * \ Merge branch 'tmon-fixes' of .git into nextZhang Rui2015-02-285-14/+63
| | |\ \
| | | * | tools/thermal: tmon: silence 'set but not used' warningsBrian Norris2015-02-281-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc complains about the 'cols' variable being unused. This is unavoidable, given the ncurses getmaxyx() macro-based API, which wants to assign to a variable directly, even when we're not going to use it. Warning: gcc -O1 -Wall -Wshadow -W -Wformat -Wimplicit-function-declaration -Wimplicit-int -fstack-protector -D VERSION=\"1.0\" -c -o tui.o tui.c tui.c: In function ‘show_dialogue’: tui.c:288:12: warning: variable ‘cols’ set but not used [-Wunused-but-set-variable] int rows, cols; ^ So, add a hack to get rid of that warning. Signed-off-by: Brian Norris <computersforpeace@gmail.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
| | | * | tools/thermal: tmon: use pkg-config to determine library dependenciesBrian Norris2015-02-281-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some distros (e.g., Arch Linux) don't package the tinfo library separately from ncurses, so don't unconditionally include it. Instead, use pkg-config. The $(STATIC) ugliness is to handle the reported build case from commit 6b533269fb25 ("tools/thermal: tmon: fix compilation errors when building statically"), where a developer wants to be able to build with: make LDFLAGS=-static which requires an additional pkg-config flag. Finally, support a lowest common denominator fallback (-lpanel -lncurses) for build systems that don't have pkg-config entries for ncurses. Signed-off-by: Brian Norris <computersforpeace@gmail.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
| | | * | tools/thermal: tmon: support cross-compilingBrian Norris2015-02-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We might want to prepare CFLAGS outside of this Makefile, so don't overwrite its initial value. Then, support $(CROSS_COMPILE), so we can use a cross-compile toolchain. Signed-off-by: Brian Norris <computersforpeace@gmail.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
| | | * | tools/thermal: tmon: add .gitignoreBrian Norris2015-02-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Brian Norris <computersforpeace@gmail.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
| | | * | tools/thermal: tmon: fixup tui windowing calculationsBrian Norris2015-02-281-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The number of rows in the dialog vary according to the number of cooling devices. However, some of the windowing computations were assuming a fixed number of rows. This computation is OK when we have between 4 and 9 cooling devices (and they wrap to the next column), but with fewer devices, we end up printing off the end of the window. This unifies the row computation into a single function and uses that throughout the TUI code. This also accounts for increasing the number of rows when there are more than 9 total cooling devices. Signed-off-by: Brian Norris <computersforpeace@gmail.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
| | | * | tools/thermal: tmon: tui: don't hard-code dialog window size assumptionsBrian Norris2015-02-281-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can use the ncurses API to get the number of rows. Signed-off-by: Brian Norris <computersforpeace@gmail.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
| | | * | tools/thermal: tmon: add min/max macrosBrian Norris2015-02-281-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Brian Norris <computersforpeace@gmail.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
| | | * | tools/thermal: tmon: add --target-temp parameterBrian Norris2015-02-282-2/+14
| | | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we launch in daemon mode (--daemon), we don't have the ncurses UI, but we might want to set the target temperature still. For example, someone might stick the following in their boot script: tmon --control intel_powerclamp --target-temp 90 --log --daemon This would turn on CPU idle injection when we're around 90 degrees celsius, and would log temperature and throttling info to /var/tmp/tmon.log. Signed-off-by: Brian Norris <computersforpeace@gmail.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
| | * | thermal: int340x_thermal: Ignore missing _ART, _TRT tablesSrinivas Pandruvada2015-02-281-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible that _ART/_TRT tables are missing or have errors. Ignore those failures, as INT3400 thermal zone is still required for _OSC or mode switch. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
| | * | thermal/intel_powerclamp: add id for Avoton SoCMiguel Bernal Marin2015-02-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable Intel Powerclamp driver on Atom* Processor C2000 Product Family for Microservers (Avoton). Avoton - SoCs for micro-servers has package C-states which can be used for idle injection. Reported-by: Jose Navarro <jose.navarro@intel.com> Suggested-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Tested-by: Jose Carlos Venegas Munoz <jos.c.venegas.munoz@intel.com> Signed-off-by: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
| | * | thermal: exynos: Clean-up code to use oneline entry for exynos compatible tableChanwoo Choi2015-02-241-28/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch cleanup the code to use oneline for entry of exynos compatible table. Cc: Zhang Rui <rui.zhang@intel.com> Cc: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Acked-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
| | * | thermal: rcar: Make error and remove paths symmetrical with initGeert Uytterhoeven2015-02-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Swap interrupt disable and thermal zone unregistration in the error and remove paths, to make them more symmetrical with the initialization path. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
| | * | thermal: rcar: Fix race condition between init and interruptGeert Uytterhoeven2015-02-241-13/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As soon as the interrupt has been enabled by devm_request_irq(), the interrupt routine may be called, depending on the current status of the hardware. However, at that point rcar_thermal_common hasn't been initialized complely yet. E.g. rcar_thermal_common.base is still NULL, causing a NULL pointer dereference: Unable to handle kernel NULL pointer dereference at virtual address 0000000c pgd = c0004000 [0000000c] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.19.0-rc7-ape6evm-04564-gb6e46cb7cbe82389 #30 Hardware name: Generic R8A73A4 (Flattened Device Tree) task: ee8953c0 ti: ee896000 task.ti: ee896000 PC is at rcar_thermal_irq+0x1c/0xf0 LR is at _raw_spin_lock_irqsave+0x48/0x54 Postpone the call to devm_request_irq() until all initialization has been done to fix this. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
| | * | thermal: Introduce dummy functions when thermal is not definedNishanth Menon2015-02-241-2/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_THERMAL is not enabled, it is better to introduce equivalent dummy functions in the exported header than to introduce #ifdeffery in drivers using the function. This will prevent issues such as that reported in: http://www.spinics.net/lists/linux-next/msg31573.html While at it switch over to IS_ENABLED for thermal macros to allow for thermal framework to be built as framework and relevant APIs be usable by relevant drivers as a result. Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
| | * | ti-soc-thermal: Delete an unnecessary check before the function call ↵Markus Elfring2015-02-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "cpufreq_cooling_unregister" The cpufreq_cooling_unregister() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
| | * | thermal: ti-soc-thermal: bandgap: Fix build warning if !CONFIG_PM_SLEEPGrygorii Strashko2015-02-241-1/+1
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix following build warning if CONFIG_PM_SLEEP is not set: drivers/thermal/ti-soc-thermal/ti-bandgap.c:1478:12: warning: 'ti_bandgap_suspend' defined but not used [-Wunused-function] static int ti_bandgap_suspend(struct device *dev) ^ drivers/thermal/ti-soc-thermal/ti-bandgap.c:1492:12: warning: 'ti_bandgap_resume' defined but not used [-Wunused-function] static int ti_bandgap_resume(struct device *dev) ^ Acked-by: Nishanth Menon <nm@ti.com> Signed-off-by: Grygorii Strashko <Grygorii.Strashko@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
| * | Merge tag 'md/4.0-fixes' of git://neil.brown.name/mdLinus Torvalds2015-03-023-12/+20
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull md fixes from Neil Brown: "Three md fixes: - fix a read-balance problem that was reported 2 years ago, but that I never noticed the report :-( - fix for rare RAID6 problem causing incorrect bitmap updates when two devices fail. - add __ATTR_PREALLOC annotation now that it is possible" * tag 'md/4.0-fixes' of git://neil.brown.name/md: md: mark some attributes as pre-alloc raid5: check faulty flag for array status during recovery. md/raid1: fix read balance when a drive is write-mostly.
| | * | md: mark some attributes as pre-allocNeilBrown2015-02-251-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since __ATTR_PREALLOC was introduced in v3.19-rc1~78^2~18 it can now be used by md. This ensure that writing to these sysfs attributes will never block due to a memory allocation. Such blocking could become a deadlock if mdmon is trying to reconfigure an array after a failure prior to re-enabling writes. Signed-off-by: NeilBrown <neilb@suse.de>
| | * | raid5: check faulty flag for array status during recovery.Eric Mei2015-02-251-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we have more than 1 drive failure, it's possible we start rebuild one drive while leaving another faulty drive in array. To determine whether array will be optimal after building, current code only check whether a drive is missing, which could potentially lead to data corruption. This patch is to add checking Faulty flag. Signed-off-by: NeilBrown <neilb@suse.de>
| | * | md/raid1: fix read balance when a drive is write-mostly.Tomáš Hodek2015-02-251-2/+3
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a drive is marked write-mostly it should only be the target of reads if there is no other option. This behaviour was broken by commit 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc md/raid1: read balance chooses idlest disk for SSD which causes a write-mostly device to be *preferred* is some cases. Restore correct behaviour by checking and setting best_dist_disk and best_pending_disk rather than best_disk. We only need to test one of these as they are both changed from -1 or >=0 at the same time. As we leave min_pending and best_dist unchanged, any non-write-mostly device will appear better than the write-mostly device. Reported-by: Tomáš Hodek <tomas.hodek@volny.cz> Reported-by: Dark Penguin <darkpenguin@yandex.ru> Signed-off-by: NeilBrown <neilb@suse.de> Link: http://marc.info/?l=linux-raid&m=135982797322422 Fixes: 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc Cc: stable@vger.kernel.org (3.6+)