diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2020-01-29 01:31:08 +0100 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2020-01-29 01:31:08 +0100 |
commit | abb22e44cff3f11d9e087bdd46c04bb32ff57678 (patch) | |
tree | 2e690946d91b9e498028503eaad73cd4f805cbec /Documentation/driver-api | |
parent | Merge tag 'sound-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ti... (diff) | |
parent | thermal: stm32: Fix low threshold interrupt flood (diff) | |
download | linux-abb22e44cff3f11d9e087bdd46c04bb32ff57678.tar.xz linux-abb22e44cff3f11d9e087bdd46c04bb32ff57678.zip |
Merge tag 'thermal-v5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal updates from Daniel Lezcano:
- Depromote debug print on the db8500 platform (Linus Walleij)
- Fix compilation warning when compiling with make W=1 (Amit Kucheria)
- Code cleanup and refactoring, regmap conversion and add hwmon support
on Qoriq (Andrey Smirnov)
- Add an idle injection cpu cooling device and its documentation,
rename the cpu_cooling device to cpufreq_cooling device (Daniel
Lezcano)
- Convert unexported functions to static, add the __init annotation in
the thermal-of code and remove the pointless wrapper functions
(Daniel Lezcano)
- Fix register offset for Armada XP and register reset bit
initialization (Zak Hays)
- Enable hwmon on the rockchip (Stefan Schaeckeler)
- Add the thermal sensor for the H6/H5/H3/A64/A83T/R40 sun8i platform
and their device tree bindings, followed by a fix for the ths number
and the sparse warnings (Yangtao Li)
- Code cleansup for the sun8i and hwmon support (Yangtao Li)
- Silent some messages which are misleading given the changes made in
the previous version on generic-adc (Martin Blumenstingl)
- Rename exynos to Exynos (Krzysztof Kozlowski)
- Add the bcm2711 thermal driver with the device tree bindings (Stefan
Wahren)
- Use usleep_range() instead of udelay() as the call is always done in
a sleep-able context (Geert Uytterhoeven)
- Do code cleanup and re-organization to set the scene for a new
process for the brcmstb (Florian Fainelli)
- Fix bindings check issues on brcm (Stefan Wahren)
- Add Jasper Lake support on int340x (Nivedita Swaminathan)
- Add Comet Lake support on intel pch (Gayatri Kammela)
- Fix unmatched pci_release_region() on x86 (Chuhong Yuan)
- Remove temperature boundaries for rcar and rcar3 (Niklas Söderlund)
- Fix return value to -ENODEV when thermal_zone_of_sensor_register() is
called with the of-node is missing (Peter Mamonov)
- Code cleanup, interrupt bouncing, and better support on stm32 (Pascal
Paillet)
* tag 'thermal-v5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (66 commits)
thermal: stm32: Fix low threshold interrupt flood
thermal: stm32: Improve temperature computing
thermal: stm32: Handle multiple trip points
thermal: stm32: Disable interrupts at probe
thermal: stm32: Rework sensor mode management
thermal: stm32: Fix icifr register name
thermal: of: Make thermal_zone_of_sensor_register return -ENODEV if a sensor OF node is missing
thermal: rcar_gen3_thermal: Remove temperature bound
thermal: rcar_thermal: Remove temperature bound
thermal: intel: intel_pch_thermal: Add Comet Lake (CML) platform support
thermal: intel: Fix unmatched pci_release_region
thermal: int340x: processor_thermal: Add Jasper Lake support
dt-bindings: brcm,avs-ro-thermal: Fix binding check issues
thermal: brcmstb_thermal: Register different ops per process
thermal: brcmstb_thermal: Restructure interrupt registration
thermal: brcmstb_thermal: Add 16nm process thermal parameters
dt-bindings: thermal: Define BCM7216 thermal sensor compatible
thermal: brcmstb_thermal: Prepare to support a different process
thermal: brcmstb_thermal: Do not use DT coefficients
thermal: rcar_thermal: Use usleep_range() instead of udelay()
...
Diffstat (limited to 'Documentation/driver-api')
-rw-r--r-- | Documentation/driver-api/thermal/cpu-idle-cooling.rst | 189 | ||||
-rw-r--r-- | Documentation/driver-api/thermal/exynos_thermal.rst | 8 |
2 files changed, 193 insertions, 4 deletions
diff --git a/Documentation/driver-api/thermal/cpu-idle-cooling.rst b/Documentation/driver-api/thermal/cpu-idle-cooling.rst new file mode 100644 index 000000000000..e4f0859486c7 --- /dev/null +++ b/Documentation/driver-api/thermal/cpu-idle-cooling.rst @@ -0,0 +1,189 @@ + +Situation: +---------- + +Under certain circumstances a SoC can reach a critical temperature +limit and is unable to stabilize the temperature around a temperature +control. When the SoC has to stabilize the temperature, the kernel can +act on a cooling device to mitigate the dissipated power. When the +critical temperature is reached, a decision must be taken to reduce +the temperature, that, in turn impacts performance. + +Another situation is when the silicon temperature continues to +increase even after the dynamic leakage is reduced to its minimum by +clock gating the component. This runaway phenomenon can continue due +to the static leakage. The only solution is to power down the +component, thus dropping the dynamic and static leakage that will +allow the component to cool down. + +Last but not least, the system can ask for a specific power budget but +because of the OPP density, we can only choose an OPP with a power +budget lower than the requested one and under-utilize the CPU, thus +losing performance. In other words, one OPP under-utilizes the CPU +with a power less than the requested power budget and the next OPP +exceeds the power budget. An intermediate OPP could have been used if +it were present. + +Solutions: +---------- + +If we can remove the static and the dynamic leakage for a specific +duration in a controlled period, the SoC temperature will +decrease. Acting on the idle state duration or the idle cycle +injection period, we can mitigate the temperature by modulating the +power budget. + +The Operating Performance Point (OPP) density has a great influence on +the control precision of cpufreq, however different vendors have a +plethora of OPP density, and some have large power gap between OPPs, +that will result in loss of performance during thermal control and +loss of power in other scenarios. + +At a specific OPP, we can assume that injecting idle cycle on all CPUs +belong to the same cluster, with a duration greater than the cluster +idle state target residency, we lead to dropping the static and the +dynamic leakage for this period (modulo the energy needed to enter +this state). So the sustainable power with idle cycles has a linear +relation with the OPP’s sustainable power and can be computed with a +coefficient similar to: + + Power(IdleCycle) = Coef x Power(OPP) + +Idle Injection: +--------------- + +The base concept of the idle injection is to force the CPU to go to an +idle state for a specified time each control cycle, it provides +another way to control CPU power and heat in addition to +cpufreq. Ideally, if all CPUs belonging to the same cluster, inject +their idle cycles synchronously, the cluster can reach its power down +state with a minimum power consumption and reduce the static leakage +to almost zero. However, these idle cycles injection will add extra +latencies as the CPUs will have to wakeup from a deep sleep state. + +We use a fixed duration of idle injection that gives an acceptable +performance penalty and a fixed latency. Mitigation can be increased +or decreased by modulating the duty cycle of the idle injection. + + ^ + | + | + |------- ------- + |_______|_______________________|_______|___________ + + <------> + idle <----------------------> + running + + <-----------------------------> + duty cycle 25% + + +The implementation of the cooling device bases the number of states on +the duty cycle percentage. When no mitigation is happening the cooling +device state is zero, meaning the duty cycle is 0%. + +When the mitigation begins, depending on the governor's policy, a +starting state is selected. With a fixed idle duration and the duty +cycle (aka the cooling device state), the running duration can be +computed. + +The governor will change the cooling device state thus the duty cycle +and this variation will modulate the cooling effect. + + ^ + | + | + |------- ------- + |_______|_______________|_______|___________ + + <------> + idle <--------------> + running + + <-----------------------------> + duty cycle 33% + + + ^ + | + | + |------- ------- + |_______|_______|_______|___________ + + <------> + idle <------> + running + + <-------------> + duty cycle 50% + +The idle injection duration value must comply with the constraints: + +- It is less than or equal to the latency we tolerate when the + mitigation begins. It is platform dependent and will depend on the + user experience, reactivity vs performance trade off we want. This + value should be specified. + +- It is greater than the idle state’s target residency we want to go + for thermal mitigation, otherwise we end up consuming more energy. + +Power considerations +-------------------- + +When we reach the thermal trip point, we have to sustain a specified +power for a specific temperature but at this time we consume: + + Power = Capacitance x Voltage^2 x Frequency x Utilisation + +... which is more than the sustainable power (or there is something +wrong in the system setup). The ‘Capacitance’ and ‘Utilisation’ are a +fixed value, ‘Voltage’ and the ‘Frequency’ are fixed artificially +because we don’t want to change the OPP. We can group the +‘Capacitance’ and the ‘Utilisation’ into a single term which is the +‘Dynamic Power Coefficient (Cdyn)’ Simplifying the above, we have: + + Pdyn = Cdyn x Voltage^2 x Frequency + +The power allocator governor will ask us somehow to reduce our power +in order to target the sustainable power defined in the device +tree. So with the idle injection mechanism, we want an average power +(Ptarget) resulting in an amount of time running at full power on a +specific OPP and idle another amount of time. That could be put in a +equation: + + P(opp)target = ((Trunning x (P(opp)running) + (Tidle x P(opp)idle)) / + (Trunning + Tidle) + ... + + Tidle = Trunning x ((P(opp)running / P(opp)target) - 1) + +At this point if we know the running period for the CPU, that gives us +the idle injection we need. Alternatively if we have the idle +injection duration, we can compute the running duration with: + + Trunning = Tidle / ((P(opp)running / P(opp)target) - 1) + +Practically, if the running power is less than the targeted power, we +end up with a negative time value, so obviously the equation usage is +bound to a power reduction, hence a higher OPP is needed to have the +running power greater than the targeted power. + +However, in this demonstration we ignore three aspects: + + * The static leakage is not defined here, we can introduce it in the + equation but assuming it will be zero most of the time as it is + difficult to get the values from the SoC vendors + + * The idle state wake up latency (or entry + exit latency) is not + taken into account, it must be added in the equation in order to + rigorously compute the idle injection + + * The injected idle duration must be greater than the idle state + target residency, otherwise we end up consuming more energy and + potentially invert the mitigation effect + +So the final equation is: + + Trunning = (Tidle - Twakeup ) x + (((P(opp)dyn + P(opp)static ) - P(opp)target) / P(opp)target ) diff --git a/Documentation/driver-api/thermal/exynos_thermal.rst b/Documentation/driver-api/thermal/exynos_thermal.rst index 5bd556566c70..764df4ab584d 100644 --- a/Documentation/driver-api/thermal/exynos_thermal.rst +++ b/Documentation/driver-api/thermal/exynos_thermal.rst @@ -4,7 +4,7 @@ Kernel driver exynos_tmu Supported chips: -* ARM SAMSUNG EXYNOS4, EXYNOS5 series of SoC +* ARM Samsung Exynos4, Exynos5 series of SoC Datasheet: Not publicly available @@ -14,7 +14,7 @@ Authors: Amit Daniel <amit.daniel@samsung.com> TMU controller Description: --------------------------- -This driver allows to read temperature inside SAMSUNG EXYNOS4/5 series of SoC. +This driver allows to read temperature inside Samsung Exynos4/5 series of SoC. The chip only exposes the measured 8-bit temperature code value through a register. @@ -43,7 +43,7 @@ The three equations are: Trimming info for 85 degree Celsius (stored at TRIMINFO register) Temperature code measured at 85 degree Celsius which is unchanged -TMU(Thermal Management Unit) in EXYNOS4/5 generates interrupt +TMU(Thermal Management Unit) in Exynos4/5 generates interrupt when temperature exceeds pre-defined levels. The maximum number of configurable threshold is five. The threshold levels are defined as follows:: @@ -67,7 +67,7 @@ TMU driver description: The exynos thermal driver is structured as:: Kernel Core thermal framework - (thermal_core.c, step_wise.c, cpu_cooling.c) + (thermal_core.c, step_wise.c, cpufreq_cooling.c) ^ | | |