Purpose
The purpose of this page is document how to enable basic CPU thermal management on Renesas platforms (RCar-H2 being the primary target) in a generic way.
DISCLAIMER
FOR ILLUSTRATIVE PURPOSE ONLY!!! DVFS & thermal management parameters (trip points, CPU OPPs, policy) are all empirical and not optimized. Platform stability and reliability may not be guaranteed.
Known bugs/limitations
As H2 silicon characterization data are not available, all CPU OPPs use the same voltage. Only the CPU frequency is scaled, hence limiting thermal management efficiency.
We tried to also enable voltage scaling (defining empirical supply voltages). However, even if there was no error detected from SW side, the supply voltage did not physically change. We tracked the I2C data written to the chip registers and it looked coherent (a data byte was updated following voltage formula “300mV + 10mV*step”). As we couldn't get the DA9210 datasheet (including register map) in time to analyze registers configuration and so we couldn't debug it further.
How To
Upstream, H2 kernel already includes temperature sensor driver, but no thermal policy and cpufreq driver.
(Optional) Add device tree ethernet support (to support bootp)
-
-
-
Fix clock frequency change using kick bit
-
Add CPUFreq support (originated by Guennadi Liakhovetski and updated with Device Tree support)
-
-
-
-
Add missing clock handling to rcar-thermal driver when using device tree (temporarily HACK until DT support fixed]
(Optional) Minor updates to rcar-thermal driver
-
-
Register cpufreq cooling device, add passive trip point and bind it to thermal zone 0.
-
(Optional) Boot all H2 8 cores (4*C-A15 + 4*C-A7)
-
-
Enable multicluster operation on the kernel command using “apmu=multicluster”
All these changes are available in this branch, and with additional debug traces in this branch.
Results
Temperature sensor and CPU0 clock frequency were traced against time in 3 different scenarios:
No Thermal Management, CPU Fan ON
No Thermal Management, CPU Fan OFF
Prototype Thermal Management Enabled, CPU Fan OFF
With the following conditions:
The 4 C-A15 CPU cores were loaded using cpuloadgen tool, which sources can be found here.
The trace and GNUPlot script files can be found here (see included README file for further details and instructions).
Below is a plot of these data:
As one can see, as soon as temperature reaches 35C (trip point 0) the thermal policy (step-wise) decreases the CPU OPP by 1 step. And each time temperature increases by 5C, CPU speed is decreased by another step. However, as only CPU frequency scaling is used (CPU OPPs have the same supply voltage (1V)), it is not sufficient to really stop the temperature increase, it is actually only slowing it down. This is not a real surprise, as in thermal management, the real key is voltage scaling (leakage current is quadratically proportional to voltage & temperature).
Going further
Potential ways of investigation to do a better thermal management:
Get silicon characterization data to optimize CPU OPPs, notably voltage levels.
Define another trip point from which CPU cores may be hot plugged out to further reduce leakage currents.
In case thermal management was enabled by default on next Renesas development platforms, add circuitry to control the (noisy…) CPU Fan from thermal management policy.
renesas_r-car_h2/thermal_management.txt · Last modified: 2014/03/05 11:00 by ptitiano