PSCI

This page is based on ARM's PSCI Specification, version C (DEN022C), see References.

Introduction

The Power State Coordination Interface (PSCI) is an ARM standard introduced for its new ARMv8 64bit architecture to virtualize CPU power management across exception levels i.e. between software working at different privilege levels: OS kernel, hypervisor and Secure Platform Firmware (SPF). It is used to manage power in the following situations:

  • CPU idle management
  • Dynamic addition/removal of cores (hotplug)
  • Secondary core boot
  • big.LITTLE migration models
  • System shutdown and reset

Exception Levels

ARMv8 has two execution states:

  • AArch32 (32 bits), backward-compatible with ARMv7
  • AArch64 (64 bits)

Exception levels (EL) are different depending on the execution state. AArch32 retains ARMv7 modes:

AArch32 state AArch64 state Stack and typical vendor
Non-secure EL0 (PL0) Non-secure EL0 Unprivileged applications
Non-secure EL1 (PL1) Non-secure EL1 Rich OS kernels (Linux, Windows, iOS, etc.)
Non-secure EL2 (PL2) Non-secure EL2 Hypervisors
Secure EL0 (PL0) Secure EL0 Trusted OS applications
Secure EL3 (PL1) Secure EL1 Trusted OS kernels from Trusted OS vendors such as Trustonic
Secure EL3 (PL1) Secure EL3 Secure Monitor, executing secure platform firmware provided by Silicon vendors and OEMs, ARM Trusted Firmware

Note that secure states don't match between AArch32 and AArch64.

The PSCI allows lower ELs (rich OS kernel in EL1, hypervisor in EL2) to request power management actions from the higher ELs (Hypervisor or SPF for a kernel, SPF only for a hypervisor), with either of two new instructions:

  • Secure Monitor Call (SMC), which allows:
    • EL1 → EL3
    • EL2 → EL3 (if EL2 is implemented)
  • Hypervisor Call (HVC), which allows:
    • EL1 → EL2 only, more suited for platforms without a SPF.

This interface is useful only if at least one of EL2 and EL3 are implemented, as otherwise, physical PM is directly managed by the OS on EL1. When both ELs are implemented, the hypervisor (EL2) must trap SMCs1), and propagate them to the SPF if deemed necessary.

CPU Idle management

Idle cores are put in low-power states by the OSPM. States are characterized by:

  • Power consumption
  • Wakeup latency

Power states are typically chosen according to their latency. In idle management, cores in low-power states can be woken up at any given time, and are still considered available by the system.

Possible states:

  • Run: the core is operational
  • Standby:
    • powered up
    • context preserved
    • no reset required at wakeup
    • debug registers accessible
    • typically just a WFI/WFE
  • Retention:
    • similar to standby: powered up, context preserved, no reset required
    • debug registers not accessible
    • requires programming of the power controller, therefore needs PSCI
    • lower power consumption and different wakeup latency
  • Power-down:
    • Core powered off
    • Context must be saved beforehand (fore each EL)
    • On wakeup, the core must be reset and saved contexts restored

ARM expects power controllers to be programmed by the higher EL implemented, typically the SPF. Therefore, power states deeper than standby need to use the PSCI to request power controller programming. Furthermore, PSCI allows contexts saving for each EL by passing pointers to saved context to the next EL (EL1 to EL2 and EL2 to EL3, if both are implemented).

PM Topology

Hierarchy of shared and exclusive power domains:

Power domains (or affinity levels):

  • System
  • Cluster
  • Core

Power states:

  • Local: for a particulare node or affinity instance (e.g. core 0, cluster 1…)
  • Composite: state of a node and its parents (core+cluster), when a change should have an impact on several levels (the last core of cluster going idle requests cluster-level state change)

A power level cannot be in a lower power state than the highest power state of its children (it may get into a low power state only when none of its children are in a higher power state)

Valide (low-power) composite states:

System Cluster Core
Run Run Standby
Run Run Retention
Run Run Power-down
Run Retention Retention
Run Retention Power-down
Run Power-down Power-down
Retention Retention Retention
Retention Retention Power-down
Retention Power-down Power-down
Power-down Power-down Power-down

Power state coordination

Entry into local power states for high level nodes in a power topology (e.g. clusters or system) requires coordinating children nodes. For example, entry into a cluster power-down state is only possible when all cores in the cluster are powered down. To achieve this, every core but the last one has to be placed into a power-down state, and the last one places itself and the cluster into a power-down state.

PSCI supports two modes of power state coordination:

  • Platform coordinated mode: Default mode, where the PSCI implementation is responsible of setting higher level nodes into low-power states. The callee chooses the most appropriate power state for high level nodes, according to:
    • Depth: the high level might not go deeper than requested
    • Latency: it might not go in an higher-latency state than requested either. Note that latency might not always increase with depth.
  • OS initiated mode (introduced in PSCIv1.0): The calling OS is responsible for coordination. Idle states for high level nodes are only selected when the last running core in the node goes idle. To avoid races between the PSCI view and the OS view of the node state:
    • the OS indicate when the calling core is the last running core at a particular power hierarchy level. It must also specify which power hierarchy level the core is last in (cluster or system).
    • the PSCI implementation rejects any request inconsistent with its view of the core state

The coordination mode is set with the PSCI_SET_SUSPEND_MODE function.

CPU Hotplug

Dynamically switch cores on and off. Contrary to power-down,

  • The core is no more available for processing (including interrupts)
  • No wakeup events on unplugged cores
  • Core hotplugging requires an explicit command (through PSCI)

Hotplugging is done with a call to CPU_OFF and CPU_ON. These calls need to save the core context and provide a return adress, for each calling EL.

big.LITTLE

big.LITTLE systems can use three scheduling models:

  • Cluster migration: only one cluster (either big or LITTLE) is active at any one time. When the active cluster load crosses a given threshold, the “cluster context” is migrated to the other cluster. This requires every EL to migrate its own context to the inbound cluster.
  • CPU migration: each big core is paired with a LITTLE one, and migration between cores is delt with on a pair level. As for the previous mode, only one core in a pair is active at any given time.
  • Global Task Scheduling: The OS operates across all cores in all clusters, and is aware of the compute capacity differences between big and LITTLE cores. The scheduler assigns tasks to cores based on the task compute requirements. OSPM idle management and hotplug powers off unused or under-utilized cores

The PSCI provides an interface for Trusted OS context migration, and turning cores on/off.

System PM

PSCI provides functions for:

  • shutdown
  • reset
  • suspend

The system is taken from the point of view of the caller (guest OS vs host or hypervisor).

PSCI functions

PSCI v1.0 defines 18 functions across (currently) 3 versions: v0.1, v0.2 and v1.0. These - according to SMC calling convention - are backward-compatible between minor revisions.

SMC Calling Conventions

SMC exceptions are generated by the SMC instruction and handled by the Secure Monitor. Its operation is determined by the parameters passed in through registers.

Two calling conventions are defined:

  • SMC32: 32-bit interface which can be used by both 32-bit and 64-bit clients, with up to six2) 32-bit arguments, passed through R0 to R3 (AArch32) or W0 to W3 (AArch64), with return values in R0 or W0.
  • SMC64: 64-bit interface which can be used only by 64-bit clients, with up to six 64-bit arguments, passed through X0 to X3, with return values in X0 or W0, depending on the return parameter size.

An AArch32 client calling a 64-bit function will get a return code of 0xFFFFFFFF (matching PSCI NOT_SUPPORTED error code = -1). While AArch64 SMC32 calls are legit, the caller must ensure it limits itself to 32-bit arguments.

The SMCCC also requires the immediate value used with an SMC (or HVC) instruction be 0.

Each function is characterised by a 32-bits function ID (of type 0x8400XXXX for SMC32 functions, and 0xC400XXXX for SMC64 functions). Every function exists in 32-bits, only some have a 64-bits variant.

When using HVC, the format of the call, in terms of immediate value, and register usage, is the same as in the SMC case. The only change is the replacement of the SMC instruction with HVC.

Error Codes

These are 32-bit signed integers, for 32 an 64-bits calls.

Symbol Value
SUCCESS 0
NOT_SUPPORTED -1
INVALID_PARAMETERS -2
DENIED -3
ALREADY_ON -4
ON_PENDING -5
INTERNAL_FAILURE -6
NOT_PRESENT -7
DISABLED -8
INVALID_ADDRESS -9

Functions Overview

  • PSCI_VERSION: Return the version of PSCI implemented (16-bit major and 16-bit minor number)
  • CPU_SUSPEND: Suspend execution on a core or higher level topology node. Intended for use in idle subsystems where the core is expected to return to execution through a wakeup event
  • CPU_OFF: Power down the calling core. Intended for use in hotplug. A core that is powered down by CPU_OFF can only be powered up again in response to a CPU_ON
  • CPU_ON: Power up a core. Used to power up cores that either:
    • Have not yet been booted into the calling supervisory software
    • Have been previously powered down with a CPU_OFF call
  • AFFINITY_INFO: Enable the caller to request status of an affinity instance (a particular core/cluster/the whole system)
  • MIGRATE: Used to ask a uniprocessor Trusted OS to migrate its context to a specific core
  • MIGRATE_INFO_TYPE: Allows a caller to identify the level of multicore support present in the Trusted OS
  • MIGRATE_INFO_UP_CPU: For a uniprocessor Trusted OS, returns the current resident core
  • SYSTEM_OFF: Shutdown the system
  • SYSTEM_RESET: Reset the system
  • PSCI_FEATURES: Query API that allows discovering whether a specific PSCI function is implemented and its features
  • CPU_FREEZE: Places the core into an IMPLEMENTATION DEFINED low-power state. Unlike CPU_OFF it is still valid for interrupts to be targeted to the core. However, the core must remain in the low-power state until it a CPU_ON command is issued for it
  • CPU_DEFAULT_SUSPEND: Will place a core into an IMPLEMENTATION DEFINED low-power state. Unlike CPU_SUSPEND the caller need not specify a power_state parameter
  • NODE_HW_STATE: Intended to return the true HW state of a node in the power domain topology of the system
  • SYSTEM_SUSPEND: Used to implement suspend to RAM. The semantics are equivalent to a CPU_SUSPEND to the deepest low-power state
  • PSCI_SET_SUSPEND_MODE: Allows setting the mode used by CPU_SUSPEND to coordinate power states
  • PSCI_STAT_RESIDENCY: Returns the amount of time the platform has spent in the given power state since cold boot
  • PSCI_STAT_COUNT: Returns the number of times the platform has used the given power state since cold boot

Versions

PSCI v0.1

No standard IDs where defined in this version3).

Function SMC32 ID SMC64 ID
CPU_SUSPEND 0x8400XXXX 0xC400XXXX
CPU_ON
CPU_OFF
MIGRATE

PSCI v0.2

Changes from v0.1:

  • Standard IDs introduced
  • New functions
  • PSCI version number added
  • Parameters and return values changes for v0.1 functions
  • All functions except MIGRATE, MIGRATE_INFO_TYPE and MIGRATE_INFO_UP_CPU made mandatory.
Function SMC32 ID SMC64 ID Mandatory New
PSCI_VERSION 0x84000000 NA Mandatory New
CPU_SUSPEND 0x84000001 0xC4000001 Mandatory
CPU_OFF 0x84000002 NA Mandatory
CPU_ON 0x84000003 0xC4000003 Mandatory
AFFINITY_INFO 0x84000004 0xC4000004 Mandatory New
MIGRATE 0x84000005 0xC4000005 Optional
MIGRATE_INFO_TYPE 0x84000006 NA Optional New
MIGRATE_INFO_UP_CPU 0x84000007 0xC4000007 Optional New
SYSTEM_OFF 0x84000008 NA Mandatory New
SYSTEM_RESET 0x84000009 NA Mandatory New
Note
MIGRATE_INFO_CPU is mandatory when MIGRATE is implemented

PSCI v1.0

Function SMC32 ID SMC64 ID Mandatory New
PSCI_VERSION 0x84000000 NA Mandatory
CPU_SUSPEND 0x84000001 0xC4000001 Mandatory
CPU_OFF 0x84000002 NA Mandatory
CPU_ON 0x84000003 0xC4000003 Mandatory
AFFINITY_INFO 0x84000004 0xC4000004 Mandatory
MIGRATE 0x84000005 0xC4000005 Optional
MIGRATE_INFO_TYPE 0x84000006 NA Optional
MIGRATE_INFO_UP_CPU 0x84000007 0xC4000007 Optional
SYSTEM_OFF 0x84000008 NA Mandatory
SYSTEM_RESET 0x84000009 NA Mandatory
PSCI_FEATURES 0x8400000A NA Mandatory New
CPU_FREEZE 0x8400000B NA Optional New
CPU_DEFAULT_SUSPEND 0x8400000C 0xC400000C Optional New
NODE_HW_STATE 0x8400000D 0xC400000D Optional New
SYSTEM_SUSPEND 0x8400000E 0xC400000E Optional New
PSCI_SET_SUSPEND_MODE 0x8400000F NA Optional New
PSCI_STAT_RESIDENCY 0x84000010 0xC4000010 Optional New
PSCI_STAT_COUNT 0x84000011 0xC4000011 Optional New
Note
If either one of PSCI_STAT_RESIDENCY and PSCI_STAT_COUNT is implemented, the other must be too.

Summary

The PSCI Version column gives the version the function was introduced.

Function SMC32 ID SMC64 ID Mandatory PSCI version Note
PSCI_VERSION 0x84000000 NA Mandatory 0.2
CPU_SUSPEND 0x84000001 0xC4000001 Mandatory 0.1
CPU_OFF 0x84000002 NA Mandatory 0.1
CPU_ON 0x84000003 0xC4000003 Mandatory 0.1
AFFINITY_INFO 0x84000004 0xC4000004 Mandatory 0.2
MIGRATE 0x84000005 0xC4000005 Optional 0.1
MIGRATE_INFO_TYPE 0x84000006 NA Optional 0.2 Mandatory when MIGRATE is implemented
MIGRATE_INFO_UP_CPU 0x84000007 0xC4000007 Optional 0.2
SYSTEM_OFF 0x84000008 NA Mandatory 0.2
SYSTEM_RESET 0x84000009 NA Mandatory 0.2
PSCI_FEATURES 0x8400000A NA Mandatory 1.0
CPU_FREEZE 0x8400000B NA Optional 1.0
CPU_DEFAULT_SUSPEND 0x8400000C 0xC400000C Optional 1.0
NODE_HW_STATE 0x8400000D 0xC400000D Optional 1.0
SYSTEM_SUSPEND 0x8400000E 0xC400000E Optional 1.0
PSCI_SET_SUSPEND_MODE 0x8400000F NA Optional 1.0
PSCI_STAT_RESIDENCY 0x84000010 0xC4000010 Optional 1.0 If either one is implemented, the other must be too
PSCI_STAT_COUNT 0x84000011 0xC4000011 Optional 1.0

Implementation

ARM Trusted Firmware

The ARM Trusted Firmware implements both an SPF and a Trusted OS (OP-TEE OS). The source code is available at https://github.com/ARM-software/arm-trusted-firmware. For HiKey, see https://github.com/96boards/arm-trusted-firmware.

PSCI support in SPF (version 1.1):

PSCI function Supported Comments
PSCI_VERSION Yes The version returned is 1.0
CPU_SUSPEND Yes* The original power_state format is used
CPU_OFF Yes*
CPU_ON Yes*
AFFINITY_INFO Yes
MIGRATE Yes**
MIGRATE_INFO_TYPE Yes**
MIGRATE_INFO_CPU Yes**
SYSTEM_OFF Yes*
SYSTEM_RESET Yes*
PSCI_FEATURES Yes
CPU_FREEZE No
CPU_DEFAULT_SUSPEND No
CPU_HW_STATE No Named NODE_HW_STATE in specification
SYSTEM_SUSPEND No
PSCI_SET_SUSPEND_MODE No
PSCI_STAT_RESIDENCY No
PSCI_STAT_COUNT No
* Need platform hooks to be supported (see plat_pm_ops, platform_setup_pm(), documented in docs/porting-guide.md)
** Need SPD (Trusted OS) hooks to be supported - not supported by OP-TEE OS yet.

All non-supported functions are optional PSCIv1.0 functions.

ARM FVPs and Juno are supported. Support for HiKey 96board is WIP (only affinst_on() and affinst_on_finish() implemented at the time of this writing).

Linux (4.0.0-rc4 and Hikey 3.18.0)

Both arm and arm64 have a partial support for PSCI (v0.1 and v0.2, none for v1.0). MIGRATE_INFO_UP_CPU is the only missing PSCIv0.2 (optional) function.

Supported functions (arch/arm{,64}/kernel/psci.c):

  • psci_cpu_suspend
  • psci_cpu_off
  • psci_cpu_on
  • psci_migrate
  • psci_affinity_info
  • psci_migrate_info_type
  • psci_sys_reset
  • psci_sys_poweroff
  • psci_get_version

The device tree is used to discover whether/which version of PSCI is supported, to get the conduit used (HVC/SMC) and functions IDs for v0.1. PSCI code is in arch/arm{,64}/kernel/psci*.

64-bits ARM

In arch/arm64/kernel/setup.c, setup_arch() calls psci_init() (defined in arch/arm64/kernel/psci.c). It parses the DT and calls psci_0_{1,2}_init() (according to the PSCI version found), which registers psci_<PSCI_FN_NAME> functions in struct psci_operations psci_ops (and psci_sys_reset() and psci_sys_poweroff() directly as arm_pm_restart() and pm_power_off() respectively), which are called through cpu_psci_cpu_* functions (e.g. cpu_psci_cpu_boot() is a wrapper for psci_ops.cpu_on()). These are packed in struct cpu_operations cpu_psci_ops (e.g. cpu_psci_ops.cpu_boot = cpu_psci_cpu_boot).

cpu_psci_ops is referenced in arch/arm64/kernel/cpu_ops.c, as an element of struct cpu_operations *supported_cpu_ops[]. cpu_read_ops() registers it as the final struct cpu_operations cpu_ops (rather than smp_spin_table_ops) (for each CPU) according to the CPU enable-method DT property (either “psci” or “spin-table”).

32-bits ARM

Discovery from the DT is similar (in arch/arm/kernel/psci.c), but instead of struct cpu_operations, struct smp_operations is used (kernel/psci_smp.c). These operations are registered as the main operations in setup.c (via smp_set_ops()).

References

1)
See bit TSC of HCR_EL2 register, on ARMv8 platforms
2)
PSCI functions use four at most
3)
IDs still need to follow the SMCCC, which assigns 0x8400XXXX to 32-bit standard service calls and 0xC400XXXX to 64-bit ones
psci.txt · Last modified: 2015/03/17 07:41 by hchaumette
Recent changes RSS feed Creative Commons License Donate Minima Template by Wikidesign Driven by DokuWiki