| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
Print the FDT error description along with the error message if failed
to set the "linux,drconf-usable-memory" property in the kdump kernel's
FDT.
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221216122708.182154-1-sourabhjain@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
machine_is() can't provide correct results before probe_machine() has
run. Warn when it's used too early in boot, placing the WARN_ON() in a
helper function so the reported file:line indicates exactly what went
wrong.
checkpatch complains about __attribute__((weak)) in the patch, so
change that to __weak, and align the line continuations as well.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210-warn-on-machine-is-before-probe-machine-v2-1-b57f8243c51c@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
E500MC64 is a processor pre-dating E5500 that has never been
commercialised. Use -mcpu=e5500 for E5500 core.
More details at https://gcc.gnu.org/PR108149
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/fa71ed20d22c156225436374f0ab847daac893bc.1671475543.git.christophe.leroy@csgroup.eu
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a PCI error is encountered 6th time in an hour we
set the channel state to perm_failure and notify the
driver about the permanent failure.
However, after upstream commit 38ddc011478e ("powerpc/eeh:
Make permanently failed devices non-actionable"), EEH handler
stops calling any routine once the device is marked as
permanent failure. This issue can lead to fatal consequences
like kernel hang with certain PCI devices.
Following log is observed with lpfc driver, with and without
this change, Without this change kernel hangs, If PCI error
is encountered 6 times for a device in an hour.
Without the change
EEH: Beginning: 'error_detected(permanent failure)'
PCI 0132:60:00.0#600000: EEH: not actionable (1,1,1)
PCI 0132:60:00.1#600000: EEH: not actionable (1,1,1)
EEH: Finished:'error_detected(permanent failure)'
With the change
EEH: Beginning: 'error_detected(permanent failure)'
EEH: Invoking lpfc->error_detected(permanent failure)
EEH: lpfc driver reports: 'disconnect'
EEH: Invoking lpfc->error_detected(permanent failure)
EEH: lpfc driver reports: 'disconnect'
EEH: Finished:'error_detected(permanent failure)'
To fix the issue, set channel state to permanent failure after
notifying the drivers.
Fixes: 38ddc011478e ("powerpc/eeh: Make permanently failed devices non-actionable")
Suggested-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230209105649.127707-1-ganeshgr@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Cc: stable@vger.kernel.org # v5.18+
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230127135755.79929-22-mathieu.desnoyers@efficios.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the tokens for all implemented RTAS functions now available via
rtas_function_token(), which is optimal and safe for arbitrary
contexts, there is no need to use rtas_token() or cache its result.
Most conversions are trivial, but a few are worth describing in more
detail:
* Error injection token comparisons for lockdown purposes are
consolidated into a simple predicate: token_is_restricted_errinjct().
* A couple of special cases in block_rtas_call() do not use
rtas_token() but perform string comparisons against names in the
function table. These are converted to compare against token values
instead, which is logically equivalent but less expensive.
* The lookup for the ibm,os-term token can be deferred until needed,
instead of caching it at boot to avoid device tree traversal during
panic.
* Since rtas_function_token() accesses a read-only data structure
without taking any locks, xmon's lookup of set-indicator can be
performed as needed instead of cached at startup.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-20-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Users of rtas_token() supply a string argument that can't be validated
at build time. A typo or misspelling has to be caught by inspection or
by observing wrong behavior at runtime.
Since the core RTAS code now has consolidated the names of all
possible RTAS functions and mapped them to their tokens, token lookup
can be implemented using symbolic constants to index a static array.
So introduce rtas_function_token(), a replacement API which does that,
along with a rtas_service_present()-equivalent helper,
rtas_function_implemented(). Callers supply an opaque predefined
function handle which is used internally to index the function
table. Typos or other inappropriate arguments yield build errors, and
the function handle is a type that can't be easily confused with RTAS
tokens or other integer types.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-19-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
| |
Convert the TLB block invalidate characteristics discovery to the new
papr_sysparm API. This occurs too early in boot to use
papr_sysparm_buf_alloc(), so use a static buffer.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-18-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
| |
The new papr_sysparm API handles the details of system parameter
retrieval. Use that instead of open-coding the RTAS call, work area
management, and retries.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-17-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
/proc/powerpc/lparcfg derives the LPAR name and SPLPAR characteristics
it reports using bare calls to the RTAS ibm,get-system-parameter
function. Convert these to the higher-level papr_sysparm API, which
handles the tedious details.
While the SPLPAR string parsing code could stand to be updated, that
should be done in a separate change. It is minimally modified here to
reduce the risk of changing behavior.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-16-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
| |
Convert the direct invocation of the ibm,get-system-parameter RTAS
function to papr_sysparm_get().
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-15-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce a set of APIs for retrieving and updating PAPR system
parameters. This encapsulates the toil of temporary RTAS work area
management, RTAS function call retries, and translation of RTAS call
statuses to conventional error values.
There are several places in the kernel that already retrieve system
parameters by calling the RTAS ibm,get-system-parameter function
directly. These will be converted to papr_sysparm_get() in changes to
follow.
As for updating system parameters, current practice is to use
sys_rtas() from user space; there are no in-kernel users of the RTAS
ibm,set-system-parameter function. However this will become deprecated
in time because it is not compatible with lockdown.
The papr_sysparm_* APIs will form the common basis for in-kernel
and user space access to system parameters. The code to expose the
set/get capabilities to user space will follow.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-14-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
| |
Hold a work area object for the duration of the RTAS
ibm,configure-connector sequence, eliminating locking and copying
around each RTAS call.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-13-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Various pseries-specific RTAS functions take a temporary "work area"
parameter - a buffer in memory accessible to RTAS. Typically such
functions are passed the statically allocated rtas_data_buf buffer as
the argument. This buffer is protected by a global spinlock. So users
of rtas_data_buf cannot perform sleeping operations while accessing
the buffer.
Most RTAS functions that have a work area parameter can return a
status (-2/990x) that indicates that the caller should retry. Before
retrying, the caller may need to reschedule or sleep (see
rtas_busy_delay() for details). This combination of factors
leads to uncomfortable constructions like this:
do {
spin_lock(&rtas_data_buf_lock);
rc = rtas_call(token, __pa(rtas_data_buf, ...);
if (rc == 0) {
/* parse or copy out rtas_data_buf contents */
}
spin_unlock(&rtas_data_buf_lock);
} while (rtas_busy_delay(rc));
Another unfortunately common way of handling this is for callers to
blithely ignore the possibility of a -2/990x status and hope for the
best.
If users were allowed to perform blocking operations while owning a
work area, the programming model would become less tedious and
error-prone. Users could schedule away, sleep, or perform other
blocking operations without having to release and re-acquire
resources.
We could continue to use a single work area buffer, and convert
rtas_data_buf_lock to a mutex. But that would impose an unnecessarily
coarse serialization on all users. As awkward as the current design
is, it prevents longer running operations that need to repeatedly use
rtas_data_buf from blocking the progress of others.
There are more considerations. One is that while 4KB is fine for all
current in-kernel uses, some RTAS calls can take much smaller buffers,
and some (VPD, platform dumps) would likely benefit from larger
ones. Another is that at least one RTAS function (ibm,get-vpd)
has *two* work area parameters. And finally, we should expect the
number of work area users in the kernel to increase over time as we
introduce lockdown-compatible ABIs to replace less safe use cases
based on sys_rtas/librtas.
So a special-purpose allocator for RTAS work area buffers seems worth
trying.
Properties:
* The backing memory for the allocator is reserved early in boot in
order to satisfy RTAS addressing requirements, and then managed with
genalloc.
* Allocations can block, but they never fail (mempool-like).
* Prioritizes first-come, first-serve fairness over throughput.
* Early boot allocations before the allocator has been initialized are
served via an internal static buffer.
Intended to replace rtas_data_buf. New code that needs RTAS work area
buffers should prefer this API.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-12-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
| |
Decompose the RTAS entry C code into tracing and non-tracing variants,
calling the just-added tracepoints in the tracing-enabled path. Skip
tracing in contexts known to be unsafe (real mode, CPU offline).
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-11-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add two sets of tracepoints to be used around RTAS entry:
* rtas_input/rtas_output, which emit the function name, its inputs,
the returned status, and any other outputs. These produce an API-level
record of OS<->RTAS activity.
* rtas_ll_entry/rtas_ll_exit, which are lower-level and emit the
entire contents of the parameter block (aka rtas_args) on entry and
exit. Likely useful only for debugging.
With uses of these tracepoints in do_enter_rtas() to be added in the
following patch, examples of get-time-of-day and event-scan functions
as rendered by trace-cmd (with some multi-line formatting manually
imposed on the rtas_ll_* entries to avoid extremely long lines in the
commit message):
cat-36800 [059] 4978.518303: rtas_input: get-time-of-day arguments:
cat-36800 [059] 4978.518306: rtas_ll_entry: token=3 nargs=0 nret=8
params: [0]=0x00000000 [1]=0x00000000 [2]=0x00000000 [3]=0x00000000
[4]=0x00000000 [5]=0x00000000 [6]=0x00000000 [7]=0x00000000
[8]=0x00000000 [9]=0x00000000 [10]=0x00000000 [11]=0x00000000
[12]=0x00000000 [13]=0x00000000 [14]=0x00000000 [15]=0x00000000
cat-36800 [059] 4978.518366: rtas_ll_exit: token=3 nargs=0 nret=8
params: [0]=0x00000000 [1]=0x000007e6 [2]=0x0000000b [3]=0x00000001
[4]=0x00000000 [5]=0x0000000e [6]=0x00000008 [7]=0x2e0dac40
[8]=0x00000000 [9]=0x00000000 [10]=0x00000000 [11]=0x00000000
[12]=0x00000000 [13]=0x00000000 [14]=0x00000000 [15]=0x00000000
cat-36800 [059] 4978.518366: rtas_output: get-time-of-day status: 0, other outputs: 2022 11 1 0 14 8 772648000
kworker/39:1-336 [039] 4982.731623: rtas_input: event-scan arguments: 4294967295 0 80484920 2048
kworker/39:1-336 [039] 4982.731626: rtas_ll_entry: token=6 nargs=4 nret=1
params: [0]=0xffffffff [1]=0x00000000 [2]=0x04cc1a38 [3]=0x00000800
[4]=0x00000000 [5]=0x0000000e [6]=0x00000008 [7]=0x2e0dac40
[8]=0x00000000 [9]=0x00000000 [10]=0x00000000 [11]=0x00000000
[12]=0x00000000 [13]=0x00000000 [14]=0x00000000 [15]=0x00000000
kworker/39:1-336 [039] 4982.731676: rtas_ll_exit: token=6 nargs=4 nret=1
params: [0]=0xffffffff [1]=0x00000000 [2]=0x04cc1a38 [3]=0x00000800
[4]=0x00000001 [5]=0x0000000e [6]=0x00000008 [7]=0x2e0dac40
[8]=0x00000000 [9]=0x00000000 [10]=0x00000000 [11]=0x00000000
[12]=0x00000000 [13]=0x00000000 [14]=0x00000000 [15]=0x00000000
kworker/39:1-336 [039] 4982.731677: rtas_output: event-scan status: 1, other outputs:
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-10-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make do_enter_rtas() take a pointer to struct rtas_args and do the
__pa() conversion in one place instead of leaving it to callers. This
also makes it possible to introduce enter/exit tracepoints that access
the rtas_args struct fields.
There's no apparent reason to force inlining of do_enter_rtas()
either, and it seems to bloat the code a bit. Let the compiler decide.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-9-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The core RTAS support code and its clients perform two types of lookup
for RTAS firmware function information.
First, mapping a known function name to a token. The typical use case
invokes rtas_token() to retrieve the token value to pass to
rtas_call(). rtas_token() relies on of_get_property(), which performs
a linear search of the /rtas node's property list under a lock with
IRQs disabled.
Second, and less common: given a token value, looking up some
information about the function. The primary example is the sys_rtas
filter path, which linearly scans a small table to match the token to
a rtas_filter struct. Another use case to come is RTAS entry/exit
tracepoints, which will require efficient lookup of function names
from token values. Currently there is no general API for this.
We need something much like the existing rtas_filters table, but more
general and organized to facilitate efficient lookups.
Introduce:
* A new rtas_function type, aggregating function name, token,
and filter. Other function characteristics could be added in the
future.
* An array of rtas_function, where each element corresponds to a known
RTAS function. All information in the table is static save the token
values, which are derived from the device tree at boot. The array is
sorted by function name to allow binary search.
* A named constant for each known RTAS function, used to index the
function array. These also will be used in a client-facing API to be
added later.
* An xarray that maps valid tokens to rtas_function objects.
Fold the existing rtas_filter table into the new rtas_function array,
with the appropriate adjustments to block_rtas_call(). Remove
now-redundant fields from struct rtas_filter. Preserve the function of
the CONFIG_CPU_BIG_ENDIAN guard in the current filter table by
introducing a per-function flag that is set for the function entries
related to pseries LPAR migration. These have never had working users
via sys_rtas on ppc64le; see commit de0f7349a0dd ("powerpc/rtas:
prevent suspend-related sys_rtas use on LE").
Convert rtas_token() to use a lockless binary search on the function
table. Fall back to the old behavior for lookups against names that
are not known to be RTAS functions, but issue a warning. rtas_token()
is for function names; it is not a general facility for accessing
arbitrary properties of the /rtas node. All known misuses of
rtas_token() have been converted to more appropriate of_ APIs in
preceding changes.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-8-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pseries platform has been LPAR-only for several generations, and
the PAPR spec:
* Guarantees that timebase synchronization is performed by
the platform ("The timebase registers are synchronized by the
platform before CPUs are given to the OS" - 7.3.8 SMP Support).
* Completely omits the RTAS freeze-time-base and thaw-time-base RTAS
functions, which are CHRP artifacts.
This code is effectively unused on currently supported models, so drop
it.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-7-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some RTAS functions that have work area parameters impose alignment
requirements on the work area passed to them by the OS. Examples
include:
- ibm,configure-connector
- ibm,update-nodes
- ibm,update-properties
4KB is the greatest alignment required by PAPR for such
buffers. rtas_data_buf used to have a __page_aligned attribute in the
arch/ppc64 days, but that was changed to __cacheline_aligned for
unknown reasons by commit 033ef338b6e0 ("powerpc: Merge rtas.c into
arch/powerpc/kernel"). That works out to 128-byte alignment
on ppc64, which isn't right.
This was found by inspection and I'm not aware of any real problems
caused by this. Either current RTAS implementations don't enforce the
alignment constraints, or rtas_data_buf is always being placed at a
4KB boundary by accident (or both, perhaps).
Use __aligned(SZ_4K) to ensure the rtas_data_buf has alignment
appropriate for all users.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: 033ef338b6e0 ("powerpc: Merge rtas.c into arch/powerpc/kernel")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-6-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ibm,get-system-parameter RTAS function may return -2 or 990x,
which indicate that the caller should try again.
pSeries_cmo_feature_init() ignores this, making it possible to fail to
detect cooperative memory overcommit capabilities during boot.
Move the RTAS call into a conventional rtas_busy_delay()-based
loop, dropping unnecessary clearing of rtas_data_buf.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-5-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ibm,get-system-parameter RTAS function may return -2 or 990x,
which indicate that the caller should try again.
lparcfg's parse_system_parameter_string() ignores this, making it
possible to intermittently report incorrect SPLPAR characteristics.
Move the RTAS call into a coventional rtas_busy_delay()-based loop.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-4-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ibm,get-system-parameter RTAS function may return -2 or 990x,
which indicate that the caller should try again.
pseries_lpar_read_hblkrm_characteristics() ignores this, making it
possible to incorrectly detect TLB block invalidation characteristics
at boot.
Move the RTAS call into a coventional rtas_busy_delay()-based loop.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: 1211ee61b4a8 ("powerpc/pseries: Read TLB Block Invalidate Characteristics")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-3-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ibm,get-system-parameter RTAS function may return -2 or 990x,
which indicate that the caller should try again. read_24x7_sys_info()
ignores this, allowing transient failures in reporting processor
module information.
Move the RTAS call into a coventional rtas_busy_delay()-based loop,
along with the parsing of results on success.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: 8ba214267382 ("powerpc/hv-24x7: Add rtas call in hv-24x7 driver to get processor details")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-2-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some code that runs early in boot calls RTAS functions that can return
-2 or 990x statuses, which mean the caller should retry. An example is
pSeries_cmo_feature_init(), which invokes ibm,get-system-parameter but
treats these benign statuses as errors instead of retrying.
pSeries_cmo_feature_init() and similar code should be made to retry
until they succeed or receive a real error, using the usual pattern:
do {
rc = rtas_call(token, etc...);
} while (rtas_busy_delay(rc));
But rtas_busy_delay() will perform a timed sleep on any 990x
status. This isn't safe so early in boot, before the CPU scheduler and
timer subsystem have initialized.
The -2 RTAS status is much more likely to occur during single-threaded
boot than 990x in practice, at least on PowerVM. This is because -2
usually means that RTAS made progress but exhausted its self-imposed
timeslice, while 990x is associated with concurrent requests from the
OS causing internal contention. Regardless, according to the language
in PAPR, the OS should be prepared to handle either type of status at
any time.
Add a fallback path to rtas_busy_delay() to handle this as safely as
possible, performing a small delay on 990x. Include a counter to
detect retry loops that aren't making progress and bail out. Add __ref
to rtas_busy_delay() since it now conditionally calls an __init
function.
This was found by inspection and I'm not aware of any real
failures. However, the implementation of rtas_busy_delay() before
commit 38f7b7067dae ("powerpc/rtas: rtas_busy_delay() improvements")
was not susceptible to this problem, so let's treat this as a
regression.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: 38f7b7067dae ("powerpc/rtas: rtas_busy_delay() improvements")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-1-26929c8cce78@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for loading keys from the PLPKS on pseries machines, with the
"ibm,plpks-sb-v1" format.
The object format is expected to be the same, so there shouldn't be any
functional differences between objects retrieved on powernv or pseries.
Unlike on powernv, on pseries the format string isn't contained in the
device tree. Use secvar_ops->format() to fetch the format string in a
generic manner, rather than searching the device tree ourselves.
(The current code searches the device tree for a node compatible with
"ibm,edk2-compat-v1". This patch switches to calling secvar_ops->format(),
which in the case of OPAL/powernv means opal_secvar_format(), which
searches the device tree for a node compatible with "ibm,secvar-backend"
and checks its "format" property. These are equivalent, as skiboot creates
a node with both "ibm,edk2-compat-v1" and "ibm,secvar-backend" as
compatible strings.)
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-27-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A few improvements to load_powerpc.c:
- include integrity.h for the pr_fmt()
- move all error reporting out of get_cert_list()
- use ERR_PTR() to better preserve error detail
- don't use pr_err() for missing keys
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-26-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pseries platform can support dynamic secure boot (i.e. secure boot
using user-defined keys) using variables contained with the PowerVM LPAR
Platform KeyStore (PLPKS). Using the powerpc secvar API, expose the
relevant variables for pseries dynamic secure boot through the existing
secvar filesystem layout.
The relevant variables for dynamic secure boot are signed in the
keystore, and can only be modified using the H_PKS_SIGNED_UPDATE hcall.
Object labels in the keystore are encoded using ucs2 format. With our
fixed variable names we don't have to care about encoding outside of the
necessary byte padding.
When a user writes to a variable, the first 8 bytes of data must contain
the signed update flags as defined by the hypervisor.
When a user reads a variable, the first 4 bytes of data contain the
policies defined for the object.
Limitations exist due to the underlying implementation of sysfs binary
attributes, as is the case for the OPAL secvar implementation -
partial writes are unsupported and writes cannot be larger than PAGE_SIZE.
(Even when using bin_attributes, which can be larger than a single page,
sysfs only gives us one page's worth of write buffer at a time, and the
hypervisor does not expose an interface for partial writes.)
Co-developed-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Co-developed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
[mpe: Add NLS dependency to fix build errors, squash fix from ajd]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-25-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before interacting with the PLPKS, we ask the hypervisor to generate a
password for the current boot, which is then required for most further
PLPKS operations.
If we kexec into a new kernel, the new kernel will try and fail to
generate a new password, as the password has already been set.
Pass the password through to the new kernel via the device tree, in
/chosen/ibm,plpks-pw. Check for the presence of this property before
trying to generate a new password - if it exists, use the existing
password and remove it from the device tree.
This only works with the kexec_file_load() syscall, not the older
kexec_load() syscall, however if you're using Secure Boot then you want
to be using kexec_file_load() anyway.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-24-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add helper function to get the PLPKS password length. This will be used
in a later patch to support passing the password between kernels over
kexec.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-23-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the H_PKS_GEN_PASSWORD hcall returns H_IN_USE, operations that require
authentication (i.e. anything other than reading a world-readable variable)
will not work.
The current error message doesn't explain this clearly enough. Reword it
to emphasise that authenticated operations will fail.
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-22-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It seems a bit unnecessary for the PLPKS code to have a user-visible
config option when it doesn't do anything on its own, and there's existing
options for enabling Secure Boot-related features.
It should be enabled by PPC_SECURE_BOOT, which will eventually be what
uses PLPKS to populate keyrings.
However, we can't get of the separate option completely, because it will
also be used for SED Opal purposes.
Change PSERIES_PLPKS into a hidden option, which is selected by
PPC_SECURE_BOOT.
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-21-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, plpks_read_var() allocates a buffer to pass to the
H_PKS_READ_OBJECT hcall, then allocates another buffer into which the data
is copied, and returns that buffer to the caller.
This is a bit over the top - while we probably still want to allocate a
separate buffer to pass to the hypervisor in the hcall, we can let the
caller allocate the final buffer and specify the size.
Don't allocate var->data in plpks_read_var(), instead expect the caller to
allocate it. If the caller needs to discover the size, it can set
var->data to NULL and var->datalen will be populated. Update header file
to document this.
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-20-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The plpks code converts hypervisor return codes into their Linux
equivalents so that users can understand them. Having access to the
original return codes is really useful for debugging, so add a
pr_debug() so we don't lose information from the conversion.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-19-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Platform Keystore provides a signed update interface which can be used
to create, replace or append to certain variables in the PKS in a secure
fashion, with the hypervisor requiring that the update be signed using the
Platform Key.
Implement an interface to the H_PKS_SIGNED_UPDATE hcall in the plpks
driver to allow signed updates to PKS objects.
(The plpks driver doesn't need to do any cryptography or otherwise handle
the actual signed variable contents - that will be handled by userspace
tooling.)
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
[ajd: split patch, add timeout handling and misc cleanups]
Co-developed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-18-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The plpks driver uses the H_PKS_GET_CONFIG hcall to retrieve configuration
and status information about the PKS from the hypervisor.
Update _plpks_get_config() to handle some additional fields. Add getter
functions to allow the PKS configuration information to be accessed from
other files. Validate that the values we're getting comply with the spec.
While we're here, move the config struct in _plpks_get_config() off the
stack - it's getting large and we also need to make sure it doesn't cross
a page boundary.
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
[ajd: split patch, extend to support additional v3 API fields, minor fixes]
Co-developed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-17-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move the constants defined in plpks.c to plpks.h, and standardise their
naming, so that PLPKS consumers can make use of them later on.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Co-developed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-16-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move plpks.h from platforms/pseries/ to include/asm/. This is necessary
for later patches to make use of the PLPKS from code in other subsystems.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-15-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If attempting to read the size or data attributes of a non-existent
variable (which will be possible after a later patch to expose the PLPKS
via the secvar interface), don't spam the kernel log with error messages.
Only print errors for return codes that aren't ENOENT.
Reported-by: Sudhakar Kuppusamy <sudhakar@linux.ibm.com>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-14-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Due to sysfs constraints, when writing to a variable, we can only handle
writes of up to PAGE_SIZE.
It's possible that the maximum object size is larger than PAGE_SIZE, in
which case, print a warning on boot so that the user is aware.
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-13-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the list of variables is populated by calling
secvar_ops->get_next() repeatedly, which is explicitly modelled on the
OPAL API (including the keylen parameter).
For the upcoming PLPKS backend, we have a static list of variable names.
It is messy to fit that into get_next(), so instead, let the backend put
a NULL-terminated array of variable names into secvar_ops->var_names,
which will be used if get_next() is undefined.
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-12-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The forthcoming pseries consumer of the secvar API wants to expose a
number of config variables. Allowing secvar implementations to provide
their own sysfs attributes makes it easy for consumers to expose what
they need to.
This is not being used by the OPAL secvar implementation at present, and
the config directory will not be created if no attributes are set.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Co-developed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-11-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove unnecessary prefixes from error messages in secvar_sysfs_init()
(the file defines pr_fmt, so putting "secvar:" in every message is
unnecessary). Make capitalisation and punctuation more consistent.
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-10-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the max object size is handled in the core secvar code with an
entirely OPAL-specific implementation, so create a new max_size() op and
move the existing implementation into the powernv platform. Should be
no functional change.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-9-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code that handles the format string in secvar-sysfs.c is entirely
OPAL specific, so create a new "format" op in secvar_operations to make
the secvar code more generic. No functional change.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-8-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The secvar format string and object size sysfs files are both ASCII
text, and should use sysfs_emit(). No functional change.
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-7-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The secvar code only supports one consumer at a time.
Multiple consumers aren't possible at this point in time, but we'd want
it to be obvious if it ever could happen.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Co-developed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-6-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's no reason for secvar_operations to use uint64_t vs the more
common kernel type u64.
The types are compatible, but they require different printk format
strings which can lead to confusion.
Change all the secvar related routines to use u64.
Reviewed-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-5-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
secvar_ops->get_next() returns -ENOENT when there are no more variables
to return, which is expected behaviour.
Fix this by returning 0 if get_next() returns -ENOENT.
This fixes an issue introduced in commit bd5d9c743d38 ("powerpc: expose
secure variables to userspace via sysfs"), but the return code of
secvar_sysfs_load() was never checked so this issue never mattered.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-4-ajd@linux.ibm.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A number of structures and buffers passed to PKS hcalls have alignment
requirements, which could on occasion cause problems:
- Authorisation structures must be 16-byte aligned and must not cross a
page boundary
- Label structures must not cross page boundaries
- Password output buffers must not cross page boundaries
To ensure correct alignment, we adjust the allocation size of each of
these structures/buffers to be the closest power of 2 that is at least the
size of the structure/buffer (since kmalloc() guarantees that an
allocation of a power of 2 size will be aligned to at least that size).
Reported-by: Benjamin Gray <bgray@linux.ibm.com>
Fixes: 2454a7af0f2a ("powerpc/pseries: define driver for Platform KeyStore")
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-3-ajd@linux.ibm.com
|