summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* spi: pxa2xx: Validate the correctness of the SSP typeAndy Shevchenko2022-10-242-2/+5
| | | | | | | | | Currently we blindly apply the SSP type value from any source of the information. Increase robustness by validating the value before use. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20221021190018.63646-2-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
* spi: amlogic: meson-spicc: Use pinctrl to drive CLK line when idleMark Brown2022-10-212-29/+85
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge series from Amjad Ouled-Ameur <aouledameur@baylibre.com>: Between SPI transactions, all SPI pins are in HiZ state. When using the SS signal from the SPICC controller it's not an issue because when the transaction resumes all pins come back to the right state at the same time as SS. The problem is when we use CS as a GPIO. In fact, between the GPIO CS state change and SPI pins state change from idle, you can have a missing or spurious clock transition. Set a bias on the clock depending on the clock polarity requested before CS goes active, by passing a special "idle-low" and "idle-high" pinctrl state and setting the right state at a start of a message.
| * spi: meson-spicc: Use pinctrl to drive CLK line when idleAmjad Ouled-Ameur2022-10-211-1/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Between SPI transactions, all SPI pins are in HiZ state. When using the SS signal from the SPICC controller it's not an issue because when the transaction resumes all pins come back to the right state at the same time as SS. The problem is when we use CS as a GPIO. In fact, between the GPIO CS state change and SPI pins state change from idle, you can have a missing or spurious clock transition. Set a bias on the clock depending on the clock polarity requested before CS goes active, by passing a special "idle-low" and "idle-high" pinctrl state and setting the right state at a start of a message Reported-by: Da Xue <da@libre.computer> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Amjad Ouled-Ameur <aouledameur@baylibre.com> Link: https://lore.kernel.org/r/20221004-up-aml-fix-spi-v4-2-0342d8e10c49@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
| * spi: dt-bindings: amlogic, meson-gx-spicc: Add pinctrl names for SPI signal ↵Amjad Ouled-Ameur2022-10-211-28/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | states SPI pins of the SPICC Controller in Meson-GX needs to be controlled by pin biais when idle. Therefore define three pinctrl names: - default: SPI pins are controlled by spi function. - idle-high: SCLK pin is pulled-up, but MOSI/MISO are still controlled by spi function. - idle-low: SCLK pin is pulled-down, but MOSI/MISO are still controlled by spi function. Reported-by: Da Xue <da@libre.computer> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Amjad Ouled-Ameur <aouledameur@baylibre.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20221004-up-aml-fix-spi-v4-1-0342d8e10c49@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
* | spi: pxa2xx: Minor cleanupsMark Brown2022-10-211-13/+8
|\ \ | | | | | | | | | | | | | | | Merge series from Andy Shevchenko <andriy.shevchenko@linux.intel.com>: This series has a couple of cleanups for the pxa2xx driver.
| * | spi: pxa2xx: Switch from PM ifdeffery to pm_ptr()Andy Shevchenko2022-10-211-8/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cleaning up the driver to use pm_ptr() macro instead of ifdeffery that makes it simpler and allows the compiler to remove those functions if built without CONFIG_PM and CONFIG_PM_SLEEP support. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20221020194500.10225-6-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
| * | spi: pxa2xx: Consistently use dev variable in pxa2xx_spi_init_pdata()Andy Shevchenko2022-10-211-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | We have a temporary variable to keep a pointer to a struct device in the pxa2xx_spi_init_pdata(). Use it consistently there. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20221020194500.10225-5-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
* | | spi: spi-imx: remove unused struct spi_imx_devtype_data::disable_dma callbackMarc Kleine-Budde2022-10-211-8/+0
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 7a908832ace7 ("spi: imx: add fallback feature") the last user of the struct spi_imx_devtype_data::disable_dma callback was removed. However the disable_dma member of struct spi_imx_devtype_data and the callback itself was not removed. Remove struct spi_imx_devtype_data::disable_dma and mx51_disable_dma() as they are unused. Cc: Robin Gong <yibin.gong@nxp.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/r/20221021131051.1777984-1-mkl@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>
* | spi: Introduce spi_get_device_match_data() helperAndy Shevchenko2022-10-212-0/+15
|/ | | | | | | | | | | | The proposed spi_get_device_match_data() helper is for retrieving a driver data associated with the ID in an ID table. First, it tries to get driver data of the device enumerated by firmware interface (usually Device Tree or ACPI). If none is found it falls back to the SPI ID table matching. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20221020195421.10482-1-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
* spi: spi-zyqnmp-gqspi: Add tap delay and Versal platform supportMark Brown2022-10-194-37/+184
|\ | | | | | | | | | | | | | | | | | | | | Merge series from Amit Kumar Mahapatra <amit.kumar-mahapatra@amd.com>: A bunch of improvements to the driver: - Fix kernel-doc warnings in GQSPI driver. - Avoid setting CPOL, CPHA & baud rate multiple times. - Add Versal platform support in GQSPI driver. - Add tap delay support in GQSPI driver.
| * spi: spi-zynqmp-gqspi: Add tap delay support for GQSPI controller on Versal ↵Amit Kumar Mahapatra2022-10-191-19/+67
| | | | | | | | | | | | | | | | | | | | | | | | platform Add tap delay support for GQSPI controller on Versal platform. Signed-off-by: Naga Sureshkumar Relli <nagasure@xilinx.com> Signed-off-by: Michal Simek <michal.simek@amd.com> Signed-off-by: Amit Kumar Mahapatra <amit.kumar-mahapatra@amd.com> Link: https://lore.kernel.org/r/20221011062040.12116-8-amit.kumar-mahapatra@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>
| * spi: dt-bindings: zynqmp-qspi: Add support for Xilinx Versal QSPIAmit Kumar Mahapatra2022-10-191-1/+3
| | | | | | | | | | | | | | | | | | Add new compatible to support QSPI controller on Xilinx Versal SoCs. Signed-off-by: Amit Kumar Mahapatra <amit.kumar-mahapatra@amd.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20221011062040.12116-7-amit.kumar-mahapatra@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>
| * spi: spi-zynqmp-gqspi: Add tap delay support for ZynqMP GQSPI ControllerNaga Sureshkumar Relli2022-10-191-4/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GQSPI controller uses the internal clock for loopback mode. The loopback mode is used with the high-speed Quad SPI timing mode, where the memory interface clock needs to be greater than 40 MHz. Based on the tap delay value programmed, the internal clock is delayed and used for capturing the data. Based upon the frequency of operation set the recommended tap delay values in GQSPI driver. Signed-off-by: Naga Sureshkumar Relli <nagasure@xilinx.com> Signed-off-by: Michal Simek <michal.simek@amd.com> Signed-off-by: Amit Kumar Mahapatra <amit.kumar-mahapatra@amd.com> Link: https://lore.kernel.org/r/20221011062040.12116-6-amit.kumar-mahapatra@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>
| * firmware: xilinx: Add qspi firmware interfaceRajan Vaja2022-10-192-0/+26
| | | | | | | | | | | | | | | | | | | | Add support for QSPI ioctl functions and enums. Signed-off-by: Rajan Vaja <rajan.vaja@xilinx.com> Signed-off-by: Michal Simek <michal.simek@amd.com> Signed-off-by: Amit Kumar Mahapatra <amit.kumar-mahapatra@amd.com> Link: https://lore.kernel.org/r/20221011062040.12116-5-amit.kumar-mahapatra@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>
| * spi: spi-zynqmp-gqspi: Avoid setting baud rate multiple times for same SPI ↵Amit Kumar Mahapatra2022-10-191-14/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | frequency During every transfer the GQSPI driver configures the baud rate value. But when there is no change in the SPI clock frequency the driver should avoid rewriting the same baud rate value to the configuration register. Update GQSPI driver to rewrite the baud rate value if there is any change in SPI clock frequency. Signed-off-by: Amit Kumar Mahapatra <amit.kumar-mahapatra@amd.com> Link: https://lore.kernel.org/r/20221011062040.12116-4-amit.kumar-mahapatra@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>
| * spi: spi-zynqmp-gqspi: Set CPOL and CPHA during hardware initAmit Kumar Mahapatra2022-10-191-15/+17
| | | | | | | | | | | | | | | | | | | | | | During every transfer GQSPI driver writes the CPOL & CPHA values to the configuration register. But the CPOL & CPHA values do not change in between multiple transfers, so moved the CPOL & CPHA initialization to hardware init so that the values are written only once. Signed-off-by: Amit Kumar Mahapatra <amit.kumar-mahapatra@amd.com> Link: https://lore.kernel.org/r/20221011062040.12116-3-amit.kumar-mahapatra@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>
| * spi: spi-zynqmp-gqspi: Fix kernel-doc warningsAmit Kumar Mahapatra2022-10-191-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Document zynqmp_qspi ctrl and op_lock member description. It also adds return documentation for 'zynqmp_qspi_setuprxdma' and zynqmp_qspi_read_op. Fixes below kernel-doc warnings- spi-zynqmp-gqspi.c:178: warning: Function parameter or member 'ctlr' not described in 'zynqmp_qspi' spi-zynqmp-gqspi.c:178: warning: Function parameter or member 'op_lock' not described in 'zynqmp_qspi' spi-zynqmp-gqspi.c:737: warning: No description found for return value of 'zynqmp_qspi_setuprxdma' spi-zynqmp-gqspi.c:822: warning: No description found for return value of 'zynqmp_qspi_read_op' Signed-off-by: Michal Simek <michal.simek@amd.com> Signed-off-by: Amit Kumar Mahapatra <amit.kumar-mahapatra@amd.com> Link: https://lore.kernel.org/r/20221011062040.12116-2-amit.kumar-mahapatra@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>
* | spi: img-spfi: Use devm_platform_get_and_ioremap_resource()Yang Yingliang2022-10-191-2/+1
| | | | | | | | | | | | | | | | | | Use the devm_platform_get_and_ioremap_resource() helper instead of calling platform_get_resource() and devm_ioremap_resource() separately. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221019093318.1183190-1-yangyingliang@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org>
* | spi: fsl-cpm: substitute empty_zero_page with helper ZERO_PAGE(0)Giulio Benetti2022-10-191-1/+1
|/ | | | | | | | | Not all zero page implementations use empty_zero_page global pointer so let's substitute empty_zero_page occurence with helper ZERO_PAGE(0). Signed-off-by: Giulio Benetti <giulio.benetti@benettiengineering.com> Link: https://lore.kernel.org/r/20221018215755.33566-2-giulio.benetti@benettiengineering.com Signed-off-by: Mark Brown <broonie@kernel.org>
* spi: bcm-qspi: Make bcm_qspi_remove() return voidUwe Kleine-König2022-10-184-6/+9
| | | | | | | | | | | | The function bcm_qspi_remove() returns zero unconditionally. Make it return void. This is a preparation for making platform remove callbacks return void. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20221017200143.1426528-1-u.kleine-koenig@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>
* spi: pxa2xx: Simplify with devm_platform_get_and_ioremap_resource()Andy Shevchenko2022-10-181-2/+1
| | | | | | | | | | | | Convert platform_get_resource(), devm_ioremap_resource() to a single call to devm_platform_get_and_ioremap_resource(), as this is exactly what this function does. No functional changes. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20221017171243.57078-1-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
* spi: cadence-quadspi: Use devm_platform_{get_and_}ioremap_resource()Yang Yingliang2022-10-171-5/+2
| | | | | | | | | Use the devm_platform_{get_and}_ioremap_resource() helper instead of calling platform_get_resource() and devm_ioremap_resource() separately. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20220928145852.1882221-2-yangyingliang@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org>
* spi: bcm63xx: Use devm_platform_get_and_ioremap_resource()Yang Yingliang2022-10-171-2/+1
| | | | | | | | | Use the devm_platform_get_and_ioremap_resource() helper instead of calling platform_get_resource() and devm_ioremap_resource() separately. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20220928145852.1882221-1-yangyingliang@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org>
* spi: microchip: pci1xxxx: Add driver for SPI controller of PCI1XXXX PCIe switchTharun Kumar P2022-10-173-0/+407
| | | | | | | | | | | | Microchip pci1xxxx is a PCIe switch with a multi-function endpoint on one of its downstream ports. SPI is one of the functions in the multi-function endpoint. This function has 2 SPI masters, operates at a maximum frequency of 30 MHz and supports 7 client devices per master. This patch adds complete functionality to the SPI function except for suspend and resume. Signed-off-by: Tharun Kumar P <tharunkumar.pasumarthi@microchip.com> Link: https://lore.kernel.org/r/20221006050514.115564-2-tharunkumar.pasumarthi@microchip.com Signed-off-by: Mark Brown <broonie@kernel.org>
* spi: microchip-core: Remove the unused function mchp_corespi_enable()Jiapeng Chong2022-10-171-9/+0
| | | | | | | | | | | | | | The function mchp_corespi_enable() is defined in the spi-microchip-core.c file, but not called elsewhere, so delete this unused function. drivers/spi/spi-microchip-core.c:122:20: warning: unused function 'mchp_corespi_enable'. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2418 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20221017092141.9789-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Mark Brown <broonie@kernel.org>
* Merge existing fixes from spi/for-6.1 into new branchMark Brown2022-10-176-6/+11
|\
| * spi: intel: Fix the offset to get the 64K erase opcodeMauro Lima2022-10-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | According to documentation, the 64K erase opcode is located in VSCC range [16:23] instead of [8:15]. Use the proper value to shift the mask over the correct range. Signed-off-by: Mauro Lima <mauro.lima@eclypsium.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://lore.kernel.org/r/20221012152135.28353-1-mauro.lima@eclypsium.com Signed-off-by: Mark Brown <broonie@kernel.org>
| * spi: aspeed: Fix typo in mode_bits field for AST2600 platformChin-Ting Kuo2022-10-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | Both quad SPI TX and RX modes can be supported on AST2600. Correct typo in mode_bits field in both ast2600_fmc_data and ast2600_spi_data structs. Signed-off-by: Chin-Ting Kuo <chin-ting_kuo@aspeedtech.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Link: https://lore.kernel.org/r/20221005083209.222272-1-chin-ting_kuo@aspeedtech.com Signed-off-by: Mark Brown <broonie@kernel.org>
| * spi: mpc52xx: Replace NO_IRQ by 0Christophe Leroy2022-10-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | NO_IRQ is used to check the return of irq_of_parse_and_map(). On some architecture NO_IRQ is 0, on other architectures it is -1. irq_of_parse_and_map() returns 0 on error, independent of NO_IRQ. So use 0 instead of using NO_IRQ. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://lore.kernel.org/r/f41e09d710879726eacb98daedf16d0847303b9b.1665034444.git.christophe.leroy@csgroup.eu Signed-off-by: Mark Brown <broonie@kernel.org>
| * spi: spi-mem: Fix typo (of -> or)Jonathan Neuschäfer2022-10-101-1/+1
| | | | | | | | | | | | | | | | | | In this instance, "or" makes more sense than "of", so I guess that "or" was intended and "of" was a typo. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Link: https://lore.kernel.org/r/20221008151459.1421406-1-j.neuschaefer@gmx.net Signed-off-by: Mark Brown <broonie@kernel.org>
| * spi: spi-gxp: fix typo in SPDX identifier lineBird, Tim2022-10-041-1/+1
| | | | | | | | | | | | | | | | Use '-' instead of '=' in identifier: "GPL-2.0-or-later" Signed-off-by: Tim Bird <tim.bird@sony.com> Link: https://lore.kernel.org/r/BYAPR13MB2503FF6412666D29FEAC8DCDFD5B9@BYAPR13MB2503.namprd13.prod.outlook.com Signed-off-by: Mark Brown <broonie@kernel.org>
| * spi: tegra210-quad: Fix combined sequenceKrishna Yarlagadda2022-10-031-0/+5
| | | | | | | | | | | | | | | | | | | | Return value should be updated to zero in combined sequence routine if transfer is completed successfully. Currently it holds timeout value resulting in errors. Signed-off-by: Krishna Yarlagadda <kyarlagadda@nvidia.com> Link: https://lore.kernel.org/r/20221001122148.9158-1-kyarlagadda@nvidia.com Signed-off-by: Mark Brown <broonie@kernel.org>
* | Linux 6.1-rc1v6.1-rc1Linus Torvalds2022-10-171-2/+2
| |
* | Merge tag 'random-6.1-rc1-for-linus' of ↵Linus Torvalds2022-10-17185-421/+378
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull more random number generator updates from Jason Donenfeld: "This time with some large scale treewide cleanups. The intent of this pull is to clean up the way callers fetch random integers. The current rules for doing this right are: - If you want a secure or an insecure random u64, use get_random_u64() - If you want a secure or an insecure random u32, use get_random_u32() The old function prandom_u32() has been deprecated for a while now and is just a wrapper around get_random_u32(). Same for get_random_int(). - If you want a secure or an insecure random u16, use get_random_u16() - If you want a secure or an insecure random u8, use get_random_u8() - If you want secure or insecure random bytes, use get_random_bytes(). The old function prandom_bytes() has been deprecated for a while now and has long been a wrapper around get_random_bytes() - If you want a non-uniform random u32, u16, or u8 bounded by a certain open interval maximum, use prandom_u32_max() I say "non-uniform", because it doesn't do any rejection sampling or divisions. Hence, it stays within the prandom_*() namespace, not the get_random_*() namespace. I'm currently investigating a "uniform" function for 6.2. We'll see what comes of that. By applying these rules uniformly, we get several benefits: - By using prandom_u32_max() with an upper-bound that the compiler can prove at compile-time is ≤65536 or ≤256, internally get_random_u16() or get_random_u8() is used, which wastes fewer batched random bytes, and hence has higher throughput. - By using prandom_u32_max() instead of %, when the upper-bound is not a constant, division is still avoided, because prandom_u32_max() uses a faster multiplication-based trick instead. - By using get_random_u16() or get_random_u8() in cases where the return value is intended to indeed be a u16 or a u8, we waste fewer batched random bytes, and hence have higher throughput. This series was originally done by hand while I was on an airplane without Internet. Later, Kees and I worked on retroactively figuring out what could be done with Coccinelle and what had to be done manually, and then we split things up based on that. So while this touches a lot of files, the actual amount of code that's hand fiddled is comfortably small" * tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: prandom: remove unused functions treewide: use get_random_bytes() when possible treewide: use get_random_u32() when possible treewide: use get_random_{u8,u16}() when possible, part 2 treewide: use get_random_{u8,u16}() when possible, part 1 treewide: use prandom_u32_max() when possible, part 2 treewide: use prandom_u32_max() when possible, part 1
| * | prandom: remove unused functionsJason A. Donenfeld2022-10-123-23/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With no callers left of prandom_u32() and prandom_bytes(), as well as get_random_int(), remove these deprecated wrappers, in favor of get_random_u32() and get_random_bytes(). Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
| * | treewide: use get_random_bytes() when possibleJason A. Donenfeld2022-10-1219-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The prandom_bytes() function has been a deprecated inline wrapper around get_random_bytes() for several releases now, and compiles down to the exact same code. Replace the deprecated wrapper with a direct call to the real function. This was done as a basic find and replace. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> # powerpc Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
| * | treewide: use get_random_u32() when possibleJason A. Donenfeld2022-10-1271-100/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The prandom_u32() function has been a deprecated inline wrapper around get_random_u32() for several releases now, and compiles down to the exact same code. Replace the deprecated wrapper with a direct call to the real function. The same also applies to get_random_int(), which is just a wrapper around get_random_u32(). This was done as a basic find and replace. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake Acked-by: Chuck Lever <chuck.lever@oracle.com> # for nfsd Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # for thunderbolt Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Acked-by: Helge Deller <deller@gmx.de> # for parisc Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
| * | treewide: use get_random_{u8,u16}() when possible, part 2Jason A. Donenfeld2022-10-126-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than truncate a 32-bit value to a 16-bit value or an 8-bit value, simply use the get_random_{u8,u16}() functions, which are faster than wasting the additional bytes from a 32-bit value. This was done by hand, identifying all of the places where one of the random integer functions was used in a non-32-bit context. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
| * | treewide: use get_random_{u8,u16}() when possible, part 1Jason A. Donenfeld2022-10-1221-34/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than truncate a 32-bit value to a 16-bit value or an 8-bit value, simply use the get_random_{u8,u16}() functions, which are faster than wasting the additional bytes from a 32-bit value. This was done mechanically with this coccinelle script: @@ expression E; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u16; typedef __be16; typedef __le16; typedef u8; @@ ( - (get_random_u32() & 0xffff) + get_random_u16() | - (get_random_u32() & 0xff) + get_random_u8() | - (get_random_u32() % 65536) + get_random_u16() | - (get_random_u32() % 256) + get_random_u8() | - (get_random_u32() >> 16) + get_random_u16() | - (get_random_u32() >> 24) + get_random_u8() | - (u16)get_random_u32() + get_random_u16() | - (u8)get_random_u32() + get_random_u8() | - (__be16)get_random_u32() + (__be16)get_random_u16() | - (__le16)get_random_u32() + (__le16)get_random_u16() | - prandom_u32_max(65536) + get_random_u16() | - prandom_u32_max(256) + get_random_u8() | - E->inet_id = get_random_u32() + E->inet_id = get_random_u16() ) @@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u16; identifier v; @@ - u16 v = get_random_u32(); + u16 v = get_random_u16(); @@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u8; identifier v; @@ - u8 v = get_random_u32(); + u8 v = get_random_u8(); @@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u16; u16 v; @@ - v = get_random_u32(); + v = get_random_u16(); @@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u8; u8 v; @@ - v = get_random_u32(); + v = get_random_u8(); // Find a potential literal @literal_mask@ expression LITERAL; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; position p; @@ ((T)get_random_u32()@p & (LITERAL)) // Examine limits @script:python add_one@ literal << literal_mask.LITERAL; RESULT; @@ value = None if literal.startswith('0x'): value = int(literal, 16) elif literal[0] in '123456789': value = int(literal, 10) if value is None: print("I don't know how to handle %s" % (literal)) cocci.include_match(False) elif value < 256: coccinelle.RESULT = cocci.make_ident("get_random_u8") elif value < 65536: coccinelle.RESULT = cocci.make_ident("get_random_u16") else: print("Skipping large mask of %s" % (literal)) cocci.include_match(False) // Replace the literal mask with the calculated result. @plus_one@ expression literal_mask.LITERAL; position literal_mask.p; identifier add_one.RESULT; identifier FUNC; @@ - (FUNC()@p & (LITERAL)) + (RESULT() & LITERAL) Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
| * | treewide: use prandom_u32_max() when possible, part 2Jason A. Donenfeld2022-10-124-19/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than incurring a division or requesting too many random bytes for the given range, use the prandom_u32_max() function, which only takes the minimum required bytes from the RNG and avoids divisions. This was done by hand, covering things that coccinelle could not do on its own. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> # for ext2, ext4, and sbitmap Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
| * | treewide: use prandom_u32_max() when possible, part 1Jason A. Donenfeld2022-10-1289-218/+204
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than incurring a division or requesting too many random bytes for the given range, use the prandom_u32_max() function, which only takes the minimum required bytes from the RNG and avoids divisions. This was done mechanically with this coccinelle script: @basic@ expression E; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u64; @@ ( - ((T)get_random_u32() % (E)) + prandom_u32_max(E) | - ((T)get_random_u32() & ((E) - 1)) + prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2) | - ((u64)(E) * get_random_u32() >> 32) + prandom_u32_max(E) | - ((T)get_random_u32() & ~PAGE_MASK) + prandom_u32_max(PAGE_SIZE) ) @multi_line@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; identifier RAND; expression E; @@ - RAND = get_random_u32(); ... when != RAND - RAND %= (E); + RAND = prandom_u32_max(E); // Find a potential literal @literal_mask@ expression LITERAL; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; position p; @@ ((T)get_random_u32()@p & (LITERAL)) // Add one to the literal. @script:python add_one@ literal << literal_mask.LITERAL; RESULT; @@ value = None if literal.startswith('0x'): value = int(literal, 16) elif literal[0] in '123456789': value = int(literal, 10) if value is None: print("I don't know how to handle %s" % (literal)) cocci.include_match(False) elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1: print("Skipping 0x%x for cleanup elsewhere" % (value)) cocci.include_match(False) elif value & (value + 1) != 0: print("Skipping 0x%x because it's not a power of two minus one" % (value)) cocci.include_match(False) elif literal.startswith('0x'): coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1)) else: coccinelle.RESULT = cocci.make_expr("%d" % (value + 1)) // Replace the literal mask with the calculated result. @plus_one@ expression literal_mask.LITERAL; position literal_mask.p; expression add_one.RESULT; identifier FUNC; @@ - (FUNC()@p & (LITERAL)) + prandom_u32_max(RESULT) @collapse_ret@ type T; identifier VAR; expression E; @@ { - T VAR; - VAR = (E); - return VAR; + return E; } @drop_var@ type T; identifier VAR; @@ { - T VAR; ... when != VAR } Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: KP Singh <kpsingh@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 and sbitmap Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> # for drbd Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390 Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* | | Merge tag 'perf-tools-for-v6.1-2-2022-10-16' of ↵Linus Torvalds2022-10-1736-71/+1265
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull more perf tools updates from Arnaldo Carvalho de Melo: - Use BPF CO-RE (Compile Once, Run Everywhere) to support old kernels when using bperf (perf BPF based counters) with cgroups. - Support HiSilicon PCIe Performance Monitoring Unit (PMU), that monitors bandwidth, latency, bus utilization and buffer occupancy. Documented in Documentation/admin-guide/perf/hisi-pcie-pmu.rst. - User space tasks can migrate between CPUs, so when tracing selected CPUs, system-wide sideband is still needed, fix it in the setup of Intel PT on hybrid systems. - Fix metricgroups title message in 'perf list', it should state that the metrics groups are to be used with the '-M' option, not '-e'. - Sync the msr-index.h copy with the kernel sources, adding support for using "AMD64_TSC_RATIO" in filter expressions in 'perf trace' as well as decoding it when printing the MSR tracepoint arguments. - Fix program header size and alignment when generating a JIT ELF in 'perf inject'. - Add multiple new Intel PT 'perf test' entries, including a jitdump one. - Fix the 'perf test' entries for 'perf stat' CSV and JSON output when running on PowerPC due to an invalid topology number in that arch. - Fix the 'perf test' for arm_coresight failures on the ARM Juno system. - Fix the 'perf test' attr entry for PERF_FORMAT_LOST, adding this option to the or expression expected in the intercepted perf_event_open() syscall. - Add missing condition flags ('hs', 'lo', 'vc', 'vs') for arm64 in the 'perf annotate' asm parser. - Fix 'perf mem record -C' option processing, it was being chopped up when preparing the underlying 'perf record -e mem-events' and thus being ignored, requiring using '-- -C CPUs' as a workaround. - Improvements and tidy ups for 'perf test' shell infra. - Fix Intel PT information printing segfault in uClibc, where a NULL format was being passed to fprintf. * tag 'perf-tools-for-v6.1-2-2022-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (23 commits) tools arch x86: Sync the msr-index.h copy with the kernel sources perf auxtrace arm64: Add support for parsing HiSilicon PCIe Trace packet perf auxtrace arm64: Add support for HiSilicon PCIe Tune and Trace device driver perf auxtrace arm: Refactor event list iteration in auxtrace_record__init() perf tests stat+json_output: Include sanity check for topology perf tests stat+csv_output: Include sanity check for topology perf intel-pt: Fix system_wide dummy event for hybrid perf intel-pt: Fix segfault in intel_pt_print_info() with uClibc perf test: Fix attr tests for PERF_FORMAT_LOST perf test: test_intel_pt.sh: Add 9 tests perf inject: Fix GEN_ELF_TEXT_OFFSET for jit perf test: test_intel_pt.sh: Add jitdump test perf test: test_intel_pt.sh: Tidy some alignment perf test: test_intel_pt.sh: Print a message when skipping kernel tracing perf test: test_intel_pt.sh: Tidy some perf record options perf test: test_intel_pt.sh: Fix return checking again perf: Skip and warn on unknown format 'configN' attrs perf list: Fix metricgroups title message perf mem: Fix -C option behavior for perf mem record perf annotate: Add missing condition flags for arm64 ...
| * | | tools arch x86: Sync the msr-index.h copy with the kernel sourcesArnaldo Carvalho de Melo2022-10-151-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To pick up the changes in: b8d1d163604bd1e6 ("x86/apic: Don't disable x2APIC if locked") ca5b7c0d9621702e ("perf/x86/amd/lbr: Add LbrExtV2 branch record support") Addressing these tools/perf build warnings: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h' That makes the beautification scripts to pick some new entries: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after $ diff -u before after --- before 2022-10-14 18:06:34.294561729 -0300 +++ after 2022-10-14 18:06:41.285744044 -0300 @@ -264,6 +264,7 @@ [0xc0000102 - x86_64_specific_MSRs_offset] = "KERNEL_GS_BASE", [0xc0000103 - x86_64_specific_MSRs_offset] = "TSC_AUX", [0xc0000104 - x86_64_specific_MSRs_offset] = "AMD64_TSC_RATIO", + [0xc000010e - x86_64_specific_MSRs_offset] = "AMD64_LBR_SELECT", [0xc000010f - x86_64_specific_MSRs_offset] = "AMD_DBG_EXTN_CFG", [0xc0000300 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS", [0xc0000301 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_CTL", $ Now one can trace systemwide asking to see backtraces to where that MSR is being read/written, see this example with a previous update: # perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB" ^C# If we use -v (verbose mode) we can see what it does behind the scenes: # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB" Using CPUID AuthenticAMD-25-21-0 0x6a0 0x6a8 New filter for msr:read_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313) 0x6a0 0x6a8 New filter for msr:write_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313) mmap size 528384B ^C# Example with a frequent msr: # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2 Using CPUID AuthenticAMD-25-21-0 0x48 New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841) 0x48 New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841) mmap size 528384B Looking at the vmlinux_path (8 entries long) symsrc__init: build id mismatch for vmlinux. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols 0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) __switch_to_xtra ([kernel.kallsyms]) __switch_to ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule ([kernel.kallsyms]) futex_wait_queue_me ([kernel.kallsyms]) futex_wait ([kernel.kallsyms]) do_futex ([kernel.kallsyms]) __x64_sys_futex ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64_after_hwframe ([kernel.kallsyms]) __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so) 0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) __switch_to_xtra ([kernel.kallsyms]) __switch_to ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule_idle ([kernel.kallsyms]) do_idle ([kernel.kallsyms]) cpu_startup_entry ([kernel.kallsyms]) secondary_startup_64_no_verify ([kernel.kallsyms]) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Link: https://lore.kernel.org/lkml/Y0nQkz2TUJxwfXJd@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf auxtrace arm64: Add support for parsing HiSilicon PCIe Trace packetQi Liu2022-10-157-0/+396
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for using 'perf report --dump-raw-trace' to parse PTT packet. Example usage: Output will contain raw PTT data and its textual representation, such as (8DW format): 0 0 0x5810 [0x30]: PERF_RECORD_AUXTRACE size: 0x400000 offset: 0 ref: 0xa5d50c725 idx: 0 tid: -1 cpu: 0 . . ... HISI PTT data: size 4194304 bytes . 00000000: 00 00 00 00 Prefix . 00000004: 08 20 00 60 Header DW0 . 00000008: ff 02 00 01 Header DW1 . 0000000c: 20 08 00 00 Header DW2 . 00000010: 10 e7 44 ab Header DW3 . 00000014: 2a a8 1e 01 Time . 00000020: 00 00 00 00 Prefix . 00000024: 01 00 00 60 Header DW0 . 00000028: 0f 1e 00 01 Header DW1 . 0000002c: 04 00 00 00 Header DW2 . 00000030: 40 00 81 02 Header DW3 . 00000034: ee 02 00 00 Time .... This patch only add basic parsing support according to the definition of the PTT packet described in Documentation/trace/hisi-ptt.rst. And the fields of each packet can be further decoded following the PCIe Spec's definition of TLP packet. Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: John Garry <john.garry@huawei.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Liu <liuqi6124@gmail.com> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Zeng Prime <prime.zeng@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-pci@vger.kernel.org Cc: linuxarm@huawei.com Link: https://lore.kernel.org/r/20220927081400.14364-4-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf auxtrace arm64: Add support for HiSilicon PCIe Tune and Trace device driverQi Liu2022-10-157-1/+273
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | HiSilicon PCIe tune and trace device (PTT) could dynamically tune the PCIe link's events, and trace the TLP headers). This patch add support for PTT device in perf tool, so users could use 'perf record' to get TLP headers trace data. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: John Garry <john.garry@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Liu <liuqi6124@gmail.com> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Zeng Prime <prime.zeng@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-pci@vger.kernel.org Cc: linuxarm@huawei.com Link: https://lore.kernel.org/r/20220927081400.14364-3-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf auxtrace arm: Refactor event list iteration in auxtrace_record__init()Qi Liu2022-10-151-19/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add find_pmu_for_event() and use to simplify logic in auxtrace_record_init(). find_pmu_for_event() will be reused in subsequent patches. Reviewed-by: John Garry <john.garry@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Liu <liuqi6124@gmail.com> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Zeng Prime <prime.zeng@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-pci@vger.kernel.org Cc: linuxarm@huawei.com Link: https://lore.kernel.org/r/20220927081400.14364-2-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf tests stat+json_output: Include sanity check for topologyAthira Rajeev2022-10-151-4/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Testcase stat+json_output.sh fails in powerpc: 86: perf stat JSON output linter : FAILED! The testcase "stat+json_output.sh" verifies perf stat JSON output. The test covers aggregation modes like per-socket, per-core, per-die, -A (no_aggr mode) along with few other tests. It counts expected fields for various commands. For example say -A (i.e, AGGR_NONE mode), expects 7 fields in the output having "CPU" as first field. Same way, for per-socket, it expects the first field in result to point to socket id. The testcases compares the result with expected count. The values for socket, die, core and cpu are fetched from topology directory: /sys/devices/system/cpu/cpu*/topology. For example, socket value is fetched from "physical_package_id" file of topology directory. (cpu__get_topology_int() in util/cpumap.c) If a platform fails to fetch the topology information, values will be set to -1. For example, incase of pSeries platform of powerpc, value for "physical_package_id" is restricted and not exposed. So, -1 will be assigned. Perf code has a checks for valid cpu id in "aggr_printout" (stat-display.c), which displays the fields. So, in cases where topology values not exposed, first field of the output displaying will be empty. This cause the testcase to fail, as it counts number of fields in the output. Incase of -A (AGGR_NONE mode,), testcase expects 7 fields in the output, becos of -1 value obtained from topology files for some, only 6 fields are printed. Hence a testcase failure reported due to mismatch in number of fields in the output. Patch here adds a sanity check in the testcase for topology. Check will help to skip the test if -1 value found. Fixes: 0c343af2a2f82844 ("perf test: JSON format checking") Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Suggested-by: Ian Rogers <irogers@google.com> Suggested-by: James Clark <james.clark@arm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Claire Jensen <cjense@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Link: https://lore.kernel.org/r/20221006155149.67205-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf tests stat+csv_output: Include sanity check for topologyAthira Rajeev2022-10-151-4/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Testcase stat+csv_output.sh fails in powerpc: 84: perf stat CSV output linter: FAILED! The testcase "stat+csv_output.sh" verifies perf stat CSV output. The test covers aggregation modes like per-socket, per-core, per-die, -A (no_aggr mode) along with few other tests. It counts expected fields for various commands. For example say -A (i.e, AGGR_NONE mode), expects 7 fields in the output having "CPU" as first field. Same way, for per-socket, it expects the first field in result to point to socket id. The testcases compares the result with expected count. The values for socket, die, core and cpu are fetched from topology directory: /sys/devices/system/cpu/cpu*/topology. For example, socket value is fetched from "physical_package_id" file of topology directory. (cpu__get_topology_int() in util/cpumap.c) If a platform fails to fetch the topology information, values will be set to -1. For example, incase of pSeries platform of powerpc, value for "physical_package_id" is restricted and not exposed. So, -1 will be assigned. Perf code has a checks for valid cpu id in "aggr_printout" (stat-display.c), which displays the fields. So, in cases where topology values not exposed, first field of the output displaying will be empty. This cause the testcase to fail, as it counts number of fields in the output. Incase of -A (AGGR_NONE mode,), testcase expects 7 fields in the output, becos of -1 value obtained from topology files for some, only 6 fields are printed. Hence a testcase failure reported due to mismatch in number of fields in the output. Patch here adds a sanity check in the testcase for topology. Check will help to skip the test if -1 value found. Fixes: 7473ee56dbc91c98 ("perf test: Add checking for perf stat CSV output.") Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Suggested-by: Ian Rogers <irogers@google.com> Suggested-by: James Clark <james.clark@arm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Claire Jensen <cjense@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Link: https://lore.kernel.org/r/20221006155149.67205-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf intel-pt: Fix system_wide dummy event for hybridAdrian Hunter2022-10-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | User space tasks can migrate between CPUs, so when tracing selected CPUs, system-wide sideband is still needed, however evlist->core.has_user_cpus is not set in the hybrid case, so check the target cpu_list instead. Fixes: 7d189cadbeebc778 ("perf intel-pt: Track sideband system-wide when needed") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221012082259.22394-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf intel-pt: Fix segfault in intel_pt_print_info() with uClibcAdrian Hunter2022-10-151-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | uClibc segfaulted because NULL was passed as the format to fprintf(). That happened because one of the format strings was missing and intel_pt_print_info() didn't check that before calling fprintf(). Add the missing format string, and check format is not NULL before calling fprintf(). Fixes: 11fa7cb86b56d361 ("perf tools: Pass Intel PT information for decoding MTC and CYC") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221012082259.22394-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>