| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Per the infiniband spec an SMI MAD can have any PKey. Checking the pkey
on SMI MADs is not necessary, and it seems that some older adapters
using the mthca driver don't follow the convention of using the default
PKey, resulting in false denials, or errors querying the PKey cache.
SMI MAD security is still enforced, only agents allowed to manage the
subnet are able to receive or send SMI MADs.
Reported-by: Chris Blake <chrisrblake93@gmail.com>
Cc: <stable@vger.kernel.org> # v4.12
Fixes: 47a2b338fe63 ("IB/core: Enforce security on management datagrams")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The alternate port number is used as an array index in the IB
security implementation, invalid values can result in a kernel panic.
Cc: <stable@vger.kernel.org> # v4.12
Fixes: d291f1a65232 ("IB/core: Enforce PKey security on QPs")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For now the only LSM security enforcement mechanism available is
specific to InfiniBand. Bypass enforcement for non-IB link types.
This fixes a regression where modify_qp fails for iWARP because
querying the PKEY returns -EINVAL.
Cc: Paul Moore <paul@paul-moore.com>
Cc: Don Dutile <ddutile@redhat.com>
Cc: stable@vger.kernel.org
Reported-by: Potnuri Bharat Teja <bharat@chelsio.com>
Fixes: d291f1a65232("IB/core: Enforce PKey security on QPs")
Fixes: 47a2b338fe63("IB/core: Enforce security on management datagrams")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Tested-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In general, dma_alloc_coherent() returns a CPU virtual address and
a DMA address, and we have no guarantee that the underlying memory
even has an associated struct page at all.
This patch gets rid of the page operation after dma_alloc_coherent,
and records the VA returned form dma_alloc_coherent in the struct
of hem in hns RoCE driver.
Fixes: 9a44353("IB/hns: Add driver files for hns RoCE driver")
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Xiping Zhang (Francis) <zhangxiping3@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In general dma_alloc_coherent() returns a CPU virtual address and
a DMA address, and we have no guarantee that the virtual address
is either in the linear map or vmalloc. It could be in some other special
place. We have no guarantee that the underlying memory even has
an associated struct page at all.
In current code, there are incorrect usage as below:
dma_alloc_coherent + virt_to_page + vmap. There will probably
introduce coherency problem. This patch fixes it to get rid of
virt_to_page and vmap calls at Leon's suggestion. The related
link: https://lkml.org/lkml/2017/11/7/34
Fixes: 9a44353("IB/hns: Add driver files for hns RoCE driver")
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Xiping Zhang (Francis) <zhangxiping3@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the smmu is enabled, the length of sg obtained from
__iommu_map_sg_attrs is not 4kB. When the IOVA is set with the sg
dma address, the IOVA will not be page continuous. so, the current
code has MTPT configuration error that probably cause dma operation
failure. In order to fix this issue, the IOVA should be calculated
based on the sg length.
Fixes: 3958cc5("RDMA/hns: Configure the MTPT in hip08")
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Xiping Zhang (Francis) <zhangxiping3@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Once infiniband is compiled as a core component its subsystem must be
enabled before device initialization. Otherwise there is a NULL pointer
dereference during mlx4_core init, calltrace:
->device_add
if (dev->class) {
deref dev->class->p =>NULLPTR
#Config
CONFIG_NET_DEVLINK=y
CONFIG_MAY_USE_DEVLINK=y
CONFIG_MLX4_EN=y
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch limits the initial value for PSN to 24 bits as
spec requires.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Established CM event is sent prior to modifying QP to RTS state.
This can result in application closing the connection before the
QP is actually in RTS state. Move sending of established CM
event to after modify QP to RTS.
Fixes: f27b4746f378 ("i40iw: add connection management code")
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For loopback, a MPA request event is generated when cm_node
is initialized, which allows applications to act on the
connect request before i40iw_connect() has completed.
In some cases, the reject flow executes in parallel with
the connect flow and doesn't delete an APBVT entry,
because the apbvt_set variable is still not set by the
connect flow. Move the MPA request event to the end of
i40iw_connect() to notify application for a connect
request, after connect has completed.
Fixes: f27b4746f378 ("i40iw: add connection management code")
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ARP table entry indexes are aliased to 12bits
instead of the intended 16bits when uploaded to
the QP Context. This will present an issue when the
number of connections exceeds 4096 as ARP entries are
reused. Fix this by adjusting the mask to account for
the full 16bits.
Fixes: 4e9042e647ff ("i40iw: add hw and utils files")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
| |
When the event type is I40IW_TIMER_TYPE_CLOSE, there is no sqbuf and
it should not be freed as one in i40iw_schedule_cm_timer().
Fixes: f27b4746f378 ("i40iw: add connection management code")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently there is only one sdbuf per Control QP (CQP) for
programming Segment Descriptor (SD). If multiple SD work
requests are posted simultaneously, the sdbuf is reused
by all WQEs and new WQEs can corrupt previous WQEs sdbuf
leading to incorrect SD programming.
Fix this by allocating one sdbuf per CQP SQ WQE. When an
SD command is posted, it will use the corresponding sdbuf
for the WQE.
Fixes: 86dbcd0f12e9 ("i40iw: add file to handle cqp calls")
Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If NO_DMA=y:
ERROR: "bad_dma_ops" [net/sunrpc/xprtrdma/rpcrdma.ko] undefined!
ERROR: "bad_dma_ops" [net/smc/smc.ko] undefined!
ERROR: "bad_dma_ops" [net/rds/rds_rdma.ko] undefined!
ERROR: "bad_dma_ops" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "bad_dma_ops" [drivers/nvme/target/nvmet-rdma.ko] undefined!
ERROR: "bad_dma_ops" [drivers/nvme/host/nvme-rdma.ko] undefined!
ERROR: "bad_dma_ops" [drivers/infiniband/ulp/srpt/ib_srpt.ko] undefined!
ERROR: "bad_dma_ops" [drivers/infiniband/ulp/srp/ib_srp.ko] undefined!
ERROR: "bad_dma_ops" [drivers/infiniband/ulp/isert/ib_isert.ko] undefined!
ERROR: "bad_dma_ops" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
ERROR: "bad_dma_ops" [drivers/infiniband/ulp/ipoib/ib_ipoib.ko] undefined!
ERROR: "bad_dma_ops" [drivers/infiniband/core/ib_core.ko] undefined!
Before, this was handled implicitly by the dependency on PCI.
Add an explicit dependency on HAS_DMA to fix this.
Fixes: 931bc0d91639f8fb ("IB: Move PCI dependency from root KConfig to HW's KConfigs")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible the bth1 variable could be used uninitialized so going
ahead and giving it a default value.
Otherwise we leak stack memory to the network.
Fixes: 5b6cabb0db77 ("IB/hfi1: Add 16B RC/UC support")
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull ARM fixes from Russell King:
- LPAE fixes for kernel-readonly regions
- Fix for get_user_pages_fast on LPAE systems
- avoid tying decompressor to a particular platform if DEBUG_LL is
enabled
- BUG if we attempt to return to userspace but the to-be-restored PSR
value keeps us in privileged mode (defeating an issue that ftracetest
found)
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: BUG if jumping to usermode address in kernel mode
ARM: 8722/1: mm: make STRICT_KERNEL_RWX effective for LPAE
ARM: 8721/1: mm: dump: check hardware RO bit for LPAE
ARM: make decompressor debug output user selectable
ARM: fix get_user_pages_fast
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Detect if we are returning to usermode via the normal kernel exit paths
but the saved PSR value indicates that we are in kernel mode. This
could occur due to corrupted stack state, which has been observed with
"ftracetest".
This ensures that we catch the problem case before we get to user code.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Currently, for ARM kernels with CONFIG_ARM_LPAE and
CONFIG_STRICT_KERNEL_RWX enabled, the 2MiB pages mapping the
kernel code and rodata are writable. They are marked read-only in
a software bit (L_PMD_SECT_RDONLY) but the hardware read-only bit
is not set (PMD_SECT_AP2).
For user mappings, the logic that propagates the software bit
to the hardware bit is in set_pmd_at(); but for the kernel,
section_update() writes the PMDs directly, skipping this logic.
The fix is to set PMD_SECT_AP2 for read-only sections in
section_update(), at the same time as L_PMD_SECT_RDONLY.
Fixes: 1e3479225acb ("ARM: 8275/1: mm: fix PMD_SECT_RDONLY undeclared compile error")
Signed-off-by: Philip Derrin <philip@cog.systems>
Reported-by: Neil Dick <neil@cog.systems>
Tested-by: Neil Dick <neil@cog.systems>
Tested-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When CONFIG_ARM_LPAE is set, the PMD dump relies on the software
read-only bit to determine whether a page is writable. This
concealed a bug which left the kernel text section writable
(AP2=0) while marked read-only in the software bit.
In a kernel with the AP2 bug, the dump looks like this:
---[ Kernel Mapping ]---
0xc0000000-0xc0200000 2M RW NX SHD
0xc0200000-0xc0600000 4M ro x SHD
0xc0600000-0xc0800000 2M ro NX SHD
0xc0800000-0xc4800000 64M RW NX SHD
The fix is to check that the software and hardware bits are both
set before displaying "ro". The dump then shows the true perms:
---[ Kernel Mapping ]---
0xc0000000-0xc0200000 2M RW NX SHD
0xc0200000-0xc0600000 4M RW x SHD
0xc0600000-0xc0800000 2M RW NX SHD
0xc0800000-0xc4800000 64M RW NX SHD
Fixes: ded947798469 ("ARM: 8109/1: mm: Modify pte_write and pmd_write logic for LPAE")
Signed-off-by: Philip Derrin <philip@cog.systems>
Tested-by: Neil Dick <neil@cog.systems>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Make the decompressor debug output user selectable, otherwise merely
enabling DEBUG_LL causes the decompressor to become board specific,
thereby preventing a multi-platform kernel from booting. Enabling
DEBUG_LL doesn't cause the kernel itself to become platform specific
unless EARLY_PRINTK is enabled, or one of the debugging routines is
added in a path that results in it being called.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Ensure that get_user_pages_fast() is not able to access memory which
has been mapped with PROT_NONE.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Glexiner:
- unbreak the irq trigger type check for legacy platforms
- a handful fixes for ARM GIC v3/4 interrupt controllers
- a few trivial fixes all over the place
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/matrix: Make - vs ?: Precedence explicit
irqchip/imgpdc: Use resource_size function on resource object
irqchip/qcom: Fix u32 comparison with value less than zero
irqchip/exiu: Fix return value check in exiu_init()
irqchip/gic-v3-its: Remove artificial dependency on PCI
irqchip/gic-v4: Add forward definition of struct irq_domain_ops
irqchip/gic-v3: pr_err() strings should end with newlines
irqchip/s3c24xx: pr_err() strings should end with newlines
irqchip/gic-v3: Fix ppi-partitions lookup
irqchip/gic-v4: Clear IRQ_DISABLE_UNLAZY again if mapping fails
genirq: Track whether the trigger type has been set
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Noticed with a Clang build. This improves the readability of the ?:
expression, as it has lower precedence than the - expression. Show
explicitly that - is evaluated first.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20171122205645.GA27125@beast
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
drivers/irqchip/irq-imgpdc.c:327:20-23: WARNING: Suspicious code.
resource_size is maybe missing with res_regs
Generated by: scripts/coccinelle/api/resource_size.cocci
Signed-off-by: Vasyl Gomonovych <gomonovych@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: marc.zyngier@arm.com
Cc: jason@lakedaemon.net
Link: https://lkml.kernel.org/r/1511215361-8279-1-git-send-email-gomonovych@gmail.com
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The comparison of u32 nregs being less than zero is never true since
nregs is unsigned. Fix this by making nregs a signed integer.
Fixes: f20cc9b00c7b ("irqchip/qcom: Add IRQ combiner driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: kernel-janitors@vger.kernel.org
Cc: Jason Cooper <jason@lakedaemon.net>
Link: https://lkml.kernel.org/r/20171117183553.2739-1-colin.king@canonical.com
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In case of error, the function of_iomap() returns NULL pointer not
ERR_PTR().
Replace the IS_ERR() test of the return value with NULL test and return
a proper error code.
Fixes: 706cffc1b912 ("irqchip/exiu: Add support for Socionext Synquacer EXIU controller")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: https://lkml.kernel.org/r/1510642648-123574-1-git-send-email-weiyongjun1@huawei.com
|
| |\ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent
Pull irqchip updates for 4.15, take #4 from Marc Zyngier
- A core irq fix for legacy cases where the irq trigger is not reported
by firmware
- A couple of GICv3/4 fixes (Kconfig, of-node refcount, error handling)
- Trivial pr_err fixes
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The GICv3 ITS doesn't really depend on PCI. Only the PCI/MSI
part of it does, and there is no reason not to blow away most
of the irqchip stack because PCI is not selected (though not
selecting PCI seem to be asking for punishment, but hey...).
So let's split the PCI-specific part from the ITS in the Kconfig
file, and let's make that part depend on PCI. Architecture specific
hacks (arch/arm{,64}/Kconfig) will be addressed in a separate patch.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In some randconfig scenarios, including arm-gic-v4.h results
in a spurious wawrning about the $SUBJECT structure not being
defined. Adding a forward definition keeps it quiet.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
pr_err() messages should end with a new-line to avoid other messages
being concatenated.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
pr_err() messages should end with a new-line to avoid other messages
being concatenated.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Fix child-node lookup during initialisation, which ended up searching
the whole device tree depth-first starting at the parent rather than
just matching on its children.
To make things worse, the parent gic node was prematurely freed, while
the ppi-partitions node was leaked.
Fixes: e3825ba1af3a ("irqchip/gic-v3: Add support for partitioned PPIs")
Cc: stable <stable@vger.kernel.org> # 4.7
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Should the call to irq_set_vcpu_affinity() fail at map time,
we should restore the normal lazy-disable behaviour instead
of staying with the eager disable that GICv4 requires.
Reported-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When requesting a shared interrupt, we assume that the firmware
support code (DT or ACPI) has called irqd_set_trigger_type
already, so that we can retrieve it and check that the requester
is being reasonnable.
Unfortunately, we still have non-DT, non-ACPI systems around,
and these guys won't call irqd_set_trigger_type before requesting
the interrupt. The consequence is that we fail the request that
would have worked before.
We can either chase all these use cases (boring), or address it
in core code (easier). Let's have a per-irq_desc flag that
indicates whether irqd_set_trigger_type has been called, and
let's just check it when checking for a shared interrupt.
If it hasn't been set, just take whatever the interrupt
requester asks.
Fixes: 382bd4de6182 ("genirq: Use irqd_get_trigger_type to compare the trigger type for shared IRQs")
Cc: stable@vger.kernel.org
Reported-and-tested-by: Petr Cvek <petrcvekcz@gmail.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:
- topology enumeration fixes
- KASAN fix
- two entry fixes (not yet the big series related to KASLR)
- remove obsolete code
- instruction decoder fix
- better /dev/mem sanity checks, hopefully working better this time
- pkeys fixes
- two ACPI fixes
- 5-level paging related fixes
- UMIP fixes that should make application visible faults more debuggable
- boot fix for weird virtualization environment
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
x86/decoder: Add new TEST instruction pattern
x86/PCI: Remove unused HyperTransport interrupt support
x86/umip: Fix insn_get_code_seg_params()'s return value
x86/boot/KASLR: Remove unused variable
x86/entry/64: Add missing irqflags tracing to native_load_gs_index()
x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow
x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing
x86/pkeys/selftests: Fix protection keys write() warning
x86/pkeys/selftests: Rename 'si_pkey' to 'siginfo_pkey'
x86/mpx/selftests: Fix up weird arrays
x86/pkeys: Update documentation about availability
x86/umip: Print a warning into the syslog if UMIP-protected instructions are used
x86/smpboot: Fix __max_logical_packages estimate
x86/topology: Avoid wasting 128k for package id array
perf/x86/intel/uncore: Cache logical pkg id in uncore driver
x86/acpi: Reduce code duplication in mp_override_legacy_irq()
x86/acpi: Handle SCI interrupts above legacy space gracefully
x86/boot: Fix boot failure when SMP MP-table is based at 0
x86/mm: Limit mmap() of /dev/mem to valid physical addresses
x86/selftests: Add test for mapping placement for 5-level paging
...
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The kbuild test robot reported this build warning:
Warning: arch/x86/tools/test_get_len found difference at <jump_table>:ffffffff8103dd2c
Warning: ffffffff8103dd82: f6 09 d8 testb $0xd8,(%rcx)
Warning: objdump says 3 bytes, but insn_get_length() says 2
Warning: decoded and checked 1569014 instructions with 1 warnings
This sequence seems to be a new instruction not in the opcode map in the Intel SDM.
The instruction sequence is "F6 09 d8", means Group3(F6), MOD(00)REG(001)RM(001), and 0xd8.
Intel SDM vol2 A.4 Table A-6 said the table index in the group is "Encoding of Bits 5,4,3 of
the ModR/M Byte (bits 2,1,0 in parenthesis)"
In that table, opcodes listed by the index REG bits as:
000 001 010 011 100 101 110 111
TEST Ib/Iz,(undefined),NOT,NEG,MUL AL/rAX,IMUL AL/rAX,DIV AL/rAX,IDIV AL/rAX
So, it seems TEST Ib is assigned to 001.
Add the new pattern.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
There are no in-tree callers of ht_create_irq(), the driver interface for
HyperTransport interrupts, left. Remove the unused entry point and all the
supporting code.
See 8b955b0dddb3 ("[PATCH] Initial generic hypertransport interrupt
support").
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-pci@vger.kernel.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: https://lkml.kernel.org/r/20171122221337.3877.23362.stgit@bhelgaas-glaptop.roam.corp.google.com
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
In order to save on redundant structs definitions
insn_get_code_seg_params() was made to return two 4-bit values in a char
but clang complains:
arch/x86/lib/insn-eval.c:780:10: warning: implicit conversion from 'int' to 'char'
changes value from 132 to -124 [-Wconstant-conversion]
return INSN_CODE_SEG_PARAMS(4, 8);
~~~~~~ ^~~~~~~~~~~~~~~~~~~~~~~~~~
./arch/x86/include/asm/insn-eval.h:16:57: note: expanded from macro 'INSN_CODE_SEG_PARAMS'
#define INSN_CODE_SEG_PARAMS(oper_sz, addr_sz) (oper_sz | (addr_sz << 4))
Those two values do get picked apart afterwards the opposite way of how
they were ORed so wrt to the LSByte, the return value is the same.
But this function returns -EINVAL in the error case, which is an int. So
make it return an int which is the native word size anyway and thus fix
the clang warning.
Reported-by: Kees Cook <keescook@google.com>
Reported-by: Nick Desaulniers <nick.desaulniers@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ricardo.neri-calderon@linux.intel.com
Link: https://lkml.kernel.org/r/20171123091951.1462-1-bp@alien8.de
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
There are two variables "rc" in mem_avoid_memmap. One at the top of the
function and another one inside the while() loop. Drop the outer one as it
is unused. Cleanup some whitespace damage while at it.
Signed-off-by: Chao Fan <fanc.fnst@cn.fujitsu.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: gregkh@linuxfoundation.org
Cc: n-horiguchi@ah.jp.nec.com
Cc: keescook@chromium.org
Link: https://lkml.kernel.org/r/20171123090847.15293-1-fanc.fnst@cn.fujitsu.com
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Running this code with IRQs enabled (where dummy_lock is a spinlock):
static void check_load_gs_index(void)
{
/* This will fail. */
load_gs_index(0xffff);
spin_lock(&dummy_lock);
spin_unlock(&dummy_lock);
}
Will generate a lockdep warning. The issue is that the actual write
to %gs would cause an exception with IRQs disabled, and the exception
handler would, as an inadvertent side effect, update irqflag tracing
to reflect the IRQs-off status. native_load_gs_index() would then
turn IRQs back on and return with irqflag tracing still thinking that
IRQs were off. The dummy lock-and-unlock causes lockdep to notice the
error and warn.
Fix it by adding the missing tracing.
Apparently nothing did this in a context where it mattered. I haven't
tried to find a code path that would actually exhibit the warning if
appropriately nasty user code were running.
I suspect that the security impact of this bug is very, very low --
production systems don't run with lockdep enabled, and the warning is
mostly harmless anyway.
Found during a quick audit of the entry code to try to track down an
unrelated bug that Ingo found in some still-in-development code.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/e1aeb0e6ba8dd430ec36c8a35e63b429698b4132.1511411918.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
[ Note, this commit is a cherry-picked version of:
d17a1d97dc20: ("x86/mm/kasan: don't use vmemmap_populate() to initialize shadow")
... for easier x86 entry code testing and back-porting. ]
The KASAN shadow is currently mapped using vmemmap_populate() since that
provides a semi-convenient way to map pages into init_top_pgt. However,
since that no longer zeroes the mapped pages, it is not suitable for
KASAN, which requires zeroed shadow memory.
Add kasan_populate_shadow() interface and use it instead of
vmemmap_populate(). Besides, this allows us to take advantage of
gigantic pages and use them to populate the shadow, which should save us
some memory wasted on page tables and reduce TLB pressure.
Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatashin@oracle.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Bob Picco <bob.picco@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
When I added entry_SYSCALL_64_after_hwframe(), I left TRACE_IRQS_OFF
before it. This means that users of entry_SYSCALL_64_after_hwframe()
were responsible for invoking TRACE_IRQS_OFF, and the one and only
user (Xen, added in the same commit) got it wrong.
I think this would manifest as a warning if a Xen PV guest with
CONFIG_DEBUG_LOCKDEP=y were used with context tracking. (The
context tracking bit is to cause lockdep to get invoked before we
turn IRQs back on.) I haven't tested that for real yet because I
can't get a kernel configured like that to boot at all on Xen PV.
Move TRACE_IRQS_OFF below the label.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: 8a9949bc71a7 ("x86/xen/64: Rearrange the SYSCALL entries")
Link: http://lkml.kernel.org/r/9150aac013b7b95d62c2336751d5b6e91d2722aa.1511325444.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
write() is marked as having a must-check return value. Check it and
abort if we fail to write an error message from a signal handler.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171111001232.94813E58@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
'si_pkey' is now #defined to be the name of the new siginfo field that
protection keys uses. Rename it not to conflict.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171111001231.DFFC8285@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The MPX hardware data structurse are defined in a weird way: they define
their size in bytes and then union that with the type with which we want
to access them.
Yes, this is weird, but it does work. But, new GCC's complain that we
are accessing the array out of bounds. Just make it a zero-sized array
so gcc will stop complaining. There was not really a bug here.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171111001229.58A7933D@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Now that CPUs that implement Memory Protection Keys are publicly
available we can be a bit less oblique about where it is available.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171111001228.DC748A10@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
used
Print a rate-limited warning when a user-space program attempts to execute
any of the instructions that UMIP protects (i.e., SGDT, SIDT, SLDT, STR
and SMSW).
This is useful, because when CONFIG_X86_INTEL_UMIP=y is selected and
supported by the hardware, user space programs that try to execute such
instructions will receive a SIGSEGV signal that they might not expect.
In the specific cases for which emulation is provided (instructions SGDT,
SIDT and SMSW in protected and virtual-8086 modes), no signal is
generated. However, a warning is helpful to encourage updates in such
programs to avoid the use of such instructions.
Warnings are printed via a customized printk() function that also provides
information about the program that attempted to use the affected
instructions.
Utility macros are defined to wrap umip_printk() for the error and warning
kernel log levels.
While here, replace an existing call to the generic rate-limited pr_err()
with the new umip_pr_err().
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: ricardo.neri@intel.com
Link: http://lkml.kernel.org/r/1511233476-17088-1-git-send-email-ricardo.neri-calderon@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
A system booted with a small number of cores enabled per package
panics because the estimate of __max_logical_packages is too low.
This occurs when the total number of active cores across all packages is
less than the maximum core count for a single package. e.g.:
On a 4 package system with 20 cores/package where only 4 cores are
enabled on each package, the value of __max_logical_packages is
calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4.
Calculate __max_logical_packages after the cpu enumeration has completed.
Use the boot cpu's data to extrapolate the number of packages.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: He Chen <he.chen@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Piotr Luc <piotr.luc@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/20171114124257.22013-4-prarit@redhat.com
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Analyzing large early boot allocations unveiled the logical package id
storage as a prominent memory waste. Since commit 1f12e32f4cd5
("x86/topology: Create logical package id") every 64-bit system allocates a
128k array to convert logical package ids.
This happens because the array is sized for MAX_LOCAL_APIC which is always
32k on 64bit systems, and it needs 4 bytes for each entry.
This is fairly wasteful, especially for the common case of having only one
socket, which uses exactly 4 byte out of 128K.
There is no user of the package id map which is performance critical, so
the lookup is not required to be O(1). Store the logical processor id in
cpu_data and use a loop based lookup.
To keep the mapping stable accross cpu hotplug operations, add a flag to
cpu_data which is set when the CPU is brought up the first time. When the
flag is set, then cpu_data is not reinitialized by copying boot_cpu_data on
subsequent bringups.
[ tglx: Rename the flag to 'initialized', use proper pointers instead of
repeated cpu_data(x) evaluation and massage changelog. ]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: He Chen <he.chen@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Piotr Luc <piotr.luc@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/20171114124257.22013-3-prarit@redhat.com
|