| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull kvm fixes from Paolo Bonzini:
"This is a pretty large update. I think it is roughly as big as what I
usually had for the _whole_ rc period.
There are a few bad bugs where the guest can OOPS or crash the host.
We have also started looking at attack models for nested
virtualization; bugs that usually result in the guest ring 0 crashing
itself become more worrisome if you have nested virtualization,
because the nested guest might bring down the non-nested guest as
well. For current uses of nested virtualization these do not really
have a security impact, but you never know and bugs are bugs
nevertheless.
A lot of these bugs are in 3.17 too, resulting in a large number of
stable@ Ccs. I checked that all the patches apply there with no
conflicts"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: vfio: fix unregister kvm_device_ops of vfio
KVM: x86: Wrong assertion on paging_tmpl.h
kvm: fix excessive pages un-pinning in kvm_iommu_map error path.
KVM: x86: PREFETCH and HINT_NOP should have SrcMem flag
KVM: x86: Emulator does not decode clflush well
KVM: emulate: avoid accessing NULL ctxt->memopp
KVM: x86: Decoding guest instructions which cross page boundary may fail
kvm: x86: don't kill guest on unknown exit reason
kvm: vmx: handle invvpid vm exit gracefully
KVM: x86: Handle errors when RIP is set during far jumps
KVM: x86: Emulator fixes for eip canonical checks on near branches
KVM: x86: Fix wrong masking on relative jump/call
KVM: x86: Improve thread safety in pit
KVM: x86: Prevent host from panicking on shared MSR writes.
KVM: x86: Check non-canonical addresses upon WRMSR
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Even after the recent fix, the assertion on paging_tmpl.h is triggered.
Apparently, the assertion wants to check that the PAE is always set on
long-mode, but does it in incorrect way. Note that the assertion is not
enabled unless the code is debugged by defining MMU_DEBUG.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The decode phase of the x86 emulator assumes that every instruction with the
ModRM flag, and which can be used with RIP-relative addressing, has either
SrcMem or DstMem. This is not the case for several instructions - prefetch,
hint-nop and clflush.
Adding SrcMem|NoAccess for prefetch and hint-nop and SrcMem for clflush.
This fixes CVE-2014-8480.
Fixes: 41061cdb98a0bec464278b4db8e894a3121671f5
Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Currently, all group15 instructions are decoded as clflush (e.g., mfence,
xsave). In addition, the clflush instruction requires no prefix (66/f2/f3)
would exist. If prefix exists it may encode a different instruction (e.g.,
clflushopt).
Creating a group for clflush, and different group for each prefix.
This has been the case forever, but the next patch needs the cflush group
in order to fix a bug introduced in 3.17.
Fixes: 41061cdb98a0bec464278b4db8e894a3121671f5
Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
A failure to decode the instruction can cause a NULL pointer access.
This is fixed simply by moving the "done" label as close as possible
to the return.
This fixes CVE-2014-8481.
Reported-by: Andy Lutomirski <luto@amacapital.net>
Cc: stable@vger.kernel.org
Fixes: 41061cdb98a0bec464278b4db8e894a3121671f5
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Once an instruction crosses a page boundary, the size read from the second page
disregards the common case that part of the operand resides on the first page.
As a result, fetch of long insturctions may fail, and thereby cause the
decoding to fail as well.
Cc: stable@vger.kernel.org
Fixes: 5cfc7e0f5e5e1adf998df94f8e36edaf5d30d38e
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
KVM_EXIT_UNKNOWN is a kvm bug, we don't really know whether it was
triggered by a priveledged application. Let's not kill the guest: WARN
and inject #UD instead.
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On systems with invvpid instruction support (corresponding bit in
IA32_VMX_EPT_VPID_CAP MSR is set) guest invocation of invvpid
causes vm exit, which is currently not handled and results in
propagation of unknown exit to userspace.
Fix this by installing an invvpid vm exit handler.
This is CVE-2014-3646.
Cc: stable@vger.kernel.org
Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Far jmp/call/ret may fault while loading a new RIP. Currently KVM does not
handle this case, and may result in failed vm-entry once the assignment is
done. The tricky part of doing so is that loading the new CS affects the
VMCS/VMCB state, so if we fail during loading the new RIP, we are left in
unconsistent state. Therefore, this patch saves on 64-bit the old CS
descriptor and restores it if loading RIP failed.
This fixes CVE-2014-3647.
Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Before changing rip (during jmp, call, ret, etc.) the target should be asserted
to be canonical one, as real CPUs do. During sysret, both target rsp and rip
should be canonical. If any of these values is noncanonical, a #GP exception
should occur. The exception to this rule are syscall and sysenter instructions
in which the assigned rip is checked during the assignment to the relevant
MSRs.
This patch fixes the emulator to behave as real CPUs do for near branches.
Far branches are handled by the next patch.
This fixes CVE-2014-3647.
Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Relative jumps and calls do the masking according to the operand size, and not
according to the address size as the KVM emulator does today.
This patch fixes KVM behavior.
Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There's a race condition in the PIT emulation code in KVM. In
__kvm_migrate_pit_timer the pit_timer object is accessed without
synchronization. If the race condition occurs at the wrong time this
can crash the host kernel.
This fixes CVE-2014-3611.
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The previous patch blocked invalid writes directly when the MSR
is written. As a precaution, prevent future similar mistakes by
gracefulling handle GPs caused by writes to shared MSRs.
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Honig <ahonig@google.com>
[Remove parts obsoleted by Nadav's patch. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Upon WRMSR, the CPU should inject #GP if a non-canonical value (address) is
written to certain MSRs. The behavior is "almost" identical for AMD and Intel
(ignoring MSRs that are not implemented in either architecture since they would
anyhow #GP). However, IA32_SYSENTER_ESP and IA32_SYSENTER_EIP cause #GP if
non-canonical address is written on Intel but not on AMD (which ignores the top
32-bits).
Accordingly, this patch injects a #GP on the MSRs which behave identically on
Intel and AMD. To eliminate the differences between the architecutres, the
value which is written to IA32_SYSENTER_ESP and IA32_SYSENTER_EIP is turned to
canonical value before writing instead of injecting a #GP.
Some references from Intel and AMD manuals:
According to Intel SDM description of WRMSR instruction #GP is expected on
WRMSR "If the source register contains a non-canonical address and ECX
specifies one of the following MSRs: IA32_DS_AREA, IA32_FS_BASE, IA32_GS_BASE,
IA32_KERNEL_GS_BASE, IA32_LSTAR, IA32_SYSENTER_EIP, IA32_SYSENTER_ESP."
According to AMD manual instruction manual:
LSTAR/CSTAR (SYSCALL): "The WRMSR instruction loads the target RIP into the
LSTAR and CSTAR registers. If an RIP written by WRMSR is not in canonical
form, a general-protection exception (#GP) occurs."
IA32_GS_BASE and IA32_FS_BASE (WRFSBASE/WRGSBASE): "The address written to the
base field must be in canonical form or a #GP fault will occur."
IA32_KERNEL_GS_BASE (SWAPGS): "The address stored in the KernelGSbase MSR must
be in canonical form."
This patch fixes CVE-2014-3610.
Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen bug fixes from David Vrabel:
- Fix regression in xen_clocksource_read() which caused all Xen guests
to crash early in boot.
- Several fixes for super rare race conditions in the p2m.
- Assorted other minor fixes.
* tag 'stable/for-linus-3.18-b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/pci: Allocate memory for physdev_pci_device_add's optarr
x86/xen: panic on bad Xen-provided memory map
x86/xen: Fix incorrect per_cpu accessor in xen_clocksource_read()
x86/xen: avoid race in p2m handling
x86/xen: delay construction of mfn_list_list
x86/xen: avoid writing to freed memory after race in p2m handling
xen/balloon: Don't continue ballooning when BP_ECANCELED is encountered
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Panic if Xen provides a memory map with 0 entries. Although this is
unlikely, it is better to catch the error at the point of seeing the map
than later on as a symptom of some other crash.
Signed-off-by: Martin Kelly <martkell@amazon.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Commit 89cbc76768c2 ("x86: Replace __get_cpu_var uses") replaced
__get_cpu_var() with this_cpu_ptr() in xen_clocksource_read() in such a
way that instead of accessing a structure pointed to by a per-cpu pointer
we are trying to get to a per-cpu structure.
__this_cpu_read() of the pointer is the more appropriate accessor.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When a new p2m leaf is allocated this leaf is linked into the p2m tree
via cmpxchg. Unfortunately the compare value for checking the success
of the update is read after checking for the need of a new leaf. It is
possible that a new leaf has been linked into the tree concurrently
in between. This could lead to a leaked memory page and to the loss of
some p2m entries.
Avoid the race by using the read compare value for checking the need
of a new p2m leaf and use ACCESS_ONCE() to get it.
There are other places which seem to need ACCESS_ONCE() to ensure
proper operation. Change them accordingly.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The 3 level p2m tree for the Xen tools is constructed very early at
boot by calling xen_build_mfn_list_list(). Memory needed for this tree
is allocated via extend_brk().
As this tree (other than the kernel internal p2m tree) is only needed
for domain save/restore, live migration and crash dump analysis it
doesn't matter whether it is constructed very early or just some
milliseconds later when memory allocation is possible by other means.
This patch moves the call of xen_build_mfn_list_list() just after
calling xen_pagetable_p2m_copy() simplifying this function, too, as it
doesn't have to bother with two parallel trees now. The same applies
for some other internal functions.
While simplifying code, make early_can_reuse_p2m_middle() static and
drop the unused second parameter. p2m_mid_identity_mfn can be removed
as well, it isn't used either.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In case a race was detected during allocation of a new p2m tree
element in alloc_p2m() the new allocated mid_mfn page is freed without
updating the pointer to the found value in the tree. This will result
in overwriting the just freed page with the mfn of the p2m leaf.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull weak function declaration removal from Bjorn Helgaas:
"The "weak" attribute is commonly used for the default version of a
function, where an architecture can override it by providing a strong
version.
Some header file declarations included the "weak" attribute. That's
error-prone because it causes every implementation to be weak, with no
strong version at all, and the linker chooses one based on link order.
What we want is the "weak" attribute only on the *definition* of the
default implementation. These changes remove "weak" from the
declarations, leaving it on the default definitions"
* tag 'remove-weak-declarations' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
uprobes: Remove "weak" from function declarations
memory-hotplug: Remove "weak" from memory_block_size_bytes() declaration
kgdb: Remove "weak" from kgdb_arch_pc() declaration
ARC: kgdb: generic kgdb_arch_pc() suffices
vmcore: Remove "weak" from function declarations
clocksource: Remove "weak" from clocksource_default_clock() declaration
x86, intel-mid: Remove "weak" from function declarations
audit: Remove "weak" from audit_classify_compat_syscall() declaration
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For the following interfaces:
get_penwell_ops()
get_cloverview_ops()
get_tangier_ops()
there is only one implementation, so they do not need to be marked "weak".
Remove the "weak" attribute from their declarations.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
CC: David Cohen <david.a.cohen@linux.intel.com>
CC: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
CC: x86@kernel.org
|
|\ \
| |/
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 EFI updates from Peter Anvin:
"This patchset falls under the "maintainers that grovel" clause in the
v3.18-rc1 announcement. We had intended to push it late in the merge
window since we got it into the -tip tree relatively late.
Many of these are relatively simple things, but there are a couple of
key bits, especially Ard's and Matt's patches"
* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
rtc: Disable EFI rtc for x86
efi: rtc-efi: Export platform:rtc-efi as module alias
efi: Delete the in_nmi() conditional runtime locking
efi: Provide a non-blocking SetVariable() operation
x86/efi: Adding efi_printks on memory allocationa and pci.reads
x86/efi: Mark initialization code as such
x86/efi: Update comment regarding required phys mapped EFI services
x86/efi: Unexport add_efi_memmap variable
x86/efi: Remove unused efi_call* macros
efi: Resolve some shadow warnings
arm64: efi: Format EFI memory type & attrs with efi_md_typeattr_format()
ia64: efi: Format EFI memory type & attrs with efi_md_typeattr_format()
x86: efi: Format EFI memory type & attrs with efi_md_typeattr_format()
efi: Introduce efi_md_typeattr_format()
efi: Add macro for EFI_MEMORY_UCE memory attribute
x86/efi: Clear EFI_RUNTIME_SERVICES if failing to enter virtual mode
arm64/efi: Do not enter virtual mode if booting with efi=noruntime or noefi
arm64/efi: uefi_init error handling fix
efi: Add kernel param efi=noruntime
lib: Add a generic cmdline parse function parse_option_str
...
|
| |\
| | |
| | |
| | |
| | | |
Conflicts:
arch/x86/boot/compressed/eboot.c
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
commit 5dc3826d9f08 ("efi: Implement mandatory locking for UEFI Runtime
Services") implemented some conditional locking when accessing variable
runtime services that Ingo described as "pretty disgusting".
The intention with the !efi_in_nmi() checks was to avoid live-locks when
trying to write pstore crash data into an EFI variable. Such lockless
accesses are allowed according to the UEFI specification when we're in a
"non-recoverable" state, but whether or not things are implemented
correctly in actual firmware implementations remains an unanswered
question, and so it would seem sensible to avoid doing any kind of
unsynchronized variable accesses.
Furthermore, the efi_in_nmi() tests are inadequate because they don't
account for the case where we call EFI variable services from panic or
oops callbacks and aren't executing in NMI context. In other words,
live-locking is still possible.
Let's just remove the conditional locking altogether. Now we've got the
->set_variable_nonblocking() EFI variable operation we can abort if the
runtime lock is already held. Aborting is by far the safest option.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
All other calls to allocate memory seem to make some noise already, with the
exception of two calls (for gop, uga) in the setup_graphics path.
The purpose is to be noisy on worrysome errors immediately.
commit fb86b2440de0 ("x86/efi: Add better error logging to EFI boot
stub") introduces printing false alarms for lots of hardware. Rather
than playing Whack a Mole with non-fatal exit conditions, try the other
way round.
This is per Matt Fleming's suggestion:
> Where I think we could improve things
> is by adding efi_printk() message in certain error paths. Clearly, not
> all error paths need such messages, e.g. the EFI_INVALID_PARAMETER path
> you highlighted above, but it makes sense for memory allocation and PCI
> read failures.
Link: http://article.gmane.org/gmane.linux.kernel.efi/4628
Signed-off-by: Andre Müller <andre.muller@web.de>
Cc: Ulf Winkelvos <ulf@winkelvos.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The 32 bit and 64 bit implementations differ in their __init annotations
for some functions referenced from the common EFI code. Namely, the 32
bit variant is missing some of the __init annotations the 64 bit variant
has.
To solve the colliding annotations, mark the corresponding functions in
efi_32.c as initialization code, too -- as it is such.
Actually, quite a few more functions are only used during initialization
and therefore can be marked __init. They are therefore annotated, too.
Also add the __init annotation to the prototypes in the efi.h header so
users of those functions will see it's meant as initialization code
only.
This patch also fixes the "prelog" typo. ("prologue" / "epilogue" might
be more appropriate but this is C code after all, not an opera! :D)
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Commit 3f4a7836e331 ("x86/efi: Rip out phys_efi_get_time()") left
set_virtual_address_map as the only runtime service needed with a
phys mapping but missed to update the preceding comment. Fix that.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This variable was accidentally exported, even though it's only used in
this compilation unit and only during initialization.
Remove the bogus export, make the variable static instead and mark it
as __initdata.
Fixes: 200001eb140e ("x86 boot: only pick up additional EFI memmap...")
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Complement commit 62fa6e69a436 ("x86/efi: Delete most of the efi_call*
macros") and delete the stub macros for the !CONFIG_EFI case, too. In
fact, there are no EFI calls in this case so we don't need a dummy for
efi_call() even.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
An example log excerpt demonstrating the change:
Before the patch:
> efi: mem00: type=7, attr=0xf, range=[0x0000000000000000-0x000000000009f000) (0MB)
> efi: mem01: type=2, attr=0xf, range=[0x000000000009f000-0x00000000000a0000) (0MB)
> efi: mem02: type=7, attr=0xf, range=[0x0000000000100000-0x0000000000400000) (3MB)
> efi: mem03: type=2, attr=0xf, range=[0x0000000000400000-0x0000000000800000) (4MB)
> efi: mem04: type=10, attr=0xf, range=[0x0000000000800000-0x0000000000808000) (0MB)
> efi: mem05: type=7, attr=0xf, range=[0x0000000000808000-0x0000000000810000) (0MB)
> efi: mem06: type=10, attr=0xf, range=[0x0000000000810000-0x0000000000900000) (0MB)
> efi: mem07: type=4, attr=0xf, range=[0x0000000000900000-0x0000000001100000) (8MB)
> efi: mem08: type=7, attr=0xf, range=[0x0000000001100000-0x0000000001400000) (3MB)
> efi: mem09: type=2, attr=0xf, range=[0x0000000001400000-0x0000000002613000) (18MB)
> efi: mem10: type=7, attr=0xf, range=[0x0000000002613000-0x0000000004000000) (25MB)
> efi: mem11: type=4, attr=0xf, range=[0x0000000004000000-0x0000000004020000) (0MB)
> efi: mem12: type=7, attr=0xf, range=[0x0000000004020000-0x00000000068ea000) (40MB)
> efi: mem13: type=2, attr=0xf, range=[0x00000000068ea000-0x00000000068f0000) (0MB)
> efi: mem14: type=3, attr=0xf, range=[0x00000000068f0000-0x0000000006c7b000) (3MB)
> efi: mem15: type=6, attr=0x800000000000000f, range=[0x0000000006c7b000-0x0000000006c7d000) (0MB)
> efi: mem16: type=5, attr=0x800000000000000f, range=[0x0000000006c7d000-0x0000000006c85000) (0MB)
> efi: mem17: type=6, attr=0x800000000000000f, range=[0x0000000006c85000-0x0000000006c87000) (0MB)
> efi: mem18: type=3, attr=0xf, range=[0x0000000006c87000-0x0000000006ca3000) (0MB)
> efi: mem19: type=6, attr=0x800000000000000f, range=[0x0000000006ca3000-0x0000000006ca6000) (0MB)
> efi: mem20: type=10, attr=0xf, range=[0x0000000006ca6000-0x0000000006cc6000) (0MB)
> efi: mem21: type=6, attr=0x800000000000000f, range=[0x0000000006cc6000-0x0000000006d95000) (0MB)
> efi: mem22: type=5, attr=0x800000000000000f, range=[0x0000000006d95000-0x0000000006e22000) (0MB)
> efi: mem23: type=7, attr=0xf, range=[0x0000000006e22000-0x0000000007165000) (3MB)
> efi: mem24: type=4, attr=0xf, range=[0x0000000007165000-0x0000000007d22000) (11MB)
> efi: mem25: type=7, attr=0xf, range=[0x0000000007d22000-0x0000000007d25000) (0MB)
> efi: mem26: type=3, attr=0xf, range=[0x0000000007d25000-0x0000000007ea2000) (1MB)
> efi: mem27: type=5, attr=0x800000000000000f, range=[0x0000000007ea2000-0x0000000007ed2000) (0MB)
> efi: mem28: type=6, attr=0x800000000000000f, range=[0x0000000007ed2000-0x0000000007ef6000) (0MB)
> efi: mem29: type=7, attr=0xf, range=[0x0000000007ef6000-0x0000000007f00000) (0MB)
> efi: mem30: type=9, attr=0xf, range=[0x0000000007f00000-0x0000000007f02000) (0MB)
> efi: mem31: type=10, attr=0xf, range=[0x0000000007f02000-0x0000000007f06000) (0MB)
> efi: mem32: type=4, attr=0xf, range=[0x0000000007f06000-0x0000000007fd0000) (0MB)
> efi: mem33: type=6, attr=0x800000000000000f, range=[0x0000000007fd0000-0x0000000007ff0000) (0MB)
> efi: mem34: type=7, attr=0xf, range=[0x0000000007ff0000-0x0000000008000000) (0MB)
After the patch:
> efi: mem00: [Conventional Memory| | | | | |WB|WT|WC|UC] range=[0x0000000000000000-0x000000000009f000) (0MB)
> efi: mem01: [Loader Data | | | | | |WB|WT|WC|UC] range=[0x000000000009f000-0x00000000000a0000) (0MB)
> efi: mem02: [Conventional Memory| | | | | |WB|WT|WC|UC] range=[0x0000000000100000-0x0000000000400000) (3MB)
> efi: mem03: [Loader Data | | | | | |WB|WT|WC|UC] range=[0x0000000000400000-0x0000000000800000) (4MB)
> efi: mem04: [ACPI Memory NVS | | | | | |WB|WT|WC|UC] range=[0x0000000000800000-0x0000000000808000) (0MB)
> efi: mem05: [Conventional Memory| | | | | |WB|WT|WC|UC] range=[0x0000000000808000-0x0000000000810000) (0MB)
> efi: mem06: [ACPI Memory NVS | | | | | |WB|WT|WC|UC] range=[0x0000000000810000-0x0000000000900000) (0MB)
> efi: mem07: [Boot Data | | | | | |WB|WT|WC|UC] range=[0x0000000000900000-0x0000000001100000) (8MB)
> efi: mem08: [Conventional Memory| | | | | |WB|WT|WC|UC] range=[0x0000000001100000-0x0000000001400000) (3MB)
> efi: mem09: [Loader Data | | | | | |WB|WT|WC|UC] range=[0x0000000001400000-0x0000000002613000) (18MB)
> efi: mem10: [Conventional Memory| | | | | |WB|WT|WC|UC] range=[0x0000000002613000-0x0000000004000000) (25MB)
> efi: mem11: [Boot Data | | | | | |WB|WT|WC|UC] range=[0x0000000004000000-0x0000000004020000) (0MB)
> efi: mem12: [Conventional Memory| | | | | |WB|WT|WC|UC] range=[0x0000000004020000-0x00000000068ea000) (40MB)
> efi: mem13: [Loader Data | | | | | |WB|WT|WC|UC] range=[0x00000000068ea000-0x00000000068f0000) (0MB)
> efi: mem14: [Boot Code | | | | | |WB|WT|WC|UC] range=[0x00000000068f0000-0x0000000006c7b000) (3MB)
> efi: mem15: [Runtime Data |RUN| | | | |WB|WT|WC|UC] range=[0x0000000006c7b000-0x0000000006c7d000) (0MB)
> efi: mem16: [Runtime Code |RUN| | | | |WB|WT|WC|UC] range=[0x0000000006c7d000-0x0000000006c85000) (0MB)
> efi: mem17: [Runtime Data |RUN| | | | |WB|WT|WC|UC] range=[0x0000000006c85000-0x0000000006c87000) (0MB)
> efi: mem18: [Boot Code | | | | | |WB|WT|WC|UC] range=[0x0000000006c87000-0x0000000006ca3000) (0MB)
> efi: mem19: [Runtime Data |RUN| | | | |WB|WT|WC|UC] range=[0x0000000006ca3000-0x0000000006ca6000) (0MB)
> efi: mem20: [ACPI Memory NVS | | | | | |WB|WT|WC|UC] range=[0x0000000006ca6000-0x0000000006cc6000) (0MB)
> efi: mem21: [Runtime Data |RUN| | | | |WB|WT|WC|UC] range=[0x0000000006cc6000-0x0000000006d95000) (0MB)
> efi: mem22: [Runtime Code |RUN| | | | |WB|WT|WC|UC] range=[0x0000000006d95000-0x0000000006e22000) (0MB)
> efi: mem23: [Conventional Memory| | | | | |WB|WT|WC|UC] range=[0x0000000006e22000-0x0000000007165000) (3MB)
> efi: mem24: [Boot Data | | | | | |WB|WT|WC|UC] range=[0x0000000007165000-0x0000000007d22000) (11MB)
> efi: mem25: [Conventional Memory| | | | | |WB|WT|WC|UC] range=[0x0000000007d22000-0x0000000007d25000) (0MB)
> efi: mem26: [Boot Code | | | | | |WB|WT|WC|UC] range=[0x0000000007d25000-0x0000000007ea2000) (1MB)
> efi: mem27: [Runtime Code |RUN| | | | |WB|WT|WC|UC] range=[0x0000000007ea2000-0x0000000007ed2000) (0MB)
> efi: mem28: [Runtime Data |RUN| | | | |WB|WT|WC|UC] range=[0x0000000007ed2000-0x0000000007ef6000) (0MB)
> efi: mem29: [Conventional Memory| | | | | |WB|WT|WC|UC] range=[0x0000000007ef6000-0x0000000007f00000) (0MB)
> efi: mem30: [ACPI Reclaim Memory| | | | | |WB|WT|WC|UC] range=[0x0000000007f00000-0x0000000007f02000) (0MB)
> efi: mem31: [ACPI Memory NVS | | | | | |WB|WT|WC|UC] range=[0x0000000007f02000-0x0000000007f06000) (0MB)
> efi: mem32: [Boot Data | | | | | |WB|WT|WC|UC] range=[0x0000000007f06000-0x0000000007fd0000) (0MB)
> efi: mem33: [Runtime Data |RUN| | | | |WB|WT|WC|UC] range=[0x0000000007fd0000-0x0000000007ff0000) (0MB)
> efi: mem34: [Conventional Memory| | | | | |WB|WT|WC|UC] range=[0x0000000007ff0000-0x0000000008000000) (0MB)
Both the type enum and the attribute bitmap are decoded, with the
additional benefit that the memory ranges line up as well.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If enter virtual mode failed due to some reason other than the efi call
the EFI_RUNTIME_SERVICES bit in efi.flags should be cleared thus users
of efi runtime services can check the bit and handle the case instead of
assume efi runtime is ok.
Per Matt, if efi call SetVirtualAddressMap fails we will be not sure
it's safe to make any assumptions about the state of the system. So
kernel panics instead of clears EFI_RUNTIME_SERVICES bit.
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
noefi kernel param means actually disabling efi runtime, Per suggestion
from Leif Lindholm efi=noruntime should be better. But since noefi is
already used in X86 thus just adding another param efi=noruntime for
same purpose.
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
There should be a generic function to parse params like a=b,c
Adding parse_option_str in lib/cmdline.c which will return true
if there's specified option set in the params.
Also updated efi=old_map parsing code to use the new function
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
noefi param can be used for arches other than X86 later, thus move it
out of x86 platform code.
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Gracefully handle failures to allocate memory for the image, which might
be arbitrarily large.
efi_bgrt_init can fail in various ways as well, usually because the
BIOS-provided BGRT structure does not match expectations. Add
appropriate error messages rather than failing silently.
Reported-by: Srihari Vijayaraghavan <linux.bug.reporting@gmail.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=81321
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We need a way to customize the behaviour of the EFI boot stub, in
particular, we need a way to disable the "chunking" workaround, used
when reading files from the EFI System Partition.
One of my machines doesn't cope well when reading files in 1MB chunks to
a buffer above the 4GB mark - it appears that the "chunking" bug
workaround triggers another firmware bug. This was only discovered with
commit 4bf7111f5016 ("x86/efi: Support initrd loaded above 4G"), and
that commit is perfectly valid. The symptom I observed was a corrupt
initrd rather than any kind of crash.
efi= is now used to specify EFI parameters in two very different
execution environments, the EFI boot stub and during kernel boot.
There is also a slight performance optimization by enabling efi=nochunk,
but that's offset by the fact that you're more likely to run into
firmware issues, at least on x86. This is the rationale behind leaving
the workaround enabled by default.
Also provide some documentation for EFI_READ_CHUNK_SIZE and why we're
using the current value of 1MB.
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Roy Franz <roy.franz@linaro.org>
Cc: Maarten Lankhorst <m.b.lankhorst@gmail.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
According to section 7.1 of the UEFI spec, Runtime Services are not fully
reentrant, and there are particular combinations of calls that need to be
serialized. Use a spinlock to serialize all Runtime Services with respect
to all others, even if this is more than strictly needed.
We've managed to get away without requiring a runtime services lock
until now because most of the interactions with EFI involve EFI
variables, and those operations are already serialised with
__efivars->lock.
Some of the assumptions underlying the decision whether locks are
needed or not (e.g., SetVariable() against ResetSystem()) may not
apply universally to all [new] architectures that implement UEFI.
Rather than try to reason our way out of this, let's just implement at
least what the spec requires in terms of locking.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Pull audit updates from Eric Paris:
"So this change across a whole bunch of arches really solves one basic
problem. We want to audit when seccomp is killing a process. seccomp
hooks in before the audit syscall entry code. audit_syscall_entry
took as an argument the arch of the given syscall. Since the arch is
part of what makes a syscall number meaningful it's an important part
of the record, but it isn't available when seccomp shoots the
syscall...
For most arch's we have a better way to get the arch (syscall_get_arch)
So the solution was two fold: Implement syscall_get_arch() everywhere
there is audit which didn't have it. Use syscall_get_arch() in the
seccomp audit code. Having syscall_get_arch() everywhere meant it was
a useless flag on the stack and we could get rid of it for the typical
syscall entry.
The other changes inside the audit system aren't grand, fixed some
records that had invalid spaces. Better locking around the task comm
field. Removing some dead functions and structs. Make some things
static. Really minor stuff"
* git://git.infradead.org/users/eparis/audit: (31 commits)
audit: rename audit_log_remove_rule to disambiguate for trees
audit: cull redundancy in audit_rule_change
audit: WARN if audit_rule_change called illegally
audit: put rule existence check in canonical order
next: openrisc: Fix build
audit: get comm using lock to avoid race in string printing
audit: remove open_arg() function that is never used
audit: correct AUDIT_GET_FEATURE return message type
audit: set nlmsg_len for multicast messages.
audit: use union for audit_field values since they are mutually exclusive
audit: invalid op= values for rules
audit: use atomic_t to simplify audit_serial()
kernel/audit.c: use ARRAY_SIZE instead of sizeof/sizeof[0]
audit: reduce scope of audit_log_fcaps
audit: reduce scope of audit_net_id
audit: arm64: Remove the audit arch argument to audit_syscall_entry
arm64: audit: Add audit hook in syscall_trace_enter/exit()
audit: x86: drop arch from __audit_syscall_entry() interface
sparc: implement is_32bit_task
sparc: properly conditionalize use of TIF_32BIT
...
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Since the arch is found locally in __audit_syscall_entry(), there is no need to
pass it in as a parameter. Delete it from the parameter list.
x86* was the only arch to call __audit_syscall_entry() directly and did so from
assembly code.
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-audit@redhat.com
Signed-off-by: Eric Paris <eparis@redhat.com>
---
As this patch relies on changes in the audit tree, I think it
appropriate to send it through my tree rather than the x86 tree.
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We have a function where the arch can be queried, syscall_get_arch().
So rather than have every single piece of arch specific code use and/or
duplicate syscall_get_arch(), just have the audit code use the
syscall_get_arch() code.
Based-on-patch-by: Richard Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: linux-alpha@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-ia64@vger.kernel.org
Cc: microblaze-uclinux@itee.uq.edu.au
Cc: linux-mips@linux-mips.org
Cc: linux@lists.openrisc.net
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: linux-xtensa@linux-xtensa.org
Cc: x86@kernel.org
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This patch defines syscall_get_arch() for the um platform. It adds a
new syscall.h header file to define this. It copies the HOST_AUDIT_ARCH
definition from ptrace.h. (that definition will be removed when we
switch audit to use this new syscall_get_arch() function)
Based-on-patch-by: Richard Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: user-mode-linux-devel@lists.sourceforge.net
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more perf updates from Ingo Molnar:
"A second (and last) round of late coming fixes and changes, almost all
of them in perf tooling:
User visible tooling changes:
- Add period data column and make it default in 'perf script' (Jiri
Olsa)
- Add a visual cue for toggle zeroing of samples in 'perf top'
(Taeung Song)
- Improve callchains when using libunwind (Namhyung Kim)
Tooling fixes and infrastructure changes:
- Fix for double free in 'perf stat' when using some specific invalid
command line combo (Yasser Shalabi)
- Fix off-by-one bugs in map->end handling (Stephane Eranian)
- Fix off-by-one bug in maps__find(), also related to map->end
handling (Namhyung Kim)
- Make struct symbol->end be the first addr after the symbol range,
to make it match the convention used for struct map->end. (Arnaldo
Carvalho de Melo)
- Fix perf_evlist__add_pollfd() error handling in 'perf kvm stat
live' (Jiri Olsa)
- Fix python test build by moving callchain_param to an object linked
into the python binding (Jiri Olsa)
- Document sysfs events/ interfaces (Cody P Schafer)
- Fix typos in perf/Documentation (Masanari Iida)
- Add missing 'struct option' forward declaration (Arnaldo Carvalho
de Melo)
- Add option to copy events when queuing for sorting across cpu
buffers and enable it for 'perf kvm stat live', to avoid having
events left in the queue pointing to the ring buffer be rewritten
in high volume sessions. (Alexander Yarygin, improving work done
by David Ahern):
- Do not include a struct hists per perf_evsel, untangling the
histogram code from perf_evsel, to pave the way for exporting a
minimalistic tools/lib/api/perf/ library usable by tools/perf and
initially by the rasd daemon being developed by Borislav Petkov,
Robert Richter and Jean Pihet. (Arnaldo Carvalho de Melo)
- Make perf_evlist__open(evlist, NULL, NULL), i.e. without cpu and
thread maps mean syswide monitoring, reducing the boilerplate for
tools that only want system wide mode. (Arnaldo Carvalho de Melo)
- Move exit stuff from perf_evsel__delete to perf_evsel__exit, delete
should be just a front end for exit + free (Arnaldo Carvalho de
Melo)
- Add support to new style format of kernel PMU event. (Kan Liang)
and other misc fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
perf script: Add period as a default output column
perf script: Add period data column
perf evsel: No need to drag util/cgroup.h
perf evlist: Add missing 'struct option' forward declaration
perf evsel: Move exit stuff from __delete to __exit
kprobes/x86: Remove stale ARCH_SUPPORTS_KPROBES_ON_FTRACE define
perf kvm stat live: Enable events copying
perf session: Add option to copy events when queueing
perf Documentation: Fix typos in perf/Documentation
perf trace: Use thread_{,_set}_priv helpers
perf kvm: Use thread_{,_set}_priv helpers
perf callchain: Create an address space per thread
perf report: Set callchain_param.record_mode for future use
perf evlist: Fix for double free in tools/perf stat
perf test: Add test case for pmu event new style format
perf tools: Add support to new style format of kernel PMU event
perf tools: Parse the pmu event prefix and suffix
Revert "perf tools: Default to cpu// for events v5"
perf Documentation: Remove Ruplicated docs for powerpc cpu specific events
perf Documentation: sysfs events/ interfaces
...
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Commit e7dbfe349d12 ("kprobes/x86: Move ftrace-based kprobe code
into kprobes-ftrace.c") switched from using
ARCH_SUPPORTS_KPROBES_ON_FTRACE to CONFIG_KPROBES_ON_FTRACE but
missed removing the define.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: masami.hiramatsu.pt@hitachi.com
Cc: ananth@in.ibm.com
Cc: a.p.zijlstra@chello.nl
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Pull MIPS updates from Ralf Baechle:
"This is the MIPS pull request for the next kernel:
- Zubair's patch series adds CMA support for MIPS. Doing so it also
touches ARM64 and x86.
- remove the last instance of IRQF_DISABLED from arch/mips
- updates to two of the MIPS defconfig files.
- cleanup of how cache coherency bits are handled on MIPS and
implement support for write-combining.
- platform upgrades for Alchemy
- move MIPS DTS files to arch/mips/boot/dts/"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (24 commits)
MIPS: ralink: remove deprecated IRQF_DISABLED
MIPS: pgtable.h: Implement the pgprot_writecombine function for MIPS
MIPS: cpu-probe: Set the write-combine CCA value on per core basis
MIPS: pgtable-bits: Define the CCA bit for WC writes on Ingenic cores
MIPS: pgtable-bits: Move the CCA bits out of the core's ifdef blocks
MIPS: DMA: Add cma support
x86: use generic dma-contiguous.h
arm64: use generic dma-contiguous.h
asm-generic: Add dma-contiguous.h
MIPS: BPF: Add new emit_long_instr macro
MIPS: ralink: Move device-trees to arch/mips/boot/dts/
MIPS: Netlogic: Move device-trees to arch/mips/boot/dts/
MIPS: sead3: Move device-trees to arch/mips/boot/dts/
MIPS: Lantiq: Move device-trees to arch/mips/boot/dts/
MIPS: Octeon: Move device-trees to arch/mips/boot/dts/
MIPS: Add support for building device-tree binaries
MIPS: Create common infrastructure for building built-in device-trees
MIPS: SEAD3: Enable DEVTMPFS
MIPS: SEAD3: Regenerate defconfigs
MIPS: Alchemy: DB1300: Add touch penirq support
...
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
dma-contiguous.h is now in asm-generic. Use that to avoid code
repetition in x86.
Signed-off-by: Zubair Lutfullah Kakakhel <Zubair.Kakakhel@imgtec.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: catalin.marinas@arm.com
Cc: will.deacon@arm.com
Cc: tglx@linutronix.de
Cc: mingo@redhat.com
Cc: hpa@zytor.com
Cc: arnd@arndb.de
Cc: gregkh@linuxfoundation.org
Cc: m.szyprowski@samsung.com
Cc: x86@kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-arch@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/7359/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
CR4 isn't constant; at least the TSD and PCE bits can vary.
TBH, treating CR0 and CR3 as constant scares me a bit, too, but it looks
like it's correct.
This adds a branch and a read from cr4 to each vm entry. Because it is
extremely likely that consecutive entries into the same vcpu will have
the same host cr4 value, this fixes up the vmcs instead of restoring cr4
after the fact. A subsequent patch will add a kernel-wide cr4 shadow,
reducing the overhead in the common case to just two memory reads and a
branch.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Pull networking fixes from David Miller:
1) Include fixes for netrom and dsa (Fabian Frederick and Florian
Fainelli)
2) Fix FIXED_PHY support in stmmac, from Giuseppe CAVALLARO.
3) Several SKB use after free fixes (vxlan, openvswitch, vxlan,
ip_tunnel, fou), from Li ROngQing.
4) fec driver PTP support fixes from Luwei Zhou and Nimrod Andy.
5) Use after free in virtio_net, from Michael S Tsirkin.
6) Fix flow mask handling for megaflows in openvswitch, from Pravin B
Shelar.
7) ISDN gigaset and capi bug fixes from Tilman Schmidt.
8) Fix route leak in ip_send_unicast_reply(), from Vasily Averin.
9) Fix two eBPF JIT bugs on x86, from Alexei Starovoitov.
10) TCP_SKB_CB() reorganization caused a few regressions, fixed by Cong
Wang and Eric Dumazet.
11) Don't overwrite end of SKB when parsing malformed sctp ASCONF
chunks, from Daniel Borkmann.
12) Don't call sock_kfree_s() with NULL pointers, this function also has
the side effect of adjusting the socket memory usage. From Cong Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (90 commits)
bna: fix skb->truesize underestimation
net: dsa: add includes for ethtool and phy_fixed definitions
openvswitch: Set flow-key members.
netrom: use linux/uaccess.h
dsa: Fix conversion from host device to mii bus
tipc: fix bug in bundled buffer reception
ipv6: introduce tcp_v6_iif()
sfc: add support for skb->xmit_more
r8152: return -EBUSY for runtime suspend
ipv4: fix a potential use after free in fou.c
ipv4: fix a potential use after free in ip_tunnel_core.c
hyperv: Add handling of IP header with option field in netvsc_set_hash()
openvswitch: Create right mask with disabled megaflows
vxlan: fix a free after use
openvswitch: fix a use after free
ipv4: dst_entry leak in ip_send_unicast_reply()
ipv4: clean up cookie_v4_check()
ipv4: share tcp_v4_save_options() with cookie_v4_check()
ipv4: call __ip_options_echo() in cookie_v4_check()
atm: simplify lanai.c by using module_pci_driver
...
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
1.
JIT compiler using multi-pass approach to converge to final image size,
since x86 instructions are variable length. It starts with large
gaps between instructions (so some jumps may use imm32 instead of imm8)
and iterates until total program size is the same as in previous pass.
This algorithm works only if program size is strictly decreasing.
Programs that use LD_ABS insn need additional code in prologue, but it
was not emitted during 1st pass, so there was a chance that 2nd pass would
adjust imm32->imm8 jump offsets to the same number of bytes as increase in
prologue, which may cause algorithm to erroneously decide that size converged.
Fix it by always emitting largest prologue in the first pass which
is detected by oldproglen==0 check.
Also change error check condition 'proglen != oldproglen' to fail gracefully.
2.
while staring at the code realized that 64-byte buffer may not be enough
when 1st insn is large, so increase it to 128 to avoid buffer overflow
(theoretical maximum size of prologue+div is 109) and add runtime check.
Fixes: 622582786c9e ("net: filter: x86: internal BPF JIT")
Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Tested-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\ \ \ \ \ \ \
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu consistent-ops changes from Tejun Heo:
"Way back, before the current percpu allocator was implemented, static
and dynamic percpu memory areas were allocated and handled separately
and had their own accessors. The distinction has been gone for many
years now; however, the now duplicate two sets of accessors remained
with the pointer based ones - this_cpu_*() - evolving various other
operations over time. During the process, we also accumulated other
inconsistent operations.
This pull request contains Christoph's patches to clean up the
duplicate accessor situation. __get_cpu_var() uses are replaced with
with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().
Unfortunately, the former sometimes is tricky thanks to C being a bit
messy with the distinction between lvalues and pointers, which led to
a rather ugly solution for cpumask_var_t involving the introduction of
this_cpu_cpumask_var_ptr().
This converts most of the uses but not all. Christoph will follow up
with the remaining conversions in this merge window and hopefully
remove the obsolete accessors"
* 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
irqchip: Properly fetch the per cpu offset
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
Revert "powerpc: Replace __get_cpu_var uses"
percpu: Remove __this_cpu_ptr
clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
sparc: Replace __get_cpu_var uses
avr32: Replace __get_cpu_var with __this_cpu_write
blackfin: Replace __get_cpu_var uses
tile: Use this_cpu_ptr() for hardware counters
tile: Replace __get_cpu_var uses
powerpc: Replace __get_cpu_var uses
alpha: Replace __get_cpu_var
ia64: Replace __get_cpu_var uses
s390: cio driver &__get_cpu_var replacements
s390: Replace __get_cpu_var uses
mips: Replace __get_cpu_var uses
MIPS: Replace __get_cpu_var uses in FPU emulator.
arm: Replace __this_cpu_ptr with raw_cpu_ptr
...
|