| Commit message (Collapse) | Author | Files | Lines |
|
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Chris Zankel <chris@zankel.net>
|
|
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Chris Zankel <chris@zankel.net>
|
|
When we copy a user thread with CLONE_VM, we also have to reset
windowbase and windowstart to start a pristine stack frame. Otherwise,
overflows can happen using the address 0 as the stack pointer.
Also add a special case for vfork, which continues on the
parent stack until it calls execve. Because this could be a call8, we
need to spill the stack pointer of the previus frame (if still 'live' in
the register file).
Signed-off-by: Chris Zankel <chris@zankel.net>
|
|
Define virt_to_bus and bus_to_virt as virt_to_phys, and phys_to_virt,
respectively.
Signed-off-by: Chris Zankel <chris@zankel.net>
|
|
Xtensa implements a method that allows to generate a arbitrary output
for each system call by defining the __SYSCALL(number, function, num_args).
This usually requires to include uapi/unistd.h twice. Instead of removing
the guard agains multiple inclusion entirely, allow to include unistd.h again
only if __SYSCALL is defined. Note that __SYSCALL gets always undefined at
the end of the file.
Signed-off-by: Chris Zankel <chris@zankel.net>
|
|
|
|
An interesting effect of using the generic version of linkage.h
is that the padding is defined in terms of x86 NOPs, which can have
even more interesting effects when the assembly code looks like this:
ENTRY(func1)
mov x0, xzr
ENDPROC(func1)
// fall through
ENTRY(func2)
mov x0, #1
ret
ENDPROC(func2)
Admittedly, the code is not very nice. But having code from another
architecture doesn't look completely sane either.
The fix is to add arm64's version of linkage.h, which causes the insertion
of proper AArch64 NOPs.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The min/max call needed to have explicit types on some architectures
(e.g. mn10300). Use clamp_t instead to avoid the warning:
kernel/sys.c: In function 'override_release':
kernel/sys.c:1287:10: warning: comparison of distinct pointer types lacks a cast [enabled by default]
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Namhyung Kim reported that the build fails with:
GEN python/perf.so
gcc: error: python_ext_build/tmp//../../libtraceevent.a: No such file or directory
error: command 'gcc' failed with exit status 1
cp: cannot stat `python_ext_build/lib/perf.so': No such file or directory
make: *** [python/perf.so] Error 1
We need to propagate the TE_PATH variable to the setup.py file.
Reported-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/tip-8umiPbm4sxpknKivbjgykhut@git.kernel.org
[ Fixed superfluous variable build error. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Emit the magic string that indicates a module has a signature after the
signature data instead of before it. This allows module_sig_check() to
be made simpler and faster by the elimination of the search for the
magic string. Instead we just need to do a single memcmp().
This works because at the end of the signature data there is the
fixed-length signature information block. This block then falls
immediately prior to the magic number.
From the contents of the information block, it is trivial to calculate
the size of the signature data and thus the size of the actual module
data.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The module build process no longer creates intermediate files for module
signing, so remove them from .gitignore.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Turn sign-file into perl and merge in x509keyid. The latter doesn't
need to be a separate script as it doesn't actually need to work out the
SHA1 sum of the X.509 certificate itself, since it can get that from the
X.509 certificate.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
/proc/<pid>/numa_maps scans vma and show mempolicy under
mmap_sem. It sometimes accesses task->mempolicy which can
be freed without mmap_sem and numa_maps can show some
garbage while scanning.
This patch tries to take reference count of task->mempolicy at reading
numa_maps before calling get_vma_policy(). By this, task->mempolicy
will not be freed until numa_maps reaches its end.
V2->v3
- updated comments to be more verbose.
- removed task_lock() in numa_maps code.
V1->V2
- access task->mempolicy only once and remember it. Becase kernel/exit.c
can overwrite it.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If there is only one match, the unique matched entry should be returned.
Without the fix, the upcoming dma debug interfaces ("dma-debug: new
interfaces to debug dma mapping errors") can't work reliably because
only device and dma_addr are passed to dma_mapping_error().
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Tested-by: Shuah Khan <shuah.khan@hp.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Thierry reported that the "iron out" patch for isolate_freepages_block()
had problems due to the strict check being too strict with "mm:
compaction: Iron out isolate_freepages_block() and
isolate_freepages_range() -fix1". It's possible that more pages than
necessary are isolated but the check still fails and I missed that this
fix was not picked up before RC1. This same problem has been identified
in 3.7-RC1 by Tony Prisk and should be addressed by the following patch.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Tested-by: Tony Prisk <linux@prisktech.co.nz>
Reported-by: Thierry Reding <thierry.reding@avionic-design.de>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Richard Davies <richard@arachsys.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Fix this build error:
drivers/firmware/memmap.c:240:19: error: conflicting types for 'memmap_init'
arch/ia64/include/asm/pgtable.h:565:17: note: previous declaration of 'memmap_init' was here
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Bernhard Walle <bwalle@suse.de>
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
free_pid_ns() operates in a recursive fashion:
free_pid_ns(parent)
put_pid_ns(parent)
kref_put(&ns->kref, free_pid_ns);
free_pid_ns
thus if there was a huge nesting of namespaces the userspace may trigger
avalanche calling of free_pid_ns leading to kernel stack exhausting and a
panic eventually.
This patch turns the recursion into an iterative loop.
Based on a patch by Andrew Vagin.
[akpm@linux-foundation.org: export put_pid_ns() to modules]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
lm3639_bled_mode_store() error paths
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Calling uname() with the UNAME26 personality set allows a leak of kernel
stack contents. This fixes it by defensively calculating the length of
copy_to_user() call, making the len argument unsigned, and initializing
the stack buffer to zero (now technically unneeded, but hey, overkill).
CVE-2012-0957
Reported-by: PaX Team <pageexec@freemail.hu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: PaX Team <pageexec@freemail.hu>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 5ab1c309b344 ("coredump: pass siginfo_t* to do_coredump() and
below, not merely signr") added siginfo_t to linux/coredump.h but forgot
to include asm/siginfo.h. This breaks the build for UML/i386. (And any
other arch where asm/siginfo.h is not magically preincluded...)
In file included from arch/x86/um/elfcore.c:2:0: include/linux/coredump.h:15:25: error: unknown type name 'siginfo_t'
make[1]: *** [arch/x86/um/elfcore.o] Error 1
Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: "Jonathan M. Foote" <jmfoote@cert.org>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In commit 0b173bc4daa8 ("mm: kill vma flag VM_CAN_NONLINEAR") we
replaced the VM_CAN_NONLINEAR test with checking whether the mapping has
a '->remap_pages()' vm operation, but there is no guarantee that there
it even has a vm_ops pointer at all.
Add the appropriate test for NULL vm_ops.
Reported-by: Sasha Levin <levinsasha928@gmail.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Rusty had clearly not actually tested his module signing changes that I
(trustingly) applied as commit e2a666d52b48 ("kbuild: sign the modules
at install time"). That commit had multiple bugs:
- using "${#VARIABLE}" to get the number of characters in a shell
variable may look clever, but it's locale-dependent: it returns the
number of *characters*, not bytes. And we do need bytes.
So don't use "${#..}" expansion, do the stupid "wc -c" thing instead
(where "c" stands for "bytes", not "characters", despite the letter.
- Rusty had confused "siglen" and "signerlen", and his conversion
didn't set "signerlen" at all, and incorrectly set "siglen" to the
size of the signer, not the size of the signature.
End result: the modified sign-file script did create something that
superficially *looked* like a signature, but didn't actually work at
all, and would fail the signature check. Oops.
Tssk, tssk, Rusty.
But Rusty was definitely right that this whole thing should be rewritten
in perl by somebody who has the perl-fu to do so. That is not me,
though - I'm just doing an emergency fix for the shell script.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit cb6b6df111e4 ("xen/pv-on-hvm kexec: add quirk for Xen 3.4 and
shutdown watches.") added the xen_strict_xenbus_quirk() function with an
old K&R-style declaration without proper typing, causing gcc to rightly
complain:
drivers/xen/xenbus/xenbus_xs.c:628:13: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
because we really don't live in caves using stone-age tools any more,
and the kernel has always used properly typed ANSI C function
declarations.
So if a function doesn't take arguments, we tell the compiler so
explicitly by adding the proper "void" in the prototype.
I'm sure there are tons of other examples of this kind of stuff in the
tree, but this is the one that hits my workstation config, so..
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Since I will be maintaining ACPI together with Len from now on, add my
address to the ACPI maintainers list in the MAINTAINERS file (this is
the address to send patches to).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Len Brown <len.brown@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
ehci_fsl_setup_phy is supposed to return an int, but had a void return
value in the case of controller_ver being invalid.
Introduced by commit 3735ba8db8e6 ("powerpc/usb: fix bug of CPU hang
when missing USB PHY clock"), which missed one return.
Signed-off-by: Ben Collins <ben.c@servergy.com>
Cc: Shengzhou Liu <Shengzhou.Liu@freescale.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add the following system calls to the syscall table:
fallocate
sendmmsg
umount2
syncfs
epoll_create1
inotify_init1
signalfd4
dup3
pipe2
timerfd_create
timerfd_settime
timerfd_gettime
eventfd2
preadv
pwritev
fanotify_init
fanotify_mark
process_vm_readv
process_vm_writev
name_to_handle_at
open_by_handle_at
sync_file_range
perf_event_open
rt_tgsigqueueinfo
clock_adjtime
prlimit64
kcmp
Note that we have to use the 'sys_sync_file_range2' version, so that
the 64-bit arguments are aligned correctly to the argument registers.
Signed-off-by: Chris Zankel <chris@zankel.net>
|
|
Fix two compiler warnings complaining about truncating a value on
a 64-bit host, and about declaring an unused variable that is only
used for a specific configuration.
Signed-off-by: Chris Zankel <chris@zankel.net>
|
|
Linus deleted the old code and put signing on the install command,
I fixed it to extract the keyid and signer-name within sign-file
and cleaned up that script now it always signs in-place.
Some enthusiast should convert sign-key to perl and pull
x509keyid into it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
450cc201038f3 ("x86/mce: Provide boot argument to honour bios-set CMCI
threshold") added the bios_cmci_threshold sysfs attribute which was
supposed to communicate to userspace tools that BIOS CMCI threshold has
been honoured.
However, this info is not of any importance to userspace - it should
rather get the actual error count it has been thresholded already from
MCi_STATUS[38:52].
So drop this before it becomes a used interface (good thing we caught
this early in 3.7-rc1, right after the merge window closed).
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/20121017105940.GA14590@x1.osrc.amd.com
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
|
|
Code Aurora Forum (CAF) is becoming a part of Linux Foundation Labs.
Signed-off-by: Richard Kuo <rkuo@codeaurora.org>
|
|
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Richard Kuo <rkuo@codeaurora.org>
|
|
aesni_enc()
Calling convention for internal functions and 'asmlinkage' functions is
different on x86-32. Therefore do not directly cast aesni_enc as XTS tweak
function, but use wrapper function in between. Fixes crash with "XTS +
aesni_intel + x86-32" combination.
Cc: stable@vger.kernel.org
Reported-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 38f38657444d ("xattr: extract simple_xattr code from tmpfs") moved
some code from tmpfs but introduced a subtle bug along the way.
If the name passed to simple_xattr_remove() does not exist in the list of
xattrs, then it is possible to call kfree(new_xattr) when new_xattr is
actually initialized to itself on the stack via uninitialized_var().
This causes a BUG() since the memory was not allocated via the slab
allocator and was not bypassed through to the page allocator because it
was too large.
Initialize the local variable to NULL so the kfree() never takes place.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If a debugger tries to zero a hardware debug control register, the
kernel will try to infer both the type and length of the breakpoint
in order to sanity-check against the requested regset type. This will
fail because the encoding will appear as a zero-length breakpoint.
This patch changes the control register setting so that disabled
breakpoints are treated as HW_BREAKPOINT_EMPTY and no further
sanity-checking is required.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The user_hwdebug_state structure contains implicit padding to conform to
the alignment requirements of the AArch64 ABI (namely that aggregates
must be aligned to their most aligned member).
This patch fixes the ptrace functions operating on struct
user_hwdebug_state so that the padding is handled correctly.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
For historical reasons, ARM used to set r0-r2 in start_thread() to the
first values on the user stack when starting a new user application. The
same logic has been inherited in AArch64. The x0 register is overridden
by the sys_execve() return value so it's always zero on success. The x1
and x2 registers are ignored by AArch64 and EABI AArch32 applications,
so we can safely remove the register setting for both native and compat
user space.
This also fixes a potential fault with the kernel accessing user space
stack directly.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
According to Documentation/arm64/booting.txt, the kernel image must be
loaded at a pre-defined offset from the start of RAM so that the kernel
can calculate PHYS_OFFSET based on this address. If the DT contains
memory blocks below this PHYS_OFFSET, report them and ignore the
corresponding memory range.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
With commit 576094b7 (time: Introduce new GENERIC_TIME_VSYSCALL) the old
update_vsyscall() prototype is no longer available. This patch updates
the arm64 port.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: John Stultz <john.stultz@linaro.org>
|
|
With commit 786d35d4 (make most arch asm/module.h files use
asm-generic/module.h) arm64 needs to enable MODULES_USE_ELF_RELA for
loadable modules.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Will Deacon <Will.Deacon@arm.com>
|
|
Change event type to switch for the power and autopower switches.
Additionally, this patch aligns the keycodes with the other linkstation
boards already supported by linux.
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
|
|
Don't use the specific board name in a the common device tree include file.
Instead use the common name 'lsxl'.
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
|
|
Move the CACHE_FEROCEON_L2 test to kirkwood_l2_init, since linking
fails on the reference to feroceon_l2_init.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
|
|
Cc: <stable@vger.kernel.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Based on information from the ZTE Windows drivers.
Cc: <stable@vger.kernel.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This minor change adds a new system to which the "Fix Compliance Mode
on SN65LVPE502CP Hardware" patch has to be applied also.
System added:
Vendor: Hewlett-Packard. System Model: Z1
Signed-off-by: Alexis R. Cortes <alexis.cortes@ti.com>
Acked-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Make sure port data is initialised before creating sysfs attributes to
avoid a race.
A recent patch ("USB: io_ti: fix port-data memory leak") got the
sysfs-attribute creation and port-data initialisation ordering wrong.
Cc: <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Make sure sysfs attributes are created at port probe.
A recent patch ("USB: iuu_phoenix: fix port-data memory leak") removed
the sysfs-attribute creation by mistake.
Reported-by: Yuanhan Liu <yuanhan.liu@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The CPUNum Field in EBase register is 10bit wide, so after 1 bit right
shift, the mask value should be 0x1ff.
Signed-off-by: jerin jacob <jerinjacobk@gmail.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/4420/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Earlier without cpuidle framework on pseries, the native arch
idle routine comprised of both snooze and nap
states. smt_snooze_delay variable was used to delay
the idle process entry to deeper idle state like nap.
With the coming of cpuidle, this arch specific idle was replaced
by two different idle routines, one for supporting snooze and other
for nap. This enabled addition of more
low level idle states on pseries in the future.
On adopting the generic cpuidle framework for POWER systems,
the decision of which idle state to choose from, given a predicted
idle time is taken by the menu governor based on
target_residency and exit_latency of the idle states.
target_residency is the minimum time to be resident in that idle state.
Exit_latency is time taken to exit out of idle state.
Deeper the idle state, both the target residency and exit latency
would be higher.
In the current design, smt_snooze_delay is used as target_residency
for the snooze state which is incorrect, as it is not the
minimum but the maximum duration to be in snooze state.
This would result in the governor in taking bad decision,
as presently target_residency of nap < target_residency of snooze
inspite of nap being deeper idle state.
This patch aims to fix this problem by replacing the smt_snooze_delay loop
in snooze state, with the need_resched() as the governor is aware of
entry and exit of various idle transitions based on which
next idle time prediction.
The governor is intelligent enough to determine the idle state the needs to
be transitioned to and maintains a whole of heuristics including
io load, previous idle states predictions etc for the same, based on
which idle state entry decision is taken.
With this fix, of setting target_residency of snooze to 0
nap to smt_snooze_delay
if the predicted idle time is less
than smt_snooze_delay (target_residency of nap)
value governor would pick snooze state, else nap. This adhers to the
previous native idle design.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
smt_snooze_delay was designed to delay idle loop's nap entry
in the native idle code before it got ported over to use as part of
the cpuidle framework.
A -ve value assigned to smt_snooze_delay should result in
busy looping, in other words disabling the entry to nap state.
- https://lists.ozlabs.org/pipermail/linuxppc-dev/2010-May/082450.html
This particular functionality can be achieved currently by
echo 1 > /sys/devices/system/cpu/cpu*/state1/disable
but it is broken when one assigns -ve value to the smt_snooze_delay
variable either via sysfs entry or ppc64_cpu util.
This patch aims to fix this, by disabling nap state when smt_snooze_delay
variable is set to -ve value.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|