summaryrefslogtreecommitdiffstats
path: root/security (unfollow)
Commit message (Collapse)AuthorFilesLines
2015-03-12rcu: Move rcu_report_unblock_qs_rnp() to common codePaul E. McKenney2-38/+41
The rcu_report_unblock_qs_rnp() function is invoked when the last task blocking the current grace period exits its outermost RCU read-side critical section. Previously, this was called only from rcu_read_unlock_special(), and was therefore defined only when CONFIG_RCU_PREEMPT=y. However, this function will be invoked even when CONFIG_RCU_PREEMPT=n once CPU-hotplug operations are processed only at the beginnings of RCU grace periods. The reason for this change is that the last task on a given leaf rcu_node structure's ->blkd_tasks list might well exit its RCU read-side critical section between the time that recent CPU-hotplug operations were applied and when the new grace period was initialized. This situation could result in RCU waiting forever on that leaf rcu_node structure, because if all that structure's CPUs were already offline, there would be no quiescent-state events to drive that structure's part of the grace period. This commit therefore moves rcu_report_unblock_qs_rnp() to common code that is built unconditionally so that the quiescent-state-forcing code can clean up after this situation, avoiding the grace-period stall. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-12rcu: Rework preemptible expedited bitmask handlingPaul E. McKenney1-23/+75
Currently, the rcu_node tree ->expmask bitmasks are initially set to reflect the online CPUs. This is pointless, because only the CPUs preempted within RCU read-side critical sections by the preceding synchronize_sched_expedited() need to be tracked. This commit therefore instead sets up these bitmasks based on the state of the ->blkd_tasks lists. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-11rcu: Remove event tracing from rcu_cpu_notify(), used by offline CPUsPaul E. McKenney1-2/+0
Offline CPUs cannot safely invoke trace events, but such CPUs do execute within rcu_cpu_notify(). Therefore, this commit removes the trace events from rcu_cpu_notify(). These trace events are for utilization, against which rcu_cpu_notify() execution time should be negligible. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-11rcutorture: Enable slow grace-period initializationsPaul E. McKenney1-0/+1
This commit sets CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y, but leaves the default time zero. This can be overridden by passing the "--bootargs rcutree.gp_init_delay=1" argument to kvm.sh. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-11rcu: Provide diagnostic option to slow down grace-period initializationPaul E. McKenney3-0/+40
Grace-period initialization normally proceeds quite quickly, so that it is very difficult to reproduce races against grace-period initialization. This commit therefore allows grace-period initialization to be artificially slowed down, increasing race-reproduction probability. A pair of new Kconfig parameters are provided, CONFIG_RCU_TORTURE_TEST_SLOW_INIT to enable the slowdowns, and CONFIG_RCU_TORTURE_TEST_SLOW_INIT_DELAY to specify the number of jiffies of slowdown to apply. A boot-time parameter named rcutree.gp_init_delay allows boot-time delay to be specified. By default, no delay will be applied even if CONFIG_RCU_TORTURE_TEST_SLOW_INIT is set. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-11rcu: Detect stalls caused by failure to propagate up rcu_node treePaul E. McKenney1-2/+3
If all CPUs have passed through quiescent states, then stalls might be due to starvation of the grace-period kthread or to failure to propagate the quiescent states up the rcu_node combining tree. The current stall warning messages do not differentiate, so this commit adds a printout of the root rcu_node structure's ->qsmask field. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-11rcu: Eliminate empty HOTPLUG_CPU ifdefPaul E. McKenney1-4/+0
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-11rcu: Simplify sync_rcu_preempt_exp_init()Paul E. McKenney1-4/+1
This commit eliminates a boolean and associated "if" statement by rearranging the code. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-11rcu: Put all orphan-callback-related code under same commentPaul E. McKenney1-1/+1
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-11rcu: Consolidate offline-CPU callback initializationPaul E. McKenney1-4/+5
Currently, both rcu_cleanup_dead_cpu() and rcu_send_cbs_to_orphanage() initialize the outgoing CPU's callback list. However, only rcu_cleanup_dead_cpu() invokes rcu_send_cbs_to_orphanage(), and it does so unconditionally, which means that only one of these initializations is required. This commit therefore consolidates the callback-list initialization with the rest of the callback handling in rcu_send_cbs_to_orphanage(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-11metag: Use common outgoing-CPU-notification codePaul E. McKenney1-3/+2
This commit removes the open-coded CPU-offline notification with new common code. This change avoids calling scheduler code using RCU from an offline CPU that RCU is ignoring. This commit is compatible with the existing code in not checking for timeout during a prior offline for a given CPU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: <linux-metag@vger.kernel.org>
2015-03-11blackfin: Use common outgoing-CPU-notification codePaul E. McKenney1-4/+2
This commit removes the open-coded CPU-offline notification with new common code. This change avoids calling scheduler code using RCU from an offline CPU that RCU is ignoring. This commit is compatible with the existing code in not checking for timeout during a prior offline for a given CPU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Steven Miao <realmz6@gmail.com> Cc: <adi-buildroot-devel@lists.sourceforge.net>
2015-03-11x86: Use common outgoing-CPU-notification codePaul E. McKenney4-45/+44
This commit removes the open-coded CPU-offline notification with new common code. Among other things, this change avoids calling scheduler code using RCU from an offline CPU that RCU is ignoring. It also allows Xen to notice at online time that the CPU did not go offline correctly. Note that Xen has the surviving CPU carry out some cleanup operations, so if the surviving CPU times out, these cleanup operations might have been carried out while the outgoing CPU was still running. It might therefore be unwise to bring this CPU back online, and this commit avoids doing so. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: <x86@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: <xen-devel@lists.xenproject.org>
2015-03-11smpboot: Add common code for notification from dying CPUPaul E. McKenney2-0/+168
RCU ignores offlined CPUs, so they cannot safely run RCU read-side code. (They -can- use SRCU, but not RCU.) This means that any use of RCU during or after the call to arch_cpu_idle_dead(). Unfortunately, commit 2ed53c0d6cc99 added a complete() call, which will contain RCU read-side critical sections if there is a task waiting to be awakened. Which, as it turns out, there almost never is. In my qemu/KVM testing, the to-be-awakened task is not yet asleep more than 99.5% of the time. In current mainline, failure is even harder to reproduce, requiring a virtualized environment that delays the outgoing CPU by at least three jiffies between the time it exits its stop_machine() task at CPU_DYING time and the time it calls arch_cpu_idle_dead() from the idle loop. However, this problem really can occur, especially in virtualized environments, and therefore really does need to be fixed This suggests moving back to the polling loop, but using a much shorter wait, with gentle exponential backoff instead of the old 100-millisecond wait. Most of the time, the loop will exit without waiting at all, and almost all of the remaining uses will wait only five microseconds. If the outgoing CPU is preempted, a loop will wait one jiffy, then increase the wait by a factor of 11/10ths, rounding up. As before, there is a five-second timeout. This commit therefore provides common-code infrastructure to do the dying-to-surviving CPU handoff in a safe manner. This code also provides an indication at CPU-online of whether the CPU to be onlined previously timed out on offline. The new cpu_check_up_prepare() function returns -EBUSY if this CPU previously took more than five seconds to go offline, or -EAGAIN if it has not yet managed to go offline. The rationale for -EAGAIN is that it might still be preempted, so an additional wait might well find it correctly offlined. Architecture-specific code can decide how to handle these conditions. Systems in which CPUs take themselves completely offline might respond to an -EBUSY return as if it was a zero (success) return. Systems in which the surviving CPU must take some action might take it at this time, or might simply mark the other CPU as unusable. Note that architectures that take the easy way out and simply pass the -EBUSY and -EAGAIN upwards will change the sysfs API. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: <linux-api@vger.kernel.org> Cc: <linux-arch@vger.kernel.org> [ paulmck: Fixed state machine for architectures that don't check earlier CPU-hotplug results as suggested by James Hogan. ]
2015-02-23Linux 4.0-rc1v4.0-rc1Linus Torvalds1-4/+4
.. after extensive statistical analysis of my G+ polling, I've come to the inescapable conclusion that internet polls are bad. Big surprise. But "Hurr durr I'ma sheep" trounced "I like online polls" by a 62-to-38% margin, in a poll that people weren't even supposed to participate in. Who can argue with solid numbers like that? 5,796 votes from people who can't even follow the most basic directions? In contrast, "v4.0" beat out "v3.20" by a slimmer margin of 56-to-44%, but with a total of 29,110 votes right now. Now, arguably, that vote spread is only about 3,200 votes, which is less than the almost six thousand votes that the "please ignore" poll got, so it could be considered noise. But hey, I asked, so I'll honor the votes.
2015-02-22autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocationAl Viro1-2/+6
X-Coverup: just ask spender Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22procfs: fix race between symlink removals and traversalsAl Viro3-12/+22
use_pde()/unuse_pde() in ->follow_link()/->put_link() resp. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22debugfs: leave freeing a symlink body until inode evictionAl Viro1-17/+17
As it is, we have debugfs_remove() racing with symlink traversals. Supply ->evict_inode() and do freeing there - inode will remain pinned until we are done with the symlink body. And rip the idiocy with checking if dentry is positive right after we'd verified debugfs_positive(), which is a stronger check... Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22Documentation/filesystems/Locking: ->get_sb() is long goneAl Viro1-2/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22trylock_super(): replacement for grab_super_passive()Konstantin Khlebnikov3-26/+22
I've noticed significant locking contention in memory reclaimer around sb_lock inside grab_super_passive(). Grab_super_passive() is called from two places: in icache/dcache shrinkers (function super_cache_scan) and from writeback (function __writeback_inodes_wb). Both are required for progress in memory allocator. Grab_super_passive() acquires sb_lock to increment sb->s_count and check sb->s_instances. It seems sb->s_umount locked for read is enough here: super-block deactivation always runs under sb->s_umount locked for write. Protecting super-block itself isn't a problem: in super_cache_scan() sb is protected by shrinker_rwsem: it cannot be freed if its slab shrinkers are still active. Inside writeback super-block comes from inode from bdi writeback list under wb->list_lock. This patch removes locking sb_lock and checks s_instances under s_umount: generic_shutdown_super() unlinks it under sb->s_umount locked for write. New variant is called trylock_super() and since it only locks semaphore, callers must call up_read(&sb->s_umount) instead of drop_super(sb) when they're done. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversionsDavid Howells1-1/+1
Fanotify probably doesn't want to watch autodirs so make it use d_can_lookup() rather than d_is_dir() when checking a dir watch and give an error on fake directories. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversionsDavid Howells4-9/+9
Fix up the following scripted S_ISDIR/S_ISREG/S_ISLNK conversions (or lack thereof) in cachefiles: (1) Cachefiles mostly wants to use d_can_lookup() rather than d_is_dir() as it doesn't want to deal with automounts in its cache. (2) Coccinelle didn't find S_IS* expressions in ASSERT() statements in cachefiles. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)David Howells34-71/+71
Convert the following where appropriate: (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry). (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry). (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more complicated than it appears as some calls should be converted to d_can_lookup() instead. The difference is whether the directory in question is a real dir with a ->lookup op or whether it's a fake dir with a ->d_automount op. In some circumstances, we can subsume checks for dentry->d_inode not being NULL into this, provided we the code isn't in a filesystem that expects d_inode to be NULL if the dirent really *is* negative (ie. if we're going to use d_inode() rather than d_backing_inode() to get the inode pointer). Note that the dentry type field may be set to something other than DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS manages the fall-through from a negative dentry to a lower layer. In such a case, the dentry type of the negative union dentry is set to the same as the type of the lower dentry. However, if you know d_inode is not NULL at the call site, then you can use the d_is_xxx() functions even in a filesystem. There is one further complication: a 0,0 chardev dentry may be labelled DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was intended for special directory entry types that don't have attached inodes. The following perl+coccinelle script was used: use strict; my @callers; open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') || die "Can't grep for S_ISDIR and co. callers"; @callers = <$fd>; close($fd); unless (@callers) { print "No matches\n"; exit(0); } my @cocci = ( '@@', 'expression E;', '@@', '', '- S_ISLNK(E->d_inode->i_mode)', '+ d_is_symlink(E)', '', '@@', 'expression E;', '@@', '', '- S_ISDIR(E->d_inode->i_mode)', '+ d_is_dir(E)', '', '@@', 'expression E;', '@@', '', '- S_ISREG(E->d_inode->i_mode)', '+ d_is_reg(E)' ); my $coccifile = "tmp.sp.cocci"; open($fd, ">$coccifile") || die $coccifile; print($fd "$_\n") || die $coccifile foreach (@cocci); close($fd); foreach my $file (@callers) { chomp $file; print "Processing ", $file, "\n"; system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 || die "spatch failed"; } [AV: overlayfs parts skipped] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22SELinux: Use d_is_positive() rather than testing dentry->d_inodeDavid Howells1-2/+2
Use d_is_positive() rather than testing dentry->d_inode in SELinux to get rid of direct references to d_inode outside of the VFS. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22Smack: Use d_is_positive() rather than testing dentry->d_inodeDavid Howells1-2/+2
Use d_is_positive() rather than testing dentry->d_inode in Smack to get rid of direct references to d_inode outside of the VFS. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR()David Howells1-3/+1
Use d_is_dir() rather than d_inode and S_ISDIR(). Note that this will include fake directories such as automount triggers. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22Apparmor: Use d_is_positive/negative() rather than testing dentry->d_inodeDavid Howells1-1/+1
Use d_is_positive(dentry) or d_is_negative(dentry) rather than testing dentry->d_inode as the dentry may cover another layer that has an inode when the top layer doesn't or may hold a 0,0 chardev that's actually a whiteout. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sbDavid Howells2-12/+12
mediated_filesystem() should use dentry->d_sb not dentry->d_inode->i_sb and should avoid file_inode() also since it is really dealing with the path. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22VFS: Split DCACHE_FILE_TYPE into regular and special typesDavid Howells2-8/+27
Split DCACHE_FILE_TYPE into DCACHE_REGULAR_TYPE (dentries representing regular files) and DCACHE_SPECIAL_TYPE (representing blockdev, chardev, FIFO and socket files). d_is_reg() and d_is_special() are added to detect these subtypes and d_is_file() is left as the union of the two. This allows a number of places that use S_ISREG(dentry->d_inode->i_mode) to use d_is_reg(dentry) instead. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22VFS: Add a fallthrough flag for marking virtual dentriesDavid Howells2-1/+27
Add a DCACHE_FALLTHRU flag to indicate that, in a layered filesystem, this is a virtual dentry that covers another one in a lower layer that should be used instead. This may be recorded on medium if directory integration is stored there. The flag can be set with d_set_fallthru() and tested with d_is_fallthru(). Original-author: Valerie Aurora <vaurora@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22VFS: Add a whiteout dentry typeDavid Howells1-6/+18
Add DCACHE_WHITEOUT_TYPE and provide a d_is_whiteout() accessor function. A d_is_miss() accessor is also added for ordinary cache misses and d_is_negative() is modified to indicate either an ordinary miss or an enforced miss (whiteout). Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22VFS: Introduce inode-getting helpers for layered/unioned fs environmentsDavid Howells1-0/+57
Introduce some function for getting the inode (and also the dentry) in an environment where layered/unioned filesystems are in operation. The problem is that we have places where we need *both* the union dentry and the lower source or workspace inode or dentry available, but we can only have a handle on one of them. Therefore we need to derive the handle to the other from that. The idea is to introduce an extra field in struct dentry that allows the union dentry to refer to and pin the lower dentry. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-21kernel: make READ_ONCE() valid on const argumentsLinus Torvalds1-3/+3
The use of READ_ONCE() causes lots of warnings witht he pending paravirt spinlock fixes, because those ends up having passing a member to a 'const' structure to READ_ONCE(). There should certainly be nothing wrong with using READ_ONCE() with a const source, but the helper function __read_once_size() would cause warnings because it would drop the 'const' qualifier, but also because the destination would be marked 'const' too due to the use of 'typeof'. Use a union of types in READ_ONCE() to avoid this issue. Also make sure to use parenthesis around the macro arguments to avoid possible operator precedence issues. Tested-by: Ingo Molnar <mingo@kernel.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-21blk-throttle: check stats_cpu before reading it from sysfsThadeu Lima de Souza Cascardo1-0/+3
When reading blkio.throttle.io_serviced in a recently created blkio cgroup, it's possible to race against the creation of a throttle policy, which delays the allocation of stats_cpu. Like other functions in the throttle code, just checking for a NULL stats_cpu prevents the following oops caused by that race. [ 1117.285199] Unable to handle kernel paging request for data at address 0x7fb4d0020 [ 1117.285252] Faulting instruction address: 0xc0000000003efa2c [ 1137.733921] Oops: Kernel access of bad area, sig: 11 [#1] [ 1137.733945] SMP NR_CPUS=2048 NUMA PowerNV [ 1137.734025] Modules linked in: bridge stp llc kvm_hv kvm binfmt_misc autofs4 [ 1137.734102] CPU: 3 PID: 5302 Comm: blkcgroup Not tainted 3.19.0 #5 [ 1137.734132] task: c000000f1d188b00 ti: c000000f1d210000 task.ti: c000000f1d210000 [ 1137.734167] NIP: c0000000003efa2c LR: c0000000003ef9f0 CTR: c0000000003ef980 [ 1137.734202] REGS: c000000f1d213500 TRAP: 0300 Not tainted (3.19.0) [ 1137.734230] MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI> CR: 42008884 XER: 20000000 [ 1137.734325] CFAR: 0000000000008458 DAR: 00000007fb4d0020 DSISR: 40000000 SOFTE: 0 GPR00: c0000000003ed3a0 c000000f1d213780 c000000000c59538 0000000000000000 GPR04: 0000000000000800 0000000000000000 0000000000000000 0000000000000000 GPR08: ffffffffffffffff 00000007fb4d0020 00000007fb4d0000 c000000000780808 GPR12: 0000000022000888 c00000000fdc0d80 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 000001003e120200 c000000f1d5b0cc0 0000000000000200 0000000000000000 GPR24: 0000000000000001 c000000000c269e0 0000000000000020 c000000f1d5b0c80 GPR28: c000000000ca3a08 c000000000ca3dec c000000f1c667e00 c000000f1d213850 [ 1137.734886] NIP [c0000000003efa2c] .tg_prfill_cpu_rwstat+0xac/0x180 [ 1137.734915] LR [c0000000003ef9f0] .tg_prfill_cpu_rwstat+0x70/0x180 [ 1137.734943] Call Trace: [ 1137.734952] [c000000f1d213780] [d000000005560520] 0xd000000005560520 (unreliable) [ 1137.734996] [c000000f1d2138a0] [c0000000003ed3a0] .blkcg_print_blkgs+0xe0/0x1a0 [ 1137.735039] [c000000f1d213960] [c0000000003efb50] .tg_print_cpu_rwstat+0x50/0x70 [ 1137.735082] [c000000f1d2139e0] [c000000000104b48] .cgroup_seqfile_show+0x58/0x150 [ 1137.735125] [c000000f1d213a70] [c0000000002749dc] .kernfs_seq_show+0x3c/0x50 [ 1137.735161] [c000000f1d213ae0] [c000000000218630] .seq_read+0xe0/0x510 [ 1137.735197] [c000000f1d213bd0] [c000000000275b04] .kernfs_fop_read+0x164/0x200 [ 1137.735240] [c000000f1d213c80] [c0000000001eb8e0] .__vfs_read+0x30/0x80 [ 1137.735276] [c000000f1d213cf0] [c0000000001eb9c4] .vfs_read+0x94/0x1b0 [ 1137.735312] [c000000f1d213d90] [c0000000001ebb38] .SyS_read+0x58/0x100 [ 1137.735349] [c000000f1d213e30] [c000000000009218] syscall_exit+0x0/0x98 [ 1137.735383] Instruction dump: [ 1137.735405] 7c6307b4 7f891800 409d00b8 60000000 60420000 3d420004 392a63b0 786a1f24 [ 1137.735471] 7d49502a e93e01c8 7d495214 7d2ad214 <7cead02a> e9090008 e9490010 e9290018 And here is one code that allows to easily reproduce this, although this has first been found by running docker. void run(pid_t pid) { int n; int status; int fd; char *buffer; buffer = memalign(BUFFER_ALIGN, BUFFER_SIZE); n = snprintf(buffer, BUFFER_SIZE, "%d\n", pid); fd = open(CGPATH "/test/tasks", O_WRONLY); write(fd, buffer, n); close(fd); if (fork() > 0) { fd = open("/dev/sda", O_RDONLY | O_DIRECT); read(fd, buffer, 512); close(fd); wait(&status); } else { fd = open(CGPATH "/test/blkio.throttle.io_serviced", O_RDONLY); n = read(fd, buffer, BUFFER_SIZE); close(fd); } free(buffer); exit(0); } void test(void) { int status; mkdir(CGPATH "/test", 0666); if (fork() > 0) wait(&status); else run(getpid()); rmdir(CGPATH "/test"); } int main(int argc, char **argv) { int i; for (i = 0; i < NR_TESTS; i++) test(); return 0; } Reported-by: Ricardo Marin Matinata <rmm@br.ibm.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@fb.com>
2015-02-20MIPS: sead3: Corrected get_c0_perfcount_intNiklas Cassel1-1/+1
Commit e9de688dac65 ("irqchip: mips-gic: Support local interrupts") updated several platforms. This is a copy paste error. Signed-off-by: Niklas Cassel <niklass@axis.com> Reviewed-by: Andrew Bresticker <abrestic@chromium.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/9245/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-02-20MIPS: mm: Remove dead macro definitionsAndreas Ruprecht2-16/+0
In commit c441d4a54c6e ("MIPS: mm: Only build one microassembler that is suitable"), the Makefile at arch/mips/mm was rewritten to only build the "right" microassembler file, depending on whether CONFIG_CPU_MICROMIPS is set or not. In the files, however, there are still preprocessor definitions depending on CONFIG_CPU_MICROMIPS. The #ifdef around them can now never evaluate to true, so let's remove them altogether. This inconsistency was found using the undertaker-checkpatch tool. Signed-off-by: Andreas Ruprecht <rupran@einserver.de> Reviewed-by: Maciej W. Rozycki <macro@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: Valentin Rothberg <valentinrothberg@gmail.com> Cc: Paul Bolle <pebolle@tiscali.nl> Patchwork: https://patchwork.linux-mips.org/patch/9267/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-02-20IB/qib: Add blank line after declarationMike Marciniszyn17-6/+67
Upstream checkpatch now requires this. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2015-02-20IB/qib: Fix checkpatch warningsMike Marciniszyn15-44/+42
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2015-02-20i2c: ocores: rework clk code to handle NULL cookieWolfram Sang1-14/+21
For, !HAVE_CLK the clk API returns a NULL cookie. Rework the initialization code to handle that. If clk_get_rate() delivers 0, we use the fallback mechanisms. The patch is pretty easy when ignoring white space issues (git diff -b). Suggested-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Tested-by: Max Filippov <jcmvbkbc@gmail.com>
2015-02-20MIPS: OCTEON: irq: add CIB and other fixesDavid Daney2-269/+823
- Use of_irq_init() to initialize interrupt controllers - Get rid of some unlikely() - Add CIB to support SATA and other interrupts - Add support for CIU SUM2 interrupt sources Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Leonid Rosenboim <lrosenboim@caviumnetworks.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com> Signed-off-by: Peter Swain <peter.swain@cavium.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: Rob Herring <robh+dt@kernel.org> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ian Campbell <ijc+devicetree@hellion.org.uk> Cc: Kumar Gala <galak@codeaurora.org> Cc: devicetree@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/8947/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-02-20MIPS: OCTEON: Don't do acknowledge operations for level triggered irqs.David Daney1-2/+43
The acknowledge bits don't exist for level triggered irqs, so setting them causes the simulator to terminate. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Leonid Rosenboim <lrosenboim@caviumnetworks.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/8946/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-02-20MIPS: OCTEON: More OCTEONIII supportChandrakala Chavva4-2/+326
Read clock rate from the correct CSR. Don't clear COP0_DCACHE for OCTEONIII. Signed-off-by: Chandrakala Chavva <cchavva@caviumnetworks.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com> Signed-off-by: David Daney <david.daney@cavium.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/8945/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-02-20MIPS: OCTEON: Remove setting of processor specific CVMCTL icache bits.Chad Reese1-20/+0
CN38XX pass 1 required icache prefetching to be turned off. This chip never reached production and is long dead. Other processor specific icache settings are done by the bootloader. Remove these bits from the kernel. Signed-off-by: Chad Reese <kreese@caviumnetworks.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: David Daney <david.daney@cavium.com> Patchwork: https://patchwork.linux-mips.org/patch/8944/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-02-20MIPS: OCTEON: Core-15169 Workaround and general CVMSEG cleanup.David Daney2-6/+17
Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/8943/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-02-20MIPS: OCTEON: Update octeon-model.h code for new SoCs.David Daney5-27/+90
Add coverage for OCTEON III models. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/8942/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-02-20MIPS: OCTEON: Implement DCache errata workaround for all CN6XXXDavid Daney3-4/+8
Make messages refer to all CN6XXX. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/8941/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-02-20MIPS: OCTEON: Add little-endian support to asm/octeon/octeon.hDavid Daney1-30/+105
Also update union octeon_cvmemctl with new OCTEON II fields. [aleksey.makarov@auriga.com: use __BITFIELD_FIELD] Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/8940/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-02-20MIPS: OCTEON: Implement the core-16057 workaroundDavid Daney1-0/+22
Disable ICache prefetch for certian Octeon II processors. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/8938/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-02-20MIPS: OCTEON: Delete unused COP2 saving codeAleksey Makarov1-26/+0
Commit 2c952e06e4f5 ("MIPS: Move cop2 save/restore to switch_to()") removes assembler code to store COP2 registers. Commit a36d8225bceb ("MIPS: OCTEON: Enable use of FPU") mistakenly restores it Fixes: a36d8225bceb ("MIPS: OCTEON: Enable use of FPU") Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: David Daney <david.daney@cavium.com> Patchwork: https://patchwork.linux-mips.org/patch/8937/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-02-20MIPS: OCTEON: Use correct instruction to read 64-bit COP0 registerChandrakala Chavva1-3/+3
Use dmfc0/dmtc0 instructions for reading CvmMemCtl COP0 register, its a 64-bit wide. Signed-off-by: Chandrakala Chavva <cchavva@caviumnetworks.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: David Daney <david.daney@cavium.com> Patchwork: https://patchwork.linux-mips.org/patch/8936/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>