summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* modules, lock around setting of MODULE_STATE_UNFORMEDPrarit Bhargava2014-10-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A panic was seen in the following sitation. There are two threads running on the system. The first thread is a system monitoring thread that is reading /proc/modules. The second thread is loading and unloading a module (in this example I'm using my simple dummy-module.ko). Note, in the "real world" this occurred with the qlogic driver module. When doing this, the following panic occurred: ------------[ cut here ]------------ kernel BUG at kernel/module.c:3739! invalid opcode: 0000 [#1] SMP Modules linked in: binfmt_misc sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw igb gf128mul glue_helper iTCO_wdt iTCO_vendor_support ablk_helper ptp sb_edac cryptd pps_core edac_core shpchp i2c_i801 pcspkr wmi lpc_ich ioatdma mfd_core dca ipmi_si nfsd ipmi_msghandler auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: dummy_module] CPU: 37 PID: 186343 Comm: cat Tainted: GF O-------------- 3.10.0+ #7 Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013 task: ffff8807fd2d8000 ti: ffff88080fa7c000 task.ti: ffff88080fa7c000 RIP: 0010:[<ffffffff810d64c5>] [<ffffffff810d64c5>] module_flags+0xb5/0xc0 RSP: 0018:ffff88080fa7fe18 EFLAGS: 00010246 RAX: 0000000000000003 RBX: ffffffffa03b5200 RCX: 0000000000000000 RDX: 0000000000001000 RSI: ffff88080fa7fe38 RDI: ffffffffa03b5000 RBP: ffff88080fa7fe28 R08: 0000000000000010 R09: 0000000000000000 R10: 0000000000000000 R11: 000000000000000f R12: ffffffffa03b5000 R13: ffffffffa03b5008 R14: ffffffffa03b5200 R15: ffffffffa03b5000 FS: 00007f6ae57ef740(0000) GS:ffff88101e7a0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000404f70 CR3: 0000000ffed48000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffffffffa03b5200 ffff8810101e4800 ffff88080fa7fe70 ffffffff810d666c ffff88081e807300 000000002e0f2fbf 0000000000000000 ffff88100f257b00 ffffffffa03b5008 ffff88080fa7ff48 ffff8810101e4800 ffff88080fa7fee0 Call Trace: [<ffffffff810d666c>] m_show+0x19c/0x1e0 [<ffffffff811e4d7e>] seq_read+0x16e/0x3b0 [<ffffffff812281ed>] proc_reg_read+0x3d/0x80 [<ffffffff811c0f2c>] vfs_read+0x9c/0x170 [<ffffffff811c1a58>] SyS_read+0x58/0xb0 [<ffffffff81605829>] system_call_fastpath+0x16/0x1b Code: 48 63 c2 83 c2 01 c6 04 03 29 48 63 d2 eb d9 0f 1f 80 00 00 00 00 48 63 d2 c6 04 13 2d 41 8b 0c 24 8d 50 02 83 f9 01 75 b2 eb cb <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 RIP [<ffffffff810d64c5>] module_flags+0xb5/0xc0 RSP <ffff88080fa7fe18> Consider the two processes running on the system. CPU 0 (/proc/modules reader) CPU 1 (loading/unloading module) CPU 0 opens /proc/modules, and starts displaying data for each module by traversing the modules list via fs/seq_file.c:seq_open() and fs/seq_file.c:seq_read(). For each module in the modules list, seq_read does op->start() <-- this is a pointer to m_start() op->show() <- this is a pointer to m_show() op->stop() <-- this is a pointer to m_stop() The m_start(), m_show(), and m_stop() module functions are defined in kernel/module.c. The m_start() and m_stop() functions acquire and release the module_mutex respectively. ie) When reading /proc/modules, the module_mutex is acquired and released for each module. m_show() is called with the module_mutex held. It accesses the module struct data and attempts to write out module data. It is in this code path that the above BUG_ON() warning is encountered, specifically m_show() calls static char *module_flags(struct module *mod, char *buf) { int bx = 0; BUG_ON(mod->state == MODULE_STATE_UNFORMED); ... The other thread, CPU 1, in unloading the module calls the syscall delete_module() defined in kernel/module.c. The module_mutex is acquired for a short time, and then released. free_module() is called without the module_mutex. free_module() then sets mod->state = MODULE_STATE_UNFORMED, also without the module_mutex. Some additional code is called and then the module_mutex is reacquired to remove the module from the modules list: /* Now we can delete it from the lists */ mutex_lock(&module_mutex); stop_machine(__unlink_module, mod, NULL); mutex_unlock(&module_mutex); This is the sequence of events that leads to the panic. CPU 1 is removing dummy_module via delete_module(). It acquires the module_mutex, and then releases it. CPU 1 has NOT set dummy_module->state to MODULE_STATE_UNFORMED yet. CPU 0, which is reading the /proc/modules, acquires the module_mutex and acquires a pointer to the dummy_module which is still in the modules list. CPU 0 calls m_show for dummy_module. The check in m_show() for MODULE_STATE_UNFORMED passed for dummy_module even though it is being torn down. Meanwhile CPU 1, which has been continuing to remove dummy_module without holding the module_mutex, now calls free_module() and sets dummy_module->state to MODULE_STATE_UNFORMED. CPU 0 now calls module_flags() with dummy_module and ... static char *module_flags(struct module *mod, char *buf) { int bx = 0; BUG_ON(mod->state == MODULE_STATE_UNFORMED); and BOOM. Acquire and release the module_mutex lock around the setting of MODULE_STATE_UNFORMED in the teardown path, which should resolve the problem. Testing: In the unpatched kernel I can panic the system within 1 minute by doing while (true) do insmod dummy_module.ko; rmmod dummy_module.ko; done and while (true) do cat /proc/modules; done in separate terminals. In the patched kernel I was able to run just over one hour without seeing any issues. I also verified the output of panic via sysrq-c and the output of /proc/modules looks correct for all three states for the dummy_module. dummy_module 12661 0 - Unloading 0xffffffffa03a5000 (OE-) dummy_module 12661 0 - Live 0xffffffffa03bb000 (OE) dummy_module 14015 1 - Loading 0xffffffffa03a5000 (OE+) Signed-off-by: Prarit Bhargava <prarit@redhat.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org
* moduleparam: Resolve missing-field-initializer warningMark Rustad2014-09-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Resolve a missing-field-initializer warning, that is produced by every reference to module_param_call, by using designated initialization for the first field. That is enough to silence the complaint. The message is only seen when doing a W=2 build. I happened to be using gcc 4.8.3, but I think most versions would produce the warning when it is enabled. It can either be silenced by using even a single designated initializer as I did here, or providing values for all of the fields. Because of the number of references to the macro, this change silences many warnings in W=2 builds. One instance of the full warning message looks like this: /home/share/git/nn-mdr/include/linux/moduleparam.h:198:16: warning: missing initializer for field ‘free’ of ‘struct kernel_param_ops’ [-Wmissing-field-initializers] static struct kernel_param_ops __param_ops_##name = \ ^ /home/share/git/nn-mdr/fs/fuse/inode.c:35:1: note: in expansion of macro ‘module_param_call’ module_param_call(max_user_bgreq, set_global_limit, param_get_uint, ^ /home/share/git/nn-mdr/include/linux/moduleparam.h:56:9: note: ‘free’ declared here void (*free)(void *arg); Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* kbuild: handle module compression while running 'make modules_install'.Bertrand Jacquin2014-08-273-1/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since module-init-tools (gzip) and kmod (gzip and xz) support compressed modules, it could be useful to include a support for compressing modules right after having them installed. Doing this in kbuild instead of per distro can permit to make this kind of usage more generic. This patch add a Kconfig entry to "Enable loadable module support" menu and let you choose to compress using gzip (default) or xz. Both gzip and xz does not used any extra -[1-9] option since Andi Kleen and Rusty Russell prove no gain is made using them. gzip is called with -n argument to avoid storing original filename inside compressed file, that way we can save some more bytes. On a v3.16 kernel, 'make allmodconfig' generated 4680 modules for a total of 378MB (no strip, no sign, no compress), the following table shows observed disk space gain based on the allmodconfig .config : | time | +-------------+-----------------+ | manual .ko | make | size | percent | compression | modules_install | | gain +-------------+-----------------+------+-------- - | | 18.61s | 378M | GZIP | 3m16s | 3m37s | 102M | 73.41% XZ | 5m22s | 5m39s | 77M | 79.83% The gain for restricted environnement seems to be interesting while uncompress can be time consuming but happens only while loading a module, that is generally done only once. This is fully compatible with signed modules while the signed module is compressed. module-init-tools or kmod handles decompression and provide to other layer the uncompressed but signed payload. Reviewed-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Bertrand Jacquin <beber@meleeweb.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* modinst: wrap long lines in order to enhance cmd_modules_installBertrand Jacquin2014-08-271-1/+5
| | | | | | | | | Note: shouldn't we use 'install -D $(2)/$@ $@' instead of mkdir and cp ? Reviewed-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Bertrand Jacquin <beber@meleeweb.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* modsign: lookup lines ending in .ko in .mod filesBertrand Jacquin2014-08-271-1/+1
| | | | | | | | | | | | | This does the same as commit ef591a5 (scripts/Makefile.modpost: error in finding modules from .mod files), but for scripts/Makefile.modsign Maybe we should also apply to Makefile.modsign and Makefile.modinst the change applied to Makefile.modpost by commit ea4054a (modpost: handle huge numbers of modules) ? Reviewed-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Bertrand Jacquin <beber@meleeweb.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* modpost: simplify file name generation of *.mod.c filesMathias Krause2014-08-271-1/+1
| | | | | | | | | | | | | Avoid the variable length array (vla), just use PATH_MAX instead. This not only makes this code clang friedly, it also leads to a code size reduction: text data bss dec hex filename 51765 2224 12416 66405 10365 scripts/mod/modpost.old 51677 2224 12416 66317 1030d scripts/mod/modpost.new Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* modpost: reduce visibility of symbols and constify r/o arraysMathias Krause2014-08-271-11/+12
| | | | | | | | | | | | | | | Internally used symbols of modpost don't need to be externally visible; make them static. Also constify the string arrays so they resist in the r/o section instead of being runtime writable. Those changes lead to a small size reduction as can be seen below: text data bss dec hex filename 51381 2640 12416 66437 10385 scripts/mod/modpost.old 51765 2224 12416 66405 10365 scripts/mod/modpost.new Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* param: check for tainting before calling set op.Rusty Russell2014-08-272-27/+11
| | | | | | | This means every set op doesn't need to call it, and it can move into params.c. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* drm/i915: taint the kernel if unsafe module parameters are setJani Nikula2014-08-271-4/+4
| | | | | | | | | | | | Taint the kernel if the semaphores, enable_rc6, enable_fbc, or ppgtt module parameters are modified. These module parameters are for debugging and testing only, and should never be changed from their platform specific default values by the users. We do not provide support for people enabling all the experimental features. Make this clear by tainting the kernel if the parameters are set. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* module: add module_param_unsafe and module_param_named_unsafeJani Nikula2014-08-271-0/+18
| | | | | | | | | | | | | | Add the helpers to be used by modules wishing to expose unsafe debugging or testing module parameters that taint the kernel when set. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jean Delvare <khali@linux-fr.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Jon Mason <jon.mason@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* module: make it possible to have unsafe, tainting module paramsJani Nikula2014-08-273-10/+47
| | | | | | | | | | | | | | | | Add flags field to struct kernel_params, and add the first flag: unsafe parameter. Modifying a kernel parameter with the unsafe flag set, either via the kernel command line or sysfs, will issue a warning and taint the kernel. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jean Delvare <khali@linux-fr.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Jon Mason <jon.mason@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* module: rename KERNEL_PARAM_FL_NOARG to avoid confusionJani Nikula2014-08-274-7/+7
| | | | | | | | | | | | | | Make it clear this is about kernel_param_ops, not kernel_param (which will soon have a flags field of its own). No functional changes. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jean Delvare <khali@linux-fr.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Jon Mason <jon.mason@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Merge branch 'for-linus' of ↵Linus Torvalds2014-08-267-5/+54
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: - wire up the system calls seccomp, getrandom and memfd_create - use static system information as input to add_device_randomness - .. and three bug fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/sclp: remove unnecessary XTABS flag s390/3215: fix tty output containing tabs s390: wire up memfd_create syscall s390: add system information as device randomness s390/kdump: Clear subchannel ID to signal non-CCW/SCSI IPL s390: wire up seccomp and getrandom syscalls
| * s390/sclp: remove unnecessary XTABS flagMartin Schwidefsky2014-08-151-1/+1
| | | | | | | | | | | | | | The sclp line mode terminal driver scans the tty output for '\t', there is no need to set the XTABS flag in c_oflag. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * s390/3215: fix tty output containing tabsMartin Schwidefsky2014-08-151-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git commit 37f81fa1f63ad38e16125526bb2769ae0ea8d332 "n_tty: do O_ONLCR translation as a single write" surfaced a bug in the 3215 device driver. In combination this broke tab expansion for tty ouput. The cause is an asymmetry in the behaviour of tty3215_ops->write vs tty3215_ops->put_char. The put_char function scans for '\t' but the write function does not. As the driver has logic for the '\t' expansion remove XTABS from c_oflag of the initial termios as well. Reported-by: Stephen Powell <zlinuxman@wowway.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * s390: wire up memfd_create syscallHeiko Carstens2014-08-123-1/+4
| | | | | | | | | | Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * s390: add system information as device randomnessMartin Schwidefsky2014-08-121-0/+19
| | | | | | | | | | | | | | The virtual-machine cpu information data block and the cpu-id of the boot cpu can be used as source of device randomness. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * s390/kdump: Clear subchannel ID to signal non-CCW/SCSI IPLMichael Holzheu2014-08-121-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For CCW and SCSI IPL the hardware sets the subchannel ID and number correctly at 0xb8. For kdump at 0xb8 normally there is the data of the previously IPLed system. In order to be clean now for kdump and kexec always set the subchannel ID and number to zero. This tells the next OS that no CCW/SCSI IPL has been done. Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * s390: wire up seccomp and getrandom syscallsHeiko Carstens2014-08-123-1/+7
| | | | | | | | | | Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | Documentation: this_cpu_ops.txt: Update description of this_cpu_opsPranith Kumar2014-08-261-42/+171
| | | | | | | | | | | | | | | | | | | | Update the description for per cpu operations to clarify use cases of this_cpu operations and add considerations for remote access. Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | scripts/kernel-doc: recognize __meminitRandy Dunlap2014-08-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Fix scripts/kernel-doc to recognize __meminit in a function prototype and to strip it, as done with many other attributes. Fixes this warning: Warning(..//mm/page_alloc.c:2973): cannot understand function prototype: 'void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask) ' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Linux 3.17-rc2v3.17-rc2Linus Torvalds2014-08-261-1/+1
| |
* | Merge tag 'nfs-for-3.17-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2014-08-263-29/+77
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client fixes from Trond Myklebust: "Highlights: - more fixes for read/write codepath regressions * sleeping while holding the inode lock * stricter enforcement of page contiguity when coalescing requests * fix up error handling in the page coalescing code - don't busy wait on SIGKILL in the file locking code" * tag 'nfs-for-3.17-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: nfs: Don't busy-wait on SIGKILL in __nfs_iocounter_wait nfs: can_coalesce_requests must enforce contiguity nfs: disallow duplicate pages in pgio page vectors nfs: don't sleep with inode lock in lock_and_join_requests nfs: fix error handling in lock_and_join_requests nfs: use blocking page_group_lock in add_request nfs: fix nonblocking calls to nfs_page_group_lock nfs: change nfs_page_group_lock argument
| * | nfs: Don't busy-wait on SIGKILL in __nfs_iocounter_waitDavid Jeffery2014-08-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a SIGKILL is sent to a task waiting in __nfs_iocounter_wait, it will busy-wait or soft lockup in its while loop. nfs_wait_bit_killable won't sleep, and the loop won't exit on the error return. Stop the busy-wait by breaking out of the loop when nfs_wait_bit_killable returns an error. Signed-off-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | nfs: can_coalesce_requests must enforce contiguityWeston Andros Adamson2014-08-231-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 6094f83864c1d1296566a282cba05ba613f151ee "nfs: allow coalescing of subpage requests" got rid of the requirement that requests cover whole pages, but it made some incorrect assumptions. It turns out that callers of this interface can map adjacent requests (by file position as seen by req_offset + req->wb_bytes) to different pages, even when they could share a page. An example is the direct I/O interface - iov_iter_get_pages_alloc may return one segment with a partial page filled and the next segment (which is adjacent in the file position) starts with a new page. Reported-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | nfs: disallow duplicate pages in pgio page vectorsWeston Andros Adamson2014-08-231-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adjacent requests that share the same page are allowed, but should only use one entry in the page vector. This avoids overruning the page vector - it is sized based on how many bytes there are, not by request count. This fixes issues that manifest as "Redzone overwritten" bugs (the vector overrun) and hangs waiting on page read / write, as it waits on the same page more than once. This also adds bounds checking to the page vector with a graceful failure (WARN_ON_ONCE and pgio error returned to application). Reported-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | nfs: don't sleep with inode lock in lock_and_join_requestsWeston Andros Adamson2014-08-233-1/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This handles the 'nonblock=false' case in nfs_lock_and_join_requests. If the group is already locked and blocking is allowed, drop the inode lock and wait for the group lock to be cleared before trying it all again. This should fix warnings found in peterz's tree (sched/wait branch), where might_sleep() checks are added to wait.[ch]. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Reviewed-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | nfs: fix error handling in lock_and_join_requestsWeston Andros Adamson2014-08-231-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes handling of errors from nfs_page_group_lock in nfs_lock_and_join_requests. It now releases the inode lock and the reference to the head request. Reported-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Reviewed-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | nfs: use blocking page_group_lock in add_requestWeston Andros Adamson2014-08-231-11/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __nfs_pageio_add_request was calling nfs_page_group_lock nonblocking, but this can return -EAGAIN which would end up passing -EIO to the application. There is no reason not to block in this path, so change the two calls to do so. Also, there is no need to check the return value of nfs_page_group_lock when nonblock=false, so remove the error handling code. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Reviewed-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | nfs: fix nonblocking calls to nfs_page_group_lockWeston Andros Adamson2014-08-231-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nfs_page_group_lock was calling wait_on_bit_lock even when told not to block. Fix by first trying test_and_set_bit, followed by wait_on_bit_lock if and only if blocking is allowed. Return -EAGAIN if nonblocking and the test_and_set of the bit was already locked. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Reviewed-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | nfs: change nfs_page_group_lock argumentWeston Andros Adamson2014-08-232-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | Flip the meaning of the second argument from 'wait' to 'nonblock' to match related functions. Update all five calls to reflect this change. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Reviewed-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* | | Merge tag 'renesas-sh-drivers-for-v3.17' of ↵Linus Torvalds2014-08-264-3/+11
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas Pull SH driver fix from Simon Horman: "Confine SH_INTC to platforms that need it" * tag 'renesas-sh-drivers-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: sh: intc: Confine SH_INTC to platforms that need it
| * | | sh: intc: Confine SH_INTC to platforms that need itGeert Uytterhoeven2014-08-224-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the sh-intc driver is compiled on all SuperH and non-multiplatform SH-Mobile platforms, while it's only used on a limited number of platforms: - SuperH: SH2(A), SH3(A), SH4(A)(L) (all but SH5) - ARM: sh7372, sh73a0 Drop the "default y" on SH_INTC, make all CPU platforms that use it select it, and protect all sub-options by "if SH_INTC" to fix this. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
* | | | Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds2014-08-2618-59/+142
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull MIPS fixes from Ralf Baechle: "Pretty much all across the field so with this we should be in reasonable shape for the upcoming -rc2" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: OCTEON: make get_system_type() thread-safe MIPS: CPS: Initialize EVA before bringing up VPEs from secondary cores MIPS: Malta: EVA: Rename 'eva_entry' to 'platform_eva_init' MIPS: EVA: Add new EVA header MIPS: scall64-o32: Fix indirect syscall detection MIPS: syscall: Fix AUDIT value for O32 processes on MIPS64 MIPS: Loongson: Fix COP2 usage for preemptible kernel MIPS: NL: Fix nlm_xlp_defconfig build error MIPS: Remove race window in page fault handling MIPS: Malta: Improve system memory detection for '{e, }memsize' >= 2G MIPS: Alchemy: Fix db1200 PSC clock enablement MIPS: BCM47XX: Fix reboot problem on BCM4705/BCM4785 MIPS: Remove duplicated include from numa.c MIPS: Add common plat_irq_dispatch declaration MIPS: MSP71xx: remove unused plat_irq_dispatch() argument MIPS: GIC: Remove useless parens from GICBIS(). MIPS: perf: Mark pmu interupt IRQF_NO_THREAD
| * | | | MIPS: OCTEON: make get_system_type() thread-safeAaro Koskinen2014-08-191-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | get_system_type() is not thread-safe on OCTEON. It uses static data, also more dangerous issue is that it's calling cvmx_fuse_read_byte() every time without any synchronization. Currently it's possible to get processes stuck looping forever in kernel simply by launching multiple readers of /proc/cpuinfo: (while true; do cat /proc/cpuinfo > /dev/null; done) & (while true; do cat /proc/cpuinfo > /dev/null; done) & ... Fix by initializing the system type string only once during the early boot. Signed-off-by: Aaro Koskinen <aaro.koskinen@nsn.com> Cc: stable@vger.kernel.org Reviewed-by: Markos Chandras <markos.chandras@imgtec.com> Patchwork: http://patchwork.linux-mips.org/patch/7437/ Signed-off-by: James Hogan <james.hogan@imgtec.com>
| * | | | MIPS: CPS: Initialize EVA before bringing up VPEs from secondary coresMarkos Chandras2014-08-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The CPS code is doing several memory loads when configuring the VPEs from secondary cores, so the segmentation control registers must be initialized in time otherwise the kernel will crash with strange TLB exceptions. Reviewed-by: Paul Burton <paul.burton@imgtec.com> Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Patchwork: http://patchwork.linux-mips.org/patch/7424/ Signed-off-by: James Hogan <james.hogan@imgtec.com>
| * | | | MIPS: Malta: EVA: Rename 'eva_entry' to 'platform_eva_init'Markos Chandras2014-08-191-6/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename 'eva_entry' to 'platform_eva_init' as required by the new 'eva_init' macro in the eva.h header. Since this macro is now used in a platform dependent way, it must not depend on its caller so move the t1 register initialization inside this macro. Also set the .reorder assembler option in case the caller may have previously set .noreorder. This may allow a few assembler optimizations. Finally include missing headers and document the register usage for this macro. Reviewed-by: Paul Burton <paul.burton@imgtec.com> Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Patchwork: http://patchwork.linux-mips.org/patch/7423/ Signed-off-by: James Hogan <james.hogan@imgtec.com>
| * | | | MIPS: EVA: Add new EVA headerMarkos Chandras2014-08-191-0/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generic code may need to perform certain operations when EVA is enabled, for example, configure the segmentation registers during boot. In order to avoid using more CONFIG_EVA ifdefs in the arch code, such functions will be added in this header instead. Initially this header contains a macro which will be used by generic code later on during VPEs configuration on secondary cores. All it does is to call the platform specific EVA init code in case EVA is enabled. Reviewed-by: Paul Burton <paul.burton@imgtec.com> Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Patchwork: http://patchwork.linux-mips.org/patch/7422/ Signed-off-by: James Hogan <james.hogan@imgtec.com>
| * | | | MIPS: scall64-o32: Fix indirect syscall detectionMarkos Chandras2014-08-191-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 4c21b8fd8f14 (MIPS: seccomp: Handle indirect system calls (o32)) added indirect syscall detection for O32 processes running on MIPS64 but it did not work as expected. The reason is the the scall64-o32 implementation differs compared to scall32-o32. In the former, the v0 (syscall number) register contains the absolute syscall number (4000 + X) whereas in the latter it contains the relative syscall number (X). Fix the code to avoid doing an extra addition, and load the v0 register directly to the first argument for syscall_trace_enter. Moreover, set the .reorder assembler option in order to have better control on this part of the assembly code. Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Patchwork: http://patchwork.linux-mips.org/patch/7481/ Cc: <stable@vger.kernel.org> # v3.15+ Signed-off-by: James Hogan <james.hogan@imgtec.com>
| * | | | MIPS: syscall: Fix AUDIT value for O32 processes on MIPS64Markos Chandras2014-08-191-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On MIPS64, O32 processes set both TIF_32BIT_ADDR and TIF_32BIT_REGS so the previous condition treated O32 applications as N32 when evaluating seccomp filters. Fix the condition to check both TIF_32BIT_{REGS, ADDR} for the N32 AUDIT flag. Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Patchwork: http://patchwork.linux-mips.org/patch/7480/ Cc: <stable@vger.kernel.org> # v3.15+ Signed-off-by: James Hogan <james.hogan@imgtec.com>
| * | | | MIPS: Loongson: Fix COP2 usage for preemptible kernelHuacai Chen2014-08-191-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preemptible kernel, only TIF_USEDFPU flag is reliable to distinguish whether _init_fpu()/_restore_fp() is needed. Because the value of the CP0_Status.CU1 isn't changed during preemption. V2: Fix coding style. Signed-off-by: Huacai Chen <chenhc@lemote.com> Cc: John Crispin <john@phrozen.org> Cc: Steven J. Hill <Steven.Hill@imgtec.com> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Patchwork: https://patchwork.linux-mips.org/patch/7515/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | | | MIPS: NL: Fix nlm_xlp_defconfig build errorGuenter Roeck2014-08-191-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nlm_xlp_defconfig build fails with ./arch/mips/include/asm/mach-netlogic/topology.h:15:0: error: "topology_core_id" redefined [-Werror] In file included from include/linux/smp.h:59:0, [ ...] from arch/mips/mm/dma-default.c:12: ./arch/mips/include/asm/smp.h:41:0: note: this is the location of the previous definition and similar errors. This is caused by commit bda4584cd943d7 ("MIPS: Support CPU topology files in sysfs") which adds the defines to arch/mips/include/asm/smp.h. Remove the defines from arch/mips/include/asm/mach-netlogic/topology.h as no longer necessary. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Cc: Huacai Chen <chenhc@lemote.com> Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/7513/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | | | MIPS: Remove race window in page fault handlingLars Persson2014-08-192-13/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Multicore MIPSes without I/D hardware coherency suffered from a race condition in the page fault handler. The page table entry was published before any pending lazy D-cache flush was committed, hence it allowed execution of stale page cache data by other VPEs in the system. To make the cache handling safe we need to perform flushing already in the set_pte_at function. MIPSes without coherent I-caches can get a small increase in flushes due to the unavailability of the execute flag in set_pte_at. [ralf@linux-mips.org: outlining set_pte_at() saves a good k in a test build, so I moved its definition from pgtable.h to cache.c.] Signed-off-by: Lars Persson <larper@axis.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/7511/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | | | MIPS: Malta: Improve system memory detection for '{e, }memsize' >= 2GMarkos Chandras2014-08-191-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using kstrtol to parse the "{e,}memsize" variables was wrong because this parses signed long numbers. In case of '{e,}memsize' >= 2G, the top bit is set, resulting to -ERANGE errors and possibly random system memory boundaries. We fix this by replacing "kstrtol" with "kstrtoul". We also improve the code to check the kstrtoul return value and print a warning if an error was returned. Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: <stable@vger.kernel.org> # v3.15+ Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/7543/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | | | MIPS: Alchemy: Fix db1200 PSC clock enablementManuel Lauss2014-08-191-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable PSC0 (I2C/SPI) clock and leave PSC1 (Audio) alone. This patch restores functionality to both Audio and I2C/SPI. Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com> Cc: Linux-MIPS <linux-mips@linux-mips.org> Patchwork: https://patchwork.linux-mips.org/patch/7544/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | | | MIPS: BCM47XX: Fix reboot problem on BCM4705/BCM4785Hauke Mehrtens2014-08-191-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds some code based on code from the Broadcom GPL tar to fix the reboot problems on BCM4705/BCM4785. I tried rebooting my device for ~10 times and have never seen a problem. This reverts the changes in the previous commit and adds the real fix as suggested by Rafał. Setting bit 22 in Reg 22, sel 4 puts the BIU (Bus Interface Unit) into async mode. The previous commit was 316cad5c1d4daee998cd1f83ccdb437f6f20d45c [MIPS: BCM47XX: make reboot more relaiable] Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Cc: jogo@openwrt.org Cc: zajec5@gmail.com Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/7545/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | | | MIPS: Remove duplicated include from numa.cWei Yongjun2014-08-191-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: Huacai Chen <chenhc@lemote.com> Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/7537/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | | | MIPS: Add common plat_irq_dispatch declarationSergey Ryazanov2014-08-192-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add common declaration to get rid of following sparse warning: "symbol 'plat_irq_dispatch' was not declared. Should it be static?" Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Cc: Linux MIPS <linux-mips@linux-mips.org> Patchwork: https://patchwork.linux-mips.org/patch/7539/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | | | MIPS: MSP71xx: remove unused plat_irq_dispatch() argumentSergey Ryazanov2014-08-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove unused argument to make the plat_irq_dispatch() function declaration similar to the realization of other platforms. Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Cc: Linux MIPS <linux-mips@linux-mips.org> Patchwork: https://patchwork.linux-mips.org/patch/7538/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | | | MIPS: GIC: Remove useless parens from GICBIS().Ralf Baechle2014-08-181-1/+1
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Ralf Baechle <ralf@linux-mips.org>