linux - linux

	Commit message (Collapse)	Author	Files	Lines
2015-09-18	s390/compat: remove superfluous compat wrappers	Heiko Carstens	2	-102/+51
	A couple of compat wrapper functions are simply trampolines to the real system call. This happened because the compat wrapper defines will only sign and zero extend system call parameters which are of different size on s390/s390x (longs and pointers). All other parameters will be correctly sign and zero extended by the normal system call wrappers. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-09-18	s390/compat: do not trace compat wrapper functions	Heiko Carstens	1	-6/+6
	Add notrace to the compat wrapper define to disable tracing of compat wrapper functions. These are supposed to be very small and more or less just a trampoline to the real system call. Also fix indentation. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-09-17	s390/s390x: allocate sys_membarrier system call number	Mathieu Desnoyers	2	-1/+3
	Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: linux-api@vger.kernel.org CC: Heiko Carstens <heiko.carstens@de.ibm.com> CC: linux-s390@vger.kernel.org Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-09-17	s390/configs//zfcpdump_defconfig: Remove CONFIG_MEMSTICK	Michael Holzheu	1	-5/+0
	This config option is completely irrelevant for zfcpdump and unfortunately causes a kernel panic on recent kernels in "mspro_block_init()/driver_register()". Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-09-17	s390: wire up userfaultfd system call	Heiko Carstens	3	-1/+5
	Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-09-17	s390/vtime: correct scaled cputime for SMT	Martin Schwidefsky	1	-4/+8
	The scaled cputime is supposed to be derived from the normal per-thread cputime by dividing it with the average thread density in the last interval. The calculation of the scaling values for the average thread density is incorrect. The current, incorrect calculation: Ci = cycle count with i active threads T = unscaled cputime, sT = scaled cputime sT = T * (C1 + C2 + ... + Cn) / (1C1 + 2C2 + ... + nCn) The calculation happens to yield the correct numbers for the simple cases with only one Ci value not zero. But for cases with multiple Ci values not zero it fails. E.g. on a SMT-2 system with one thread active half the time and two threads active for the other half of the time it fails, the scaling factor should be 3/4 but the formula gives 2/3. The correct formula is sT = T (C1/1 + C2/2 + ... + Cn/n) / (C1 + C2 + ... + Cn) Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-09-17	s390/cpum_cf: Corrected return code for unauthorized counter sets	Hendrik Brueckner	1	-4/+8
	Previously, the cpum_cf PMU returned -EPERM if a counter is requested and the counter set to which the counter belongs is not authorized. According to the perf_event_open() system call manual, an error code of EPERM indicates an unsupported exclude setting or CAP_SYS_ADMIN is missing. Use ENOENT to indicate that particular counters are not available when the counter set which contains the counter is not authorized. For generic events, this might trigger a fall back, for example, to a software event. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-09-17	s390/compat: correct uc_sigmask of the compat signal frame	Martin Schwidefsky	1	-4/+23
	The uc_sigmask in the ucontext structure is an array of words to keep the 64 signal bits (or 1024 if you ask glibc but the kernel sigset_t only has 64 bits). For 64 bit the sigset_t contains a single 8 byte word, but for 31 bit there are two 4 byte words. The compat signal handler code uses a simple copy of the 64 bit sigset_t to the 31 bit compat_sigset_t. As s390 is a big-endian architecture this is incorrect, the two words in the 31 bit sigset_t array need to be swapped. Cc: <stable@vger.kernel.org> Reported-by: Stefan Liebler <stli@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-09-17	s390: fix floating point register corruption	Heiko Carstens	1	-0/+2
	The critical section cleanup code misses to add the offset of the thread_struct to the task address. Therefore, if the critical section code gets executed, it may corrupt the task struct or restore the contents of the floating point registers from the wrong memory location. Fixes d0164ee20d "s390/kernel: remove save_fpu_regs() parameter and use __LC_CURRENT instead". Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-09-17	s390/hibernate: fix save and restore of vector registers	Martin Schwidefsky	1	-35/+3
	The swsusp_arch_suspend()/swsusp_arch_resume() functions currently only save and restore the floating point registers. If the task that started the hibernation process is using vector registers they can get lost. To fix this just call save_fpu_regs in swsusp_arch_suspend(), the restore will happen automatically on return to user space. Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-09-16	ia64: Enable userfaultfd and membarrier system calls	Luck, Tony	3	-1/+5
	Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-15	modsign: Fix GPL/OpenSSL licence incompatibility	David Woodhouse	2	-10/+13
	The GPL does not permit us to link against the OpenSSL library. Use LGPL for sign-file and extract-file instead. [ The whole "openssl isn't compatible with gpl" is really just fear-mongering, but there's no reason not to make modsign LGPL, so nobody cares. - Linus ] Reported-by: Julian Andres Klode <jak@jak-linux.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Julian Andres Klode <jak@jak-linux.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-14	pinctrl: samsung: s3c24xx: fix syntax error	Linus Walleij	1	-1/+1
	?SYNTAX ERROR irq_desc_get_irq_chip() does not exist. It should be irq_desc_get_chip(). Tested by compiling s3c2410_defconfig. Cc: Thomas Gleixner <tglx@linutronix.de> Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-09-14	pinctrl: core: Warn about NULL gpio_chip in pinctrl_ready_for_gpio_range()	Tony Lindgren	1	-0/+3
	If the gpio driver is confused about the numbers for gpio-ranges, pinctrl_ready_for_gpio_range() may get called with invalid GPIO causing a NULL pointer exception. Let's instead provide a warning that allows fixing the problem and return with error. Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-09-14	pinctrl: join lines that can be a single line within 80 columns	Masahiro Yamada	1	-2/+1
	There is no reason to break a line shorter than 80 columns. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-09-14	pinctrl: digicolor: convert null test to IS_ERR test	Julia Lawall	1	-2/+2
	Since commit 323de9efdf3e ("pinctrl: make pinctrl_register() return proper error code"), pinctrl_register returns an error code rather than NULL on failure. Update a driver that was introduced more recently. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression e,e1,e2; @@ e = pinctrl_register(...) ... when != e = e1 if ( - e == NULL + IS_ERR(e) ) { ... return - e2 + PTR_ERR(e) ; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-09-14	pinctrl: qcom: ssbi: convert null test to IS_ERR test	Julia Lawall	2	-4/+4
	Since commit 323de9efdf3e ("pinctrl: make pinctrl_register() return proper error code"), pinctrl_register returns an error code rather than NULL on failure. Update some drivers that were introduced more recently. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression e,e1,e2; @@ e = pinctrl_register(...) ... when != e = e1 if ( - e == NULL + IS_ERR(e) ) { ... return - e2 + PTR_ERR(e) ; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-09-14	gpio: omap: Fix GPIO numbering for deferred probe	Tony Lindgren	1	-1/+3
	If gpio-omap probe fails with -EPROBE_DEFER, the GPIO numbering keeps increasing. Only increase the gpio count if gpiochip_add() was successful as otherwise the numbers will increase for each probe attempt. Cc: Javier Martinez Canillas <javier@dowhile0.org> Cc: Kevin Hilman <khilman@deeprootsystems.com> Cc: Santosh Shilimkar <ssantosh@kernel.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-09-14	Documentation: gpio: Explain that <function>-gpio is also supported	Javier Martinez Canillas	1	-3/+3
	The GPIO documentation mentions that GPIOs are mapped by defining a <function>-gpios property in the consumer device's node but a -gpio sufix is also supported after commit: dd34c37aa3e8 ("gpio: of: Allow -gpio suffix for property names") Update the documentation to match the implementation. Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-09-14	gpio: omap: Fix gpiochip_add() handling for deferred probe	Tony Lindgren	1	-1/+4
	Currently we gpio-omap breaks if gpiochip_add() returns -EPROBE_DEFER: [ 0.570000] gpiochip_add: GPIOs 0..31 (gpio) failed to register [ 0.570000] omap_gpio 48310000.gpio: Could not register gpio chip -517 ... [ 3.670000] omap_gpio 48310000.gpio: Unbalanced pm_runtime_enable! Let's fix the issue by adding the missing pm_runtime_put() on error. Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Javier Martinez Canillas <javier@dowhile0.org> Cc: Kevin Hilman <khilman@deeprootsystems.com> Cc: Santosh Shilimkar <ssantosh@kernel.org> Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-09-14	gpio: sx150x: Remove unnecessary MODULE_ALIAS()	Javier Martinez Canillas	1	-1/+0
	The driver has a I2C device id table that is used to create the module aliases and also "sx150x" isn't a supported I2C id, so it's never used. Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-09-14	Documentation: gpio: board: describe the con_id parameter	Dirk Behme	2	-0/+12
	The con_id parameter has to match the GPIO description and is automatically extended by the GPIO suffix if not NULL. I had to look into the code to understand this and properly find the GPIO I've been looking for, so document this. Signed-off-by: Dirk Behme <dirk.behme@gmail.com> Acked-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-09-14	Documentation: gpio: board: add flags parameter to gpiod_get*() functions	Dirk Behme	1	-12/+13
	With commit 39b2bbe3d715 ("gpio: add flags argument to gpiod_get() functions") the gpiod_get() functions got a 'flags' parameter. Reflect this in the documentation, too. Signed-off-by: Dirk Behme <dirk.behme@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-09-14	gpio: Propagate errors from chip->get()	Bjorn Andersson	1	-7/+14
	It's possible to have gpio chips hanging off unreliable remote buses where the get() operation will fail to acquire a readout of the current gpio state. Propagate these errors to the consumer so that they can act on, retry or ignore these failing reads, instead of treating them as the line being held high. Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com> Reviewed-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-09-14	gpio: rcar: GPIO_RCAR doesn't relate to ARM	Kuninori Morimoto	1	-1/+1
	8cd1470("gpio: rcar: Add r8a7795 (R-Car H3) support") added GPIO support for r8a7795. r8a7795 based on CONFIG_ARM64. OTOH, GPIO_RCAR driver can be compiled fine on non-ARM. This patch removed ARM dependency for it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-09-14	gpio: mxs: need to check return value of irq_alloc_generic_chip	Peng Fan	1	-2/+11
	Need to check return value of irq_alloc_generic_chip, because it may return NULL. 1. Change mxs_gpio_init_gc return type from void to int. 2. Add a new lable out_irqdomain_remove to remove the irq domain when mxc_gpio_init_gc fail. Signed-off-by: Peng Fan <van.freenix@gmail.com> Cc: Alexandre Courbot <gnurou@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-09-14	gpio: mxc: need to check return value of irq_alloc_generic_chip	Peng Fan	1	-2/+10
	Need to check return value of irq_alloc_generic_chip, because it may return NULL. 1. Change mxc_gpio_init_gc return type from void to int. 2. Add a new lable out_irqdomain_remove to remove the irq domain when mxc_gpio_init_gc fail. Signed-off-by: Peng Fan <van.freenix@gmail.com> Cc: Alexandre Courbot <gnurou@gmail.com> [Manually rebased] Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-09-13	hwmon: (nct6775) Add support for NCT6793D	Guenter Roeck	3	-18/+38
	NCT6793D is register compatible with NCT6792D. Also move nct6775_sio_names[] closer to enum kinds to simplify adding new chips. Tested-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2015-09-13	hwmon: (nct6775) Swap STEP_UP_TIME and STEP_DOWN_TIME registers for most chips	Guenter Roeck	1	-6/+10
	The STEP_UP_TIME and STEP_DOWN_TIME registers are swapped for all chips but NCT6775. Reported-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Jean Delvare <jdelvare@suse.de> Cc: stable@vger.kernel.org # v3.10+ Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2015-09-13	Linux 4.3-rc1v4.3-rc1	Linus Torvalds	1	-2/+2

2015-09-12	blk: rq_data_dir() should not return a boolean	Linus Torvalds	1	-1/+1
	rq_data_dir() returns either READ or WRITE (0 == READ, 1 == WRITE), not a boolean value. Now, admittedly the "!= 0" doesn't really change the value (0 stays as zero, 1 stays as one), but it's not only redundant, it confuses gcc, and causes gcc to warn about the construct switch (rq_data_dir(req)) { case READ: ... case WRITE: ... that we have in a few drivers. Now, the gcc warning is silly and stupid (it seems to warn not about the switch value having a different type from the case statements, but about _any_ boolean switch value), but in this case the code itself is silly and stupid too, so let's just change it, and get rid of warnings like this: drivers/block/hd.c: In function ‘hd_request’: drivers/block/hd.c:630:11: warning: switch condition has boolean value [-Wswitch-bool] switch (rq_data_dir(req)) { The odd '!= 0' came in when "cmd_flags" got turned into a "u64" in commit 5953316dbf90 ("block: make rq->cmd_flags be 64-bit") and is presumably because the old code (that just did a logical 'and' with 1) would then end up making the type of rq_data_dir() be u64 too. But if we want to retain the old regular integer type, let's just cast the result to 'int' rather than use that rather odd '!= 0'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-12	writeback: plug writeback in wb_writeback() and writeback_inodes_wb()	Linus Torvalds	1	-0/+6
	We had to revert the pluggin in writeback_sb_inodes() because the wb->list_lock is held, but we could easily plug at a higher level before taking that lock, and unplug after releasing it. This does that. Chris will run performance numbers, just to verify that this approach is comparable to the alternative (we could just drop and re-take the lock around the blk_finish_plug() rather than these two commits. I'd have preferred waiting for actual performance numbers before picking one approach over the other, but I don't want to release rc1 with the known "sleeping function called from invalid context" issue, so I'll pick this cleanup version for now. But if the numbers show that we really want to plug just at the writeback_sb_inodes() level, and we should just play ugly games with the spinlock, we'll switch to that. Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <jbacik@fb.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Neil Brown <neilb@suse.de> Cc: Jan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-12	thermal: fix intel PCH thermal driver mismerge	Linus Torvalds	1	-7/+4
	I didn't notice this when merging the thermal code from Zhang, but his merge (commit 5a924a07f882: "Merge branches 'thermal-core' and 'thermal-intel' of .git into next") of the thermal-core and thermal-intel branches was wrong. In thermal-core, commit 17e8351a7739 ("thermal: consistently use int for temperatures") converted the thermal layer to use "int" for temperatures. But in parallel, in the thermal-intel branch commit d0a12625d2ff ("thermal: Add Intel PCH thermal driver") added support for the intel PCH thermal sensor using the old interfaces that used "unsigned long" pointers. This resulted in warnings like this: drivers/thermal/intel_pch_thermal.c:184:14: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types] .get_temp = pch_thermal_get_temp, ^ drivers/thermal/intel_pch_thermal.c:184:14: note: (near initialization for ‘tzd_ops.get_temp’) drivers/thermal/intel_pch_thermal.c:186:19: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types] .get_trip_temp = pch_get_trip_temp, ^ drivers/thermal/intel_pch_thermal.c:186:19: note: (near initialization for ‘tzd_ops.get_trip_temp’) This fixes it. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-12	ARCv2: [axs103_smp] Reduce clk for SMP FPGA configs	Vineet Gupta	1	-0/+2
	Newer bitfiles needs the reduced clk even for SMP builds Cc: <stable@vger.kernel.org> #4.2 Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-12	[CIFS] mount option sec=none not displayed properly in /proc/mounts	Steve French	1	-1/+4
	When the user specifies "sec=none" in a cifs mount, we set sec_type as unspecified (and set a flag and the username will be null) rather than setting sectype as "none" so cifs_show_security was not properly displaying it in cifs /proc/mounts entries. Signed-off-by: Steve French <steve.french@primarydata.com> Reviewed-by: Jeff Layton <jlayton@poochiereds.net>
2015-09-12	revert "ocfs2/dlm: use list_for_each_entry instead of list_for_each"	Andrew Morton	1	-2/+4
	Revert commit f83c7b5e9fd6 ("ocfs2/dlm: use list_for_each_entry instead of list_for_each"). list_for_each_entry() will dereference its `pos' argument, which can be NULL in dlm_process_recovery_data(). Reported-by: Julia Lawall <julia.lawall@lip6.fr> Reported-by: Fengguang Wu <fengguang.wu@gmail.com> Cc: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-12	mm/early_ioremap: add explicit #include of asm/early_ioremap.h	Ard Biesheuvel	1	-0/+1
	Commit 6b0f68e32ea8 ("mm: add utility for early copy from unmapped ram") introduces a function copy_from_early_mem() into mm/early_ioremap.c which itself calls early_memremap()/early_memunmap(). However, since early_memunmap() has not been declared yet at this point in the .c file, nor by any explicitly included header files, we are depending on a transitive include of asm/early_ioremap.h to declare it, which is fragile. So instead, include this header explicitly. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Mark Salter <msalter@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-12	fs/seq_file: convert int seq_vprint/seq_printf/etc... returns to void	Joe Perches	4	-50/+45
	The seq_<foo> function return values were frequently misused. See: commit 1f33c41c03da ("seq_file: Rename seq_overflow() to seq_has_overflowed() and make public") All uses of these return values have been removed, so convert the return types to void. Miscellanea: o Move seq_put_decimal_<type> and seq_escape prototypes closer the other seq_vprintf prototypes o Reorder seq_putc and seq_puts to return early on overflow o Add argument names to seq_vprintf and seq_printf o Update the seq_escape kernel-doc o Convert a couple of leading spaces to tabs in seq_escape Signed-off-by: Joe Perches <joe@perches.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mark Brown <broonie@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Joerg Roedel <jroedel@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-12	selftests: enhance membarrier syscall test	Mathieu Desnoyers	1	-25/+75
	Update the membarrier syscall self-test to match the membarrier interface. Extend coverage of the interface. Consider ENOSYS as a "SKIP" test, since it is a valid configuration, but does not allow testing the system call. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Pranith Kumar <bobby.prani@gmail.com> Cc: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-12	selftests: add membarrier syscall test	Pranith Kumar	4	-0/+84
	Add a self test for the membarrier system call. Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-12	sys_membarrier(): system-wide memory barrier (generic, x86)	Mathieu Desnoyers	11	-1/+151
	Here is an implementation of a new system call, sys_membarrier(), which executes a memory barrier on all threads running on the system. It is implemented by calling synchronize_sched(). It can be used to distribute the cost of user-space memory barriers asymmetrically by transforming pairs of memory barriers into pairs consisting of sys_membarrier() and a compiler barrier. For synchronization primitives that distinguish between read-side and write-side (e.g. userspace RCU [1], rwlocks), the read-side can be accelerated significantly by moving the bulk of the memory barrier overhead to the write-side. The existing applications of which I am aware that would be improved by this system call are as follows: * Through Userspace RCU library (http://urcu.so) - DNS server (Knot DNS) https://www.knot-dns.cz/ - Network sniffer (http://netsniff-ng.org/) - Distributed object storage (https://sheepdog.github.io/sheepdog/) - User-space tracing (http://lttng.org) - Network storage system (https://www.gluster.org/) - Virtual routers (https://events.linuxfoundation.org/sites/events/files/slides/DPDK_RCU_0MQ.pdf) - Financial software (https://lkml.org/lkml/2015/3/23/189) Those projects use RCU in userspace to increase read-side speed and scalability compared to locking. Especially in the case of RCU used by libraries, sys_membarrier can speed up the read-side by moving the bulk of the memory barrier cost to synchronize_rcu(). * Direct users of sys_membarrier - core dotnet garbage collector (https://github.com/dotnet/coreclr/issues/198) Microsoft core dotnet GC developers are planning to use the mprotect() side-effect of issuing memory barriers through IPIs as a way to implement Windows FlushProcessWriteBuffers() on Linux. They are referring to sys_membarrier in their github thread, specifically stating that sys_membarrier() is what they are looking for. To explain the benefit of this scheme, let's introduce two example threads: Thread A (non-frequent, e.g. executing liburcu synchronize_rcu()) Thread B (frequent, e.g. executing liburcu rcu_read_lock()/rcu_read_unlock()) In a scheme where all smp_mb() in thread A are ordering memory accesses with respect to smp_mb() present in Thread B, we can change each smp_mb() within Thread A into calls to sys_membarrier() and each smp_mb() within Thread B into compiler barriers "barrier()". Before the change, we had, for each smp_mb() pairs: Thread A Thread B previous mem accesses previous mem accesses smp_mb() smp_mb() following mem accesses following mem accesses After the change, these pairs become: Thread A Thread B prev mem accesses prev mem accesses sys_membarrier() barrier() follow mem accesses follow mem accesses As we can see, there are two possible scenarios: either Thread B memory accesses do not happen concurrently with Thread A accesses (1), or they do (2). 1) Non-concurrent Thread A vs Thread B accesses: Thread A Thread B prev mem accesses sys_membarrier() follow mem accesses prev mem accesses barrier() follow mem accesses In this case, thread B accesses will be weakly ordered. This is OK, because at that point, thread A is not particularly interested in ordering them with respect to its own accesses. 2) Concurrent Thread A vs Thread B accesses Thread A Thread B prev mem accesses prev mem accesses sys_membarrier() barrier() follow mem accesses follow mem accesses In this case, thread B accesses, which are ensured to be in program order thanks to the compiler barrier, will be "upgraded" to full smp_mb() by synchronize_sched(). * Benchmarks On Intel Xeon E5405 (8 cores) (one thread is calling sys_membarrier, the other 7 threads are busy looping) 1000 non-expedited sys_membarrier calls in 33s =3D 33 milliseconds/call. * User-space user of this system call: Userspace RCU library Both the signal-based and the sys_membarrier userspace RCU schemes permit us to remove the memory barrier from the userspace RCU rcu_read_lock() and rcu_read_unlock() primitives, thus significantly accelerating them. These memory barriers are replaced by compiler barriers on the read-side, and all matching memory barriers on the write-side are turned into an invocation of a memory barrier on all active threads in the process. By letting the kernel perform this synchronization rather than dumbly sending a signal to every process threads (as we currently do), we diminish the number of unnecessary wake ups and only issue the memory barriers on active threads. Non-running threads do not need to execute such barrier anyway, because these are implied by the scheduler context switches. Results in liburcu: Operations in 10s, 6 readers, 2 writers: memory barriers in reader: 1701557485 reads, 2202847 writes signal-based scheme: 9830061167 reads, 6700 writes sys_membarrier: 9952759104 reads, 425 writes sys_membarrier (dyn. check): 7970328887 reads, 425 writes The dynamic sys_membarrier availability check adds some overhead to the read-side compared to the signal-based scheme, but besides that, sys_membarrier slightly outperforms the signal-based scheme. However, this non-expedited sys_membarrier implementation has a much slower grace period than signal and memory barrier schemes. Besides diminishing the number of wake-ups, one major advantage of the membarrier system call over the signal-based scheme is that it does not need to reserve a signal. This plays much more nicely with libraries, and with processes injected into for tracing purposes, for which we cannot expect that signals will be unused by the application. An expedited version of this system call can be added later on to speed up the grace period. Its implementation will likely depend on reading the cpu_curr()->mm without holding each CPU's rq lock. This patch adds the system call to x86 and to asm-generic. [1] http://urcu.so membarrier(2) man page: MEMBARRIER(2) Linux Programmer's Manual MEMBARRIER(2) NAME membarrier - issue memory barriers on a set of threads SYNOPSIS #include <linux/membarrier.h> int membarrier(int cmd, int flags); DESCRIPTION The cmd argument is one of the following: MEMBARRIER_CMD_QUERY Query the set of supported commands. It returns a bitmask of supported commands. MEMBARRIER_CMD_SHARED Execute a memory barrier on all threads running on the system. Upon return from system call, the caller thread is ensured that all running threads have passed through a state where all memory accesses to user-space addresses match program order between entry to and return from the system call (non-running threads are de facto in such a state). This covers threads from all pro=E2=80=90 cesses running on the system. This command returns 0. The flags argument needs to be 0. For future extensions. All memory accesses performed in program order from each targeted thread is guaranteed to be ordered with respect to sys_membarrier(). If we use the semantic "barrier()" to represent a compiler barrier forcing memory accesses to be performed in program order across the barrier, and smp_mb() to represent explicit memory barriers forcing full memory ordering across the barrier, we have the following ordering table for each pair of barrier(), sys_membarrier() and smp_mb(): The pair ordering is detailed as (O: ordered, X: not ordered): barrier() smp_mb() sys_membarrier() barrier() X X O smp_mb() X O O sys_membarrier() O O O RETURN VALUE On success, these system calls return zero. On error, -1 is returned, and errno is set appropriately. For a given command, with flags argument set to 0, this system call is guaranteed to always return the same value until reboot. ERRORS ENOSYS System call is not implemented. EINVAL Invalid arguments. Linux 2015-04-15 MEMBARRIER(2) Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Nicholas Miell <nmiell@comcast.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: David Howells <dhowells@redhat.com> Cc: Pranith Kumar <bobby.prani@gmail.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-12	MODSIGN: fix a compilation warning in extract-cert	David Howells	1	-1/+1
	Fix the following warning when compiling extract-cert: scripts/extract-cert.c: In function `write_cert': scripts/extract-cert.c:89:2: warning: format not a string literal and no format arguments [-Wformat-security] ERR(!i2d_X509_bio(wb, x509), cert_dst); ^ whereby the ERR() macro is taking cert_dst as the format string. "%s" should be used as the format string as the path could contain special characters. Signed-off-by: David Howells <dhowells@redhat.com> Reported-by: Jim Davis <jim.epost@gmail.com> Acked-by : David Woodhouse <david.woodhouse@intel.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-12	IB/ehca: Deprecate driver, move to staging, schedule deletion	Doug Ledford	36	-3/+9
	The ehca driver is only supported on IBM machines with a custom EBus. As they have opted to build their newer machines using more industry standard technology and haven't really been pushing EBus capable machines for a while, this driver can now safely be moved to the staging area and scheduled for eventual removal. This plan was brought to IBM's attention and received their sign-off. Cc: alexs@linux.vnet.ibm.com Cc: hnguyen@de.ibm.com Cc: raisch@de.ibm.com Cc: stefan.roscher@de.ibm.com Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-09-11	Revert "writeback: plug writeback at a high level"	Linus Torvalds	1	-3/+4
	This reverts commit d353d7587d02116b9732d5c06615aed75a4d3a47. Doing the block layer plug/unplug inside writeback_sb_inodes() is broken, because that function is actually called with a spinlock held: wb->list_lock, as pointed out by Chris Mason. Chris suggested just dropping and re-taking the spinlock around the blk_finish_plug() call (the plgging itself can happen under the spinlock), and that would technically work, but is just disgusting. We do something fairly similar - but not quite as disgusting because we at least have a better reason for it - in writeback_single_inode(), so it's not like the caller can depend on the lock being held over the call, but in this case there just isn't any good reason for that "release and re-take the lock" pattern. [ In general, we should really strive to avoid the "release and retake" pattern for locks, because in the general case it can easily cause subtle bugs when the caller caches any state around the call that might be invalidated by dropping the lock even just temporarily. ] But in this case, the plugging should be easy to just move up to the callers before the spinlock is taken, which should even improve the effectiveness of the plug. So there is really no good reason to play games with locking here. I'll send off a test-patch so that Dave Chinner can verify that that plug movement works. In the meantime this just reverts the problematic commit and adds a comment to the function so that we hopefully don't make this mistake again. Reported-by: Chris Mason <clm@fb.com> Cc: Josef Bacik <jbacik@fb.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Neil Brown <neilb@suse.de> Cc: Jan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-11	scsi_dh: fix randconfig build error	Christoph Hellwig	1	-1/+1
	It looks like the Kconfig check that was meant to fix this (commit fe9233fb6914a0eb20166c967e3020f7f0fba2c9 [SCSI] scsi_dh: fix kconfig related build errors) was actually reversed, but no-one noticed until the new set of patches which separated DM and SCSI_DH). Fixes: fe9233fb6914a0eb20166c967e3020f7f0fba2c9 Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-09-11	ARM: 8431/1: fix alignement of __bug_table section entries	Robert Jarzmik	1	-0/+1
	On old ARM chips, unaligned accesses to memory are not trapped and fixed. On module load, symbols are relocated, and the relocation of __bug_table symbols is done on a u32 basis. Yet the section is not aligned to a multiple of 4 address, but to a multiple of 2. This triggers an Oops on pxa architecture, where address 0xbf0021ea is the first relocation in the __bug_table section : apply_relocate(): pxa3xx_nand: section 13 reloc 0 sym '' Unable to handle kernel paging request at virtual address bf0021ea pgd = e1cd0000 [bf0021ea] pgd=c1cce851, pte=c1cde04f, *ppte=c1cde01f Internal error: Oops: 23 [#1] ARM Modules linked in: CPU: 0 PID: 606 Comm: insmod Not tainted 4.2.0-rc8-next-20150828-cm-x300+ #887 Hardware name: CM-X300 module task: e1c68700 ti: e1c3e000 task.ti: e1c3e000 PC is at apply_relocate+0x2f4/0x3d4 LR is at 0xbf0021ea pc : [<c000e7c8>] lr : [<bf0021ea>] psr: 80000013 sp : e1c3fe30 ip : 60000013 fp : e49e8c60 r10: e49e8fa8 r9 : 00000000 r8 : e49e7c58 r7 : e49e8c38 r6 : e49e8a58 r5 : e49e8920 r4 : e49e8918 r3 : bf0021ea r2 : bf007034 r1 : 00000000 r0 : bf000000 Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 0000397f Table: c1cd0018 DAC: 00000051 Process insmod (pid: 606, stack limit = 0xe1c3e198) [<c000e7c8>] (apply_relocate) from [<c005ce5c>] (load_module+0x1248/0x1f5c) [<c005ce5c>] (load_module) from [<c005dc54>] (SyS_init_module+0xe4/0x170) [<c005dc54>] (SyS_init_module) from [<c000a420>] (ret_fast_syscall+0x0/0x38) Fix this by ensuring entries in __bug_table are all aligned to at least of multiple of 4. This transforms a module section __bug_table as : - [12] __bug_table PROGBITS 00000000 002232 000018 00 A 0 0 1 + [12] __bug_table PROGBITS 00000000 002232 000018 00 A 0 0 4 Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-09-11	arm/xen: Enable user access to the kernel before issuing a privcmd call	Julien Grall	1	-0/+15
	When Xen is copying data to/from the guest it will check if the kernel has the right to do the access. If not, the hypercall will return an error. After the commit a5e090acbf545c0a3b04080f8a488b17ec41fe02 "ARM: software-based privileged-no-access support", the kernel can't access any longer the user space by default. This will result to fail on every hypercall made by the userspace (i.e via privcmd). We have to enable the userspace access and then restore the correct permission every time the privcmd is used to made an hypercall. I didn't find generic helpers to do a these operations, so the change is only arm32 specific. Reported-by: Riku Voipio <riku.voipio@linaro.org> Signed-off-by: Julien Grall <julien.grall@citrix.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-09-11	ARM: domains: add memory dependencies to get_domain/set_domain	Russell King	1	-2/+4
	We need to have memory dependencies on get_domain/set_domain to avoid the compiler over-optimising these inline assembly instructions. Loads/stores must not be reordered across a set_domain(), so introduce a compiler barrier for that assembly. The value of get_domain() must not be cached across a set_domain(), but we still want to allow the compiler to optimise it away. Introduce a dependency on current_thread_info()->cpu_domain to avoid this; the new memory clobber in set_domain() should therefore cause the compiler to re-load this. The other advantage of using this is we should have its address in the register set already, or very soon after at most call sites. Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-09-11	ARM: domains: thread_info.h no longer needs asm/domains.h	Russell King	1	-1/+0
	As of 1eef5d2f1b46 ("ARM: domains: switch to keeping domain value in register") we no longer need to include asm/domains.h into asm/thread_info.h. Remove it. Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-09-11	CIFS: fix type confusion in copy offload ioctl	Jann Horn	1	-0/+6
	This might lead to local privilege escalation (code execution as kernel) for systems where the following conditions are met: - CONFIG_CIFS_SMB2 and CONFIG_CIFS_POSIX are enabled - a cifs filesystem is mounted where: - the mount option "vers" was used and set to a value >=2.0 - the attacker has write access to at least one file on the filesystem To attack this, an attacker would have to guess the target_tcon pointer (but guessing wrong doesn't cause a crash, it just returns an error code) and win a narrow race. CC: Stable <stable@vger.kernel.org> Signed-off-by: Jann Horn <jann@thejh.net> Signed-off-by: Steve French <smfrench@gmail.com>