summaryrefslogtreecommitdiffstats
path: root/include (follow)
Commit message (Collapse)AuthorAgeFilesLines
* NFSv4.1: Allow revoked stateids to skip the call to TEST_STATEIDTrond Myklebust2016-09-271-0/+1
| | | | | | | | | | | | In some cases (e.g. when the SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED sequence flag is set) we may already know that the stateid was revoked and that the only valid operation we can call is FREE_STATEID. In those cases, allow the stateid to carry the information in the type field, so that we skip the redundant call to TEST_STATEID. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* nfs: allow blocking locks to be awoken by lock callbacksJeff Layton2016-09-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | Add a waitqueue head to the client structure. Have clients set a wait on that queue prior to requesting a lock from the server. If the lock is blocked, then we can use that to wait for wakeups. Note that we do need to do this "manually" since we need to set the wait on the waitqueue prior to requesting the lock, but requesting a lock can involve activities that can block. However, only do that for NFSv4.1 locks, either by compiling out all of the waitqueue handling when CONFIG_NFS_V4_1 is disabled, or skipping all of it at runtime if we're dealing with v4.0, or v4.1 servers that don't send lock callbacks. Note too that even when we expect to get a lock callback, RFC5661 section 20.11.4 is pretty clear that we still need to poll for them, so we do still sleep on a timeout. We do however always poll at the longest interval in that case. Signed-off-by: Jeff Layton <jlayton@redhat.com> [Anna: nfs4_retry_setlk() "status" should default to -ERESTARTSYS] Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* nfs: add a new NFS4_OPEN_RESULT_MAY_NOTIFY_LOCK constantJeff Layton2016-09-221-2/+3
| | | | | | | As defined in RFC 5661, section 18.16. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* pnfs: add a new mechanism to select a layout driver according to an ordered listJeff Layton2016-09-191-0/+1
| | | | | | | | | | | | | | | | | | Currently, the layout driver selection code always chooses the first one from the list. That's not really ideal however, as the server can send the list of layout types in any order that it likes. It's up to the client to select the best one for its needs. This patch adds an ordered list of preferred driver types and has the selection code sort the list of available layout drivers according to it. Any unrecognized layout type is sorted to the end of the list. For now, the order of preference is hardcoded, but it should be possible to make this configurable in the future. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* xprtrdma: Support larger inline thresholdsChuck Lever2016-09-191-2/+2
| | | | | | | | | | | | | | The Version One default inline threshold is still 1KB. But allow testing with thresholds up to 64KB. This maximum is somewhat arbitrary. There's no fundamental architectural limit I'm aware of, but it's good to keep the size of Receive buffers reasonable. Now that Send can use a s/g list, a Send buffer is only as large as each RPC requires. Receive buffers are always the size of the inline threshold, however. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* xprtrdma: Client-side support for rpcrdma_connect_privateChuck Lever2016-09-191-0/+4
| | | | | | | | | | | | | | | | | Send an RDMA-CM private message on connect, and look for one during a connection-established event. Both sides can communicate their various implementation limits. Implementations that don't support this sideband protocol ignore it. Once the client knows the server's inline threshold maxima, it can adjust the use of Reply chunks, and eliminate most use of Position Zero Read chunks. Moderately-sized I/O can be done using a pure inline RDMA Send instead of RDMA operations that require memory registration. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* rpcrdma: RDMA/CM private message data structureChuck Lever2016-09-191-0/+35
| | | | | | | | | | | | | | | | | Introduce data structure used by both client and server to exchange implementation details during RDMA/CM connection establishment. This is an experimental out-of-band exchange between Linux RPC-over-RDMA Version One implementations, replacing the deprecated CCP (see RFC 5666bis). The purpose of this extension is to enable prototyping of features that might be introduced in a subsequent version of RPC-over-RDMA. Suggested by Christoph Hellwig and Devesh Sharma. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* SUNRPC: Add a transport-specific private field in rpc_rqstChuck Lever2016-09-191-0/+1
| | | | | | | | | | | | | | | | | | | | Currently there's a hidden and indirect mechanism for finding the rpcrdma_req that goes with an rpc_rqst. It depends on getting from the rq_buffer pointer in struct rpc_rqst to the struct rpcrdma_regbuf that controls that buffer, and then to the struct rpcrdma_req it goes with. This was done back in the day to avoid the need to add a per-rqst pointer or to alter the buf_free API when support for RPC-over-RDMA was introduced. I'm about to change the way regbuf's work to support larger inline thresholds. Now is a good time to replace this indirect mechanism with something that is more straightforward. I guess this should be considered a clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* SUNRPC: Separate buffer pointers for RPC Call and Reply messagesChuck Lever2016-09-191-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For xprtrdma, the RPC Call and Reply buffers are involved in real I/O operations. To start with, the DMA direction of the I/O for a Call is opposite that of a Reply. In the current arrangement, the Reply buffer address is on a four-byte alignment just past the call buffer. Would be friendlier on some platforms if that was at a DMA cache alignment instead. Because the current arrangement allocates a single memory region which contains both buffers, the RPC Reply buffer often contains a page boundary in it when the Call buffer is large enough (which is frequent). It would be a little nicer for setting up DMA operations (and possible registration of the Reply buffer) if the two buffers were separated, well-aligned, and contained as few page boundaries as possible. Now, I could just pad out the single memory region used for the pair of buffers. But frequently that would mean a lot of unused space to ensure the Reply buffer did not have a page boundary. Add a separate pointer to rpc_rqst that points right to the RPC Reply buffer. This makes no difference to xprtsock, but it will help xprtrdma in subsequent patches. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* SUNRPC: Generalize the RPC buffer release APIChuck Lever2016-09-192-2/+2
| | | | | | | | | | | | | | | | | xprtrdma needs to allocate the Call and Reply buffers separately. TBH, the reliance on using a single buffer for the pair of XDR buffers is transport implementation-specific. Instead of passing just the rq_buffer into the buf_free method, pass the task structure and let buf_free take care of freeing both XDR buffers at once. There's a micro-optimization here. In the common case, both xprt_release and the transport's buf_free method were checking if rq_buffer was NULL. Now the check is done only once per RPC. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* SUNRPC: Generalize the RPC buffer allocation APIChuck Lever2016-09-192-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xprtrdma needs to allocate the Call and Reply buffers separately. TBH, the reliance on using a single buffer for the pair of XDR buffers is transport implementation-specific. Transports that want to allocate separate Call and Reply buffers will ignore the "size" argument anyway. Don't bother passing it. The buf_alloc method can't return two pointers. Instead, make the method's return value an error code, and set the rq_buffer pointer in the method itself. This gives call_allocate an opportunity to terminate an RPC instead of looping forever when a permanent problem occurs. If a request is just bogus, or the transport is in a state where it can't allocate resources for any request, there needs to be a way to kill the RPC right there and not loop. This immediately fixes a rare problem in the backchannel send path, which loops if the server happens to send a CB request whose call+reply size is larger than a page (which it shouldn't do yet). One more issue: looks like xprt_inject_disconnect was incorrectly placed in the failure path in call_allocate. It needs to be in the success path, as it is for other call-sites. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* SUNRPC: Refactor rpc_xdr_buf_init()Chuck Lever2016-09-192-1/+13
| | | | | | | | | | | | Clean up: there is some XDR initialization logic that is common to the forward channel and backchannel. Move it to an XDR header so it can be shared. rpc_rqst::rq_buffer points to a buffer containing big-endian data. Update its annotation as part of the clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* SUNRPC: rpc_clnt_add_xprt setup function for NFS layerAndy Adamson2016-09-191-0/+12
| | | | | | | | | | | | Use a setup function to call into the NFS layer to test an rpc_xprt for session trunking so as to not leak the rpc_xprt_switch into the nfs layer. Search for the address in the rpc_xprt_switch first so as not to put an unnecessary EXCHANGE_ID on the wire. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* SUNRPC search xprt switch for sockaddrAndy Adamson2016-09-192-0/+4
| | | | | Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* SUNRPC rpc_clnt_xprt_switch_add_xprtAndy Adamson2016-09-191-0/+1
| | | | | | | Give the NFS layer access to the rpc_xprt_switch_add_xprt function Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* SUNRPC rpc_clnt_xprt_switch_putAndy Adamson2016-09-191-0/+2
| | | | | | | Give the NFS layer access to the xprt_switch_put function Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* pnfs: track multiple layout types in fsinfo structureJeff Layton2016-09-191-1/+6
| | | | | | | | | | | | | | | Current NFSv4.1/pNFS client assumes that MDS supports only one layout type. While it's true for most existing servers, nevertheless, this can be change in the near future. For now, this patch just plumbs in the ability to track a list of layouts in the fsinfo structure. The existing behavior of the client is preserved, by having it just select the first entry in the list. Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Jeff Layton <jlayton@poochiereds.net> Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* Merge branch 'smp-urgent-for-linus' of ↵Linus Torvalds2016-09-181-0/+2
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull SMP build fixlet from Thomas Gleixner: "Add a missing include in cpuhotplug.h" * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/hotplug: Include linux/types.h in linux/cpuhotplug.h
| * cpu/hotplug: Include linux/types.h in linux/cpuhotplug.hPaul Burton2016-09-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The linux/cpuhotplug.h header makes use of the bool type, but wasn't including linux/types.h to ensure that type has been defined. Fix this by including linux/types.h in preparation for including linux/cpuhotplug.h in a file that doesn't do so already. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Richard Cochran <rcochran@linutronix.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Link: http://lkml.kernel.org/r/20160914100027.20945-1-paul.burton@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds2016-09-181-0/+10
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "Two patches from Boris which address a potential deadlock in the atmel irq chip driver" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/atmel-aic: Fix potential deadlock in ->xlate() genirq: Provide irq_gc_{lock_irqsave,unlock_irqrestore}() helpers
| * | genirq: Provide irq_gc_{lock_irqsave,unlock_irqrestore}() helpersBoris Brezillon2016-09-131-0/+10
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some irqchip drivers need to take the generic chip lock outside of the irq context. Provide the irq_gc_{lock_irqsave,unlock_irqrestore}() helpers to allow one to disable irqs while entering a critical section protected by gc->lock. Note that we do not provide optimized version of these helpers for !SMP, because they are not called from the hot-path. [ tglx: Added a comment when these helpers should be [not] used ] Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: stable@vger.kernel.org Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com> Link: http://lkml.kernel.org/r/1473775109-4192-1-git-send-email-boris.brezillon@free-electrons.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | fix iov_iter_fault_in_readable()Al Viro2016-09-171-1/+1
| | | | | | | | | | | | | | | | ... by turning it into what used to be multipages counterpart Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'uaccess-fixes' of ↵Linus Torvalds2016-09-141-7/+13
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull uaccess fixes from Al Viro: "Fixes for broken uaccess primitives - mostly lack of proper zeroing in copy_from_user()/get_user()/__get_user(), but for several architectures there's more (broken clear_user() on frv and strncpy_from_user() on hexagon)" * 'uaccess-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits) avr32: fix copy_from_user() microblaze: fix __get_user() microblaze: fix copy_from_user() m32r: fix __get_user() blackfin: fix copy_from_user() sparc32: fix copy_from_user() sh: fix copy_from_user() sh64: failing __get_user() should zero score: fix copy_from_user() and friends score: fix __get_user/get_user s390: get_user() should zero on failure ppc32: fix copy_from_user() parisc: fix copy_from_user() openrisc: fix copy_from_user() nios2: fix __get_user() nios2: copy_from_user() should zero the tail of destination mn10300: copy_from_user() should zero on access_ok() failure... mn10300: failing __get_user() and get_user() should zero mips: copy_from_user() must zero the destination on access_ok() failure ARC: uaccess: get_user to zero out dest in cause of fault ...
| * | asm-generic: make get_user() clear the destination on errorsAl Viro2016-09-131-3/+7
| | | | | | | | | | | | | | | | | | | | | both for access_ok() failures and for faults halfway through Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | asm-generic: make copy_from_user() zero the destination properlyAl Viro2016-09-101-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ... in all cases, including the failing access_ok() Note that some architectures using asm-generic/uaccess.h have __copy_from_user() not zeroing the tail on failure halfway through. This variant works either way. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds2016-09-131-3/+4
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Ingo Molnar: "Another lockless_dereference() Sparse fix" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/barriers: Don't use sizeof(void) in lockless_dereference()
| * | | locking/barriers: Don't use sizeof(void) in lockless_dereference()Johannes Berg2016-09-051-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | My previous commit: 112dc0c8069e ("locking/barriers: Suppress sparse warnings in lockless_dereference()") caused sparse to complain that (in radix-tree.h) we use sizeof(void) since that rcu_dereference()s a void *. Really, all we need is to have the expression *p in here somewhere to make sure p is a pointer type, and sizeof(*p) was the thing that came to my mind first to make sure that's done without really doing anything at runtime. Another thing I had considered was using typeof(*p), but obviously we can't just declare a typeof(*p) variable either, since that may end up being void. Declaring a variable as typeof(*p)* gets around that, and still checks that typeof(*p) is valid, so do that. This type construction can't be done for _________p1 because that will actually be used and causes sparse address space warnings, so keep a separate unused variable for it. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kbuild-all@01.org Fixes: 112dc0c8069e ("locking/barriers: Suppress sparse warnings in lockless_dereference()") Link: http://lkml.kernel.org/r/1472192160-4049-1-git-send-email-johannes@sipsolutions.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | Merge branch 'efi-urgent-for-linus' of ↵Linus Torvalds2016-09-131-6/+21
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Ingo Molnar: "This contains a Xen fix, an arm64 fix and a race condition / robustization set of fixes related to ExitBootServices() usage and boundary conditions" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/efi: Use efi_exit_boot_services() efi/libstub: Use efi_exit_boot_services() in FDT efi/libstub: Introduce ExitBootServices helper efi/libstub: Allocate headspace in efi_get_memory_map() efi: Fix handling error value in fdt_find_uefi_params efi: Make for_each_efi_memory_desc_in_map() cope with running on Xen
| * | | | efi/libstub: Introduce ExitBootServices helperJeffrey Hugo2016-09-051-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The spec allows ExitBootServices to fail with EFI_INVALID_PARAMETER if a race condition has occurred where the EFI has updated the memory map after the stub grabbed a reference to the map. The spec defines a retry proceedure with specific requirements to handle this scenario. This scenario was previously observed on x86 - commit d3768d885c6c ("x86, efi: retry ExitBootServices() on failure") but the current fix is not spec compliant and the scenario is now observed on the Qualcomm Technologies QDF2432 via the FDT stub which does not handle the error and thus causes boot failures. The user will notice the boot failure as the kernel is not executed and the system may drop back to a UEFI shell, but will be unresponsive to input and the system will require a power cycle to recover. Add a helper to the stub library that correctly adheres to the spec in the case of EFI_INVALID_PARAMETER from ExitBootServices and can be universally used across all stub implementations. Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Leif Lindholm <leif.lindholm@linaro.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
| * | | | efi/libstub: Allocate headspace in efi_get_memory_map()Jeffrey Hugo2016-09-051-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | efi_get_memory_map() allocates a buffer to store the memory map that it retrieves. This buffer may need to be reused by the client after ExitBootServices() is called, at which point allocations are not longer permitted. To support this usecase, provide the allocated buffer size back to the client, and allocate some additional headroom to account for any reasonable growth in the map that is likely to happen between the call to efi_get_memory_map() and the client reusing the buffer. Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Leif Lindholm <leif.lindholm@linaro.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
| * | | | efi: Make for_each_efi_memory_desc_in_map() cope with running on XenJan Beulich2016-09-051-1/+1
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While commit 55f1ea15216 ("efi: Fix for_each_efi_memory_desc_in_map() for empty memmaps") made an attempt to deal with empty memory maps, it didn't address the case where the map field never gets set, as is apparently the case when running under Xen. Reported-by: <lists@ssl-mail.com> Tested-by: <lists@ssl-mail.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Mark Rutland <mark.rutland@arm.com> Cc: <stable@vger.kernel.org> # v4.7+ Signed-off-by: Jan Beulich <jbeulich@suse.com> [ Guard the loop with a NULL check instead of pointer underflow ] Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2016-09-127-4/+24
|\ \ \ \ | |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: "Mostly small sets of driver fixes scattered all over the place. 1) Mediatek driver fixes from Sean Wang. Forward port not written correctly during TX map, missed handling of EPROBE_DEFER, and mistaken use of put_page() instead of skb_free_frag(). 2) Fix socket double-free in KCM code, from WANG Cong. 3) QED driver fixes from Sudarsana Reddy Kalluru, including a fix for using the dcbx buffers before initializing them. 4) Mellanox Switch driver fixes from Jiri Pirko, including a fix for double fib removals and an error handling fix in mlxsw_sp_module_init(). 5) Fix kernel panic when enabling LLDP in i40e driver, from Dave Ertman. 6) Fix padding of TSO packets in thunderx driver, from Sunil Goutham. 7) TCP's rcv_wup not initialized properly when using fastopen, from Neal Cardwell. 8) Don't use uninitialized flow keys in flow dissector, from Gao Feng. 9) Use after free in l2tp module unload, from Sabrina Dubroca. 10) Fix interrupt registry ordering issues in smsc911x driver, from Jeremy Linton. 11) Fix crashes in bonding having to do with enslaving and rx_handler, from Mahesh Bandewar. 12) AF_UNIX deadlock fixes from Linus. 13) In mlx5 driver, don't read skb->xmit_mode after it might have been freed from the TX reclaim path. From Tariq Toukan. 14) Fix a bug from 2015 in TCP Yeah where the congestion window does not increase, from Artem Germanov. 15) Don't pad frames on receive in NFP driver, from Jakub Kicinski. 16) Fix chunk fragmenting in SCTP wrt. GSO, from Marcelo Ricardo Leitner. 17) Fix deletion of VRF routes, from Mark Tomlinson. 18) Fix device refcount leak when DAD fails in ipv6, from Wei Yongjun" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (101 commits) net/mlx4_en: Fix panic on xmit while port is down net/mlx4_en: Fixes for DCBX net/mlx4_en: Fix the return value of mlx4_en_dcbnl_set_state() net/mlx4_en: Fix the return value of mlx4_en_dcbnl_set_all() net: ethernet: renesas: sh_eth: add POST registers for rz drivers: net: phy: mdio-xgene: Add hardware dependency dwc_eth_qos: do not register semi-initialized device sctp: identify chunks that need to be fragmented at IP level mlxsw: spectrum: Set port type before setting its address mlxsw: spectrum_router: Fix error path in mlxsw_sp_router_init nfp: don't pad frames on receive nfp: drop support for old firmware ABIs nfp: remove linux/version.h includes tcp: cwnd does not increase in TCP YeAH net/mlx5e: Fix parsing of vlan packets when updating lro header net/mlx5e: Fix global PFC counters replication net/mlx5e: Prevent casting overflow net/mlx5e: Move an_disable_cap bit to a new position net/mlx5e: Fix xmit_more counter race issue tcp: fastopen: avoid negative sk_forward_alloc ...
| * | | net/mlx5e: Move an_disable_cap bit to a new positionBodong Wang2016-09-091-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previous an_disable_cap position bit31 is deprecated to be use in driver with newer firmware. New firmware will advertise the same capability in bit29. Old capability didn't allow setting more than one protocol for a specific speed when autoneg is off, while newer firmware will allow this and it is indicated in the new capability location. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: Don't delete routes in different VRFsMark Tomlinson2016-09-061-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When deleting an IP address from an interface, there is a clean-up of routes which refer to this local address. However, there was no check to see that the VRF matched. This meant that deletion wasn't confined to the VRF it should have been. To solve this, a new field has been added to fib_info to hold a table id. When removing fib entries corresponding to a local ip address, this table id is also used in the comparison. The table id is populated when the fib_info is created. This was already done in some places, but not in ip_rt_ioctl(). This has now been fixed. Fixes: 021dd3b8a142 ("net: Add routes to the table associated with the device") Acked-by: David Ahern <dsa@cumulusnetworks.com> Tested-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock'Linus Torvalds2016-09-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now we use the 'readlock' both for protecting some of the af_unix IO path and for making the bind be single-threaded. The two are independent, but using the same lock makes for a nasty deadlock due to ordering with regards to filesystem locking. The bind locking would want to nest outside the VSF pathname locking, but the IO locking wants to nest inside some of those same locks. We tried to fix this earlier with commit c845acb324aa ("af_unix: Fix splice-bind deadlock") which moved the readlock inside the vfs locks, but that caused problems with overlayfs that will then call back into filesystem routines that take the lock in the wrong order anyway. Splitting the locks means that we can go back to having the bind lock be the outermost lock, and we don't have any deadlocks with lock ordering. Acked-by: Rainer Weikusat <rweikusat@cyberadapt.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | bonding: Fix bonding crashMahesh Bandewar2016-09-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Following few steps will crash kernel - (a) Create bonding master > modprobe bonding miimon=50 (b) Create macvlan bridge on eth2 > ip link add link eth2 dev mvl0 address aa:0:0:0:0:01 \ type macvlan (c) Now try adding eth2 into the bond > echo +eth2 > /sys/class/net/bond0/bonding/slaves <crash> Bonding does lots of things before checking if the device enslaved is busy or not. In this case when the notifier call-chain sends notifications, the bond_netdev_event() assumes that the rx_handler /rx_handler_data is registered while the bond_enslave() hasn't progressed far enough to register rx_handler for the new slave. This patch adds a rx_handler check that can be performed right at the beginning of the enslave code to avoid getting into this situation. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller2016-08-312-0/+8
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Allow nf_tables reject expression from input, forward and output hooks, since only there the routing information is available, otherwise we crash. 2) Fix unsafe list iteration when flushing timeout and accouting objects. 3) Fix refcount leak on timeout policy parsing failure. 4) Unlink timeout object for unconfirmed conntracks too 5) Missing validation of pkttype mangling from bridge family. 6) Fix refcount leak on ebtables on second lookup for the specific bridge match extension, this patch from Sabrina Dubroca. 7) Remove unnecessary ip_hdr() in nf_tables_netdev family. Patches from 1-5 and 7 from Liping Zhang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | netfilter: nft_meta: improve the validity check of pkttype set exprLiping Zhang2016-08-251-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "meta pkttype set" is only supported on prerouting chain with bridge family and ingress chain with netdev family. But the validate check is incomplete, and the user can add the nft rules on input chain with bridge family, for example: # nft add table bridge filter # nft add chain bridge filter input {type filter hook input \ priority 0 \;} # nft add chain bridge filter test # nft add rule bridge filter test meta pkttype set unicast # nft add rule bridge filter input jump test This patch fixes the problem. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | netfilter: nft_reject: restrict to INPUT/FORWARD/OUTPUTLiping Zhang2016-08-251-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After I add the nft rule "nft add rule filter prerouting reject with tcp reset", kernel panic happened on my system: NULL pointer dereference at ... IP: [<ffffffff81b9db2f>] nf_send_reset+0xaf/0x400 Call Trace: [<ffffffff81b9da80>] ? nf_reject_ip_tcphdr_get+0x160/0x160 [<ffffffffa0928061>] nft_reject_ipv4_eval+0x61/0xb0 [nft_reject_ipv4] [<ffffffffa08e836a>] nft_do_chain+0x1fa/0x890 [nf_tables] [<ffffffffa08e8170>] ? __nft_trace_packet+0x170/0x170 [nf_tables] [<ffffffffa06e0900>] ? nf_ct_invert_tuple+0xb0/0xc0 [nf_conntrack] [<ffffffffa07224d4>] ? nf_nat_setup_info+0x5d4/0x650 [nf_nat] [...] Because in the PREROUTING chain, routing information is not exist, then we will dereference the NULL pointer and oops happen. So we restrict reject expression to INPUT, FORWARD and OUTPUT chain. This is consistent with iptables REJECT target. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | Merge tag 'mac80211-for-davem-2016-08-30' of ↵David S. Miller2016-08-311-0/+9
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Three little fixes: * revert a recent wext patch, which Ben Hutchings noticed was wrong, and it turns out not to be necessary for any driver * fix an infinite loop that can occur under certain conditions in mac80211's TDLS code (depending on regulatory information) * add a cfg80211_get_station() static inline when cfg80211 isn't built, to allow other modules to not have to depend on it for it ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | cfg80211: Add stub for cfg80211_get_station()Linus Lüssing2016-08-301-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows modules using this function (currently: batman-adv) to compile even if cfg80211 is not built at all, thus relaxing dependencies. Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
* | | | | | Merge tag 'for_linus_stable' of ↵Linus Torvalds2016-09-101-3/+2
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull fscrypto fixes fromTed Ts'o: "Fix some brown-paper-bag bugs for fscrypto, including one one which allows a malicious user to set an encryption policy on an empty directory which they do not own" * tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: fscrypto: require write access to mount to set encryption policy fscrypto: only allow setting encryption policy on directories fscrypto: add authorization check for setting encryption policy
| * | | | | | fscrypto: require write access to mount to set encryption policyEric Biggers2016-09-101-3/+2
| | |_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since setting an encryption policy requires writing metadata to the filesystem, it should be guarded by mnt_want_write/mnt_drop_write. Otherwise, a user could cause a write to a frozen or readonly filesystem. This was handled correctly by f2fs but not by ext4. Make fscrypt_process_policy() handle it rather than relying on the filesystem to get it right. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: stable@vger.kernel.org # 4.1+; check fs/{ext4,f2fs} Signed-off-by: Theodore Ts'o <tytso@mit.edu> Acked-by: Jaegeuk Kim <jaegeuk@kernel.org>
* | | | | | usercopy: force check_object_size() inlineKees Cook2016-09-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just for good measure, make sure that check_object_size() is always inlined too, as already done for copy_*_user() and __copy_*_user(). Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kees Cook <keescook@chromium.org>
* | | | | | usercopy: fold builtin_const check into inline functionKees Cook2016-09-061-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of having each caller of check_object_size() need to remember to check for a const size parameter, move the check into check_object_size() itself. This actually matches the original implementation in PaX, though this commit cleans up the now-redundant builtin_const() calls in the various architectures. Signed-off-by: Kees Cook <keescook@chromium.org>
* | | | | | Merge tag 'scsi-fixes' of ↵Linus Torvalds2016-09-061-3/+2
|\ \ \ \ \ \ | |/ / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is really three fixes, but the SES one comes in a bundle of three (making the replacement API available properly, using it and removing the non-working one). The SES problem causes an oops on hpsa devices because they attach virtual disks to the host which aren't SAS attached (the replacement API ignores them). The other two fixes are fairly minor: the sense key one means we actually resolve a newly added sense key and the RDAC device blacklisting is needed to prevent us annoying the universal XPORT lun of various RDAC arrays" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sas: remove is_sas_attached() scsi: ses: use scsi_is_sas_rphy instead of is_sas_attached scsi: sas: provide stub implementation for scsi_is_sas_rphy scsi: blacklist all RDAC devices for BLIST_NO_ULD_ATTACH scsi: fix upper bounds check of sense key in scsi_sense_key_string()
| * | | | | Merge remote-tracking branch 'mkp-scsi/4.8/scsi-fixes' into fixesJames Bottomley2016-08-191-3/+2
| |\ \ \ \ \
| | * | | | | scsi: sas: remove is_sas_attached()Johannes Thumshirn2016-08-191-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As there are no more users of is_sas_attached() left, remove it. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * | | | | scsi: sas: provide stub implementation for scsi_is_sas_rphyJohannes Thumshirn2016-08-191-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide a stub implementation for scsi_is_sas_rphy for kernel configurations which do not have CONFIG_SCSI_SAS_ATTRS defined. Reported-by: kbuild test robot <lkp@intel.com> Suggested-by: James Bottomley <jejb@linux.vnet.ibm.com> Reviewed-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | | | | | Merge tag 'staging-4.8-rc5' of ↵Linus Torvalds2016-09-033-7/+5
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging/IIO driver fixes from Greg KH: "Here are a number of small fixes for staging and IIO drivers that resolve reported problems. Full details are in the shortlog. All of these have been in linux-next with no reported issues" * tag 'staging-4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (35 commits) arm: dts: rockchip: add reset node for the exist saradc SoCs arm64: dts: rockchip: add reset saradc node for rk3368 SoCs iio: adc: rockchip_saradc: reset saradc controller before programming it iio: accel: kxsd9: Fix raw read return iio: adc: ti_am335x_adc: Increase timeout value waiting for ADC sample iio: adc: ti_am335x_adc: Protect FIFO1 from concurrent access include/linux: fix excess fence.h kernel-doc notation staging: wilc1000: correctly check if associatedsta has not been found staging: wilc1000: NULL dereference on error staging: wilc1000: txq_event: Fix coding error MAINTAINERS: Add file patterns for ion device tree bindings MAINTAINERS: Update maintainer entry for wilc1000 iio: chemical: atlas-ph-sensor: fix typo in val assignment iio: fix sched WARNING "do not call blocking ops when !TASK_RUNNING" staging: comedi: ni_mio_common: fix AO inttrig backwards compatibility staging: comedi: dt2811: fix a precedence bug staging: comedi: adv_pci1760: Do not return EINVAL for CMDF_ROUND_DOWN. staging: comedi: ni_mio_common: fix wrong insn_write handler staging: comedi: comedi_test: fix timer race conditions staging: comedi: daqboard2000: bug fix board type matching code ...