diff options
156 files changed, 2757 insertions, 1482 deletions
diff --git a/Documentation/kernel-per-CPU-kthreads.txt b/Documentation/kernel-per-CPU-kthreads.txt new file mode 100644 index 000000000000..cbf7ae412da4 --- /dev/null +++ b/Documentation/kernel-per-CPU-kthreads.txt @@ -0,0 +1,202 @@ +REDUCING OS JITTER DUE TO PER-CPU KTHREADS + +This document lists per-CPU kthreads in the Linux kernel and presents +options to control their OS jitter. Note that non-per-CPU kthreads are +not listed here. To reduce OS jitter from non-per-CPU kthreads, bind +them to a "housekeeping" CPU dedicated to such work. + + +REFERENCES + +o Documentation/IRQ-affinity.txt: Binding interrupts to sets of CPUs. + +o Documentation/cgroups: Using cgroups to bind tasks to sets of CPUs. + +o man taskset: Using the taskset command to bind tasks to sets + of CPUs. + +o man sched_setaffinity: Using the sched_setaffinity() system + call to bind tasks to sets of CPUs. + +o /sys/devices/system/cpu/cpuN/online: Control CPU N's hotplug state, + writing "0" to offline and "1" to online. + +o In order to locate kernel-generated OS jitter on CPU N: + + cd /sys/kernel/debug/tracing + echo 1 > max_graph_depth # Increase the "1" for more detail + echo function_graph > current_tracer + # run workload + cat per_cpu/cpuN/trace + + +KTHREADS + +Name: ehca_comp/%u +Purpose: Periodically process Infiniband-related work. +To reduce its OS jitter, do any of the following: +1. Don't use eHCA Infiniband hardware, instead choosing hardware + that does not require per-CPU kthreads. This will prevent these + kthreads from being created in the first place. (This will + work for most people, as this hardware, though important, is + relatively old and is produced in relatively low unit volumes.) +2. Do all eHCA-Infiniband-related work on other CPUs, including + interrupts. +3. Rework the eHCA driver so that its per-CPU kthreads are + provisioned only on selected CPUs. + + +Name: irq/%d-%s +Purpose: Handle threaded interrupts. +To reduce its OS jitter, do the following: +1. Use irq affinity to force the irq threads to execute on + some other CPU. + +Name: kcmtpd_ctr_%d +Purpose: Handle Bluetooth work. +To reduce its OS jitter, do one of the following: +1. Don't use Bluetooth, in which case these kthreads won't be + created in the first place. +2. Use irq affinity to force Bluetooth-related interrupts to + occur on some other CPU and furthermore initiate all + Bluetooth activity on some other CPU. + +Name: ksoftirqd/%u +Purpose: Execute softirq handlers when threaded or when under heavy load. +To reduce its OS jitter, each softirq vector must be handled +separately as follows: +TIMER_SOFTIRQ: Do all of the following: +1. To the extent possible, keep the CPU out of the kernel when it + is non-idle, for example, by avoiding system calls and by forcing + both kernel threads and interrupts to execute elsewhere. +2. Build with CONFIG_HOTPLUG_CPU=y. After boot completes, force + the CPU offline, then bring it back online. This forces + recurring timers to migrate elsewhere. If you are concerned + with multiple CPUs, force them all offline before bringing the + first one back online. Once you have onlined the CPUs in question, + do not offline any other CPUs, because doing so could force the + timer back onto one of the CPUs in question. +NET_TX_SOFTIRQ and NET_RX_SOFTIRQ: Do all of the following: +1. Force networking interrupts onto other CPUs. +2. Initiate any network I/O on other CPUs. +3. Once your application has started, prevent CPU-hotplug operations + from being initiated from tasks that might run on the CPU to + be de-jittered. (It is OK to force this CPU offline and then + bring it back online before you start your application.) +BLOCK_SOFTIRQ: Do all of the following: +1. Force block-device interrupts onto some other CPU. +2. Initiate any block I/O on other CPUs. +3. Once your application has started, prevent CPU-hotplug operations + from being initiated from tasks that might run on the CPU to + be de-jittered. (It is OK to force this CPU offline and then + bring it back online before you start your application.) +BLOCK_IOPOLL_SOFTIRQ: Do all of the following: +1. Force block-device interrupts onto some other CPU. +2. Initiate any block I/O and block-I/O polling on other CPUs. +3. Once your application has started, prevent CPU-hotplug operations + from being initiated from tasks that might run on the CPU to + be de-jittered. (It is OK to force this CPU offline and then + bring it back online before you start your application.) +TASKLET_SOFTIRQ: Do one or more of the following: +1. Avoid use of drivers that use tasklets. (Such drivers will contain + calls to things like tasklet_schedule().) +2. Convert all drivers that you must use from tasklets to workqueues. +3. Force interrupts for drivers using tasklets onto other CPUs, + and also do I/O involving these drivers on other CPUs. +SCHED_SOFTIRQ: Do all of the following: +1. Avoid sending scheduler IPIs to the CPU to be de-jittered, + for example, ensure that at most one runnable kthread is present + on that CPU. If a thread that expects to run on the de-jittered + CPU awakens, the scheduler will send an IPI that can result in + a subsequent SCHED_SOFTIRQ. +2. Build with CONFIG_RCU_NOCB_CPU=y, CONFIG_RCU_NOCB_CPU_ALL=y, + CONFIG_NO_HZ_FULL=y, and, in addition, ensure that the CPU + to be de-jittered is marked as an adaptive-ticks CPU using the + "nohz_full=" boot parameter. This reduces the number of + scheduler-clock interrupts that the de-jittered CPU receives, + minimizing its chances of being selected to do the load balancing + work that runs in SCHED_SOFTIRQ context. +3. To the extent possible, keep the CPU out of the kernel when it + is non-idle, for example, by avoiding system calls and by + forcing both kernel threads and interrupts to execute elsewhere. + This further reduces the number of scheduler-clock interrupts + received by the de-jittered CPU. +HRTIMER_SOFTIRQ: Do all of the following: +1. To the extent possible, keep the CPU out of the kernel when it + is non-idle. For example, avoid system calls and force both + kernel threads and interrupts to execute elsewhere. +2. Build with CONFIG_HOTPLUG_CPU=y. Once boot completes, force the + CPU offline, then bring it back online. This forces recurring + timers to migrate elsewhere. If you are concerned with multiple + CPUs, force them all offline before bringing the first one + back online. Once you have onlined the CPUs in question, do not + offline any other CPUs, because doing so could force the timer + back onto one of the CPUs in question. +RCU_SOFTIRQ: Do at least one of the following: +1. Offload callbacks and keep the CPU in either dyntick-idle or + adaptive-ticks state by doing all of the following: + a. Build with CONFIG_RCU_NOCB_CPU=y, CONFIG_RCU_NOCB_CPU_ALL=y, + CONFIG_NO_HZ_FULL=y, and, in addition ensure that the CPU + to be de-jittered is marked as an adaptive-ticks CPU using + the "nohz_full=" boot parameter. Bind the rcuo kthreads + to housekeeping CPUs, which can tolerate OS jitter. + b. To the extent possible, keep the CPU out of the kernel + when it is non-idle, for example, by avoiding system + calls and by forcing both kernel threads and interrupts + to execute elsewhere. +2. Enable RCU to do its processing remotely via dyntick-idle by + doing all of the following: + a. Build with CONFIG_NO_HZ=y and CONFIG_RCU_FAST_NO_HZ=y. + b. Ensure that the CPU goes idle frequently, allowing other + CPUs to detect that it has passed through an RCU quiescent + state. If the kernel is built with CONFIG_NO_HZ_FULL=y, + userspace execution also allows other CPUs to detect that + the CPU in question has passed through a quiescent state. + c. To the extent possible, keep the CPU out of the kernel + when it is non-idle, for example, by avoiding system + calls and by forcing both kernel threads and interrupts + to execute elsewhere. + +Name: rcuc/%u +Purpose: Execute RCU callbacks in CONFIG_RCU_BOOST=y kernels. +To reduce its OS jitter, do at least one of the following: +1. Build the kernel with CONFIG_PREEMPT=n. This prevents these + kthreads from being created in the first place, and also obviates + the need for RCU priority boosting. This approach is feasible + for workloads that do not require high degrees of responsiveness. +2. Build the kernel with CONFIG_RCU_BOOST=n. This prevents these + kthreads from being created in the first place. This approach + is feasible only if your workload never requires RCU priority + boosting, for example, if you ensure frequent idle time on all + CPUs that might execute within the kernel. +3. Build with CONFIG_RCU_NOCB_CPU=y and CONFIG_RCU_NOCB_CPU_ALL=y, + which offloads all RCU callbacks to kthreads that can be moved + off of CPUs susceptible to OS jitter. This approach prevents the + rcuc/%u kthreads from having any work to do, so that they are + never awakened. +4. Ensure that the CPU never enters the kernel, and, in particular, + avoid initiating any CPU hotplug operations on this CPU. This is + another way of preventing any callbacks from being queued on the + CPU, again preventing the rcuc/%u kthreads from having any work + to do. + +Name: rcuob/%d, rcuop/%d, and rcuos/%d +Purpose: Offload RCU callbacks from the corresponding CPU. +To reduce its OS jitter, do at least one of the following: +1. Use affinity, cgroups, or other mechanism to force these kthreads + to execute on some other CPU. +2. Build with CONFIG_RCU_NOCB_CPUS=n, which will prevent these + kthreads from being created in the first place. However, please + note that this will not eliminate OS jitter, but will instead + shift it to RCU_SOFTIRQ. + +Name: watchdog/%u +Purpose: Detect software lockups on each CPU. +To reduce its OS jitter, do at least one of the following: +1. Build with CONFIG_LOCKUP_DETECTOR=n, which will prevent these + kthreads from being created in the first place. +2. Echo a zero to /proc/sys/kernel/watchdog to disable the + watchdog timer. +3. Echo a large number of /proc/sys/kernel/watchdog_thresh in + order to reduce the frequency of OS jitter due to the watchdog + timer down to a level that is acceptable for your workload. diff --git a/arch/Kconfig b/arch/Kconfig index dd0e8eb8042f..a4429bcd609e 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -213,6 +213,9 @@ config USE_GENERIC_SMP_HELPERS config GENERIC_SMP_IDLE_THREAD bool +config GENERIC_IDLE_POLL_SETUP + bool + # Select if arch init_task initializer is different to init/init_task.c config ARCH_INIT_TASK bool diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index d423d58f938d..49d993cee512 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -38,6 +38,7 @@ config ARM select HAVE_GENERIC_HARDIRQS select HAVE_HW_BREAKPOINT if (PERF_EVENTS && (CPU_V6 || CPU_V6K || CPU_V7)) select HAVE_IDE if PCI || ISA || PCMCIA + select HAVE_IRQ_TIME_ACCOUNTING select HAVE_KERNEL_GZIP select HAVE_KERNEL_LZMA select HAVE_KERNEL_LZO @@ -488,7 +489,7 @@ config ARCH_IXP4XX config ARCH_DOVE bool "Marvell Dove" select ARCH_REQUIRE_GPIOLIB - select CPU_V7 + select CPU_PJ4 select GENERIC_CLOCKEVENTS select MIGHT_HAVE_PCI select PINCTRL diff --git a/arch/arm/Makefile b/arch/arm/Makefile index 47374085befd..1ba358ba16b8 100644 --- a/arch/arm/Makefile +++ b/arch/arm/Makefile @@ -309,7 +309,7 @@ define archhelp echo ' Image - Uncompressed kernel image (arch/$(ARCH)/boot/Image)' echo '* xipImage - XIP kernel image, if configured (arch/$(ARCH)/boot/xipImage)' echo ' uImage - U-Boot wrapped zImage' - echo ' bootpImage - Combined zImage and initial RAM disk' + echo ' bootpImage - Combined zImage and initial RAM disk' echo ' (supply initrd image via make variable INITRD=<path>)' echo '* dtbs - Build device tree blobs for enabled boards' echo ' install - Install uncompressed kernel' diff --git a/arch/arm/common/mcpm_platsmp.c b/arch/arm/common/mcpm_platsmp.c index 52b88d81b7bb..3caed0db6986 100644 --- a/arch/arm/common/mcpm_platsmp.c +++ b/arch/arm/common/mcpm_platsmp.c @@ -15,8 +15,6 @@ #include <linux/smp.h> #include <linux/spinlock.h> -#include <linux/irqchip/arm-gic.h> - #include <asm/mcpm.h> #include <asm/smp.h> #include <asm/smp_plat.h> @@ -49,7 +47,6 @@ static int __cpuinit mcpm_boot_secondary(unsigned int cpu, struct task_struct *i static void __cpuinit mcpm_secondary_init(unsigned int cpu) { mcpm_cpu_powered_up(); - gic_secondary_init(0); } #ifdef CONFIG_HOTPLUG_CPU diff --git a/arch/arm/include/asm/cmpxchg.h b/arch/arm/include/asm/cmpxchg.h index 7eb18c1d8d6c..4f009c10540d 100644 --- a/arch/arm/include/asm/cmpxchg.h +++ b/arch/arm/include/asm/cmpxchg.h @@ -233,15 +233,15 @@ static inline unsigned long __cmpxchg_local(volatile void *ptr, ((__typeof__(*(ptr)))atomic64_cmpxchg(container_of((ptr), \ atomic64_t, \ counter), \ - (unsigned long)(o), \ - (unsigned long)(n))) + (unsigned long long)(o), \ + (unsigned long long)(n))) #define cmpxchg64_local(ptr, o, n) \ ((__typeof__(*(ptr)))local64_cmpxchg(container_of((ptr), \ local64_t, \ a), \ - (unsigned long)(o), \ - (unsigned long)(n))) + (unsigned long long)(o), \ + (unsigned long long)(n))) #endif /* __LINUX_ARM_ARCH__ >= 6 */ diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c index d30042e39974..13609e01f4b7 100644 --- a/arch/arm/xen/enlighten.c +++ b/arch/arm/xen/enlighten.c @@ -152,11 +152,12 @@ int xen_unmap_domain_mfn_range(struct vm_area_struct *vma, } EXPORT_SYMBOL_GPL(xen_unmap_domain_mfn_range); -static int __init xen_secondary_init(unsigned int cpu) +static void __init xen_percpu_init(void *unused) { struct vcpu_register_vcpu_info info; struct vcpu_info *vcpup; int err; + int cpu = get_cpu(); pr_info("Xen: initializing cpu%d\n", cpu); vcpup = per_cpu_ptr(xen_vcpu_info, cpu); @@ -165,14 +166,10 @@ static int __init xen_secondary_init(unsigned int cpu) info.offset = offset_in_page(vcpup); err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info); - if (err) { - pr_debug("register_vcpu_info failed: err=%d\n", err); - } else { - /* This cpu is using the registered vcpu info, even if - later ones fail to. */ - per_cpu(xen_vcpu, cpu) = vcpup; - } - return 0; + BUG_ON(err); + per_cpu(xen_vcpu, cpu) = vcpup; + + enable_percpu_irq(xen_events_irq, 0); } static void xen_restart(char str, const char *cmd) @@ -208,7 +205,6 @@ static int __init xen_guest_init(void) const char *version = NULL; const char *xen_prefix = "xen,xen-"; struct resource res; - int i; node = of_find_compatible_node(NULL, NULL, "xen,xen"); if (!node) { @@ -265,19 +261,23 @@ static int __init xen_guest_init(void) sizeof(struct vcpu_info)); if (xen_vcpu_info == NULL) return -ENOMEM; - for_each_online_cpu(i) - xen_secondary_init(i); gnttab_init(); if (!xen_initial_domain()) xenbus_probe(NULL); + return 0; +} +core_initcall(xen_guest_init); + +static int __init xen_pm_init(void) +{ pm_power_off = xen_power_off; arm_pm_restart = xen_restart; return 0; } -core_initcall(xen_guest_init); +subsys_initcall(xen_pm_init); static irqreturn_t xen_arm_callback(int irq, void *arg) { @@ -285,11 +285,6 @@ static irqreturn_t xen_arm_callback(int irq, void *arg) return IRQ_HANDLED; } -static __init void xen_percpu_enable_events(void *unused) -{ - enable_percpu_irq(xen_events_irq, 0); -} - static int __init xen_init_events(void) { if (!xen_domain() || xen_events_irq < 0) @@ -303,7 +298,7 @@ static int __init xen_init_events(void) return -EINVAL; } - on_each_cpu(xen_percpu_enable_events, NULL, 0); + on_each_cpu(xen_percpu_init, NULL, 0); return 0; } diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig index cad060f288cf..6507dabdd5dd 100644 --- a/arch/parisc/Kconfig +++ b/arch/parisc/Kconfig @@ -245,7 +245,7 @@ config SMP config IRQSTACKS bool "Use separate kernel stacks when processing interrupts" - default n + default y help If you say Y here the kernel will use separate kernel stacks for handling hard and soft interrupts. This can help avoid diff --git a/arch/parisc/Makefile b/arch/parisc/Makefile index 2f967cc6649e..197690068f88 100644 --- a/arch/parisc/Makefile +++ b/arch/parisc/Makefile @@ -23,24 +23,21 @@ NM = sh $(srctree)/arch/parisc/nm CHECKFLAGS += -D__hppa__=1 LIBGCC = $(shell $(CC) $(KBUILD_CFLAGS) -print-libgcc-file-name) -MACHINE := $(shell uname -m) -NATIVE := $(if $(filter parisc%,$(MACHINE)),1,0) - ifdef CONFIG_64BIT UTS_MACHINE := parisc64 CHECKFLAGS += -D__LP64__=1 -m64 -WIDTH := 64 +CC_ARCHES = hppa64 else # 32-bit -WIDTH := +CC_ARCHES = hppa hppa2.0 hppa1.1 endif -# attempt to help out folks who are cross-compiling -ifeq ($(NATIVE),1) -CROSS_COMPILE := hppa$(WIDTH)-linux- -else - ifeq ($(CROSS_COMPILE),) - CROSS_COMPILE := hppa$(WIDTH)-linux-gnu- - endif +ifneq ($(SUBARCH),$(UTS_MACHINE)) + ifeq ($(CROSS_COMPILE),) + CC_SUFFIXES = linux linux-gnu unknown-linux-gnu + CROSS_COMPILE := $(call cc-cross-prefix, \ + $(foreach a,$(CC_ARCHES), \ + $(foreach s,$(CC_SUFFIXES),$(a)-$(s)-))) + endif endif OBJCOPY_FLAGS =-O binary -R .note -R .comment -S diff --git a/arch/parisc/include/asm/hardirq.h b/arch/parisc/include/asm/hardirq.h index 12373c4dabab..c19f7138ba48 100644 --- a/arch/parisc/include/asm/hardirq.h +++ b/arch/parisc/include/asm/hardirq.h @@ -11,10 +11,18 @@ #include <linux/threads.h> #include <linux/irq.h> +#ifdef CONFIG_IRQSTACKS +#define __ARCH_HAS_DO_SOFTIRQ +#endif + typedef struct { unsigned int __softirq_pending; #ifdef CONFIG_DEBUG_STACKOVERFLOW unsigned int kernel_stack_usage; +#ifdef CONFIG_IRQSTACKS + unsigned int irq_stack_usage; + unsigned int irq_stack_counter; +#endif #endif #ifdef CONFIG_SMP unsigned int irq_resched_count; @@ -28,6 +36,7 @@ DECLARE_PER_CPU_SHARED_ALIGNED(irq_cpustat_t, irq_stat); #define __ARCH_IRQ_STAT #define __IRQ_STAT(cpu, member) (irq_stat[cpu].member) #define inc_irq_stat(member) this_cpu_inc(irq_stat.member) +#define __inc_irq_stat(member) __this_cpu_inc(irq_stat.member) #define local_softirq_pending() this_cpu_read(irq_stat.__softirq_pending) #define __ARCH_SET_SOFTIRQ_PENDING diff --git a/arch/parisc/include/asm/processor.h b/arch/parisc/include/asm/processor.h index 064015547d1e..cfbc43929cf6 100644 --- a/arch/parisc/include/asm/processor.h +++ b/arch/parisc/include/asm/processor.h @@ -63,10 +63,13 @@ */ #ifdef __KERNEL__ +#include <linux/spinlock_types.h> + #define IRQ_STACK_SIZE (4096 << 2) /* 16k irq stack size */ union irq_stack_union { unsigned long stack[IRQ_STACK_SIZE/sizeof(unsigned long)]; + raw_spinlock_t lock; }; DECLARE_PER_CPU(union irq_stack_union, irq_stack_union); diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S index 4bb96ad9b0b1..ae27cb6ce19a 100644 --- a/arch/parisc/kernel/entry.S +++ b/arch/parisc/kernel/entry.S @@ -452,9 +452,41 @@ L2_ptep \pgd,\pte,\index,\va,\fault .endm + /* Acquire pa_dbit_lock lock. */ + .macro dbit_lock spc,tmp,tmp1 +#ifdef CONFIG_SMP + cmpib,COND(=),n 0,\spc,2f + load32 PA(pa_dbit_lock),\tmp +1: LDCW 0(\tmp),\tmp1 + cmpib,COND(=) 0,\tmp1,1b + nop +2: +#endif + .endm + + /* Release pa_dbit_lock lock without reloading lock address. */ + .macro dbit_unlock0 spc,tmp +#ifdef CONFIG_SMP + or,COND(=) %r0,\spc,%r0 + stw \spc,0(\tmp) +#endif + .endm + + /* Release pa_dbit_lock lock. */ + .macro dbit_unlock1 spc,tmp +#ifdef CONFIG_SMP + load32 PA(pa_dbit_lock),\tmp + dbit_unlock0 \spc,\tmp +#endif + .endm + /* Set the _PAGE_ACCESSED bit of the PTE. Be clever and * don't needlessly dirty the cache line if it was already set */ - .macro update_ptep ptep,pte,tmp,tmp1 + .macro update_ptep spc,ptep,pte,tmp,tmp1 +#ifdef CONFIG_SMP + or,COND(=) %r0,\spc,%r0 + LDREG 0(\ptep),\pte +#endif ldi _PAGE_ACCESSED,\tmp1 or \tmp1,\pte,\tmp and,COND(<>) \tmp1,\pte,%r0 @@ -463,7 +495,11 @@ /* Set the dirty bit (and accessed bit). No need to be * clever, this is only used from the dirty fault */ - .macro update_dirty ptep,pte,tmp + .macro update_dirty spc,ptep,pte,tmp +#ifdef CONFIG_SMP + or,COND(=) %r0,\spc,%r0 + LDREG 0(\ptep),\pte +#endif ldi _PAGE_ACCESSED|_PAGE_DIRTY,\tmp or \tmp,\pte,\pte STREG \pte,0(\ptep) @@ -1111,11 +1147,13 @@ dtlb_miss_20w: L3_ptep ptp,pte,t0,va,dtlb_check_alias_20w - update_ptep ptp,pte,t0,t1 + dbit_lock spc,t0,t1 + update_ptep spc,ptp,pte,t0,t1 make_insert_tlb spc,pte,prot idtlbt pte,prot + dbit_unlock1 spc,t0 rfir nop @@ -1135,11 +1173,13 @@ nadtlb_miss_20w: L3_ptep ptp,pte,t0,va,nadtlb_check_alias_20w - update_ptep ptp,pte,t0,t1 + dbit_lock spc,t0,t1 + update_ptep spc,ptp,pte,t0,t1 make_insert_tlb spc,pte,prot idtlbt pte,prot + dbit_unlock1 spc,t0 rfir nop @@ -1161,7 +1201,8 @@ dtlb_miss_11: L2_ptep ptp,pte,t0,va,dtlb_check_alias_11 - update_ptep ptp,pte,t0,t1 + dbit_lock spc,t0,t1 + update_ptep spc,ptp,pte,t0,t1 make_insert_tlb_11 spc,pte,prot @@ -1172,6 +1213,7 @@ dtlb_miss_11: idtlbp prot,(%sr1,va) mtsp t0, %sr1 /* Restore sr1 */ + dbit_unlock1 spc,t0 rfir nop @@ -1192,7 +1234,8 @@ nadtlb_miss_11: L2_ptep ptp,pte,t0,va,nadtlb_check_alias_11 - update_ptep ptp,pte,t0,t1 + dbit_lock spc,t0,t1 + update_ptep spc,ptp,pte,t0,t1 make_insert_tlb_11 spc,pte,prot @@ -1204,6 +1247,7 @@ nadtlb_miss_11: idtlbp prot,(%sr1,va) mtsp t0, %sr1 /* Restore sr1 */ + dbit_unlock1 spc,t0 rfir nop @@ -1224,13 +1268,15 @@ dtlb_miss_20: L2_ptep ptp,pte,t0,va,dtlb_check_alias_20 - update_ptep ptp,pte,t0,t1 + dbit_lock spc,t0,t1 + update_ptep spc,ptp,pte,t0,t1 make_insert_tlb spc,pte,prot f_extend pte,t0 idtlbt pte,prot + dbit_unlock1 spc,t0 rfir nop @@ -1250,13 +1296,15 @@ nadtlb_miss_20: L2_ptep ptp,pte,t0,va,nadtlb_check_alias_20 - update_ptep ptp,pte,t0,t1 + dbit_lock spc,t0,t1 + update_ptep spc,ptp,pte,t0,t1 make_insert_tlb spc,pte,prot f_extend pte,t0 idtlbt pte,prot + dbit_unlock1 spc,t0 rfir nop @@ -1357,11 +1405,13 @@ itlb_miss_20w: L3_ptep ptp,pte,t0,va,itlb_fault - update_ptep ptp,pte,t0,t1 + dbit_lock spc,t0,t1 + update_ptep spc,ptp,pte,t0,t1 make_insert_tlb spc,pte,prot iitlbt pte,prot + dbit_unlock1 spc,t0 rfir nop @@ -1379,11 +1429,13 @@ naitlb_miss_20w: L3_ptep ptp,pte,t0,va,naitlb_check_alias_20w - update_ptep ptp,pte,t0,t1 + dbit_lock spc,t0,t1 + update_ptep spc,ptp,pte,t0,t1 make_insert_tlb spc,pte,prot iitlbt pte,prot + dbit_unlock1 spc,t0 rfir nop @@ -1405,7 +1457,8 @@ itlb_miss_11: L2_ptep ptp,pte,t0,va,itlb_fault - update_ptep ptp,pte,t0,t1 + dbit_lock spc,t0,t1 + update_ptep spc,ptp,pte,t0,t1 make_insert_tlb_11 spc,pte,prot @@ -1416,6 +1469,7 @@ itlb_miss_11: iitlbp prot,(%sr1,va) mtsp t0, %sr1 /* Restore sr1 */ + dbit_unlock1 spc,t0 rfir nop @@ -1427,7 +1481,8 @@ naitlb_miss_11: L2_ptep ptp,pte,t0,va,naitlb_check_alias_11 - update_ptep ptp,pte,t0,t1 + dbit_lock spc,t0,t1 + update_ptep spc,ptp,pte,t0,t1 make_insert_tlb_11 spc,pte,prot @@ -1438,6 +1493,7 @@ naitlb_miss_11: iitlbp prot,(%sr1,va) mtsp t0, %sr1 /* Restore sr1 */ + dbit_unlock1 spc,t0 rfir nop @@ -1459,13 +1515,15 @@ itlb_miss_20: L2_ptep ptp,pte,t0,va,itlb_fault - update_ptep ptp,pte,t0,t1 + dbit_lock spc,t0,t1 + update_ptep spc,ptp,pte,t0,t1 make_insert_tlb spc,pte,prot f_extend pte,t0 iitlbt pte,prot + dbit_unlock1 spc,t0 rfir nop @@ -1477,13 +1535,15 @@ naitlb_miss_20: L2_ptep ptp,pte,t0,va,naitlb_check_alias_20 - update_ptep ptp,pte,t0,t1 + dbit_lock spc,t0,t1 + update_ptep spc,ptp,pte,t0,t1 make_insert_tlb spc,pte,prot f_extend pte,t0 iitlbt pte,prot + dbit_unlock1 spc,t0 rfir nop @@ -1507,29 +1567,13 @@ dbit_trap_20w: L3_ptep ptp,pte,t0,va,dbit_fault -#ifdef CONFIG_SMP - cmpib,COND(=),n 0,spc,dbit_nolock_20w - load32 PA(pa_dbit_lock),t0 - -dbit_spin_20w: - LDCW 0(t0),t1 - cmpib,COND(=) 0,t1,dbit_spin_20w - nop - -dbit_nolock_20w: -#endif - update_dirty ptp,pte,t1 + dbit_lock spc,t0,t1 + update_dirty spc,ptp,pte,t1 make_insert_tlb spc,pte,prot idtlbt pte,prot -#ifdef CONFIG_SMP - cmpib,COND(=),n 0,spc,dbit_nounlock_20w - ldi 1,t1 - stw t1,0(t0) - -dbit_nounlock_20w: -#endif + dbit_unlock0 spc,t0 rfir nop @@ -1543,18 +1587,8 @@ dbit_trap_11: L2_ptep ptp,pte,t0,va,dbit_fault -#ifdef CONFIG_SMP - cmpib,COND(=),n 0,spc,dbit_nolock_11 - load32 PA(pa_dbit_lock),t0 - -dbit_spin_11: - LDCW 0(t0),t1 - cmpib,= 0,t1,dbit_spin_11 - nop - -dbit_nolock_11: -#endif - update_dirty ptp,pte,t1 + dbit_lock spc,t0,t1 + update_dirty spc,ptp,pte,t1 make_insert_tlb_11 spc,pte,prot @@ -1565,13 +1599,7 @@ dbit_nolock_11: idtlbp prot,(%sr1,va) mtsp t1, %sr1 /* Restore sr1 */ -#ifdef CONFIG_SMP - cmpib,COND(=),n 0,spc,dbit_nounlock_11 - ldi 1,t1 - stw t1,0(t0) - -dbit_nounlock_11: -#endif + dbit_unlock0 spc,t0 rfir nop @@ -1583,32 +1611,15 @@ dbit_trap_20: L2_ptep ptp,pte,t0,va,dbit_fault -#ifdef CONFIG_SMP - cmpib,COND(=),n 0,spc,dbit_nolock_20 - load32 PA(pa_dbit_lock),t0 - -dbit_spin_20: - LDCW 0(t0),t1 - cmpib,= 0,t1,dbit_spin_20 - nop - -dbit_nolock_20: -#endif - update_dirty ptp,pte,t1 + dbit_lock spc,t0,t1 + update_dirty spc,ptp,pte,t1 make_insert_tlb spc,pte,prot f_extend pte,t1 idtlbt pte,prot - -#ifdef CONFIG_SMP - cmpib,COND(=),n 0,spc,dbit_nounlock_20 - ldi 1,t1 - stw t1,0(t0) - -dbit_nounlock_20: -#endif + dbit_unlock0 spc,t0 rfir nop diff --git a/arch/parisc/kernel/irq.c b/arch/parisc/kernel/irq.c index e255db0bb761..55237a70e197 100644 --- a/arch/parisc/kernel/irq.c +++ b/arch/parisc/kernel/irq.c @@ -166,22 +166,32 @@ int arch_show_interrupts(struct seq_file *p, int prec) seq_printf(p, "%*s: ", prec, "STK"); for_each_online_cpu(j) seq_printf(p, "%10u ", irq_stats(j)->kernel_stack_usage); - seq_printf(p, " Kernel stack usage\n"); + seq_puts(p, " Kernel stack usage\n"); +# ifdef CONFIG_IRQSTACKS + seq_printf(p, "%*s: ", prec, "IST"); + for_each_online_cpu(j) + seq_printf(p, "%10u ", irq_stats(j)->irq_stack_usage); + seq_puts(p, " Interrupt stack usage\n"); + seq_printf(p, "%*s: ", prec, "ISC"); + for_each_online_cpu(j) + seq_printf(p, "%10u ", irq_stats(j)->irq_stack_counter); + seq_puts(p, " Interrupt stack usage counter\n"); +# endif #endif #ifdef CONFIG_SMP seq_printf(p, "%*s: ", prec, "RES"); for_each_online_cpu(j) seq_printf(p, "%10u ", irq_stats(j)->irq_resched_count); - seq_printf(p, " Rescheduling interrupts\n"); + seq_puts(p, " Rescheduling interrupts\n"); seq_printf(p, "%*s: ", prec, "CAL"); for_each_online_cpu(j) seq_printf(p, "%10u ", irq_stats(j)->irq_call_count); - seq_printf(p, " Function call interrupts\n"); + seq_puts(p, " Function call interrupts\n"); #endif seq_printf(p, "%*s: ", prec, "TLB"); for_each_online_cpu(j) seq_printf(p, "%10u ", irq_stats(j)->irq_tlb_count); - seq_printf(p, " TLB shootdowns\n"); + seq_puts(p, " TLB shootdowns\n"); return 0; } @@ -378,6 +388,7 @@ static inline void stack_overflow_check(struct pt_regs *regs) unsigned long sp = regs->gr[30]; unsigned long stack_usage; unsigned int *last_usage; + int cpu = smp_processor_id(); /* if sr7 != 0, we interrupted a userspace process which we do not want * to check for stack overflow. We will only check the kernel stack. */ @@ -386,7 +397,31 @@ static inline void stack_overflow_check(struct pt_regs *regs) /* calculate kernel stack usage */ stack_usage = sp - stack_start; - last_usage = &per_cpu(irq_stat.kernel_stack_usage, smp_processor_id()); +#ifdef CONFIG_IRQSTACKS + if (likely(stack_usage <= THREAD_SIZE)) + goto check_kernel_stack; /* found kernel stack */ + + /* check irq stack usage */ + stack_start = (unsigned long) &per_cpu(irq_stack_union, cpu).stack; + stack_usage = sp - stack_start; + + last_usage = &per_cpu(irq_stat.irq_stack_usage, cpu); + if (unlikely(stack_usage > *last_usage)) + *last_usage = stack_usage; + + if (likely(stack_usage < (IRQ_STACK_SIZE - STACK_MARGIN))) + return; + + pr_emerg("stackcheck: %s will most likely overflow irq stack " + "(sp:%lx, stk bottom-top:%lx-%lx)\n", + current->comm, sp, stack_start, stack_start + IRQ_STACK_SIZE); + goto panic_check; + +check_kernel_stack: +#endif + + /* check kernel stack usage */ + last_usage = &per_cpu(irq_stat.kernel_stack_usage, cpu); if (unlikely(stack_usage > *last_usage)) *last_usage = stack_usage; @@ -398,31 +433,69 @@ static inline void stack_overflow_check(struct pt_regs *regs) "(sp:%lx, stk bottom-top:%lx-%lx)\n", current->comm, sp, stack_start, stack_start + THREAD_SIZE); +#ifdef CONFIG_IRQSTACKS +panic_check: +#endif if (sysctl_panic_on_stackoverflow) panic("low stack detected by irq handler - check messages\n"); #endif } #ifdef CONFIG_IRQSTACKS -DEFINE_PER_CPU(union irq_stack_union, irq_stack_union); +DEFINE_PER_CPU(union irq_stack_union, irq_stack_union) = { + .lock = __RAW_SPIN_LOCK_UNLOCKED((irq_stack_union).lock) + }; static void execute_on_irq_stack(void *func, unsigned long param1) { - unsigned long *irq_stack_start; + union irq_stack_union *union_ptr; unsigned long irq_stack; - int cpu = smp_processor_id(); + raw_spinlock_t *irq_stack_in_use; - irq_stack_start = &per_cpu(irq_stack_union, cpu).stack[0]; - irq_stack = (unsigned long) irq_stack_start; - irq_stack = ALIGN(irq_stack, 16); /* align for stack frame usage */ + union_ptr = &per_cpu(irq_stack_union, smp_processor_id()); + irq_stack = (unsigned long) &union_ptr->stack; + irq_stack = ALIGN(irq_stack + sizeof(irq_stack_union.lock), + 64); /* align for stack frame usage */ - BUG_ON(*irq_stack_start); /* report bug if we were called recursive. */ - *irq_stack_start = 1; + /* We may be called recursive. If we are already using the irq stack, + * just continue to use it. Use spinlocks to serialize + * the irq stack usage. + */ + irq_stack_in_use = &union_ptr->lock; + if (!raw_spin_trylock(irq_stack_in_use)) { + void (*direct_call)(unsigned long p1) = func; + + /* We are using the IRQ stack already. + * Do direct call on current stack. */ + direct_call(param1); + return; + } /* This is where we switch to the IRQ stack. */ call_on_stack(param1, func, irq_stack); - *irq_stack_start = 0; + __inc_irq_stat(irq_stack_counter); + + /* free up irq stack usage. */ + do_raw_spin_unlock(irq_stack_in_use); +} + +asmlinkage void do_softirq(void) +{ + __u32 pending; + unsigned long flags; + + if (in_interrupt()) + return; + + local_irq_save(flags); + + pending = local_softirq_pending(); + + if (pending) + execute_on_irq_stack(__do_softirq, 0); + + local_irq_restore(flags); } #endif /* CONFIG_IRQSTACKS */ diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c index ce939ac8622b..1c965642068b 100644 --- a/arch/parisc/mm/init.c +++ b/arch/parisc/mm/init.c @@ -1069,7 +1069,7 @@ void flush_tlb_all(void) { int do_recycle; - inc_irq_stat(irq_tlb_count); + __inc_irq_stat(irq_tlb_count); do_recycle = 0; spin_lock(&sid_lock); if (dirty_space_ids > RECYCLE_THRESHOLD) { @@ -1090,7 +1090,7 @@ void flush_tlb_all(void) #else void flush_tlb_all(void) { - inc_irq_stat(irq_tlb_count); + __inc_irq_stat(irq_tlb_count); spin_lock(&sid_lock); flush_tlb_all_local(NULL); recycle_sids(); diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug index 5416e28a7538..863d877e0b5f 100644 --- a/arch/powerpc/Kconfig.debug +++ b/arch/powerpc/Kconfig.debug @@ -262,8 +262,31 @@ config PPC_EARLY_DEBUG_OPAL_HVSI Select this to enable early debugging for the PowerNV platform using an "hvsi" console +config PPC_EARLY_DEBUG_MEMCONS + bool "In memory console" + help + Select this to enable early debugging using an in memory console. + This console provides input and output buffers stored within the + kernel BSS and should be safe to select on any system. A debugger + can then be used to read kernel output or send input to the console. endchoice +config PPC_MEMCONS_OUTPUT_SIZE + int "In memory console output buffer size" + depends on PPC_EARLY_DEBUG_MEMCONS + default 4096 + help + Selects the size of the output buffer (in bytes) of the in memory + console. + +config PPC_MEMCONS_INPUT_SIZE + int "In memory console input buffer size" + depends on PPC_EARLY_DEBUG_MEMCONS + default 128 + help + Selects the size of the input buffer (in bytes) of the in memory + console. + config PPC_EARLY_DEBUG_OPAL def_bool y depends on PPC_EARLY_DEBUG_OPAL_RAW || PPC_EARLY_DEBUG_OPAL_HVSI diff --git a/arch/powerpc/include/asm/context_tracking.h b/arch/powerpc/include/asm/context_tracking.h new file mode 100644 index 000000000000..b6f5a33b8ee2 --- /dev/null +++ b/arch/powerpc/include/asm/context_tracking.h @@ -0,0 +1,10 @@ +#ifndef _ASM_POWERPC_CONTEXT_TRACKING_H +#define _ASM_POWERPC_CONTEXT_TRACKING_H + +#ifdef CONFIG_CONTEXT_TRACKING +#define SCHEDULE_USER bl .schedule_user +#else +#define SCHEDULE_USER bl .schedule +#endif + +#endif diff --git a/arch/powerpc/include/asm/firmware.h b/arch/powerpc/include/asm/firmware.h index 0df54646f968..681bc0314b6b 100644 --- a/arch/powerpc/include/asm/firmware.h +++ b/arch/powerpc/include/asm/firmware.h @@ -52,6 +52,7 @@ #define FW_FEATURE_BEST_ENERGY ASM_CONST(0x0000000080000000) #define FW_FEATURE_TYPE1_AFFINITY ASM_CONST(0x0000000100000000) #define FW_FEATURE_PRRN ASM_CONST(0x0000000200000000) +#define FW_FEATURE_OPALv3 ASM_CONST(0x0000000400000000) #ifndef __ASSEMBLY__ @@ -69,7 +70,8 @@ enum { FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY | FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN, FW_FEATURE_PSERIES_ALWAYS = 0, - FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL | FW_FEATURE_OPALv2, + FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL | FW_FEATURE_OPALv2 | + FW_FEATURE_OPALv3, FW_FEATURE_POWERNV_ALWAYS = 0, FW_FEATURE_PS3_POSSIBLE = FW_FEATURE_LPAR | FW_FEATURE_PS3_LV1, FW_FEATURE_PS3_ALWAYS = FW_FEATURE_LPAR | FW_FEATURE_PS3_LV1, diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index d615b28dda82..ba713f166fa5 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -96,11 +96,12 @@ static inline bool arch_irqs_disabled(void) #endif #define hard_irq_disable() do { \ + u8 _was_enabled = get_paca()->soft_enabled; \ __hard_irq_disable(); \ - if (local_paca->soft_enabled) \ - trace_hardirqs_off(); \ get_paca()->soft_enabled = 0; \ get_paca()->irq_happened |= PACA_IRQ_HARD_DIS; \ + if (_was_enabled) \ + trace_hardirqs_off(); \ } while(0) static inline bool lazy_irq_pending(void) diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h index b6c8b58b1d76..cbb9305ab15a 100644 --- a/arch/powerpc/include/asm/opal.h +++ b/arch/powerpc/include/asm/opal.h @@ -243,7 +243,8 @@ enum OpalMCE_TlbErrorType { enum OpalThreadStatus { OPAL_THREAD_INACTIVE = 0x0, - OPAL_THREAD_STARTED = 0x1 + OPAL_THREAD_STARTED = 0x1, + OPAL_THREAD_UNAVAILABLE = 0x2 /* opal-v3 */ }; enum OpalPciBusCompare { @@ -563,6 +564,8 @@ extern void opal_nvram_init(void); extern int opal_machine_check(struct pt_regs *regs); +extern void opal_shutdown(void); + #endif /* __ASSEMBLY__ */ #endif /* __OPAL_H */ diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h index 91acb12bac92..b66ae722a8e9 100644 --- a/arch/powerpc/include/asm/pgalloc-64.h +++ b/arch/powerpc/include/asm/pgalloc-64.h @@ -186,7 +186,7 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd, static inline pgtable_t pmd_pgtable(pmd_t pmd) { - return (pgtable_t)(pmd_val(pmd) & -sizeof(pte_t)*PTRS_PER_PTE); + return (pgtable_t)(pmd_val(pmd) & ~PMD_MASKED_BITS); } static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm, diff --git a/arch/powerpc/include/asm/pte-hash64-64k.h b/arch/powerpc/include/asm/pte-hash64-64k.h index 3e13e23e4fdf..d836d945068d 100644 --- a/arch/powerpc/include/asm/pte-hash64-64k.h +++ b/arch/powerpc/include/asm/pte-hash64-64k.h @@ -47,7 +47,7 @@ * generic accessors and iterators here */ #define __real_pte(e,p) ((real_pte_t) { \ - (e), ((e) & _PAGE_COMBO) ? \ + (e), (pte_val(e) & _PAGE_COMBO) ? \ (pte_val(*((p) + PTRS_PER_PTE))) : 0 }) #define __rpte_to_hidx(r,index) ((pte_val((r).pte) & _PAGE_COMBO) ? \ (((r).hidx >> ((index)<<2)) & 0xf) : ((pte_val((r).pte) >> 12) & 0xf)) diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h index a8bc2bb4adc9..34fd70488d83 100644 --- a/arch/powerpc/include/asm/rtas.h +++ b/arch/powerpc/include/asm/rtas.h @@ -264,6 +264,8 @@ extern void rtas_progress(char *s, unsigned short hex); extern void rtas_initialize(void); extern int rtas_suspend_cpu(struct rtas_suspend_me_data *data); extern int rtas_suspend_last_cpu(struct rtas_suspend_me_data *data); +extern int rtas_online_cpus_mask(cpumask_var_t cpus); +extern int rtas_offline_cpus_mask(cpumask_var_t cpus); extern int rtas_ibm_suspend_me(struct rtas_args *); struct rtc_time; diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h index 8ceea14d6fe4..ba7b1973866e 100644 --- a/arch/powerpc/include/asm/thread_info.h +++ b/arch/powerpc/include/asm/thread_info.h @@ -97,7 +97,7 @@ static inline struct thread_info *current_thread_info(void) #define TIF_PERFMON_CTXSW 6 /* perfmon needs ctxsw calls */ #define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */ #define TIF_SINGLESTEP 8 /* singlestepping active */ -#define TIF_MEMDIE 9 /* is terminating due to OOM killer */ +#define TIF_NOHZ 9 /* in adaptive nohz mode */ #define TIF_SECCOMP 10 /* secure computing */ #define TIF_RESTOREALL 11 /* Restore all regs (implies NOERROR) */ #define TIF_NOERROR 12 /* Force successful syscall return */ @@ -106,6 +106,7 @@ static inline struct thread_info *current_thread_info(void) #define TIF_SYSCALL_TRACEPOINT 15 /* syscall tracepoint instrumentation */ #define TIF_EMULATE_STACK_STORE 16 /* Is an instruction emulation for stack store? */ +#define TIF_MEMDIE 17 /* is terminating due to OOM killer */ /* as above, but as bit values */ #define _TIF_SYSCALL_TRACE (1<<TIF_SYSCALL_TRACE) @@ -124,8 +125,10 @@ static inline struct thread_info *current_thread_info(void) #define _TIF_UPROBE (1<<TIF_UPROBE) #define _TIF_SYSCALL_TRACEPOINT (1<<TIF_SYSCALL_TRACEPOINT) #define _TIF_EMULATE_STACK_STORE (1<<TIF_EMULATE_STACK_STORE) +#define _TIF_NOHZ (1<<TIF_NOHZ) #define _TIF_SYSCALL_T_OR_A (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \ - _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT) + _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT | \ + _TIF_NOHZ) #define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \ _TIF_NOTIFY_RESUME | _TIF_UPROBE) diff --git a/arch/powerpc/include/asm/udbg.h b/arch/powerpc/include/asm/udbg.h index 5a7510e9d09d..dc590919f8eb 100644 --- a/arch/powerpc/include/asm/udbg.h +++ b/arch/powerpc/include/asm/udbg.h @@ -52,6 +52,7 @@ extern void __init udbg_init_40x_realmode(void); extern void __init udbg_init_cpm(void); extern void __init udbg_init_usbgecko(void); extern void __init udbg_init_wsp(void); +extern void __init udbg_init_memcons(void); extern void __init udbg_init_ehv_bc(void); extern void __init udbg_init_ps3gelic(void); extern void __init udbg_init_debug_opal_raw(void); diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index e514de57a125..d22e73e4618b 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -439,8 +439,6 @@ ret_from_fork: ret_from_kernel_thread: REST_NVGPRS(r1) bl schedule_tail - li r3,0 - stw r3,0(r1) mtlr r14 mr r3,r15 PPC440EP_ERR42 diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 915fbb4fc2fe..51cfb8fc301f 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -33,6 +33,7 @@ #include <asm/irqflags.h> #include <asm/ftrace.h> #include <asm/hw_irq.h> +#include <asm/context_tracking.h> /* * System calls. @@ -376,8 +377,6 @@ _GLOBAL(ret_from_fork) _GLOBAL(ret_from_kernel_thread) bl .schedule_tail REST_NVGPRS(r1) - li r3,0 - std r3,0(r1) ld r14, 0(r14) mtlr r14 mr r3,r15 @@ -634,7 +633,7 @@ _GLOBAL(ret_from_except_lite) andi. r0,r4,_TIF_NEED_RESCHED beq 1f bl .restore_interrupts - bl .schedule + SCHEDULE_USER b .ret_from_except_lite 1: bl .save_nvgprs diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index 42a756eec9ff..645170a07ada 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel/exceptions-64e.S @@ -489,7 +489,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC) */ mfspr r14,SPRN_DBSR /* check single-step/branch taken */ - andis. r15,r14,DBSR_IC@h + andis. r15,r14,(DBSR_IC|DBSR_BT)@h beq+ 1f LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e) @@ -500,7 +500,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC) bge+ cr1,1f /* here it looks like we got an inappropriate debug exception. */ - lis r14,DBSR_IC@h /* clear the IC event */ + lis r14,(DBSR_IC|DBSR_BT)@h /* clear the event */ rlwinm r11,r11,0,~MSR_DE /* clear DE in the CSRR1 value */ mtspr SPRN_DBSR,r14 mtspr SPRN_CSRR1,r11 @@ -555,7 +555,7 @@ kernel_dbg_exc: */ mfspr r14,SPRN_DBSR /* check single-step/branch taken */ - andis. r15,r14,DBSR_IC@h + andis. r15,r14,(DBSR_IC|DBSR_BT)@h beq+ 1f LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e) @@ -566,7 +566,7 @@ kernel_dbg_exc: bge+ cr1,1f /* here it looks like we got an inappropriate debug exception. */ - lis r14,DBSR_IC@h /* clear the IC event */ + lis r14,(DBSR_IC|DBSR_BT)@h /* clear the event */ rlwinm r11,r11,0,~MSR_DE /* clear DE in the DSRR1 value */ mtspr SPRN_DBSR,r14 mtspr SPRN_DSRR1,r11 diff --git a/arch/powerpc/kernel/machine_kexec_64.c b/arch/powerpc/kernel/machine_kexec_64.c index 466a2908bb63..611acdf30096 100644 --- a/arch/powerpc/kernel/machine_kexec_64.c +++ b/arch/powerpc/kernel/machine_kexec_64.c @@ -17,6 +17,7 @@ #include <linux/errno.h> #include <linux/kernel.h> #include <linux/cpu.h> +#include <linux/hardirq.h> #include <asm/page.h> #include <asm/current.h> @@ -335,10 +336,13 @@ void default_machine_kexec(struct kimage *image) pr_debug("kexec: Starting switchover sequence.\n"); /* switch to a staticly allocated stack. Based on irq stack code. + * We setup preempt_count to avoid using VMX in memcpy. * XXX: the task struct will likely be invalid once we do the copy! */ kexec_stack.thread_info.task = current_thread_info()->task; kexec_stack.thread_info.flags = 0; + kexec_stack.thread_info.preempt_count = HARDIRQ_OFFSET; + kexec_stack.thread_info.cpu = current_thread_info()->cpu; /* We need a static PACA, too; copy this CPU's PACA over and switch to * it. Also poison per_cpu_offset to catch anyone using non-static diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index 19e096bd0e73..e469f30e6eeb 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -657,6 +657,17 @@ _GLOBAL(__ucmpdi2) li r3,2 blr +_GLOBAL(__bswapdi2) + rotlwi r9,r4,8 + rotlwi r10,r3,8 + rlwimi r9,r4,24,0,7 + rlwimi r10,r3,24,0,7 + rlwimi r9,r4,24,16,23 + rlwimi r10,r3,24,16,23 + mr r3,r9 + mr r4,r10 + blr + _GLOBAL(abs) srawi r4,r3,31 xor r3,r3,r4 diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S index 5cfa8008693b..6820e45f557b 100644 --- a/arch/powerpc/kernel/misc_64.S +++ b/arch/powerpc/kernel/misc_64.S @@ -234,6 +234,17 @@ _GLOBAL(__flush_dcache_icache) isync blr +_GLOBAL(__bswapdi2) + srdi r8,r3,32 + rlwinm r7,r3,8,0xffffffff + rlwimi r7,r3,24,0,7 + rlwinm r9,r8,8,0xffffffff + rlwimi r7,r3,24,16,23 + rlwimi r9,r8,24,0,7 + rlwimi r9,r8,24,16,23 + sldi r7,r7,32 + or r3,r7,r9 + blr #if defined(CONFIG_PPC_PMAC) || defined(CONFIG_PPC_MAPLE) /* diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index f5c5c90799a7..6053f037ef0a 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -359,7 +359,6 @@ static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp, enum pci_mmap_state mmap_state, int write_combine) { - unsigned long prot = pgprot_val(protection); /* Write combine is always 0 on non-memory space mappings. On * memory space, if the user didn't pass 1, we check for a @@ -376,9 +375,9 @@ static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp, /* XXX would be nice to have a way to ask for write-through */ if (write_combine) - return pgprot_noncached_wc(prot); + return pgprot_noncached_wc(protection); else - return pgprot_noncached(prot); + return pgprot_noncached(protection); } /* diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c index 78b8766fd79e..c29666586998 100644 --- a/arch/powerpc/kernel/ppc_ksyms.c +++ b/arch/powerpc/kernel/ppc_ksyms.c @@ -143,7 +143,8 @@ EXPORT_SYMBOL(__lshrdi3); int __ucmpdi2(unsigned long long, unsigned long long); EXPORT_SYMBOL(__ucmpdi2); #endif - +long long __bswapdi2(long long); +EXPORT_SYMBOL(__bswapdi2); EXPORT_SYMBOL(memcpy); EXPORT_SYMBOL(memset); EXPORT_SYMBOL(memmove); diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index ceb4e7b62cf4..a902723fdc69 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -339,6 +339,13 @@ static void set_debug_reg_defaults(struct thread_struct *thread) static void prime_debug_regs(struct thread_struct *thread) { + /* + * We could have inherited MSR_DE from userspace, since + * it doesn't get cleared on exception entry. Make sure + * MSR_DE is clear before we enable any debug events. + */ + mtmsr(mfmsr() & ~MSR_DE); + mtspr(SPRN_IAC1, thread->iac1); mtspr(SPRN_IAC2, thread->iac2); #if CONFIG_PPC_ADV_DEBUG_IACS > 2 @@ -971,6 +978,7 @@ int copy_thread(unsigned long clone_flags, unsigned long usp, * do some house keeping and then return from the fork or clone * system call, using the stack frame created above. */ + ((unsigned long *)sp)[0] = 0; sp -= sizeof(struct pt_regs); kregs = (struct pt_regs *) sp; sp -= STACK_FRAME_OVERHEAD; diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c index 3b14d320e69f..98c2fc198712 100644 --- a/arch/powerpc/kernel/ptrace.c +++ b/arch/powerpc/kernel/ptrace.c @@ -32,6 +32,7 @@ #include <trace/syscall.h> #include <linux/hw_breakpoint.h> #include <linux/perf_event.h> +#include <linux/context_tracking.h> #include <asm/uaccess.h> #include <asm/page.h> @@ -1788,6 +1789,8 @@ long do_syscall_trace_enter(struct pt_regs *regs) { long ret = 0; + user_exit(); + secure_computing_strict(regs->gpr[0]); if (test_thread_flag(TIF_SYSCALL_TRACE) && @@ -1832,4 +1835,6 @@ void do_syscall_trace_leave(struct pt_regs *regs) step = test_thread_flag(TIF_SINGLESTEP); if (step || test_thread_flag(TIF_SYSCALL_TRACE)) tracehook_report_syscall_exit(regs, step); + + user_enter(); } diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index 1fd6e7b2f390..52add6f3e201 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -19,6 +19,7 @@ #include <linux/init.h> #include <linux/capability.h> #include <linux/delay.h> +#include <linux/cpu.h> #include <linux/smp.h> #include <linux/completion.h> #include <linux/cpumask.h> @@ -807,6 +808,95 @@ static void rtas_percpu_suspend_me(void *info) __rtas_suspend_cpu((struct rtas_suspend_me_data *)info, 1); } +enum rtas_cpu_state { + DOWN, + UP, +}; + +#ifndef CONFIG_SMP +static int rtas_cpu_state_change_mask(enum rtas_cpu_state state, + cpumask_var_t cpus) +{ + if (!cpumask_empty(cpus)) { + cpumask_clear(cpus); + return -EINVAL; + } else + return 0; +} +#else +/* On return cpumask will be altered to indicate CPUs changed. + * CPUs with states changed will be set in the mask, + * CPUs with status unchanged will be unset in the mask. */ +static int rtas_cpu_state_change_mask(enum rtas_cpu_state state, + cpumask_var_t cpus) +{ + int cpu; + int cpuret = 0; + int ret = 0; + + if (cpumask_empty(cpus)) + return 0; + + for_each_cpu(cpu, cpus) { + switch (state) { + case DOWN: + cpuret = cpu_down(cpu); + break; + case UP: + cpuret = cpu_up(cpu); + break; + } + if (cpuret) { + pr_debug("%s: cpu_%s for cpu#%d returned %d.\n", + __func__, + ((state == UP) ? "up" : "down"), + cpu, cpuret); + if (!ret) + ret = cpuret; + if (state == UP) { + /* clear bits for unchanged cpus, return */ + cpumask_shift_right(cpus, cpus, cpu); + cpumask_shift_left(cpus, cpus, cpu); + break; + } else { + /* clear bit for unchanged cpu, continue */ + cpumask_clear_cpu(cpu, cpus); + } + } + } + + return ret; +} +#endif + +int rtas_online_cpus_mask(cpumask_var_t cpus) +{ + int ret; + + ret = rtas_cpu_state_change_mask(UP, cpus); + + if (ret) { + cpumask_var_t tmp_mask; + + if (!alloc_cpumask_var(&tmp_mask, GFP_TEMPORARY)) + return ret; + + /* Use tmp_mask to preserve cpus mask from first failure */ + cpumask_copy(tmp_mask, cpus); + rtas_offline_cpus_mask(tmp_mask); + free_cpumask_var(tmp_mask); + } + + return ret; +} +EXPORT_SYMBOL(rtas_online_cpus_mask); + +int rtas_offline_cpus_mask(cpumask_var_t cpus) +{ + return rtas_cpu_state_change_mask(DOWN, cpus); +} +EXPORT_SYMBOL(rtas_offline_cpus_mask); + int rtas_ibm_suspend_me(struct rtas_args *args) { long state; @@ -814,6 +904,8 @@ int rtas_ibm_suspend_me(struct rtas_args *args) unsigned long retbuf[PLPAR_HCALL_BUFSIZE]; struct rtas_suspend_me_data data; DECLARE_COMPLETION_ONSTACK(done); + cpumask_var_t offline_mask; + int cpuret; if (!rtas_service_present("ibm,suspend-me")) return -ENOSYS; @@ -837,11 +929,24 @@ int rtas_ibm_suspend_me(struct rtas_args *args) return 0; } + if (!alloc_cpumask_var(&offline_mask, GFP_TEMPORARY)) + return -ENOMEM; + atomic_set(&data.working, 0); atomic_set(&data.done, 0); atomic_set(&data.error, 0); data.token = rtas_token("ibm,suspend-me"); data.complete = &done; + + /* All present CPUs must be online */ + cpumask_andnot(offline_mask, cpu_present_mask, cpu_online_mask); + cpuret = rtas_online_cpus_mask(offline_mask); + if (cpuret) { + pr_err("%s: Could not bring present CPUs online.\n", __func__); + atomic_set(&data.error, cpuret); + goto out; + } + stop_topology_update(); /* Call function on all CPUs. One of us will make the @@ -857,6 +962,14 @@ int rtas_ibm_suspend_me(struct rtas_args *args) start_topology_update(); + /* Take down CPUs not online prior to suspend */ + cpuret = rtas_offline_cpus_mask(offline_mask); + if (cpuret) + pr_warn("%s: Could not restore CPUs to offline state.\n", + __func__); + +out: + free_cpumask_var(offline_mask); return atomic_read(&data.error); } #else /* CONFIG_PPC_PSERIES */ diff --git a/arch/powerpc/kernel/rtas_flash.c b/arch/powerpc/kernel/rtas_flash.c index 5b3022470126..2f3cdb01506d 100644 --- a/arch/powerpc/kernel/rtas_flash.c +++ b/arch/powerpc/kernel/rtas_flash.c @@ -89,6 +89,7 @@ /* Array sizes */ #define VALIDATE_BUF_SIZE 4096 +#define VALIDATE_MSG_LEN 256 #define RTAS_MSG_MAXLEN 64 /* Quirk - RTAS requires 4k list length and block size */ @@ -466,7 +467,7 @@ static void validate_flash(struct rtas_validate_flash_t *args_buf) } static int get_validate_flash_msg(struct rtas_validate_flash_t *args_buf, - char *msg) + char *msg, int msglen) { int n; @@ -474,7 +475,8 @@ static int get_validate_flash_msg(struct rtas_validate_flash_t *args_buf, n = sprintf(msg, "%d\n", args_buf->update_results); if ((args_buf->update_results >= VALIDATE_CUR_UNKNOWN) || (args_buf->update_results == VALIDATE_TMP_UPDATE)) - n += sprintf(msg + n, "%s\n", args_buf->buf); + n += snprintf(msg + n, msglen - n, "%s\n", + args_buf->buf); } else { n = sprintf(msg, "%d\n", args_buf->status); } @@ -486,11 +488,11 @@ static ssize_t validate_flash_read(struct file *file, char __user *buf, { struct rtas_validate_flash_t *const args_buf = &rtas_validate_flash_data; - char msg[RTAS_MSG_MAXLEN]; + char msg[VALIDATE_MSG_LEN]; int msglen; mutex_lock(&rtas_validate_flash_mutex); - msglen = get_validate_flash_msg(args_buf, msg); + msglen = get_validate_flash_msg(args_buf, msg, VALIDATE_MSG_LEN); mutex_unlock(&rtas_validate_flash_mutex); return simple_read_from_buffer(buf, count, ppos, msg, msglen); diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c index cf12eae02de5..577a8aa69c6e 100644 --- a/arch/powerpc/kernel/signal.c +++ b/arch/powerpc/kernel/signal.c @@ -13,6 +13,7 @@ #include <linux/signal.h> #include <linux/uprobes.h> #include <linux/key.h> +#include <linux/context_tracking.h> #include <asm/hw_breakpoint.h> #include <asm/uaccess.h> #include <asm/unistd.h> @@ -24,7 +25,7 @@ * through debug.exception-trace sysctl. */ -int show_unhandled_signals = 0; +int show_unhandled_signals = 1; /* * Allocate space for the signal frame @@ -159,6 +160,8 @@ static int do_signal(struct pt_regs *regs) void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags) { + user_exit(); + if (thread_info_flags & _TIF_UPROBE) uprobe_notify_resume(regs); @@ -169,4 +172,6 @@ void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags) clear_thread_flag(TIF_NOTIFY_RESUME); tracehook_notify_resume(regs); } + + user_enter(); } diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index 83efa2f7d926..a7a648f6b750 100644 --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -35,6 +35,7 @@ #include <linux/kdebug.h> #include <linux/debugfs.h> #include <linux/ratelimit.h> +#include <linux/context_tracking.h> #include <asm/emulated_ops.h> #include <asm/pgtable.h> @@ -667,6 +668,7 @@ int machine_check_generic(struct pt_regs *regs) void machine_check_exception(struct pt_regs *regs) { + enum ctx_state prev_state = exception_enter(); int recover = 0; __get_cpu_var(irq_stat).mce_exceptions++; @@ -683,7 +685,7 @@ void machine_check_exception(struct pt_regs *regs) recover = cur_cpu_spec->machine_check(regs); if (recover > 0) - return; + goto bail; #if defined(CONFIG_8xx) && defined(CONFIG_PCI) /* the qspan pci read routines can cause machine checks -- Cort @@ -693,20 +695,23 @@ void machine_check_exception(struct pt_regs *regs) * -- BenH */ bad_page_fault(regs, regs->dar, SIGBUS); - return; + goto bail; #endif if (debugger_fault_handler(regs)) - return; + goto bail; if (check_io_access(regs)) - return; + goto bail; die("Machine check", regs, SIGBUS); /* Must die if the interrupt is not recoverable */ if (!(regs->msr & MSR_RI)) panic("Unrecoverable Machine check"); + +bail: + exception_exit(prev_state); } void SMIException(struct pt_regs *regs) @@ -716,20 +721,29 @@ void SMIException(struct pt_regs *regs) void unknown_exception(struct pt_regs *regs) { + enum ctx_state prev_state = exception_enter(); + printk("Bad trap at PC: %lx, SR: %lx, vector=%lx\n", regs->nip, regs->msr, regs->trap); _exception(SIGTRAP, regs, 0, 0); + + exception_exit(prev_state); } void instruction_breakpoint_exception(struct pt_regs *regs) { + enum ctx_state prev_state = exception_enter(); + if (notify_die(DIE_IABR_MATCH, "iabr_match", regs, 5, 5, SIGTRAP) == NOTIFY_STOP) - return; + goto bail; if (debugger_iabr_match(regs)) - return; + goto bail; _exception(SIGTRAP, regs, TRAP_BRKPT, regs->nip); + +bail: + exception_exit(prev_state); } void RunModeException(struct pt_regs *regs) @@ -739,15 +753,20 @@ void RunModeException(struct pt_regs *regs) void __kprobes single_step_exception(struct pt_regs *regs) { + enum ctx_state prev_state = exception_enter(); + clear_single_step(regs); if (notify_die(DIE_SSTEP, "single_step", regs, 5, 5, SIGTRAP) == NOTIFY_STOP) - return; + goto bail; if (debugger_sstep(regs)) - return; + goto bail; _exception(SIGTRAP, regs, TRAP_TRACE, regs->nip); + +bail: + exception_exit(prev_state); } /* @@ -1005,6 +1024,7 @@ int is_valid_bugaddr(unsigned long addr) void __kprobes program_check_exception(struct pt_regs *regs) { + enum ctx_state prev_state = exception_enter(); unsigned int reason = get_reason(regs); extern int do_mathemu(struct pt_regs *regs); @@ -1014,26 +1034,26 @@ void __kprobes program_check_exception(struct pt_regs *regs) if (reason & REASON_FP) { /* IEEE FP exception */ parse_fpe(regs); - return; + goto bail; } if (reason & REASON_TRAP) { /* Debugger is first in line to stop recursive faults in * rcu_lock, notify_die, or atomic_notifier_call_chain */ if (debugger_bpt(regs)) - return; + goto bail; /* trap exception */ if (notify_die(DIE_BPT, "breakpoint", regs, 5, 5, SIGTRAP) == NOTIFY_STOP) - return; + goto bail; if (!(regs->msr & MSR_PR) && /* not user-mode */ report_bug(regs->nip, regs) == BUG_TRAP_TYPE_WARN) { regs->nip += 4; - return; + goto bail; } _exception(SIGTRAP, regs, TRAP_BRKPT, regs->nip); - return; + goto bail; } #ifdef CONFIG_PPC_TRANSACTIONAL_MEM if (reason & REASON_TM) { @@ -1049,7 +1069,7 @@ void __kprobes program_check_exception(struct pt_regs *regs) if (!user_mode(regs) && report_bug(regs->nip, regs) == BUG_TRAP_TYPE_WARN) { regs->nip += 4; - return; + goto bail; } /* If usermode caused this, it's done something illegal and * gets a SIGILL slap on the wrist. We call it an illegal @@ -1059,7 +1079,7 @@ void __kprobes program_check_exception(struct pt_regs *regs) */ if (user_mode(regs)) { _exception(SIGILL, regs, ILL_ILLOPN, regs->nip); - return; + goto bail; } else { printk(KERN_EMERG "Unexpected TM Bad Thing exception " "at %lx (msr 0x%x)\n", regs->nip, reason); @@ -1083,16 +1103,16 @@ void __kprobes program_check_exception(struct pt_regs *regs) switch (do_mathemu(regs)) { case 0: emulate_single_step(regs); - return; + goto bail; case 1: { int code = 0; code = __parse_fpscr(current->thread.fpscr.val); _exception(SIGFPE, regs, code, regs->nip); - return; + goto bail; } case -EFAULT: _exception(SIGSEGV, regs, SEGV_MAPERR, regs->nip); - return; + goto bail; } /* fall through on any other errors */ #endif /* CONFIG_MATH_EMULATION */ @@ -1103,10 +1123,10 @@ void __kprobes program_check_exception(struct pt_regs *regs) case 0: regs->nip += 4; emulate_single_step(regs); - return; + goto bail; case -EFAULT: _exception(SIGSEGV, regs, SEGV_MAPERR, regs->nip); - return; + goto bail; } } @@ -1114,10 +1134,14 @@ void __kprobes program_check_exception(struct pt_regs *regs) _exception(SIGILL, regs, ILL_PRVOPC, regs->nip); else _exception(SIGILL, regs, ILL_ILLOPC, regs->nip); + +bail: + exception_exit(prev_state); } void alignment_exception(struct pt_regs *regs) { + enum ctx_state prev_state = exception_enter(); int sig, code, fixed = 0; /* We restore the interrupt state now */ @@ -1131,7 +1155,7 @@ void alignment_exception(struct pt_regs *regs) if (fixed == 1) { regs->nip += 4; /* skip over emulated instruction */ emulate_single_step(regs); - return; + goto bail; } /* Operand address was bad */ @@ -1146,6 +1170,9 @@ void alignment_exception(struct pt_regs *regs) _exception(sig, regs, code, regs->dar); else bad_page_fault(regs, regs->dar, sig); + +bail: + exception_exit(prev_state); } void StackOverflow(struct pt_regs *regs) @@ -1174,23 +1201,32 @@ void trace_syscall(struct pt_regs *regs) void kernel_fp_unavailable_exception(struct pt_regs *regs) { + enum ctx_state prev_state = exception_enter(); + printk(KERN_EMERG "Unrecoverable FP Unavailable Exception " "%lx at %lx\n", regs->trap, regs->nip); die("Unrecoverable FP Unavailable Exception", regs, SIGABRT); + + exception_exit(prev_state); } void altivec_unavailable_exception(struct pt_regs *regs) { + enum ctx_state prev_state = exception_enter(); + if (user_mode(regs)) { /* A user program has executed an altivec instruction, but this kernel doesn't support altivec. */ _exception(SIGILL, regs, ILL_ILLOPC, regs->nip); - return; + goto bail; } printk(KERN_EMERG "Unrecoverable VMX/Altivec Unavailable Exception " "%lx at %lx\n", regs->trap, regs->nip); die("Unrecoverable VMX/Altivec Unavailable Exception", regs, SIGABRT); + +bail: + exception_exit(prev_state); } void vsx_unavailable_exception(struct pt_regs *regs) diff --git a/arch/powerpc/kernel/udbg.c b/arch/powerpc/kernel/udbg.c index 13b867093499..9d3fdcd66290 100644 --- a/arch/powerpc/kernel/udbg.c +++ b/arch/powerpc/kernel/udbg.c @@ -64,6 +64,9 @@ void __init udbg_early_init(void) udbg_init_usbgecko(); #elif defined(CONFIG_PPC_EARLY_DEBUG_WSP) udbg_init_wsp(); +#elif defined(CONFIG_PPC_EARLY_DEBUG_MEMCONS) + /* In memory console */ + udbg_init_memcons(); #elif defined(CONFIG_PPC_EARLY_DEBUG_EHV_BC) udbg_init_ehv_bc(); #elif defined(CONFIG_PPC_EARLY_DEBUG_PS3GELIC) diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 229951ffc351..8726779e1409 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -32,6 +32,7 @@ #include <linux/perf_event.h> #include <linux/magic.h> #include <linux/ratelimit.h> +#include <linux/context_tracking.h> #include <asm/firmware.h> #include <asm/page.h> @@ -196,6 +197,7 @@ static int mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault) int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address, unsigned long error_code) { + enum ctx_state prev_state = exception_enter(); struct vm_area_struct * vma; struct mm_struct *mm = current->mm; unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE; @@ -204,6 +206,7 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address, int trap = TRAP(regs); int is_exec = trap == 0x400; int fault; + int rc = 0; #if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE)) /* @@ -230,28 +233,30 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address, * look at it */ if (error_code & ICSWX_DSI_UCT) { - int rc = acop_handle_fault(regs, address, error_code); + rc = acop_handle_fault(regs, address, error_code); if (rc) - return rc; + goto bail; } #endif /* CONFIG_PPC_ICSWX */ if (notify_page_fault(regs)) - return 0; + goto bail; if (unlikely(debugger_fault_handler(regs))) - return 0; + goto bail; /* On a kernel SLB miss we can only check for a valid exception entry */ - if (!user_mode(regs) && (address >= TASK_SIZE)) - return SIGSEGV; + if (!user_mode(regs) && (address >= TASK_SIZE)) { + rc = SIGSEGV; + goto bail; + } #if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE) || \ defined(CONFIG_PPC_BOOK3S_64)) if (error_code & DSISR_DABRMATCH) { /* breakpoint match */ do_break(regs, address, error_code); - return 0; + goto bail; } #endif @@ -260,8 +265,10 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address, local_irq_enable(); if (in_atomic() || mm == NULL) { - if (!user_mode(regs)) - return SIGSEGV; + if (!user_mode(regs)) { + rc = SIGSEGV; + goto bail; + } /* in_atomic() in user mode is really bad, as is current->mm == NULL. */ printk(KERN_EMERG "Page fault in user mode with " @@ -417,9 +424,11 @@ good_area: */ fault = handle_mm_fault(mm, vma, address, flags); if (unlikely(fault & (VM_FAULT_RETRY|VM_FAULT_ERROR))) { - int rc = mm_fault_error(regs, address, fault); + rc = mm_fault_error(regs, address, fault); if (rc >= MM_FAULT_RETURN) - return rc; + goto bail; + else + rc = 0; } /* @@ -454,7 +463,7 @@ good_area: } up_read(&mm->mmap_sem); - return 0; + goto bail; bad_area: up_read(&mm->mmap_sem); @@ -463,7 +472,7 @@ bad_area_nosemaphore: /* User mode accesses cause a SIGSEGV */ if (user_mode(regs)) { _exception(SIGSEGV, regs, code, address); - return 0; + goto bail; } if (is_exec && (error_code & DSISR_PROTFAULT)) @@ -471,7 +480,11 @@ bad_area_nosemaphore: " page (%lx) - exploit attempt? (uid: %d)\n", address, from_kuid(&init_user_ns, current_uid())); - return SIGSEGV; + rc = SIGSEGV; + +bail: + exception_exit(prev_state); + return rc; } diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index 88ac0eeaadde..e303a6d74e3a 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -33,6 +33,7 @@ #include <linux/init.h> #include <linux/signal.h> #include <linux/memblock.h> +#include <linux/context_tracking.h> #include <asm/processor.h> #include <asm/pgtable.h> @@ -954,6 +955,7 @@ void hash_failure_debug(unsigned long ea, unsigned long access, */ int hash_page(unsigned long ea, unsigned long access, unsigned long trap) { + enum ctx_state prev_state = exception_enter(); pgd_t *pgdir; unsigned long vsid; struct mm_struct *mm; @@ -973,7 +975,8 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap) mm = current->mm; if (! mm) { DBG_LOW(" user region with no mm !\n"); - return 1; + rc = 1; + goto bail; } psize = get_slice_psize(mm, ea); ssize = user_segment_size(ea); @@ -992,19 +995,23 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap) /* Not a valid range * Send the problem up to do_page_fault */ - return 1; + rc = 1; + goto bail; } DBG_LOW(" mm=%p, mm->pgdir=%p, vsid=%016lx\n", mm, mm->pgd, vsid); /* Bad address. */ if (!vsid) { DBG_LOW("Bad address!\n"); - return 1; + rc = 1; + goto bail; } /* Get pgdir */ pgdir = mm->pgd; - if (pgdir == NULL) - return 1; + if (pgdir == NULL) { + rc = 1; + goto bail; + } /* Check CPU locality */ tmp = cpumask_of(smp_processor_id()); @@ -1027,7 +1034,8 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap) ptep = find_linux_pte_or_hugepte(pgdir, ea, &hugeshift); if (ptep == NULL || !pte_present(*ptep)) { DBG_LOW(" no PTE !\n"); - return 1; + rc = 1; + goto bail; } /* Add _PAGE_PRESENT to the required access perm */ @@ -1038,13 +1046,16 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap) */ if (access & ~pte_val(*ptep)) { DBG_LOW(" no access !\n"); - return 1; + rc = 1; + goto bail; } #ifdef CONFIG_HUGETLB_PAGE - if (hugeshift) - return __hash_page_huge(ea, access, vsid, ptep, trap, local, + if (hugeshift) { + rc = __hash_page_huge(ea, access, vsid, ptep, trap, local, ssize, hugeshift, psize); + goto bail; + } #endif /* CONFIG_HUGETLB_PAGE */ #ifndef CONFIG_PPC_64K_PAGES @@ -1124,6 +1135,9 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap) pte_val(*(ptep + PTRS_PER_PTE))); #endif DBG_LOW(" -> rc=%d\n", rc); + +bail: + exception_exit(prev_state); return rc; } EXPORT_SYMBOL_GPL(hash_page); @@ -1259,6 +1273,8 @@ void flush_hash_range(unsigned long number, int local) */ void low_hash_fault(struct pt_regs *regs, unsigned long address, int rc) { + enum ctx_state prev_state = exception_enter(); + if (user_mode(regs)) { #ifdef CONFIG_PPC_SUBPAGE_PROT if (rc == -2) @@ -1268,6 +1284,8 @@ void low_hash_fault(struct pt_regs *regs, unsigned long address, int rc) _exception(SIGBUS, regs, BUS_ADRERR, address); } else bad_page_fault(regs, address, SIGBUS); + + exception_exit(prev_state); } long hpte_insert_repeating(unsigned long hash, unsigned long vpn, diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c index c2787bf779ca..a90b9c458990 100644 --- a/arch/powerpc/mm/init_64.c +++ b/arch/powerpc/mm/init_64.c @@ -215,7 +215,8 @@ static void __meminit vmemmap_create_mapping(unsigned long start, unsigned long phys) { int mapped = htab_bolt_mapping(start, start + page_size, phys, - PAGE_KERNEL, mmu_vmemmap_psize, + pgprot_val(PAGE_KERNEL), + mmu_vmemmap_psize, mmu_kernel_ssize); BUG_ON(mapped < 0); } diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index c627843c5b2e..426180b84978 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -13,11 +13,13 @@ #include <linux/perf_event.h> #include <linux/percpu.h> #include <linux/hardirq.h> +#include <linux/uaccess.h> #include <asm/reg.h> #include <asm/pmc.h> #include <asm/machdep.h> #include <asm/firmware.h> #include <asm/ptrace.h> +#include <asm/code-patching.h> #define BHRB_MAX_ENTRIES 32 #define BHRB_TARGET 0x0000000000000002 @@ -100,6 +102,10 @@ static inline int siar_valid(struct pt_regs *regs) return 1; } +static inline void power_pmu_bhrb_enable(struct perf_event *event) {} +static inline void power_pmu_bhrb_disable(struct perf_event *event) {} +void power_pmu_flush_branch_stack(void) {} +static inline void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw) {} #endif /* CONFIG_PPC32 */ static bool regs_use_siar(struct pt_regs *regs) @@ -308,6 +314,159 @@ static inline int siar_valid(struct pt_regs *regs) return 1; } + +/* Reset all possible BHRB entries */ +static void power_pmu_bhrb_reset(void) +{ + asm volatile(PPC_CLRBHRB); +} + +static void power_pmu_bhrb_enable(struct perf_event *event) +{ + struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events); + + if (!ppmu->bhrb_nr) + return; + + /* Clear BHRB if we changed task context to avoid data leaks */ + if (event->ctx->task && cpuhw->bhrb_context != event->ctx) { + power_pmu_bhrb_reset(); + cpuhw->bhrb_context = event->ctx; + } + cpuhw->bhrb_users++; +} + +static void power_pmu_bhrb_disable(struct perf_event *event) +{ + struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events); + + if (!ppmu->bhrb_nr) + return; + + cpuhw->bhrb_users--; + WARN_ON_ONCE(cpuhw->bhrb_users < 0); + + if (!cpuhw->disabled && !cpuhw->bhrb_users) { + /* BHRB cannot be turned off when other + * events are active on the PMU. + */ + + /* avoid stale pointer */ + cpuhw->bhrb_context = NULL; + } +} + +/* Called from ctxsw to prevent one process's branch entries to + * mingle with the other process's entries during context switch. + */ +void power_pmu_flush_branch_stack(void) +{ + if (ppmu->bhrb_nr) + power_pmu_bhrb_reset(); +} +/* Calculate the to address for a branch */ +static __u64 power_pmu_bhrb_to(u64 addr) +{ + unsigned int instr; + int ret; + __u64 target; + + if (is_kernel_addr(addr)) + return branch_target((unsigned int *)addr); + + /* Userspace: need copy instruction here then translate it */ + pagefault_disable(); + ret = __get_user_inatomic(instr, (unsigned int __user *)addr); + if (ret) { + pagefault_enable(); + return 0; + } + pagefault_enable(); + + target = branch_target(&instr); + if ((!target) || (instr & BRANCH_ABSOLUTE)) + return target; + + /* Translate relative branch target from kernel to user address */ + return target - (unsigned long)&instr + addr; +} + +/* Processing BHRB entries */ +void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw) +{ + u64 val; + u64 addr; + int r_index, u_index, pred; + + r_index = 0; + u_index = 0; + while (r_index < ppmu->bhrb_nr) { + /* Assembly read function */ + val = read_bhrb(r_index++); + if (!val) + /* Terminal marker: End of valid BHRB entries */ + break; + else { + addr = val & BHRB_EA; + pred = val & BHRB_PREDICTION; + + if (!addr) + /* invalid entry */ + continue; + + /* Branches are read most recent first (ie. mfbhrb 0 is + * the most recent branch). + * There are two types of valid entries: + * 1) a target entry which is the to address of a + * computed goto like a blr,bctr,btar. The next + * entry read from the bhrb will be branch + * corresponding to this target (ie. the actual + * blr/bctr/btar instruction). + * 2) a from address which is an actual branch. If a + * target entry proceeds this, then this is the + * matching branch for that target. If this is not + * following a target entry, then this is a branch + * where the target is given as an immediate field + * in the instruction (ie. an i or b form branch). + * In this case we need to read the instruction from + * memory to determine the target/to address. + */ + + if (val & BHRB_TARGET) { + /* Target branches use two entries + * (ie. computed gotos/XL form) + */ + cpuhw->bhrb_entries[u_index].to = addr; + cpuhw->bhrb_entries[u_index].mispred = pred; + cpuhw->bhrb_entries[u_index].predicted = ~pred; + + /* Get from address in next entry */ + val = read_bhrb(r_index++); + addr = val & BHRB_EA; + if (val & BHRB_TARGET) { + /* Shouldn't have two targets in a + row.. Reset index and try again */ + r_index--; + addr = 0; + } + cpuhw->bhrb_entries[u_index].from = addr; + } else { + /* Branches to immediate field + (ie I or B form) */ + cpuhw->bhrb_entries[u_index].from = addr; + cpuhw->bhrb_entries[u_index].to = + power_pmu_bhrb_to(addr); + cpuhw->bhrb_entries[u_index].mispred = pred; + cpuhw->bhrb_entries[u_index].predicted = ~pred; + } + u_index++; + + } + } + cpuhw->bhrb_stack.nr = u_index; + return; +} + #endif /* CONFIG_PPC64 */ static void perf_event_interrupt(struct pt_regs *regs); @@ -904,47 +1063,6 @@ static int collect_events(struct perf_event *group, int max_count, return n; } -/* Reset all possible BHRB entries */ -static void power_pmu_bhrb_reset(void) -{ - asm volatile(PPC_CLRBHRB); -} - -void power_pmu_bhrb_enable(struct perf_event *event) -{ - struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events); - - if (!ppmu->bhrb_nr) - return; - - /* Clear BHRB if we changed task context to avoid data leaks */ - if (event->ctx->task && cpuhw->bhrb_context != event->ctx) { - power_pmu_bhrb_reset(); - cpuhw->bhrb_context = event->ctx; - } - cpuhw->bhrb_users++; -} - -void power_pmu_bhrb_disable(struct perf_event *event) -{ - struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events); - - if (!ppmu->bhrb_nr) - return; - - cpuhw->bhrb_users--; - WARN_ON_ONCE(cpuhw->bhrb_users < 0); - - if (!cpuhw->disabled && !cpuhw->bhrb_users) { - /* BHRB cannot be turned off when other - * events are active on the PMU. - */ - - /* avoid stale pointer */ - cpuhw->bhrb_context = NULL; - } -} - /* * Add a event to the PMU. * If all events are not already frozen, then we disable and @@ -1180,15 +1298,6 @@ int power_pmu_commit_txn(struct pmu *pmu) return 0; } -/* Called from ctxsw to prevent one process's branch entries to - * mingle with the other process's entries during context switch. - */ -void power_pmu_flush_branch_stack(void) -{ - if (ppmu->bhrb_nr) - power_pmu_bhrb_reset(); -} - /* * Return 1 if we might be able to put event on a limited PMC, * or 0 if not. @@ -1458,77 +1567,6 @@ struct pmu power_pmu = { .flush_branch_stack = power_pmu_flush_branch_stack, }; -/* Processing BHRB entries */ -void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw) -{ - u64 val; - u64 addr; - int r_index, u_index, target, pred; - - r_index = 0; - u_index = 0; - while (r_index < ppmu->bhrb_nr) { - /* Assembly read function */ - val = read_bhrb(r_index); - - /* Terminal marker: End of valid BHRB entries */ - if (val == 0) { - break; - } else { - /* BHRB field break up */ - addr = val & BHRB_EA; - pred = val & BHRB_PREDICTION; - target = val & BHRB_TARGET; - - /* Probable Missed entry: Not applicable for POWER8 */ - if ((addr == 0) && (target == 0) && (pred == 1)) { - r_index++; - continue; - } - - /* Real Missed entry: Power8 based missed entry */ - if ((addr == 0) && (target == 1) && (pred == 1)) { - r_index++; - continue; - } - - /* Reserved condition: Not a valid entry */ - if ((addr == 0) && (target == 1) && (pred == 0)) { - r_index++; - continue; - } - - /* Is a target address */ - if (val & BHRB_TARGET) { - /* First address cannot be a target address */ - if (r_index == 0) { - r_index++; - continue; - } - - /* Update target address for the previous entry */ - cpuhw->bhrb_entries[u_index - 1].to = addr; - cpuhw->bhrb_entries[u_index - 1].mispred = pred; - cpuhw->bhrb_entries[u_index - 1].predicted = ~pred; - - /* Dont increment u_index */ - r_index++; - } else { - /* Update address, flags for current entry */ - cpuhw->bhrb_entries[u_index].from = addr; - cpuhw->bhrb_entries[u_index].mispred = pred; - cpuhw->bhrb_entries[u_index].predicted = ~pred; - - /* Successfully popullated one entry */ - u_index++; - r_index++; - } - } - } - cpuhw->bhrb_stack.nr = u_index; - return; -} - /* * A counter has overflowed; update its count and record * things if requested. Note that interrupts are hard-disabled diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig index a881232a3cce..b62aab3e22ec 100644 --- a/arch/powerpc/platforms/Kconfig +++ b/arch/powerpc/platforms/Kconfig @@ -128,7 +128,7 @@ config PPC_RTAS_DAEMON config RTAS_PROC bool "Proc interface to RTAS" - depends on PPC_RTAS + depends on PPC_RTAS && PROC_FS default y config RTAS_FLASH diff --git a/arch/powerpc/platforms/powernv/opal.c b/arch/powerpc/platforms/powernv/opal.c index ade4463226c6..628c564ceadb 100644 --- a/arch/powerpc/platforms/powernv/opal.c +++ b/arch/powerpc/platforms/powernv/opal.c @@ -15,6 +15,7 @@ #include <linux/of.h> #include <linux/of_platform.h> #include <linux/interrupt.h> +#include <linux/slab.h> #include <asm/opal.h> #include <asm/firmware.h> @@ -28,6 +29,8 @@ struct opal { static struct device_node *opal_node; static DEFINE_SPINLOCK(opal_write_lock); extern u64 opal_mc_secondary_handler[]; +static unsigned int *opal_irqs; +static unsigned int opal_irq_count; int __init early_init_dt_scan_opal(unsigned long node, const char *uname, int depth, void *data) @@ -53,7 +56,11 @@ int __init early_init_dt_scan_opal(unsigned long node, opal.entry, entryp, entrysz); powerpc_firmware_features |= FW_FEATURE_OPAL; - if (of_flat_dt_is_compatible(node, "ibm,opal-v2")) { + if (of_flat_dt_is_compatible(node, "ibm,opal-v3")) { + powerpc_firmware_features |= FW_FEATURE_OPALv2; + powerpc_firmware_features |= FW_FEATURE_OPALv3; + printk("OPAL V3 detected !\n"); + } else if (of_flat_dt_is_compatible(node, "ibm,opal-v2")) { powerpc_firmware_features |= FW_FEATURE_OPALv2; printk("OPAL V2 detected !\n"); } else { @@ -144,6 +151,13 @@ int opal_put_chars(uint32_t vtermno, const char *data, int total_len) rc == OPAL_BUSY_EVENT || rc == OPAL_SUCCESS)) { len = total_len; rc = opal_console_write(vtermno, &len, data); + + /* Closed or other error drop */ + if (rc != OPAL_SUCCESS && rc != OPAL_BUSY && + rc != OPAL_BUSY_EVENT) { + written = total_len; + break; + } if (rc == OPAL_SUCCESS) { total_len -= len; data += len; @@ -316,6 +330,8 @@ static int __init opal_init(void) irqs = of_get_property(opal_node, "opal-interrupts", &irqlen); pr_debug("opal: Found %d interrupts reserved for OPAL\n", irqs ? (irqlen / 4) : 0); + opal_irq_count = irqlen / 4; + opal_irqs = kzalloc(opal_irq_count * sizeof(unsigned int), GFP_KERNEL); for (i = 0; irqs && i < (irqlen / 4); i++, irqs++) { unsigned int hwirq = be32_to_cpup(irqs); unsigned int irq = irq_create_mapping(NULL, hwirq); @@ -327,7 +343,19 @@ static int __init opal_init(void) if (rc) pr_warning("opal: Error %d requesting irq %d" " (0x%x)\n", rc, irq, hwirq); + opal_irqs[i] = irq; } return 0; } subsys_initcall(opal_init); + +void opal_shutdown(void) +{ + unsigned int i; + + for (i = 0; i < opal_irq_count; i++) { + if (opal_irqs[i]) + free_irq(opal_irqs[i], 0); + opal_irqs[i] = 0; + } +} diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 1da578b7c1bf..3937aaae5bc4 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -1048,6 +1048,12 @@ static u32 pnv_ioda_bdfn_to_pe(struct pnv_phb *phb, struct pci_bus *bus, return phb->ioda.pe_rmap[(bus->number << 8) | devfn]; } +static void pnv_pci_ioda_shutdown(struct pnv_phb *phb) +{ + opal_pci_reset(phb->opal_id, OPAL_PCI_IODA_TABLE_RESET, + OPAL_ASSERT_RESET); +} + void __init pnv_pci_init_ioda_phb(struct device_node *np, int ioda_type) { struct pci_controller *hose; @@ -1178,6 +1184,9 @@ void __init pnv_pci_init_ioda_phb(struct device_node *np, int ioda_type) /* Setup TCEs */ phb->dma_dev_setup = pnv_pci_ioda_dma_dev_setup; + /* Setup shutdown function for kexec */ + phb->shutdown = pnv_pci_ioda_shutdown; + /* Setup MSI support */ pnv_pci_init_ioda_msis(phb); diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c index 55dfca844ddf..163bd7422f1c 100644 --- a/arch/powerpc/platforms/powernv/pci.c +++ b/arch/powerpc/platforms/powernv/pci.c @@ -450,6 +450,18 @@ static void pnv_pci_dma_dev_setup(struct pci_dev *pdev) pnv_pci_dma_fallback_setup(hose, pdev); } +void pnv_pci_shutdown(void) +{ + struct pci_controller *hose; + + list_for_each_entry(hose, &hose_list, list_node) { + struct pnv_phb *phb = hose->private_data; + + if (phb && phb->shutdown) + phb->shutdown(phb); + } +} + /* Fixup wrong class code in p7ioc and p8 root complex */ static void pnv_p7ioc_rc_quirk(struct pci_dev *dev) { diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h index 48dc4bb856a1..25d76c4df50b 100644 --- a/arch/powerpc/platforms/powernv/pci.h +++ b/arch/powerpc/platforms/powernv/pci.h @@ -86,6 +86,7 @@ struct pnv_phb { void (*dma_dev_setup)(struct pnv_phb *phb, struct pci_dev *pdev); void (*fixup_phb)(struct pci_controller *hose); u32 (*bdfn_to_pe)(struct pnv_phb *phb, struct pci_bus *bus, u32 devfn); + void (*shutdown)(struct pnv_phb *phb); union { struct { @@ -158,4 +159,5 @@ extern void pnv_pci_init_ioda_hub(struct device_node *np); extern void pnv_pci_init_ioda2_phb(struct device_node *np); extern void pnv_pci_ioda_tce_invalidate(struct iommu_table *tbl, u64 *startp, u64 *endp); + #endif /* __POWERNV_PCI_H */ diff --git a/arch/powerpc/platforms/powernv/powernv.h b/arch/powerpc/platforms/powernv/powernv.h index 8a9df7f9667e..a1c6f83fc391 100644 --- a/arch/powerpc/platforms/powernv/powernv.h +++ b/arch/powerpc/platforms/powernv/powernv.h @@ -9,8 +9,10 @@ static inline void pnv_smp_init(void) { } #ifdef CONFIG_PCI extern void pnv_pci_init(void); +extern void pnv_pci_shutdown(void); #else static inline void pnv_pci_init(void) { } +static inline void pnv_pci_shutdown(void) { } #endif #endif /* _POWERNV_H */ diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c index db1ad1c8f68f..d4459bfc92f7 100644 --- a/arch/powerpc/platforms/powernv/setup.c +++ b/arch/powerpc/platforms/powernv/setup.c @@ -78,7 +78,9 @@ static void pnv_show_cpuinfo(struct seq_file *m) if (root) model = of_get_property(root, "model", NULL); seq_printf(m, "machine\t\t: PowerNV %s\n", model); - if (firmware_has_feature(FW_FEATURE_OPALv2)) + if (firmware_has_feature(FW_FEATURE_OPALv3)) + seq_printf(m, "firmware\t: OPAL v3\n"); + else if (firmware_has_feature(FW_FEATURE_OPALv2)) seq_printf(m, "firmware\t: OPAL v2\n"); else if (firmware_has_feature(FW_FEATURE_OPAL)) seq_printf(m, "firmware\t: OPAL v1\n"); @@ -126,6 +128,17 @@ static void pnv_progress(char *s, unsigned short hex) { } +static void pnv_shutdown(void) +{ + /* Let the PCI code clear up IODA tables */ + pnv_pci_shutdown(); + + /* And unregister all OPAL interrupts so they don't fire + * up while we kexec + */ + opal_shutdown(); +} + #ifdef CONFIG_KEXEC static void pnv_kexec_cpu_down(int crash_shutdown, int secondary) { @@ -187,6 +200,7 @@ define_machine(powernv) { .init_IRQ = pnv_init_IRQ, .show_cpuinfo = pnv_show_cpuinfo, .progress = pnv_progress, + .machine_shutdown = pnv_shutdown, .power_save = power7_idle, .calibrate_decr = generic_calibrate_decr, #ifdef CONFIG_KEXEC diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c index 6a3ecca5b725..88c9459c3e07 100644 --- a/arch/powerpc/platforms/powernv/smp.c +++ b/arch/powerpc/platforms/powernv/smp.c @@ -71,18 +71,68 @@ int pnv_smp_kick_cpu(int nr) BUG_ON(nr < 0 || nr >= NR_CPUS); - /* On OPAL v2 the CPU are still spinning inside OPAL itself, - * get them back now + /* + * If we already started or OPALv2 is not supported, we just + * kick the CPU via the PACA */ - if (!paca[nr].cpu_start && firmware_has_feature(FW_FEATURE_OPALv2)) { - pr_devel("OPAL: Starting CPU %d (HW 0x%x)...\n", nr, pcpu); - rc = opal_start_cpu(pcpu, start_here); + if (paca[nr].cpu_start || !firmware_has_feature(FW_FEATURE_OPALv2)) + goto kick; + + /* + * At this point, the CPU can either be spinning on the way in + * from kexec or be inside OPAL waiting to be started for the + * first time. OPAL v3 allows us to query OPAL to know if it + * has the CPUs, so we do that + */ + if (firmware_has_feature(FW_FEATURE_OPALv3)) { + uint8_t status; + + rc = opal_query_cpu_status(pcpu, &status); if (rc != OPAL_SUCCESS) { - pr_warn("OPAL Error %ld starting CPU %d\n", + pr_warn("OPAL Error %ld querying CPU %d state\n", rc, nr); return -ENODEV; } + + /* + * Already started, just kick it, probably coming from + * kexec and spinning + */ + if (status == OPAL_THREAD_STARTED) + goto kick; + + /* + * Available/inactive, let's kick it + */ + if (status == OPAL_THREAD_INACTIVE) { + pr_devel("OPAL: Starting CPU %d (HW 0x%x)...\n", + nr, pcpu); + rc = opal_start_cpu(pcpu, start_here); + if (rc != OPAL_SUCCESS) { + pr_warn("OPAL Error %ld starting CPU %d\n", + rc, nr); + return -ENODEV; + } + } else { + /* + * An unavailable CPU (or any other unknown status) + * shouldn't be started. It should also + * not be in the possible map but currently it can + * happen + */ + pr_devel("OPAL: CPU %d (HW 0x%x) is unavailable" + " (status %d)...\n", nr, pcpu, status); + return -ENODEV; + } + } else { + /* + * On OPAL v2, we just kick it and hope for the best, + * we must not test the error from opal_start_cpu() or + * we would fail to get CPUs from kexec. + */ + opal_start_cpu(pcpu, start_here); } + kick: return smp_generic_kick_cpu(nr); } diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig index 9a0941bc4d31..023b288f895b 100644 --- a/arch/powerpc/platforms/pseries/Kconfig +++ b/arch/powerpc/platforms/pseries/Kconfig @@ -18,6 +18,7 @@ config PPC_PSERIES select PPC_PCI_CHOICE if EXPERT select ZLIB_DEFLATE select PPC_DOORBELL + select HAVE_CONTEXT_TRACKING default y config PPC_SPLPAR diff --git a/arch/powerpc/platforms/pseries/suspend.c b/arch/powerpc/platforms/pseries/suspend.c index 47226e04126d..5f997e79d570 100644 --- a/arch/powerpc/platforms/pseries/suspend.c +++ b/arch/powerpc/platforms/pseries/suspend.c @@ -16,6 +16,7 @@ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ +#include <linux/cpu.h> #include <linux/delay.h> #include <linux/suspend.h> #include <linux/stat.h> @@ -126,11 +127,15 @@ static ssize_t store_hibernate(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { + cpumask_var_t offline_mask; int rc; if (!capable(CAP_SYS_ADMIN)) return -EPERM; + if (!alloc_cpumask_var(&offline_mask, GFP_TEMPORARY)) + return -ENOMEM; + stream_id = simple_strtoul(buf, NULL, 16); do { @@ -140,15 +145,32 @@ static ssize_t store_hibernate(struct device *dev, } while (rc == -EAGAIN); if (!rc) { + /* All present CPUs must be online */ + cpumask_andnot(offline_mask, cpu_present_mask, + cpu_online_mask); + rc = rtas_online_cpus_mask(offline_mask); + if (rc) { + pr_err("%s: Could not bring present CPUs online.\n", + __func__); + goto out; + } + stop_topology_update(); rc = pm_suspend(PM_SUSPEND_MEM); start_topology_update(); + + /* Take down CPUs not online prior to suspend */ + if (!rtas_offline_cpus_mask(offline_mask)) + pr_warn("%s: Could not restore CPUs to offline " + "state.\n", __func__); } stream_id = 0; if (!rc) rc = count; +out: + free_cpumask_var(offline_mask); return rc; } diff --git a/arch/powerpc/platforms/wsp/ics.c b/arch/powerpc/platforms/wsp/ics.c index 97fe82ee8633..2d3b1dd9571d 100644 --- a/arch/powerpc/platforms/wsp/ics.c +++ b/arch/powerpc/platforms/wsp/ics.c @@ -361,7 +361,7 @@ static int wsp_chip_set_affinity(struct irq_data *d, xive = xive_set_server(xive, get_irq_server(ics, hw_irq)); wsp_ics_set_xive(ics, hw_irq, xive); - return 0; + return IRQ_SET_MASK_OK; } static struct irq_chip wsp_irq_chip = { diff --git a/arch/powerpc/sysdev/Makefile b/arch/powerpc/sysdev/Makefile index b0a518e97599..99464a7bdb3b 100644 --- a/arch/powerpc/sysdev/Makefile +++ b/arch/powerpc/sysdev/Makefile @@ -64,6 +64,8 @@ endif obj-$(CONFIG_PPC_SCOM) += scom.o +obj-$(CONFIG_PPC_EARLY_DEBUG_MEMCONS) += udbg_memcons.o + subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror obj-$(CONFIG_PPC_XICS) += xics/ diff --git a/arch/powerpc/sysdev/ehv_pic.c b/arch/powerpc/sysdev/ehv_pic.c index 6e0e1005227f..9cd0e60716fe 100644 --- a/arch/powerpc/sysdev/ehv_pic.c +++ b/arch/powerpc/sysdev/ehv_pic.c @@ -81,7 +81,7 @@ int ehv_pic_set_affinity(struct irq_data *d, const struct cpumask *dest, ev_int_set_config(src, config, prio, cpuid); spin_unlock_irqrestore(&ehv_pic_lock, flags); - return 0; + return IRQ_SET_MASK_OK; } static unsigned int ehv_pic_type_to_vecpri(unsigned int type) diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c index ee21b5e71aec..0a13ecb270c7 100644 --- a/arch/powerpc/sysdev/mpic.c +++ b/arch/powerpc/sysdev/mpic.c @@ -836,7 +836,7 @@ int mpic_set_affinity(struct irq_data *d, const struct cpumask *cpumask, mpic_physmask(mask)); } - return 0; + return IRQ_SET_MASK_OK; } static unsigned int mpic_type_to_vecpri(struct mpic *mpic, unsigned int type) diff --git a/arch/powerpc/sysdev/udbg_memcons.c b/arch/powerpc/sysdev/udbg_memcons.c new file mode 100644 index 000000000000..ce5a7b489e4b --- /dev/null +++ b/arch/powerpc/sysdev/udbg_memcons.c @@ -0,0 +1,105 @@ +/* + * A udbg backend which logs messages and reads input from in memory + * buffers. + * + * The console output can be read from memcons_output which is a + * circular buffer whose next write position is stored in memcons.output_pos. + * + * Input may be passed by writing into the memcons_input buffer when it is + * empty. The input buffer is empty when both input_pos == input_start and + * *input_start == '\0'. + * + * Copyright (C) 2003-2005 Anton Blanchard and Milton Miller, IBM Corp + * Copyright (C) 2013 Alistair Popple, IBM Corp + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include <linux/init.h> +#include <linux/kernel.h> +#include <asm/barrier.h> +#include <asm/page.h> +#include <asm/processor.h> +#include <asm/udbg.h> + +struct memcons { + char *output_start; + char *output_pos; + char *output_end; + char *input_start; + char *input_pos; + char *input_end; +}; + +static char memcons_output[CONFIG_PPC_MEMCONS_OUTPUT_SIZE]; +static char memcons_input[CONFIG_PPC_MEMCONS_INPUT_SIZE]; + +struct memcons memcons = { + .output_start = memcons_output, + .output_pos = memcons_output, + .output_end = &memcons_output[CONFIG_PPC_MEMCONS_OUTPUT_SIZE], + .input_start = memcons_input, + .input_pos = memcons_input, + .input_end = &memcons_input[CONFIG_PPC_MEMCONS_INPUT_SIZE], +}; + +void memcons_putc(char c) +{ + char *new_output_pos; + + *memcons.output_pos = c; + wmb(); + new_output_pos = memcons.output_pos + 1; + if (new_output_pos >= memcons.output_end) + new_output_pos = memcons.output_start; + + memcons.output_pos = new_output_pos; +} + +int memcons_getc_poll(void) +{ + char c; + char *new_input_pos; + + if (*memcons.input_pos) { + c = *memcons.input_pos; + + new_input_pos = memcons.input_pos + 1; + if (new_input_pos >= memcons.input_end) + new_input_pos = memcons.input_start; + else if (*new_input_pos == '\0') + new_input_pos = memcons.input_start; + + *memcons.input_pos = '\0'; + wmb(); + memcons.input_pos = new_input_pos; + return c; + } + + return -1; +} + +int memcons_getc(void) +{ + int c; + + while (1) { + c = memcons_getc_poll(); + if (c == -1) + cpu_relax(); + else + break; + } + + return c; +} + +void udbg_init_memcons(void) +{ + udbg_putc = memcons_putc; + udbg_getc = memcons_getc; + udbg_getc_poll = memcons_getc_poll; +} diff --git a/arch/powerpc/sysdev/xics/ics-opal.c b/arch/powerpc/sysdev/xics/ics-opal.c index f7e8609df0d5..39d72212655e 100644 --- a/arch/powerpc/sysdev/xics/ics-opal.c +++ b/arch/powerpc/sysdev/xics/ics-opal.c @@ -148,7 +148,7 @@ static int ics_opal_set_affinity(struct irq_data *d, __func__, d->irq, hw_irq, server, rc); return -1; } - return 0; + return IRQ_SET_MASK_OK; } static struct irq_chip ics_opal_irq_chip = { diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 6a154a91c7e7..685692c94f05 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -108,7 +108,6 @@ config X86 select GENERIC_CLOCKEVENTS_BROADCAST if X86_64 || (X86_32 && X86_LOCAL_APIC) select GENERIC_TIME_VSYSCALL if X86_64 select KTIME_SCALAR if X86_32 - select ALWAYS_USE_PERSISTENT_CLOCK select GENERIC_STRNCPY_FROM_USER select GENERIC_STRNLEN_USER select HAVE_CONTEXT_TRACKING if X86_64 diff --git a/arch/x86/kernel/microcode_intel_early.c b/arch/x86/kernel/microcode_intel_early.c index d893e8ed8ac9..2e9e12871c2b 100644 --- a/arch/x86/kernel/microcode_intel_early.c +++ b/arch/x86/kernel/microcode_intel_early.c @@ -487,6 +487,7 @@ static inline void show_saved_mc(void) #endif #if defined(CONFIG_MICROCODE_INTEL_EARLY) && defined(CONFIG_HOTPLUG_CPU) +static DEFINE_MUTEX(x86_cpu_microcode_mutex); /* * Save this mc into mc_saved_data. So it will be loaded early when a CPU is * hot added or resumes. @@ -507,7 +508,7 @@ int save_mc_for_early(u8 *mc) * Hold hotplug lock so mc_saved_data is not accessed by a CPU in * hotplug. */ - cpu_hotplug_driver_lock(); + mutex_lock(&x86_cpu_microcode_mutex); mc_saved_count_init = mc_saved_data.mc_saved_count; mc_saved_count = mc_saved_data.mc_saved_count; @@ -544,7 +545,7 @@ int save_mc_for_early(u8 *mc) } out: - cpu_hotplug_driver_unlock(); + mutex_unlock(&x86_cpu_microcode_mutex); return ret; } diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 607af0d4d5ef..4e7a37ff03ab 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -312,6 +312,8 @@ void arch_cpu_idle(void) { if (cpuidle_idle_call()) x86_idle(); + else + local_irq_enable(); } /* @@ -368,9 +370,6 @@ void amd_e400_remove_cpu(int cpu) */ static void amd_e400_idle(void) { - if (need_resched()) - return; - if (!amd_e400_c1e_detected) { u32 lo, hi; diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index fdc5dca14fb3..eaac1743def7 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -359,7 +359,17 @@ unsigned long __init_refok init_memory_mapping(unsigned long start, } /* - * would have hole in the middle or ends, and only ram parts will be mapped. + * We need to iterate through the E820 memory map and create direct mappings + * for only E820_RAM and E820_KERN_RESERVED regions. We cannot simply + * create direct mappings for all pfns from [0 to max_low_pfn) and + * [4GB to max_pfn) because of possible memory holes in high addresses + * that cannot be marked as UC by fixed/variable range MTRRs. + * Depending on the alignment of E820 ranges, this may possibly result + * in using smaller size (i.e. 4K instead of 2M or 1G) page tables. + * + * init_mem_mapping() calls init_range_memory_mapping() with big range. + * That range would have hole in the middle or ends, and only ram parts + * will be mapped in init_range_memory_mapping(). */ static unsigned long __init init_range_memory_mapping( unsigned long r_start, @@ -419,6 +429,13 @@ void __init init_mem_mapping(void) max_pfn_mapped = 0; /* will get exact value next */ min_pfn_mapped = real_end >> PAGE_SHIFT; last_start = start = real_end; + + /* + * We start from the top (end of memory) and go to the bottom. + * The memblock_find_in_range() gets us a block of RAM from the + * end of RAM in [min_pfn_mapped, max_pfn_mapped) used as new pages + * for page table. + */ while (last_start > ISA_END_ADDRESS) { if (last_start > step_size) { start = round_down(last_start - 1, step_size); diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index ca63104136e0..d6d314027b5d 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -55,6 +55,39 @@ #define SECTOR_SHIFT 9 #define SECTOR_SIZE (1ULL << SECTOR_SHIFT) +/* + * Increment the given counter and return its updated value. + * If the counter is already 0 it will not be incremented. + * If the counter is already at its maximum value returns + * -EINVAL without updating it. + */ +static int atomic_inc_return_safe(atomic_t *v) +{ + unsigned int counter; + + counter = (unsigned int)__atomic_add_unless(v, 1, 0); + if (counter <= (unsigned int)INT_MAX) + return (int)counter; + + atomic_dec(v); + + return -EINVAL; +} + +/* Decrement the counter. Return the resulting value, or -EINVAL */ +static int atomic_dec_return_safe(atomic_t *v) +{ + int counter; + + counter = atomic_dec_return(v); + if (counter >= 0) + return counter; + + atomic_inc(v); + + return -EINVAL; +} + #define RBD_DRV_NAME "rbd" #define RBD_DRV_NAME_LONG "rbd (rados block device)" @@ -100,21 +133,20 @@ * block device image metadata (in-memory version) */ struct rbd_image_header { - /* These four fields never change for a given rbd image */ + /* These six fields never change for a given rbd image */ char *object_prefix; - u64 features; __u8 obj_order; __u8 crypt_type; __u8 comp_type; + u64 stripe_unit; + u64 stripe_count; + u64 features; /* Might be changeable someday? */ /* The remaining fields need to be updated occasionally */ u64 image_size; struct ceph_snap_context *snapc; - char *snap_names; - u64 *snap_sizes; - - u64 stripe_unit; - u64 stripe_count; + char *snap_names; /* format 1 only */ + u64 *snap_sizes; /* format 1 only */ }; /* @@ -225,6 +257,7 @@ struct rbd_obj_request { }; }; struct page **copyup_pages; + u32 copyup_page_count; struct ceph_osd_request *osd_req; @@ -257,6 +290,7 @@ struct rbd_img_request { struct rbd_obj_request *obj_request; /* obj req initiator */ }; struct page **copyup_pages; + u32 copyup_page_count; spinlock_t completion_lock;/* protects next_completion */ u32 next_completion; rbd_img_callback_t callback; @@ -311,6 +345,7 @@ struct rbd_device { struct rbd_spec *parent_spec; u64 parent_overlap; + atomic_t parent_ref; struct rbd_device *parent; /* protects updating the header */ @@ -359,7 +394,8 @@ static ssize_t rbd_add(struct bus_type *bus, const char *buf, size_t count); static ssize_t rbd_remove(struct bus_type *bus, const char *buf, size_t count); -static int rbd_dev_image_probe(struct rbd_device *rbd_dev); +static int rbd_dev_image_probe(struct rbd_device *rbd_dev, bool mapping); +static void rbd_spec_put(struct rbd_spec *spec); static struct bus_attribute rbd_bus_attrs[] = { __ATTR(add, S_IWUSR, NULL, rbd_add), @@ -426,7 +462,8 @@ static void rbd_img_parent_read(struct rbd_obj_request *obj_request); static void rbd_dev_remove_parent(struct rbd_device *rbd_dev); static int rbd_dev_refresh(struct rbd_device *rbd_dev); -static int rbd_dev_v2_refresh(struct rbd_device *rbd_dev); +static int rbd_dev_v2_header_onetime(struct rbd_device *rbd_dev); +static int rbd_dev_v2_header_info(struct rbd_device *rbd_dev); static const char *rbd_dev_v2_snap_name(struct rbd_device *rbd_dev, u64 snap_id); static int _rbd_dev_v2_snap_size(struct rbd_device *rbd_dev, u64 snap_id, @@ -726,88 +763,123 @@ static bool rbd_dev_ondisk_valid(struct rbd_image_header_ondisk *ondisk) } /* - * Create a new header structure, translate header format from the on-disk - * header. + * Fill an rbd image header with information from the given format 1 + * on-disk header. */ -static int rbd_header_from_disk(struct rbd_image_header *header, +static int rbd_header_from_disk(struct rbd_device *rbd_dev, struct rbd_image_header_ondisk *ondisk) { + struct rbd_image_header *header = &rbd_dev->header; + bool first_time = header->object_prefix == NULL; + struct ceph_snap_context *snapc; + char *object_prefix = NULL; + char *snap_names = NULL; + u64 *snap_sizes = NULL; u32 snap_count; - size_t len; size_t size; + int ret = -ENOMEM; u32 i; - memset(header, 0, sizeof (*header)); + /* Allocate this now to avoid having to handle failure below */ - snap_count = le32_to_cpu(ondisk->snap_count); + if (first_time) { + size_t len; - len = strnlen(ondisk->object_prefix, sizeof (ondisk->object_prefix)); - header->object_prefix = kmalloc(len + 1, GFP_KERNEL); - if (!header->object_prefix) - return -ENOMEM; - memcpy(header->object_prefix, ondisk->object_prefix, len); - header->object_prefix[len] = '\0'; + len = strnlen(ondisk->object_prefix, + sizeof (ondisk->object_prefix)); + object_prefix = kmalloc(len + 1, GFP_KERNEL); + if (!object_prefix) + return -ENOMEM; + memcpy(object_prefix, ondisk->object_prefix, len); + object_prefix[len] = '\0'; + } + /* Allocate the snapshot context and fill it in */ + + snap_count = le32_to_cpu(ondisk->snap_count); + snapc = ceph_create_snap_context(snap_count, GFP_KERNEL); + if (!snapc) + goto out_err; + snapc->seq = le64_to_cpu(ondisk->snap_seq); if (snap_count) { + struct rbd_image_snap_ondisk *snaps; u64 snap_names_len = le64_to_cpu(ondisk->snap_names_len); - /* Save a copy of the snapshot names */ + /* We'll keep a copy of the snapshot names... */ - if (snap_names_len > (u64) SIZE_MAX) - return -EIO; - header->snap_names = kmalloc(snap_names_len, GFP_KERNEL); - if (!header->snap_names) + if (snap_names_len > (u64)SIZE_MAX) + goto out_2big; + snap_names = kmalloc(snap_names_len, GFP_KERNEL); + if (!snap_names) goto out_err; + + /* ...as well as the array of their sizes. */ + + size = snap_count * sizeof (*header->snap_sizes); + snap_sizes = kmalloc(size, GFP_KERNEL); + if (!snap_sizes) + goto out_err; + /* - * Note that rbd_dev_v1_header_read() guarantees - * the ondisk buffer we're working with has + * Copy the names, and fill in each snapshot's id + * and size. + * + * Note that rbd_dev_v1_header_info() guarantees the + * ondisk buffer we're working with has * snap_names_len bytes beyond the end of the * snapshot id array, this memcpy() is safe. */ - memcpy(header->snap_names, &ondisk->snaps[snap_count], - snap_names_len); + memcpy(snap_names, &ondisk->snaps[snap_count], snap_names_len); + snaps = ondisk->snaps; + for (i = 0; i < snap_count; i++) { + snapc->snaps[i] = le64_to_cpu(snaps[i].id); + snap_sizes[i] = le64_to_cpu(snaps[i].image_size); + } + } - /* Record each snapshot's size */ + /* We won't fail any more, fill in the header */ - size = snap_count * sizeof (*header->snap_sizes); - header->snap_sizes = kmalloc(size, GFP_KERNEL); - if (!header->snap_sizes) - goto out_err; - for (i = 0; i < snap_count; i++) - header->snap_sizes[i] = - le64_to_cpu(ondisk->snaps[i].image_size); + down_write(&rbd_dev->header_rwsem); + if (first_time) { + header->object_prefix = object_prefix; + header->obj_order = ondisk->options.order; + header->crypt_type = ondisk->options.crypt_type; + header->comp_type = ondisk->options.comp_type; + /* The rest aren't used for format 1 images */ + header->stripe_unit = 0; + header->stripe_count = 0; + header->features = 0; } else { - header->snap_names = NULL; - header->snap_sizes = NULL; + ceph_put_snap_context(header->snapc); + kfree(header->snap_names); + kfree(header->snap_sizes); } - header->features = 0; /* No features support in v1 images */ - header->obj_order = ondisk->options.order; - header->crypt_type = ondisk->options.crypt_type; - header->comp_type = ondisk->options.comp_type; - - /* Allocate and fill in the snapshot context */ + /* The remaining fields always get updated (when we refresh) */ header->image_size = le64_to_cpu(ondisk->image_size); + header->snapc = snapc; + header->snap_names = snap_names; + header->snap_sizes = snap_sizes; - header->snapc = ceph_create_snap_context(snap_count, GFP_KERNEL); - if (!header->snapc) - goto out_err; - header->snapc->seq = le64_to_cpu(ondisk->snap_seq); - for (i = 0; i < snap_count; i++) - header->snapc->snaps[i] = le64_to_cpu(ondisk->snaps[i].id); + /* Make sure mapping size is consistent with header info */ - return 0; + if (rbd_dev->spec->snap_id == CEPH_NOSNAP || first_time) + if (rbd_dev->mapping.size != header->image_size) + rbd_dev->mapping.size = header->image_size; + + up_write(&rbd_dev->header_rwsem); + return 0; +out_2big: + ret = -EIO; out_err: - kfree(header->snap_sizes); - header->snap_sizes = NULL; - kfree(header->snap_names); - header->snap_names = NULL; - kfree(header->object_prefix); - header->object_prefix = NULL; + kfree(snap_sizes); + kfree(snap_names); + ceph_put_snap_context(snapc); + kfree(object_prefix); - return -ENOMEM; + return ret; } static const char *_rbd_dev_v1_snap_name(struct rbd_device *rbd_dev, u32 which) @@ -934,20 +1006,11 @@ static int rbd_snap_features(struct rbd_device *rbd_dev, u64 snap_id, static int rbd_dev_mapping_set(struct rbd_device *rbd_dev) { - const char *snap_name = rbd_dev->spec->snap_name; - u64 snap_id; + u64 snap_id = rbd_dev->spec->snap_id; u64 size = 0; u64 features = 0; int ret; - if (strcmp(snap_name, RBD_SNAP_HEAD_NAME)) { - snap_id = rbd_snap_id_by_name(rbd_dev, snap_name); - if (snap_id == CEPH_NOSNAP) - return -ENOENT; - } else { - snap_id = CEPH_NOSNAP; - } - ret = rbd_snap_size(rbd_dev, snap_id, &size); if (ret) return ret; @@ -958,11 +1021,6 @@ static int rbd_dev_mapping_set(struct rbd_device *rbd_dev) rbd_dev->mapping.size = size; rbd_dev->mapping.features = features; - /* If we are mapping a snapshot it must be marked read-only */ - - if (snap_id != CEPH_NOSNAP) - rbd_dev->mapping.read_only = true; - return 0; } @@ -970,14 +1028,6 @@ static void rbd_dev_mapping_clear(struct rbd_device *rbd_dev) { rbd_dev->mapping.size = 0; rbd_dev->mapping.features = 0; - rbd_dev->mapping.read_only = true; -} - -static void rbd_dev_clear_mapping(struct rbd_device *rbd_dev) -{ - rbd_dev->mapping.size = 0; - rbd_dev->mapping.features = 0; - rbd_dev->mapping.read_only = true; } static const char *rbd_segment_name(struct rbd_device *rbd_dev, u64 offset) @@ -1342,20 +1392,18 @@ static void rbd_obj_request_put(struct rbd_obj_request *obj_request) kref_put(&obj_request->kref, rbd_obj_request_destroy); } -static void rbd_img_request_get(struct rbd_img_request *img_request) -{ - dout("%s: img %p (was %d)\n", __func__, img_request, - atomic_read(&img_request->kref.refcount)); - kref_get(&img_request->kref); -} - +static bool img_request_child_test(struct rbd_img_request *img_request); +static void rbd_parent_request_destroy(struct kref *kref); static void rbd_img_request_destroy(struct kref *kref); static void rbd_img_request_put(struct rbd_img_request *img_request) { rbd_assert(img_request != NULL); dout("%s: img %p (was %d)\n", __func__, img_request, atomic_read(&img_request->kref.refcount)); - kref_put(&img_request->kref, rbd_img_request_destroy); + if (img_request_child_test(img_request)) + kref_put(&img_request->kref, rbd_parent_request_destroy); + else + kref_put(&img_request->kref, rbd_img_request_destroy); } static inline void rbd_img_obj_request_add(struct rbd_img_request *img_request, @@ -1472,6 +1520,12 @@ static void img_request_child_set(struct rbd_img_request *img_request) smp_mb(); } +static void img_request_child_clear(struct rbd_img_request *img_request) +{ + clear_bit(IMG_REQ_CHILD, &img_request->flags); + smp_mb(); +} + static bool img_request_child_test(struct rbd_img_request *img_request) { smp_mb(); @@ -1484,6 +1538,12 @@ static void img_request_layered_set(struct rbd_img_request *img_request) smp_mb(); } +static void img_request_layered_clear(struct rbd_img_request *img_request) +{ + clear_bit(IMG_REQ_LAYERED, &img_request->flags); + smp_mb(); +} + static bool img_request_layered_test(struct rbd_img_request *img_request) { smp_mb(); @@ -1827,6 +1887,74 @@ static void rbd_obj_request_destroy(struct kref *kref) kmem_cache_free(rbd_obj_request_cache, obj_request); } +/* It's OK to call this for a device with no parent */ + +static void rbd_spec_put(struct rbd_spec *spec); +static void rbd_dev_unparent(struct rbd_device *rbd_dev) +{ + rbd_dev_remove_parent(rbd_dev); + rbd_spec_put(rbd_dev->parent_spec); + rbd_dev->parent_spec = NULL; + rbd_dev->parent_overlap = 0; +} + +/* + * Parent image reference counting is used to determine when an + * image's parent fields can be safely torn down--after there are no + * more in-flight requests to the parent image. When the last + * reference is dropped, cleaning them up is safe. + */ +static void rbd_dev_parent_put(struct rbd_device *rbd_dev) +{ + int counter; + + if (!rbd_dev->parent_spec) + return; + + counter = atomic_dec_return_safe(&rbd_dev->parent_ref); + if (counter > 0) + return; + + /* Last reference; clean up parent data structures */ + + if (!counter) + rbd_dev_unparent(rbd_dev); + else + rbd_warn(rbd_dev, "parent reference underflow\n"); +} + +/* + * If an image has a non-zero parent overlap, get a reference to its + * parent. + * + * We must get the reference before checking for the overlap to + * coordinate properly with zeroing the parent overlap in + * rbd_dev_v2_parent_info() when an image gets flattened. We + * drop it again if there is no overlap. + * + * Returns true if the rbd device has a parent with a non-zero + * overlap and a reference for it was successfully taken, or + * false otherwise. + */ +static bool rbd_dev_parent_get(struct rbd_device *rbd_dev) +{ + int counter; + + if (!rbd_dev->parent_spec) + return false; + + counter = atomic_inc_return_safe(&rbd_dev->parent_ref); + if (counter > 0 && rbd_dev->parent_overlap) + return true; + + /* Image was flattened, but parent is not yet torn down */ + + if (counter < 0) + rbd_warn(rbd_dev, "parent reference overflow\n"); + + return false; +} + /* * Caller is responsible for filling in the list of object requests * that comprises the image request, and the Linux request pointer @@ -1835,8 +1963,7 @@ static void rbd_obj_request_destroy(struct kref *kref) static struct rbd_img_request *rbd_img_request_create( struct rbd_device *rbd_dev, u64 offset, u64 length, - bool write_request, - bool child_request) + bool write_request) { struct rbd_img_request *img_request; @@ -1861,9 +1988,7 @@ static struct rbd_img_request *rbd_img_request_create( } else { img_request->snap_id = rbd_dev->spec->snap_id; } - if (child_request) - img_request_child_set(img_request); - if (rbd_dev->parent_spec) + if (rbd_dev_parent_get(rbd_dev)) img_request_layered_set(img_request); spin_lock_init(&img_request->completion_lock); img_request->next_completion = 0; @@ -1873,9 +1998,6 @@ static struct rbd_img_request *rbd_img_request_create( INIT_LIST_HEAD(&img_request->obj_requests); kref_init(&img_request->kref); - rbd_img_request_get(img_request); /* Avoid a warning */ - rbd_img_request_put(img_request); /* TEMPORARY */ - dout("%s: rbd_dev %p %s %llu/%llu -> img %p\n", __func__, rbd_dev, write_request ? "write" : "read", offset, length, img_request); @@ -1897,15 +2019,54 @@ static void rbd_img_request_destroy(struct kref *kref) rbd_img_obj_request_del(img_request, obj_request); rbd_assert(img_request->obj_request_count == 0); + if (img_request_layered_test(img_request)) { + img_request_layered_clear(img_request); + rbd_dev_parent_put(img_request->rbd_dev); + } + if (img_request_write_test(img_request)) ceph_put_snap_context(img_request->snapc); - if (img_request_child_test(img_request)) - rbd_obj_request_put(img_request->obj_request); - kmem_cache_free(rbd_img_request_cache, img_request); } +static struct rbd_img_request *rbd_parent_request_create( + struct rbd_obj_request *obj_request, + u64 img_offset, u64 length) +{ + struct rbd_img_request *parent_request; + struct rbd_device *rbd_dev; + + rbd_assert(obj_request->img_request); + rbd_dev = obj_request->img_request->rbd_dev; + + parent_request = rbd_img_request_create(rbd_dev->parent, + img_offset, length, false); + if (!parent_request) + return NULL; + + img_request_child_set(parent_request); + rbd_obj_request_get(obj_request); + parent_request->obj_request = obj_request; + + return parent_request; +} + +static void rbd_parent_request_destroy(struct kref *kref) +{ + struct rbd_img_request *parent_request; + struct rbd_obj_request *orig_request; + + parent_request = container_of(kref, struct rbd_img_request, kref); + orig_request = parent_request->obj_request; + + parent_request->obj_request = NULL; + rbd_obj_request_put(orig_request); + img_request_child_clear(parent_request); + + rbd_img_request_destroy(kref); +} + static bool rbd_img_obj_end_request(struct rbd_obj_request *obj_request) { struct rbd_img_request *img_request; @@ -2114,7 +2275,7 @@ rbd_img_obj_copyup_callback(struct rbd_obj_request *obj_request) { struct rbd_img_request *img_request; struct rbd_device *rbd_dev; - u64 length; + struct page **pages; u32 page_count; rbd_assert(obj_request->type == OBJ_REQUEST_BIO); @@ -2124,12 +2285,14 @@ rbd_img_obj_copyup_callback(struct rbd_obj_request *obj_request) rbd_dev = img_request->rbd_dev; rbd_assert(rbd_dev); - length = (u64)1 << rbd_dev->header.obj_order; - page_count = (u32)calc_pages_for(0, length); - rbd_assert(obj_request->copyup_pages); - ceph_release_page_vector(obj_request->copyup_pages, page_count); + pages = obj_request->copyup_pages; + rbd_assert(pages != NULL); obj_request->copyup_pages = NULL; + page_count = obj_request->copyup_page_count; + rbd_assert(page_count); + obj_request->copyup_page_count = 0; + ceph_release_page_vector(pages, page_count); /* * We want the transfer count to reflect the size of the @@ -2153,9 +2316,11 @@ rbd_img_obj_parent_read_full_callback(struct rbd_img_request *img_request) struct ceph_osd_client *osdc; struct rbd_device *rbd_dev; struct page **pages; - int result; - u64 obj_size; - u64 xferred; + u32 page_count; + int img_result; + u64 parent_length; + u64 offset; + u64 length; rbd_assert(img_request_child_test(img_request)); @@ -2164,46 +2329,74 @@ rbd_img_obj_parent_read_full_callback(struct rbd_img_request *img_request) pages = img_request->copyup_pages; rbd_assert(pages != NULL); img_request->copyup_pages = NULL; + page_count = img_request->copyup_page_count; + rbd_assert(page_count); + img_request->copyup_page_count = 0; orig_request = img_request->obj_request; rbd_assert(orig_request != NULL); - rbd_assert(orig_request->type == OBJ_REQUEST_BIO); - result = img_request->result; - obj_size = img_request->length; - xferred = img_request->xferred; + rbd_assert(obj_request_type_valid(orig_request->type)); + img_result = img_request->result; + parent_length = img_request->length; + rbd_assert(parent_length == img_request->xferred); + rbd_img_request_put(img_request); - rbd_dev = img_request->rbd_dev; + rbd_assert(orig_request->img_request); + rbd_dev = orig_request->img_request->rbd_dev; rbd_assert(rbd_dev); - rbd_assert(obj_size == (u64)1 << rbd_dev->header.obj_order); - rbd_img_request_put(img_request); + /* + * If the overlap has become 0 (most likely because the + * image has been flattened) we need to free the pages + * and re-submit the original write request. + */ + if (!rbd_dev->parent_overlap) { + struct ceph_osd_client *osdc; - if (result) - goto out_err; + ceph_release_page_vector(pages, page_count); + osdc = &rbd_dev->rbd_client->client->osdc; + img_result = rbd_obj_request_submit(osdc, orig_request); + if (!img_result) + return; + } - /* Allocate the new copyup osd request for the original request */ + if (img_result) + goto out_err; - result = -ENOMEM; - rbd_assert(!orig_request->osd_req); + /* + * The original osd request is of no use to use any more. + * We need a new one that can hold the two ops in a copyup + * request. Allocate the new copyup osd request for the + * original request, and release the old one. + */ + img_result = -ENOMEM; osd_req = rbd_osd_req_create_copyup(orig_request); if (!osd_req) goto out_err; + rbd_osd_req_destroy(orig_request->osd_req); orig_request->osd_req = osd_req; orig_request->copyup_pages = pages; + orig_request->copyup_page_count = page_count; /* Initialize the copyup op */ osd_req_op_cls_init(osd_req, 0, CEPH_OSD_OP_CALL, "rbd", "copyup"); - osd_req_op_cls_request_data_pages(osd_req, 0, pages, obj_size, 0, + osd_req_op_cls_request_data_pages(osd_req, 0, pages, parent_length, 0, false, false); /* Then the original write request op */ + offset = orig_request->offset; + length = orig_request->length; osd_req_op_extent_init(osd_req, 1, CEPH_OSD_OP_WRITE, - orig_request->offset, - orig_request->length, 0, 0); - osd_req_op_extent_osd_data_bio(osd_req, 1, orig_request->bio_list, - orig_request->length); + offset, length, 0, 0); + if (orig_request->type == OBJ_REQUEST_BIO) + osd_req_op_extent_osd_data_bio(osd_req, 1, + orig_request->bio_list, length); + else + osd_req_op_extent_osd_data_pages(osd_req, 1, + orig_request->pages, length, + offset & ~PAGE_MASK, false, false); rbd_osd_req_format_write(orig_request); @@ -2211,13 +2404,13 @@ rbd_img_obj_parent_read_full_callback(struct rbd_img_request *img_request) orig_request->callback = rbd_img_obj_copyup_callback; osdc = &rbd_dev->rbd_client->client->osdc; - result = rbd_obj_request_submit(osdc, orig_request); - if (!result) + img_result = rbd_obj_request_submit(osdc, orig_request); + if (!img_result) return; out_err: /* Record the error code and complete the request */ - orig_request->result = result; + orig_request->result = img_result; orig_request->xferred = 0; obj_request_done_set(orig_request); rbd_obj_request_complete(orig_request); @@ -2249,7 +2442,7 @@ static int rbd_img_obj_parent_read_full(struct rbd_obj_request *obj_request) int result; rbd_assert(obj_request_img_data_test(obj_request)); - rbd_assert(obj_request->type == OBJ_REQUEST_BIO); + rbd_assert(obj_request_type_valid(obj_request->type)); img_request = obj_request->img_request; rbd_assert(img_request != NULL); @@ -2257,15 +2450,6 @@ static int rbd_img_obj_parent_read_full(struct rbd_obj_request *obj_request) rbd_assert(rbd_dev->parent != NULL); /* - * First things first. The original osd request is of no - * use to use any more, we'll need a new one that can hold - * the two ops in a copyup request. We'll get that later, - * but for now we can release the old one. - */ - rbd_osd_req_destroy(obj_request->osd_req); - obj_request->osd_req = NULL; - - /* * Determine the byte range covered by the object in the * child image to which the original request was to be sent. */ @@ -2295,18 +2479,16 @@ static int rbd_img_obj_parent_read_full(struct rbd_obj_request *obj_request) } result = -ENOMEM; - parent_request = rbd_img_request_create(rbd_dev->parent, - img_offset, length, - false, true); + parent_request = rbd_parent_request_create(obj_request, + img_offset, length); if (!parent_request) goto out_err; - rbd_obj_request_get(obj_request); - parent_request->obj_request = obj_request; result = rbd_img_request_fill(parent_request, OBJ_REQUEST_PAGES, pages); if (result) goto out_err; parent_request->copyup_pages = pages; + parent_request->copyup_page_count = page_count; parent_request->callback = rbd_img_obj_parent_read_full_callback; result = rbd_img_request_submit(parent_request); @@ -2314,6 +2496,7 @@ static int rbd_img_obj_parent_read_full(struct rbd_obj_request *obj_request) return 0; parent_request->copyup_pages = NULL; + parent_request->copyup_page_count = 0; parent_request->obj_request = NULL; rbd_obj_request_put(obj_request); out_err: @@ -2331,6 +2514,7 @@ out_err: static void rbd_img_obj_exists_callback(struct rbd_obj_request *obj_request) { struct rbd_obj_request *orig_request; + struct rbd_device *rbd_dev; int result; rbd_assert(!obj_request_img_data_test(obj_request)); @@ -2353,8 +2537,21 @@ static void rbd_img_obj_exists_callback(struct rbd_obj_request *obj_request) obj_request->xferred, obj_request->length); rbd_obj_request_put(obj_request); - rbd_assert(orig_request); - rbd_assert(orig_request->img_request); + /* + * If the overlap has become 0 (most likely because the + * image has been flattened) we need to free the pages + * and re-submit the original write request. + */ + rbd_dev = orig_request->img_request->rbd_dev; + if (!rbd_dev->parent_overlap) { + struct ceph_osd_client *osdc; + + rbd_obj_request_put(orig_request); + osdc = &rbd_dev->rbd_client->client->osdc; + result = rbd_obj_request_submit(osdc, orig_request); + if (!result) + return; + } /* * Our only purpose here is to determine whether the object @@ -2512,14 +2709,36 @@ static void rbd_img_parent_read_callback(struct rbd_img_request *img_request) struct rbd_obj_request *obj_request; struct rbd_device *rbd_dev; u64 obj_end; + u64 img_xferred; + int img_result; rbd_assert(img_request_child_test(img_request)); + /* First get what we need from the image request and release it */ + obj_request = img_request->obj_request; + img_xferred = img_request->xferred; + img_result = img_request->result; + rbd_img_request_put(img_request); + + /* + * If the overlap has become 0 (most likely because the + * image has been flattened) we need to re-submit the + * original request. + */ rbd_assert(obj_request); rbd_assert(obj_request->img_request); + rbd_dev = obj_request->img_request->rbd_dev; + if (!rbd_dev->parent_overlap) { + struct ceph_osd_client *osdc; + + osdc = &rbd_dev->rbd_client->client->osdc; + img_result = rbd_obj_request_submit(osdc, obj_request); + if (!img_result) + return; + } - obj_request->result = img_request->result; + obj_request->result = img_result; if (obj_request->result) goto out; @@ -2532,7 +2751,6 @@ static void rbd_img_parent_read_callback(struct rbd_img_request *img_request) */ rbd_assert(obj_request->img_offset < U64_MAX - obj_request->length); obj_end = obj_request->img_offset + obj_request->length; - rbd_dev = obj_request->img_request->rbd_dev; if (obj_end > rbd_dev->parent_overlap) { u64 xferred = 0; @@ -2540,43 +2758,39 @@ static void rbd_img_parent_read_callback(struct rbd_img_request *img_request) xferred = rbd_dev->parent_overlap - obj_request->img_offset; - obj_request->xferred = min(img_request->xferred, xferred); + obj_request->xferred = min(img_xferred, xferred); } else { - obj_request->xferred = img_request->xferred; + obj_request->xferred = img_xferred; } out: - rbd_img_request_put(img_request); rbd_img_obj_request_read_callback(obj_request); rbd_obj_request_complete(obj_request); } static void rbd_img_parent_read(struct rbd_obj_request *obj_request) { - struct rbd_device *rbd_dev; struct rbd_img_request *img_request; int result; rbd_assert(obj_request_img_data_test(obj_request)); rbd_assert(obj_request->img_request != NULL); rbd_assert(obj_request->result == (s32) -ENOENT); - rbd_assert(obj_request->type == OBJ_REQUEST_BIO); + rbd_assert(obj_request_type_valid(obj_request->type)); - rbd_dev = obj_request->img_request->rbd_dev; - rbd_assert(rbd_dev->parent != NULL); /* rbd_read_finish(obj_request, obj_request->length); */ - img_request = rbd_img_request_create(rbd_dev->parent, + img_request = rbd_parent_request_create(obj_request, obj_request->img_offset, - obj_request->length, - false, true); + obj_request->length); result = -ENOMEM; if (!img_request) goto out_err; - rbd_obj_request_get(obj_request); - img_request->obj_request = obj_request; - - result = rbd_img_request_fill(img_request, OBJ_REQUEST_BIO, - obj_request->bio_list); + if (obj_request->type == OBJ_REQUEST_BIO) + result = rbd_img_request_fill(img_request, OBJ_REQUEST_BIO, + obj_request->bio_list); + else + result = rbd_img_request_fill(img_request, OBJ_REQUEST_PAGES, + obj_request->pages); if (result) goto out_err; @@ -2626,6 +2840,7 @@ out: static void rbd_watch_cb(u64 ver, u64 notify_id, u8 opcode, void *data) { struct rbd_device *rbd_dev = (struct rbd_device *)data; + int ret; if (!rbd_dev) return; @@ -2633,7 +2848,9 @@ static void rbd_watch_cb(u64 ver, u64 notify_id, u8 opcode, void *data) dout("%s: \"%s\" notify_id %llu opcode %u\n", __func__, rbd_dev->header_name, (unsigned long long)notify_id, (unsigned int)opcode); - (void)rbd_dev_refresh(rbd_dev); + ret = rbd_dev_refresh(rbd_dev); + if (ret) + rbd_warn(rbd_dev, ": header refresh error (%d)\n", ret); rbd_obj_notify_ack(rbd_dev, notify_id); } @@ -2642,7 +2859,7 @@ static void rbd_watch_cb(u64 ver, u64 notify_id, u8 opcode, void *data) * Request sync osd watch/unwatch. The value of "start" determines * whether a watch request is being initiated or torn down. */ -static int rbd_dev_header_watch_sync(struct rbd_device *rbd_dev, int start) +static int rbd_dev_header_watch_sync(struct rbd_device *rbd_dev, bool start) { struct ceph_osd_client *osdc = &rbd_dev->rbd_client->client->osdc; struct rbd_obj_request *obj_request; @@ -2676,7 +2893,7 @@ static int rbd_dev_header_watch_sync(struct rbd_device *rbd_dev, int start) rbd_dev->watch_request->osd_req); osd_req_op_watch_init(obj_request->osd_req, 0, CEPH_OSD_OP_WATCH, - rbd_dev->watch_event->cookie, 0, start); + rbd_dev->watch_event->cookie, 0, start ? 1 : 0); rbd_osd_req_format_write(obj_request); ret = rbd_obj_request_submit(osdc, obj_request); @@ -2869,9 +3086,16 @@ static void rbd_request_fn(struct request_queue *q) goto end_request; /* Shouldn't happen */ } + result = -EIO; + if (offset + length > rbd_dev->mapping.size) { + rbd_warn(rbd_dev, "beyond EOD (%llu~%llu > %llu)\n", + offset, length, rbd_dev->mapping.size); + goto end_request; + } + result = -ENOMEM; img_request = rbd_img_request_create(rbd_dev, offset, length, - write_request, false); + write_request); if (!img_request) goto end_request; @@ -3022,17 +3246,11 @@ out: } /* - * Read the complete header for the given rbd device. - * - * Returns a pointer to a dynamically-allocated buffer containing - * the complete and validated header. Caller can pass the address - * of a variable that will be filled in with the version of the - * header object at the time it was read. - * - * Returns a pointer-coded errno if a failure occurs. + * Read the complete header for the given rbd device. On successful + * return, the rbd_dev->header field will contain up-to-date + * information about the image. */ -static struct rbd_image_header_ondisk * -rbd_dev_v1_header_read(struct rbd_device *rbd_dev) +static int rbd_dev_v1_header_info(struct rbd_device *rbd_dev) { struct rbd_image_header_ondisk *ondisk = NULL; u32 snap_count = 0; @@ -3057,22 +3275,22 @@ rbd_dev_v1_header_read(struct rbd_device *rbd_dev) size += names_size; ondisk = kmalloc(size, GFP_KERNEL); if (!ondisk) - return ERR_PTR(-ENOMEM); + return -ENOMEM; ret = rbd_obj_read_sync(rbd_dev, rbd_dev->header_name, 0, size, ondisk); if (ret < 0) - goto out_err; + goto out; if ((size_t)ret < size) { ret = -ENXIO; rbd_warn(rbd_dev, "short header read (want %zd got %d)", size, ret); - goto out_err; + goto out; } if (!rbd_dev_ondisk_valid(ondisk)) { ret = -ENXIO; rbd_warn(rbd_dev, "invalid header"); - goto out_err; + goto out; } names_size = le64_to_cpu(ondisk->snap_names_len); @@ -3080,85 +3298,13 @@ rbd_dev_v1_header_read(struct rbd_device *rbd_dev) snap_count = le32_to_cpu(ondisk->snap_count); } while (snap_count != want_count); - return ondisk; - -out_err: - kfree(ondisk); - - return ERR_PTR(ret); -} - -/* - * reload the ondisk the header - */ -static int rbd_read_header(struct rbd_device *rbd_dev, - struct rbd_image_header *header) -{ - struct rbd_image_header_ondisk *ondisk; - int ret; - - ondisk = rbd_dev_v1_header_read(rbd_dev); - if (IS_ERR(ondisk)) - return PTR_ERR(ondisk); - ret = rbd_header_from_disk(header, ondisk); + ret = rbd_header_from_disk(rbd_dev, ondisk); +out: kfree(ondisk); return ret; } -static void rbd_update_mapping_size(struct rbd_device *rbd_dev) -{ - if (rbd_dev->spec->snap_id != CEPH_NOSNAP) - return; - - if (rbd_dev->mapping.size != rbd_dev->header.image_size) { - sector_t size; - - rbd_dev->mapping.size = rbd_dev->header.image_size; - size = (sector_t)rbd_dev->mapping.size / SECTOR_SIZE; - dout("setting size to %llu sectors", (unsigned long long)size); - set_capacity(rbd_dev->disk, size); - } -} - -/* - * only read the first part of the ondisk header, without the snaps info - */ -static int rbd_dev_v1_refresh(struct rbd_device *rbd_dev) -{ - int ret; - struct rbd_image_header h; - - ret = rbd_read_header(rbd_dev, &h); - if (ret < 0) - return ret; - - down_write(&rbd_dev->header_rwsem); - - /* Update image size, and check for resize of mapped image */ - rbd_dev->header.image_size = h.image_size; - rbd_update_mapping_size(rbd_dev); - - /* rbd_dev->header.object_prefix shouldn't change */ - kfree(rbd_dev->header.snap_sizes); - kfree(rbd_dev->header.snap_names); - /* osd requests may still refer to snapc */ - ceph_put_snap_context(rbd_dev->header.snapc); - - rbd_dev->header.image_size = h.image_size; - rbd_dev->header.snapc = h.snapc; - rbd_dev->header.snap_names = h.snap_names; - rbd_dev->header.snap_sizes = h.snap_sizes; - /* Free the extra copy of the object prefix */ - if (strcmp(rbd_dev->header.object_prefix, h.object_prefix)) - rbd_warn(rbd_dev, "object prefix changed (ignoring)"); - kfree(h.object_prefix); - - up_write(&rbd_dev->header_rwsem); - - return ret; -} - /* * Clear the rbd device's EXISTS flag if the snapshot it's mapped to * has disappeared from the (just updated) snapshot context. @@ -3180,26 +3326,29 @@ static void rbd_exists_validate(struct rbd_device *rbd_dev) static int rbd_dev_refresh(struct rbd_device *rbd_dev) { - u64 image_size; + u64 mapping_size; int ret; rbd_assert(rbd_image_format_valid(rbd_dev->image_format)); - image_size = rbd_dev->header.image_size; + mapping_size = rbd_dev->mapping.size; mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING); if (rbd_dev->image_format == 1) - ret = rbd_dev_v1_refresh(rbd_dev); + ret = rbd_dev_v1_header_info(rbd_dev); else - ret = rbd_dev_v2_refresh(rbd_dev); + ret = rbd_dev_v2_header_info(rbd_dev); /* If it's a mapped snapshot, validate its EXISTS flag */ rbd_exists_validate(rbd_dev); mutex_unlock(&ctl_mutex); - if (ret) - rbd_warn(rbd_dev, "got notification but failed to " - " update snaps: %d\n", ret); - if (image_size != rbd_dev->header.image_size) + if (mapping_size != rbd_dev->mapping.size) { + sector_t size; + + size = (sector_t)rbd_dev->mapping.size / SECTOR_SIZE; + dout("setting size to %llu sectors", (unsigned long long)size); + set_capacity(rbd_dev->disk, size); revalidate_disk(rbd_dev->disk); + } return ret; } @@ -3403,6 +3552,8 @@ static ssize_t rbd_image_refresh(struct device *dev, int ret; ret = rbd_dev_refresh(rbd_dev); + if (ret) + rbd_warn(rbd_dev, ": manual header refresh error (%d)\n", ret); return ret < 0 ? ret : size; } @@ -3501,6 +3652,7 @@ static struct rbd_device *rbd_dev_create(struct rbd_client *rbdc, spin_lock_init(&rbd_dev->lock); rbd_dev->flags = 0; + atomic_set(&rbd_dev->parent_ref, 0); INIT_LIST_HEAD(&rbd_dev->node); init_rwsem(&rbd_dev->header_rwsem); @@ -3650,6 +3802,7 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev) __le64 snapid; void *p; void *end; + u64 pool_id; char *image_id; u64 overlap; int ret; @@ -3680,18 +3833,37 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev) p = reply_buf; end = reply_buf + ret; ret = -ERANGE; - ceph_decode_64_safe(&p, end, parent_spec->pool_id, out_err); - if (parent_spec->pool_id == CEPH_NOPOOL) + ceph_decode_64_safe(&p, end, pool_id, out_err); + if (pool_id == CEPH_NOPOOL) { + /* + * Either the parent never existed, or we have + * record of it but the image got flattened so it no + * longer has a parent. When the parent of a + * layered image disappears we immediately set the + * overlap to 0. The effect of this is that all new + * requests will be treated as if the image had no + * parent. + */ + if (rbd_dev->parent_overlap) { + rbd_dev->parent_overlap = 0; + smp_mb(); + rbd_dev_parent_put(rbd_dev); + pr_info("%s: clone image has been flattened\n", + rbd_dev->disk->disk_name); + } + goto out; /* No parent? No problem. */ + } /* The ceph file layout needs to fit pool id in 32 bits */ ret = -EIO; - if (parent_spec->pool_id > (u64)U32_MAX) { + if (pool_id > (u64)U32_MAX) { rbd_warn(NULL, "parent pool id too large (%llu > %u)\n", - (unsigned long long)parent_spec->pool_id, U32_MAX); + (unsigned long long)pool_id, U32_MAX); goto out_err; } + parent_spec->pool_id = pool_id; image_id = ceph_extract_encoded_string(&p, end, NULL, GFP_KERNEL); if (IS_ERR(image_id)) { @@ -3702,9 +3874,14 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev) ceph_decode_64_safe(&p, end, parent_spec->snap_id, out_err); ceph_decode_64_safe(&p, end, overlap, out_err); - rbd_dev->parent_overlap = overlap; - rbd_dev->parent_spec = parent_spec; - parent_spec = NULL; /* rbd_dev now owns this */ + if (overlap) { + rbd_spec_put(rbd_dev->parent_spec); + rbd_dev->parent_spec = parent_spec; + parent_spec = NULL; /* rbd_dev now owns this */ + rbd_dev->parent_overlap = overlap; + } else { + rbd_warn(rbd_dev, "ignoring parent of clone with overlap 0\n"); + } out: ret = 0; out_err: @@ -4002,6 +4179,7 @@ static int rbd_dev_v2_snap_context(struct rbd_device *rbd_dev) for (i = 0; i < snap_count; i++) snapc->snaps[i] = ceph_decode_64(&p); + ceph_put_snap_context(rbd_dev->header.snapc); rbd_dev->header.snapc = snapc; dout(" snap context seq = %llu, snap_count = %u\n", @@ -4053,21 +4231,56 @@ out: return snap_name; } -static int rbd_dev_v2_refresh(struct rbd_device *rbd_dev) +static int rbd_dev_v2_header_info(struct rbd_device *rbd_dev) { + bool first_time = rbd_dev->header.object_prefix == NULL; int ret; down_write(&rbd_dev->header_rwsem); + if (first_time) { + ret = rbd_dev_v2_header_onetime(rbd_dev); + if (ret) + goto out; + } + + /* + * If the image supports layering, get the parent info. We + * need to probe the first time regardless. Thereafter we + * only need to if there's a parent, to see if it has + * disappeared due to the mapped image getting flattened. + */ + if (rbd_dev->header.features & RBD_FEATURE_LAYERING && + (first_time || rbd_dev->parent_spec)) { + bool warn; + + ret = rbd_dev_v2_parent_info(rbd_dev); + if (ret) + goto out; + + /* + * Print a warning if this is the initial probe and + * the image has a parent. Don't print it if the + * image now being probed is itself a parent. We + * can tell at this point because we won't know its + * pool name yet (just its pool id). + */ + warn = rbd_dev->parent_spec && rbd_dev->spec->pool_name; + if (first_time && warn) + rbd_warn(rbd_dev, "WARNING: kernel layering " + "is EXPERIMENTAL!"); + } + ret = rbd_dev_v2_image_size(rbd_dev); if (ret) goto out; - rbd_update_mapping_size(rbd_dev); + + if (rbd_dev->spec->snap_id == CEPH_NOSNAP) + if (rbd_dev->mapping.size != rbd_dev->header.image_size) + rbd_dev->mapping.size = rbd_dev->header.image_size; ret = rbd_dev_v2_snap_context(rbd_dev); dout("rbd_dev_v2_snap_context returned %d\n", ret); - if (ret) - goto out; out: up_write(&rbd_dev->header_rwsem); @@ -4490,10 +4703,10 @@ static void rbd_dev_unprobe(struct rbd_device *rbd_dev) { struct rbd_image_header *header; - rbd_dev_remove_parent(rbd_dev); - rbd_spec_put(rbd_dev->parent_spec); - rbd_dev->parent_spec = NULL; - rbd_dev->parent_overlap = 0; + /* Drop parent reference unless it's already been done (or none) */ + + if (rbd_dev->parent_overlap) + rbd_dev_parent_put(rbd_dev); /* Free dynamic fields from the header, then zero it out */ @@ -4505,72 +4718,22 @@ static void rbd_dev_unprobe(struct rbd_device *rbd_dev) memset(header, 0, sizeof (*header)); } -static int rbd_dev_v1_probe(struct rbd_device *rbd_dev) +static int rbd_dev_v2_header_onetime(struct rbd_device *rbd_dev) { int ret; - /* Populate rbd image metadata */ - - ret = rbd_read_header(rbd_dev, &rbd_dev->header); - if (ret < 0) - goto out_err; - - /* Version 1 images have no parent (no layering) */ - - rbd_dev->parent_spec = NULL; - rbd_dev->parent_overlap = 0; - - dout("discovered version 1 image, header name is %s\n", - rbd_dev->header_name); - - return 0; - -out_err: - kfree(rbd_dev->header_name); - rbd_dev->header_name = NULL; - kfree(rbd_dev->spec->image_id); - rbd_dev->spec->image_id = NULL; - - return ret; -} - -static int rbd_dev_v2_probe(struct rbd_device *rbd_dev) -{ - int ret; - - ret = rbd_dev_v2_image_size(rbd_dev); - if (ret) - goto out_err; - - /* Get the object prefix (a.k.a. block_name) for the image */ - ret = rbd_dev_v2_object_prefix(rbd_dev); if (ret) goto out_err; - /* Get the and check features for the image */ - + /* + * Get the and check features for the image. Currently the + * features are assumed to never change. + */ ret = rbd_dev_v2_features(rbd_dev); if (ret) goto out_err; - /* If the image supports layering, get the parent info */ - - if (rbd_dev->header.features & RBD_FEATURE_LAYERING) { - ret = rbd_dev_v2_parent_info(rbd_dev); - if (ret) - goto out_err; - - /* - * Don't print a warning for parent images. We can - * tell this point because we won't know its pool - * name yet (just its pool id). - */ - if (rbd_dev->spec->pool_name) - rbd_warn(rbd_dev, "WARNING: kernel layering " - "is EXPERIMENTAL!"); - } - /* If the image supports fancy striping, get its parameters */ if (rbd_dev->header.features & RBD_FEATURE_STRIPINGV2) { @@ -4578,28 +4741,11 @@ static int rbd_dev_v2_probe(struct rbd_device *rbd_dev) if (ret < 0) goto out_err; } - - /* crypto and compression type aren't (yet) supported for v2 images */ - - rbd_dev->header.crypt_type = 0; - rbd_dev->header.comp_type = 0; - - /* Get the snapshot context, plus the header version */ - - ret = rbd_dev_v2_snap_context(rbd_dev); - if (ret) - goto out_err; - - dout("discovered version 2 image, header name is %s\n", - rbd_dev->header_name); + /* No support for crypto and compression type format 2 images */ return 0; out_err: - rbd_dev->parent_overlap = 0; - rbd_spec_put(rbd_dev->parent_spec); - rbd_dev->parent_spec = NULL; - kfree(rbd_dev->header_name); - rbd_dev->header_name = NULL; + rbd_dev->header.features = 0; kfree(rbd_dev->header.object_prefix); rbd_dev->header.object_prefix = NULL; @@ -4628,15 +4774,16 @@ static int rbd_dev_probe_parent(struct rbd_device *rbd_dev) if (!parent) goto out_err; - ret = rbd_dev_image_probe(parent); + ret = rbd_dev_image_probe(parent, false); if (ret < 0) goto out_err; rbd_dev->parent = parent; + atomic_set(&rbd_dev->parent_ref, 1); return 0; out_err: if (parent) { - rbd_spec_put(rbd_dev->parent_spec); + rbd_dev_unparent(rbd_dev); kfree(rbd_dev->header_name); rbd_dev_destroy(parent); } else { @@ -4651,10 +4798,6 @@ static int rbd_dev_device_setup(struct rbd_device *rbd_dev) { int ret; - ret = rbd_dev_mapping_set(rbd_dev); - if (ret) - return ret; - /* generate unique id: find highest unique id, add one */ rbd_dev_id_get(rbd_dev); @@ -4676,13 +4819,17 @@ static int rbd_dev_device_setup(struct rbd_device *rbd_dev) if (ret) goto err_out_blkdev; - ret = rbd_bus_add_dev(rbd_dev); + ret = rbd_dev_mapping_set(rbd_dev); if (ret) goto err_out_disk; + set_capacity(rbd_dev->disk, rbd_dev->mapping.size / SECTOR_SIZE); + + ret = rbd_bus_add_dev(rbd_dev); + if (ret) + goto err_out_mapping; /* Everything's ready. Announce the disk to the world. */ - set_capacity(rbd_dev->disk, rbd_dev->mapping.size / SECTOR_SIZE); set_bit(RBD_DEV_FLAG_EXISTS, &rbd_dev->flags); add_disk(rbd_dev->disk); @@ -4691,6 +4838,8 @@ static int rbd_dev_device_setup(struct rbd_device *rbd_dev) return ret; +err_out_mapping: + rbd_dev_mapping_clear(rbd_dev); err_out_disk: rbd_free_disk(rbd_dev); err_out_blkdev: @@ -4731,12 +4880,7 @@ static int rbd_dev_header_name(struct rbd_device *rbd_dev) static void rbd_dev_image_release(struct rbd_device *rbd_dev) { - int ret; - rbd_dev_unprobe(rbd_dev); - ret = rbd_dev_header_watch_sync(rbd_dev, 0); - if (ret) - rbd_warn(rbd_dev, "failed to cancel watch event (%d)\n", ret); kfree(rbd_dev->header_name); rbd_dev->header_name = NULL; rbd_dev->image_format = 0; @@ -4748,10 +4892,11 @@ static void rbd_dev_image_release(struct rbd_device *rbd_dev) /* * Probe for the existence of the header object for the given rbd - * device. For format 2 images this includes determining the image - * id. + * device. If this image is the one being mapped (i.e., not a + * parent), initiate a watch on its header object before using that + * object to get detailed information about the rbd image. */ -static int rbd_dev_image_probe(struct rbd_device *rbd_dev) +static int rbd_dev_image_probe(struct rbd_device *rbd_dev, bool mapping) { int ret; int tmp; @@ -4771,14 +4916,16 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev) if (ret) goto err_out_format; - ret = rbd_dev_header_watch_sync(rbd_dev, 1); - if (ret) - goto out_header_name; + if (mapping) { + ret = rbd_dev_header_watch_sync(rbd_dev, true); + if (ret) + goto out_header_name; + } if (rbd_dev->image_format == 1) - ret = rbd_dev_v1_probe(rbd_dev); + ret = rbd_dev_v1_header_info(rbd_dev); else - ret = rbd_dev_v2_probe(rbd_dev); + ret = rbd_dev_v2_header_info(rbd_dev); if (ret) goto err_out_watch; @@ -4787,15 +4934,22 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev) goto err_out_probe; ret = rbd_dev_probe_parent(rbd_dev); - if (!ret) - return 0; + if (ret) + goto err_out_probe; + + dout("discovered format %u image, header name is %s\n", + rbd_dev->image_format, rbd_dev->header_name); + return 0; err_out_probe: rbd_dev_unprobe(rbd_dev); err_out_watch: - tmp = rbd_dev_header_watch_sync(rbd_dev, 0); - if (tmp) - rbd_warn(rbd_dev, "unable to tear down watch request\n"); + if (mapping) { + tmp = rbd_dev_header_watch_sync(rbd_dev, false); + if (tmp) + rbd_warn(rbd_dev, "unable to tear down " + "watch request (%d)\n", tmp); + } out_header_name: kfree(rbd_dev->header_name); rbd_dev->header_name = NULL; @@ -4819,6 +4973,7 @@ static ssize_t rbd_add(struct bus_type *bus, struct rbd_spec *spec = NULL; struct rbd_client *rbdc; struct ceph_osd_client *osdc; + bool read_only; int rc = -ENOMEM; if (!try_module_get(THIS_MODULE)) @@ -4828,6 +4983,9 @@ static ssize_t rbd_add(struct bus_type *bus, rc = rbd_add_parse_args(buf, &ceph_opts, &rbd_opts, &spec); if (rc < 0) goto err_out_module; + read_only = rbd_opts->read_only; + kfree(rbd_opts); + rbd_opts = NULL; /* done with this */ rbdc = rbd_get_client(ceph_opts); if (IS_ERR(rbdc)) { @@ -4858,14 +5016,16 @@ static ssize_t rbd_add(struct bus_type *bus, rbdc = NULL; /* rbd_dev now owns this */ spec = NULL; /* rbd_dev now owns this */ - rbd_dev->mapping.read_only = rbd_opts->read_only; - kfree(rbd_opts); - rbd_opts = NULL; /* done with this */ - - rc = rbd_dev_image_probe(rbd_dev); + rc = rbd_dev_image_probe(rbd_dev, true); if (rc < 0) goto err_out_rbd_dev; + /* If we are mapping a snapshot it must be marked read-only */ + + if (rbd_dev->spec->snap_id != CEPH_NOSNAP) + read_only = true; + rbd_dev->mapping.read_only = read_only; + rc = rbd_dev_device_setup(rbd_dev); if (!rc) return count; @@ -4911,7 +5071,7 @@ static void rbd_dev_device_release(struct device *dev) rbd_free_disk(rbd_dev); clear_bit(RBD_DEV_FLAG_EXISTS, &rbd_dev->flags); - rbd_dev_clear_mapping(rbd_dev); + rbd_dev_mapping_clear(rbd_dev); unregister_blkdev(rbd_dev->major, rbd_dev->name); rbd_dev->major = 0; rbd_dev_id_put(rbd_dev); @@ -4978,10 +5138,13 @@ static ssize_t rbd_remove(struct bus_type *bus, spin_unlock_irq(&rbd_dev->lock); if (ret < 0) goto done; - ret = count; rbd_bus_del_dev(rbd_dev); + ret = rbd_dev_header_watch_sync(rbd_dev, false); + if (ret) + rbd_warn(rbd_dev, "failed to cancel watch event (%d)\n", ret); rbd_dev_image_release(rbd_dev); module_put(THIS_MODULE); + ret = count; done: mutex_unlock(&ctl_mutex); diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c index 3a8f7e6db295..e7e92429d10f 100644 --- a/drivers/gpu/drm/drm_crtc.c +++ b/drivers/gpu/drm/drm_crtc.c @@ -78,6 +78,10 @@ void drm_warn_on_modeset_not_all_locked(struct drm_device *dev) { struct drm_crtc *crtc; + /* Locking is currently fubar in the panic handler. */ + if (oops_in_progress) + return; + list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) WARN_ON(!mutex_is_locked(&crtc->mutex)); @@ -246,6 +250,7 @@ char *drm_get_connector_status_name(enum drm_connector_status status) else return "unknown"; } +EXPORT_SYMBOL(drm_get_connector_status_name); /** * drm_mode_object_get - allocate a new modeset identifier diff --git a/drivers/gpu/drm/drm_crtc_helper.c b/drivers/gpu/drm/drm_crtc_helper.c index e974f9309b72..ed1334e27c33 100644 --- a/drivers/gpu/drm/drm_crtc_helper.c +++ b/drivers/gpu/drm/drm_crtc_helper.c @@ -121,6 +121,7 @@ int drm_helper_probe_single_connector_modes(struct drm_connector *connector, connector->helper_private; int count = 0; int mode_flags = 0; + bool verbose_prune = true; DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n", connector->base.id, drm_get_connector_name(connector)); @@ -149,6 +150,7 @@ int drm_helper_probe_single_connector_modes(struct drm_connector *connector, DRM_DEBUG_KMS("[CONNECTOR:%d:%s] disconnected\n", connector->base.id, drm_get_connector_name(connector)); drm_mode_connector_update_edid_property(connector, NULL); + verbose_prune = false; goto prune; } @@ -182,7 +184,7 @@ int drm_helper_probe_single_connector_modes(struct drm_connector *connector, } prune: - drm_mode_prune_invalid(dev, &connector->modes, true); + drm_mode_prune_invalid(dev, &connector->modes, verbose_prune); if (list_empty(&connector->modes)) return 0; @@ -1005,12 +1007,20 @@ static void output_poll_execute(struct work_struct *work) continue; connector->status = connector->funcs->detect(connector, false); - DRM_DEBUG_KMS("[CONNECTOR:%d:%s] status updated from %d to %d\n", - connector->base.id, - drm_get_connector_name(connector), - old_status, connector->status); - if (old_status != connector->status) + if (old_status != connector->status) { + const char *old, *new; + + old = drm_get_connector_status_name(old_status); + new = drm_get_connector_status_name(connector->status); + + DRM_DEBUG_KMS("[CONNECTOR:%d:%s] " + "status updated from %s to %s\n", + connector->base.id, + drm_get_connector_name(connector), + old, new); + changed = true; + } } mutex_unlock(&dev->mode_config.mutex); @@ -1083,10 +1093,11 @@ void drm_helper_hpd_irq_event(struct drm_device *dev) old_status = connector->status; connector->status = connector->funcs->detect(connector, false); - DRM_DEBUG_KMS("[CONNECTOR:%d:%s] status updated from %d to %d\n", + DRM_DEBUG_KMS("[CONNECTOR:%d:%s] status updated from %s to %s\n", connector->base.id, drm_get_connector_name(connector), - old_status, connector->status); + drm_get_connector_status_name(old_status), + drm_get_connector_status_name(connector->status)); if (old_status != connector->status) changed = true; } diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 8d4f29075af5..9cc247f55502 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -57,7 +57,7 @@ static int drm_version(struct drm_device *dev, void *data, struct drm_file *file_priv); #define DRM_IOCTL_DEF(ioctl, _func, _flags) \ - [DRM_IOCTL_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, .cmd_drv = 0} + [DRM_IOCTL_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, .cmd_drv = 0, .name = #ioctl} /** Ioctl table */ static const struct drm_ioctl_desc drm_ioctls[] = { @@ -375,7 +375,7 @@ long drm_ioctl(struct file *filp, { struct drm_file *file_priv = filp->private_data; struct drm_device *dev; - const struct drm_ioctl_desc *ioctl; + const struct drm_ioctl_desc *ioctl = NULL; drm_ioctl_t *func; unsigned int nr = DRM_IOCTL_NR(cmd); int retcode = -EINVAL; @@ -392,11 +392,6 @@ long drm_ioctl(struct file *filp, atomic_inc(&dev->counts[_DRM_STAT_IOCTLS]); ++file_priv->ioctl_count; - DRM_DEBUG("pid=%d, cmd=0x%02x, nr=0x%02x, dev 0x%lx, auth=%d\n", - task_pid_nr(current), cmd, nr, - (long)old_encode_dev(file_priv->minor->device), - file_priv->authenticated); - if ((nr >= DRM_CORE_IOCTL_COUNT) && ((nr < DRM_COMMAND_BASE) || (nr >= DRM_COMMAND_END))) goto err_i1; @@ -417,6 +412,11 @@ long drm_ioctl(struct file *filp, } else goto err_i1; + DRM_DEBUG("pid=%d, dev=0x%lx, auth=%d, %s\n", + task_pid_nr(current), + (long)old_encode_dev(file_priv->minor->device), + file_priv->authenticated, ioctl->name); + /* Do not trust userspace, use our own definition */ func = ioctl->func; /* is there a local override? */ @@ -471,6 +471,12 @@ long drm_ioctl(struct file *filp, } err_i1: + if (!ioctl) + DRM_DEBUG("invalid iotcl: pid=%d, dev=0x%lx, auth=%d, cmd=0x%02x, nr=0x%02x\n", + task_pid_nr(current), + (long)old_encode_dev(file_priv->minor->device), + file_priv->authenticated, cmd, nr); + if (kdata != stack_kdata) kfree(kdata); atomic_dec(&dev->ioctl_count); diff --git a/drivers/gpu/drm/drm_encoder_slave.c b/drivers/gpu/drm/drm_encoder_slave.c index 48c52f7df4e6..0cfb60f54766 100644 --- a/drivers/gpu/drm/drm_encoder_slave.c +++ b/drivers/gpu/drm/drm_encoder_slave.c @@ -54,16 +54,12 @@ int drm_i2c_encoder_init(struct drm_device *dev, struct i2c_adapter *adap, const struct i2c_board_info *info) { - char modalias[sizeof(I2C_MODULE_PREFIX) - + I2C_NAME_SIZE]; struct module *module = NULL; struct i2c_client *client; struct drm_i2c_encoder_driver *encoder_drv; int err = 0; - snprintf(modalias, sizeof(modalias), - "%s%s", I2C_MODULE_PREFIX, info->type); - request_module(modalias); + request_module("%s%s", I2C_MODULE_PREFIX, info->type); client = i2c_new_device(adap, info); if (!client) { diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c index db1e2d6f90d7..07cf99cc8862 100644 --- a/drivers/gpu/drm/drm_mm.c +++ b/drivers/gpu/drm/drm_mm.c @@ -755,33 +755,35 @@ void drm_mm_debug_table(struct drm_mm *mm, const char *prefix) EXPORT_SYMBOL(drm_mm_debug_table); #if defined(CONFIG_DEBUG_FS) -int drm_mm_dump_table(struct seq_file *m, struct drm_mm *mm) +static unsigned long drm_mm_dump_hole(struct seq_file *m, struct drm_mm_node *entry) { - struct drm_mm_node *entry; - unsigned long total_used = 0, total_free = 0, total = 0; unsigned long hole_start, hole_end, hole_size; - hole_start = drm_mm_hole_node_start(&mm->head_node); - hole_end = drm_mm_hole_node_end(&mm->head_node); - hole_size = hole_end - hole_start; - if (hole_size) + if (entry->hole_follows) { + hole_start = drm_mm_hole_node_start(entry); + hole_end = drm_mm_hole_node_end(entry); + hole_size = hole_end - hole_start; seq_printf(m, "0x%08lx-0x%08lx: 0x%08lx: free\n", hole_start, hole_end, hole_size); - total_free += hole_size; + return hole_size; + } + + return 0; +} + +int drm_mm_dump_table(struct seq_file *m, struct drm_mm *mm) +{ + struct drm_mm_node *entry; + unsigned long total_used = 0, total_free = 0, total = 0; + + total_free += drm_mm_dump_hole(m, &mm->head_node); drm_mm_for_each_node(entry, mm) { seq_printf(m, "0x%08lx-0x%08lx: 0x%08lx: used\n", entry->start, entry->start + entry->size, entry->size); total_used += entry->size; - if (entry->hole_follows) { - hole_start = drm_mm_hole_node_start(entry); - hole_end = drm_mm_hole_node_end(entry); - hole_size = hole_end - hole_start; - seq_printf(m, "0x%08lx-0x%08lx: 0x%08lx: free\n", - hole_start, hole_end, hole_size); - total_free += hole_size; - } + total_free += drm_mm_dump_hole(m, entry); } total = total_free + total_used; diff --git a/drivers/gpu/drm/drm_modes.c b/drivers/gpu/drm/drm_modes.c index faa79df02648..a371ff865a88 100644 --- a/drivers/gpu/drm/drm_modes.c +++ b/drivers/gpu/drm/drm_modes.c @@ -1143,6 +1143,7 @@ bool drm_mode_parse_command_line_for_connector(const char *mode_option, was_digit = false; } else goto done; + break; case '0' ... '9': was_digit = true; break; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 6be940effefd..6165535d15f0 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1045,6 +1045,8 @@ static int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno, if (timeout) { struct timespec sleep_time = timespec_sub(now, before); *timeout = timespec_sub(*timeout, sleep_time); + if (!timespec_valid(timeout)) /* i.e. negative time remains */ + set_normalized_timespec(timeout, 0, 0); } switch (end) { @@ -1053,8 +1055,6 @@ static int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno, case -ERESTARTSYS: /* Signal */ return (int)end; case 0: /* Timeout */ - if (timeout) - set_normalized_timespec(timeout, 0, 0); return -ETIME; default: /* Completed */ WARN_ON(end < 0); /* We're not aware of other errors */ @@ -2377,10 +2377,8 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file) mutex_unlock(&dev->struct_mutex); ret = __wait_seqno(ring, seqno, reset_counter, true, timeout); - if (timeout) { - WARN_ON(!timespec_valid(timeout)); + if (timeout) args->timeout_ns = timespec_to_ns(timeout); - } return ret; out: diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index dca614de71b6..bdb0d7717bc7 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -709,15 +709,6 @@ static inline size_t gen6_get_stolen_size(u16 snb_gmch_ctl) return snb_gmch_ctl << 25; /* 32 MB units */ } -static inline size_t gen7_get_stolen_size(u16 snb_gmch_ctl) -{ - static const int stolen_decoder[] = { - 0, 0, 0, 0, 0, 32, 48, 64, 128, 256, 96, 160, 224, 352}; - snb_gmch_ctl >>= IVB_GMCH_GMS_SHIFT; - snb_gmch_ctl &= IVB_GMCH_GMS_MASK; - return stolen_decoder[snb_gmch_ctl] << 20; -} - static int gen6_gmch_probe(struct drm_device *dev, size_t *gtt_total, size_t *stolen, @@ -747,11 +738,7 @@ static int gen6_gmch_probe(struct drm_device *dev, pci_read_config_word(dev->pdev, SNB_GMCH_CTRL, &snb_gmch_ctl); gtt_size = gen6_get_total_gtt_size(snb_gmch_ctl); - if (IS_GEN7(dev) && !IS_VALLEYVIEW(dev)) - *stolen = gen7_get_stolen_size(snb_gmch_ctl); - else - *stolen = gen6_get_stolen_size(snb_gmch_ctl); - + *stolen = gen6_get_stolen_size(snb_gmch_ctl); *gtt_total = (gtt_size / sizeof(gen6_gtt_pte_t)) << PAGE_SHIFT; /* For Modern GENs the PTEs and register space are split in the BAR */ diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 83f9c26e1adb..2d6b62e42daf 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -46,8 +46,6 @@ #define SNB_GMCH_GGMS_MASK 0x3 #define SNB_GMCH_GMS_SHIFT 3 /* Graphics Mode Select */ #define SNB_GMCH_GMS_MASK 0x1f -#define IVB_GMCH_GMS_SHIFT 4 -#define IVB_GMCH_GMS_MASK 0xf /* PCI config space */ diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c index 26a0a570f92e..fb961bb81903 100644 --- a/drivers/gpu/drm/i915/intel_ddi.c +++ b/drivers/gpu/drm/i915/intel_ddi.c @@ -1265,6 +1265,8 @@ static void intel_ddi_pre_enable(struct intel_encoder *intel_encoder) intel_dp_sink_dpms(intel_dp, DRM_MODE_DPMS_ON); intel_dp_start_link_train(intel_dp); intel_dp_complete_link_train(intel_dp); + if (port != PORT_A) + intel_dp_stop_link_train(intel_dp); } } @@ -1326,6 +1328,9 @@ static void intel_enable_ddi(struct intel_encoder *intel_encoder) } else if (type == INTEL_OUTPUT_EDP) { struct intel_dp *intel_dp = enc_to_intel_dp(encoder); + if (port == PORT_A) + intel_dp_stop_link_train(intel_dp); + ironlake_edp_backlight_on(intel_dp); } diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index fb2fbc1e08b9..3d704b706a8d 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -702,6 +702,9 @@ intel_dp_compute_config(struct intel_encoder *encoder, /* Walk through all bpp values. Luckily they're all nicely spaced with 2 * bpc in between. */ bpp = min_t(int, 8*3, pipe_config->pipe_bpp); + if (is_edp(intel_dp) && dev_priv->edp.bpp) + bpp = min_t(int, bpp, dev_priv->edp.bpp); + for (; bpp >= 6*3; bpp -= 2*3) { mode_rate = intel_dp_link_required(target_clock, bpp); @@ -739,6 +742,7 @@ found: intel_dp->link_bw = bws[clock]; intel_dp->lane_count = lane_count; adjusted_mode->clock = drm_dp_bw_code_to_link_rate(intel_dp->link_bw); + pipe_config->pipe_bpp = bpp; pipe_config->pixel_target_clock = target_clock; DRM_DEBUG_KMS("DP link bw %02x lane count %d clock %d bpp %d\n", @@ -751,20 +755,6 @@ found: target_clock, adjusted_mode->clock, &pipe_config->dp_m_n); - /* - * XXX: We have a strange regression where using the vbt edp bpp value - * for the link bw computation results in black screens, the panel only - * works when we do the computation at the usual 24bpp (but still - * requires us to use 18bpp). Until that's fully debugged, stay - * bug-for-bug compatible with the old code. - */ - if (is_edp(intel_dp) && dev_priv->edp.bpp) { - DRM_DEBUG_KMS("clamping display bpc (was %d) to eDP (%d)\n", - bpp, dev_priv->edp.bpp); - bpp = min_t(int, bpp, dev_priv->edp.bpp); - } - pipe_config->pipe_bpp = bpp; - return true; } @@ -1389,6 +1379,7 @@ static void intel_enable_dp(struct intel_encoder *encoder) ironlake_edp_panel_on(intel_dp); ironlake_edp_panel_vdd_off(intel_dp, true); intel_dp_complete_link_train(intel_dp); + intel_dp_stop_link_train(intel_dp); ironlake_edp_backlight_on(intel_dp); } @@ -1711,10 +1702,9 @@ intel_dp_set_link_train(struct intel_dp *intel_dp, struct drm_i915_private *dev_priv = dev->dev_private; enum port port = intel_dig_port->port; int ret; - uint32_t temp; if (HAS_DDI(dev)) { - temp = I915_READ(DP_TP_CTL(port)); + uint32_t temp = I915_READ(DP_TP_CTL(port)); if (dp_train_pat & DP_LINK_SCRAMBLING_DISABLE) temp |= DP_TP_CTL_SCRAMBLE_DISABLE; @@ -1724,18 +1714,6 @@ intel_dp_set_link_train(struct intel_dp *intel_dp, temp &= ~DP_TP_CTL_LINK_TRAIN_MASK; switch (dp_train_pat & DP_TRAINING_PATTERN_MASK) { case DP_TRAINING_PATTERN_DISABLE: - - if (port != PORT_A) { - temp |= DP_TP_CTL_LINK_TRAIN_IDLE; - I915_WRITE(DP_TP_CTL(port), temp); - - if (wait_for((I915_READ(DP_TP_STATUS(port)) & - DP_TP_STATUS_IDLE_DONE), 1)) - DRM_ERROR("Timed out waiting for DP idle patterns\n"); - - temp &= ~DP_TP_CTL_LINK_TRAIN_MASK; - } - temp |= DP_TP_CTL_LINK_TRAIN_NORMAL; break; @@ -1811,6 +1789,37 @@ intel_dp_set_link_train(struct intel_dp *intel_dp, return true; } +static void intel_dp_set_idle_link_train(struct intel_dp *intel_dp) +{ + struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp); + struct drm_device *dev = intel_dig_port->base.base.dev; + struct drm_i915_private *dev_priv = dev->dev_private; + enum port port = intel_dig_port->port; + uint32_t val; + + if (!HAS_DDI(dev)) + return; + + val = I915_READ(DP_TP_CTL(port)); + val &= ~DP_TP_CTL_LINK_TRAIN_MASK; + val |= DP_TP_CTL_LINK_TRAIN_IDLE; + I915_WRITE(DP_TP_CTL(port), val); + + /* + * On PORT_A we can have only eDP in SST mode. There the only reason + * we need to set idle transmission mode is to work around a HW issue + * where we enable the pipe while not in idle link-training mode. + * In this case there is requirement to wait for a minimum number of + * idle patterns to be sent. + */ + if (port == PORT_A) + return; + + if (wait_for((I915_READ(DP_TP_STATUS(port)) & DP_TP_STATUS_IDLE_DONE), + 1)) + DRM_ERROR("Timed out waiting for DP idle patterns\n"); +} + /* Enable corresponding port and start training pattern 1 */ void intel_dp_start_link_train(struct intel_dp *intel_dp) @@ -1953,10 +1962,19 @@ intel_dp_complete_link_train(struct intel_dp *intel_dp) ++tries; } + intel_dp_set_idle_link_train(intel_dp); + + intel_dp->DP = DP; + if (channel_eq) DRM_DEBUG_KMS("Channel EQ done. DP Training successful\n"); - intel_dp_set_link_train(intel_dp, DP, DP_TRAINING_PATTERN_DISABLE); +} + +void intel_dp_stop_link_train(struct intel_dp *intel_dp) +{ + intel_dp_set_link_train(intel_dp, intel_dp->DP, + DP_TRAINING_PATTERN_DISABLE); } static void @@ -2164,6 +2182,7 @@ intel_dp_check_link_status(struct intel_dp *intel_dp) drm_get_encoder_name(&intel_encoder->base)); intel_dp_start_link_train(intel_dp); intel_dp_complete_link_train(intel_dp); + intel_dp_stop_link_train(intel_dp); } } diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index b5b6d19e6dd3..624a9e6b8d71 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -499,6 +499,7 @@ extern void intel_dp_init_connector(struct intel_digital_port *intel_dig_port, extern void intel_dp_init_link_config(struct intel_dp *intel_dp); extern void intel_dp_start_link_train(struct intel_dp *intel_dp); extern void intel_dp_complete_link_train(struct intel_dp *intel_dp); +extern void intel_dp_stop_link_train(struct intel_dp *intel_dp); extern void intel_dp_sink_dpms(struct intel_dp *intel_dp, int mode); extern void intel_dp_encoder_destroy(struct drm_encoder *encoder); extern void intel_dp_check_link_status(struct intel_dp *intel_dp); diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c index 0e19e575a1b4..6b7c3ca2c035 100644 --- a/drivers/gpu/drm/i915/intel_fb.c +++ b/drivers/gpu/drm/i915/intel_fb.c @@ -262,10 +262,22 @@ void intel_fbdev_fini(struct drm_device *dev) void intel_fbdev_set_suspend(struct drm_device *dev, int state) { drm_i915_private_t *dev_priv = dev->dev_private; - if (!dev_priv->fbdev) + struct intel_fbdev *ifbdev = dev_priv->fbdev; + struct fb_info *info; + + if (!ifbdev) return; - fb_set_suspend(dev_priv->fbdev->helper.fbdev, state); + info = ifbdev->helper.fbdev; + + /* On resume from hibernation: If the object is shmemfs backed, it has + * been restored from swap. If the object is stolen however, it will be + * full of whatever garbage was left in there. + */ + if (!state && ifbdev->ifb.obj->stolen) + memset_io(info->screen_base, 0, info->screen_size); + + fb_set_suspend(info, state); } MODULE_LICENSE("GPL and additional rights"); diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index de3b0dc5658b..aa01128ff192 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -1301,17 +1301,17 @@ static void valleyview_update_wm(struct drm_device *dev) vlv_update_drain_latency(dev); - if (g4x_compute_wm0(dev, 0, + if (g4x_compute_wm0(dev, PIPE_A, &valleyview_wm_info, latency_ns, &valleyview_cursor_wm_info, latency_ns, &planea_wm, &cursora_wm)) - enabled |= 1; + enabled |= 1 << PIPE_A; - if (g4x_compute_wm0(dev, 1, + if (g4x_compute_wm0(dev, PIPE_B, &valleyview_wm_info, latency_ns, &valleyview_cursor_wm_info, latency_ns, &planeb_wm, &cursorb_wm)) - enabled |= 2; + enabled |= 1 << PIPE_B; if (single_plane_enabled(enabled) && g4x_compute_srwm(dev, ffs(enabled) - 1, @@ -1357,17 +1357,17 @@ static void g4x_update_wm(struct drm_device *dev) int plane_sr, cursor_sr; unsigned int enabled = 0; - if (g4x_compute_wm0(dev, 0, + if (g4x_compute_wm0(dev, PIPE_A, &g4x_wm_info, latency_ns, &g4x_cursor_wm_info, latency_ns, &planea_wm, &cursora_wm)) - enabled |= 1; + enabled |= 1 << PIPE_A; - if (g4x_compute_wm0(dev, 1, + if (g4x_compute_wm0(dev, PIPE_B, &g4x_wm_info, latency_ns, &g4x_cursor_wm_info, latency_ns, &planeb_wm, &cursorb_wm)) - enabled |= 2; + enabled |= 1 << PIPE_B; if (single_plane_enabled(enabled) && g4x_compute_srwm(dev, ffs(enabled) - 1, @@ -1716,7 +1716,7 @@ static void ironlake_update_wm(struct drm_device *dev) unsigned int enabled; enabled = 0; - if (g4x_compute_wm0(dev, 0, + if (g4x_compute_wm0(dev, PIPE_A, &ironlake_display_wm_info, ILK_LP0_PLANE_LATENCY, &ironlake_cursor_wm_info, @@ -1727,10 +1727,10 @@ static void ironlake_update_wm(struct drm_device *dev) DRM_DEBUG_KMS("FIFO watermarks For pipe A -" " plane %d, " "cursor: %d\n", plane_wm, cursor_wm); - enabled |= 1; + enabled |= 1 << PIPE_A; } - if (g4x_compute_wm0(dev, 1, + if (g4x_compute_wm0(dev, PIPE_B, &ironlake_display_wm_info, ILK_LP0_PLANE_LATENCY, &ironlake_cursor_wm_info, @@ -1741,7 +1741,7 @@ static void ironlake_update_wm(struct drm_device *dev) DRM_DEBUG_KMS("FIFO watermarks For pipe B -" " plane %d, cursor: %d\n", plane_wm, cursor_wm); - enabled |= 2; + enabled |= 1 << PIPE_B; } /* @@ -1801,7 +1801,7 @@ static void sandybridge_update_wm(struct drm_device *dev) unsigned int enabled; enabled = 0; - if (g4x_compute_wm0(dev, 0, + if (g4x_compute_wm0(dev, PIPE_A, &sandybridge_display_wm_info, latency, &sandybridge_cursor_wm_info, latency, &plane_wm, &cursor_wm)) { @@ -1812,10 +1812,10 @@ static void sandybridge_update_wm(struct drm_device *dev) DRM_DEBUG_KMS("FIFO watermarks For pipe A -" " plane %d, " "cursor: %d\n", plane_wm, cursor_wm); - enabled |= 1; + enabled |= 1 << PIPE_A; } - if (g4x_compute_wm0(dev, 1, + if (g4x_compute_wm0(dev, PIPE_B, &sandybridge_display_wm_info, latency, &sandybridge_cursor_wm_info, latency, &plane_wm, &cursor_wm)) { @@ -1826,7 +1826,7 @@ static void sandybridge_update_wm(struct drm_device *dev) DRM_DEBUG_KMS("FIFO watermarks For pipe B -" " plane %d, cursor: %d\n", plane_wm, cursor_wm); - enabled |= 2; + enabled |= 1 << PIPE_B; } /* @@ -1904,7 +1904,7 @@ static void ivybridge_update_wm(struct drm_device *dev) unsigned int enabled; enabled = 0; - if (g4x_compute_wm0(dev, 0, + if (g4x_compute_wm0(dev, PIPE_A, &sandybridge_display_wm_info, latency, &sandybridge_cursor_wm_info, latency, &plane_wm, &cursor_wm)) { @@ -1915,10 +1915,10 @@ static void ivybridge_update_wm(struct drm_device *dev) DRM_DEBUG_KMS("FIFO watermarks For pipe A -" " plane %d, " "cursor: %d\n", plane_wm, cursor_wm); - enabled |= 1; + enabled |= 1 << PIPE_A; } - if (g4x_compute_wm0(dev, 1, + if (g4x_compute_wm0(dev, PIPE_B, &sandybridge_display_wm_info, latency, &sandybridge_cursor_wm_info, latency, &plane_wm, &cursor_wm)) { @@ -1929,10 +1929,10 @@ static void ivybridge_update_wm(struct drm_device *dev) DRM_DEBUG_KMS("FIFO watermarks For pipe B -" " plane %d, cursor: %d\n", plane_wm, cursor_wm); - enabled |= 2; + enabled |= 1 << PIPE_B; } - if (g4x_compute_wm0(dev, 2, + if (g4x_compute_wm0(dev, PIPE_C, &sandybridge_display_wm_info, latency, &sandybridge_cursor_wm_info, latency, &plane_wm, &cursor_wm)) { @@ -1943,7 +1943,7 @@ static void ivybridge_update_wm(struct drm_device *dev) DRM_DEBUG_KMS("FIFO watermarks For pipe C -" " plane %d, cursor: %d\n", plane_wm, cursor_wm); - enabled |= 3; + enabled |= 1 << PIPE_C; } /* diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c b/drivers/gpu/drm/mgag200/mgag200_mode.c index f9889658329b..77b8a45fb10a 100644 --- a/drivers/gpu/drm/mgag200/mgag200_mode.c +++ b/drivers/gpu/drm/mgag200/mgag200_mode.c @@ -46,29 +46,26 @@ static void mga_crtc_load_lut(struct drm_crtc *crtc) static inline void mga_wait_vsync(struct mga_device *mdev) { - unsigned int count = 0; + unsigned long timeout = jiffies + HZ/10; unsigned int status = 0; do { status = RREG32(MGAREG_Status); - count++; - } while ((status & 0x08) && (count < 250000)); - count = 0; + } while ((status & 0x08) && time_before(jiffies, timeout)); + timeout = jiffies + HZ/10; status = 0; do { status = RREG32(MGAREG_Status); - count++; - } while (!(status & 0x08) && (count < 250000)); + } while (!(status & 0x08) && time_before(jiffies, timeout)); } static inline void mga_wait_busy(struct mga_device *mdev) { - unsigned int count = 0; + unsigned long timeout = jiffies + HZ; unsigned int status = 0; do { status = RREG8(MGAREG_Status + 2); - count++; - } while ((status & 0x01) && (count < 500000)); + } while ((status & 0x01) && time_before(jiffies, timeout)); } /* @@ -189,12 +186,12 @@ static int mga_g200wb_set_plls(struct mga_device *mdev, long clock) WREG8(DAC_INDEX, MGA1064_PIX_CLK_CTL); tmp = RREG8(DAC_DATA); tmp |= MGA1064_PIX_CLK_CTL_CLK_DIS; - WREG_DAC(MGA1064_PIX_CLK_CTL_CLK_DIS, tmp); + WREG8(DAC_DATA, tmp); WREG8(DAC_INDEX, MGA1064_REMHEADCTL); tmp = RREG8(DAC_DATA); tmp |= MGA1064_REMHEADCTL_CLKDIS; - WREG_DAC(MGA1064_REMHEADCTL, tmp); + WREG8(DAC_DATA, tmp); /* select PLL Set C */ tmp = RREG8(MGAREG_MEM_MISC_READ); @@ -204,7 +201,7 @@ static int mga_g200wb_set_plls(struct mga_device *mdev, long clock) WREG8(DAC_INDEX, MGA1064_PIX_CLK_CTL); tmp = RREG8(DAC_DATA); tmp |= MGA1064_PIX_CLK_CTL_CLK_POW_DOWN | 0x80; - WREG_DAC(MGA1064_PIX_CLK_CTL, tmp); + WREG8(DAC_DATA, tmp); udelay(500); @@ -212,7 +209,7 @@ static int mga_g200wb_set_plls(struct mga_device *mdev, long clock) WREG8(DAC_INDEX, MGA1064_VREF_CTL); tmp = RREG8(DAC_DATA); tmp &= ~0x04; - WREG_DAC(MGA1064_VREF_CTL, tmp); + WREG8(DAC_DATA, tmp); udelay(50); @@ -236,13 +233,13 @@ static int mga_g200wb_set_plls(struct mga_device *mdev, long clock) tmp = RREG8(DAC_DATA); tmp &= ~MGA1064_PIX_CLK_CTL_SEL_MSK; tmp |= MGA1064_PIX_CLK_CTL_SEL_PLL; - WREG_DAC(MGA1064_PIX_CLK_CTL, tmp); + WREG8(DAC_DATA, tmp); WREG8(DAC_INDEX, MGA1064_REMHEADCTL); tmp = RREG8(DAC_DATA); tmp &= ~MGA1064_REMHEADCTL_CLKSL_MSK; tmp |= MGA1064_REMHEADCTL_CLKSL_PLL; - WREG_DAC(MGA1064_REMHEADCTL, tmp); + WREG8(DAC_DATA, tmp); /* reset dotclock rate bit */ WREG8(MGAREG_SEQ_INDEX, 1); @@ -253,7 +250,7 @@ static int mga_g200wb_set_plls(struct mga_device *mdev, long clock) WREG8(DAC_INDEX, MGA1064_PIX_CLK_CTL); tmp = RREG8(DAC_DATA); tmp &= ~MGA1064_PIX_CLK_CTL_CLK_DIS; - WREG_DAC(MGA1064_PIX_CLK_CTL, tmp); + WREG8(DAC_DATA, tmp); vcount = RREG8(MGAREG_VCOUNT); @@ -318,7 +315,7 @@ static int mga_g200ev_set_plls(struct mga_device *mdev, long clock) WREG8(DAC_INDEX, MGA1064_PIX_CLK_CTL); tmp = RREG8(DAC_DATA); tmp |= MGA1064_PIX_CLK_CTL_CLK_DIS; - WREG_DAC(MGA1064_PIX_CLK_CTL_CLK_DIS, tmp); + WREG8(DAC_DATA, tmp); tmp = RREG8(MGAREG_MEM_MISC_READ); tmp |= 0x3 << 2; @@ -326,12 +323,12 @@ static int mga_g200ev_set_plls(struct mga_device *mdev, long clock) WREG8(DAC_INDEX, MGA1064_PIX_PLL_STAT); tmp = RREG8(DAC_DATA); - WREG_DAC(MGA1064_PIX_PLL_STAT, tmp & ~0x40); + WREG8(DAC_DATA, tmp & ~0x40); WREG8(DAC_INDEX, MGA1064_PIX_CLK_CTL); tmp = RREG8(DAC_DATA); tmp |= MGA1064_PIX_CLK_CTL_CLK_POW_DOWN; - WREG_DAC(MGA1064_PIX_CLK_CTL, tmp); + WREG8(DAC_DATA, tmp); WREG_DAC(MGA1064_EV_PIX_PLLC_M, m); WREG_DAC(MGA1064_EV_PIX_PLLC_N, n); @@ -342,7 +339,7 @@ static int mga_g200ev_set_plls(struct mga_device *mdev, long clock) WREG8(DAC_INDEX, MGA1064_PIX_CLK_CTL); tmp = RREG8(DAC_DATA); tmp &= ~MGA1064_PIX_CLK_CTL_CLK_POW_DOWN; - WREG_DAC(MGA1064_PIX_CLK_CTL, tmp); + WREG8(DAC_DATA, tmp); udelay(500); @@ -350,11 +347,11 @@ static int mga_g200ev_set_plls(struct mga_device *mdev, long clock) tmp = RREG8(DAC_DATA); tmp &= ~MGA1064_PIX_CLK_CTL_SEL_MSK; tmp |= MGA1064_PIX_CLK_CTL_SEL_PLL; - WREG_DAC(MGA1064_PIX_CLK_CTL, tmp); + WREG8(DAC_DATA, tmp); WREG8(DAC_INDEX, MGA1064_PIX_PLL_STAT); tmp = RREG8(DAC_DATA); - WREG_DAC(MGA1064_PIX_PLL_STAT, tmp | 0x40); + WREG8(DAC_DATA, tmp | 0x40); tmp = RREG8(MGAREG_MEM_MISC_READ); tmp |= (0x3 << 2); @@ -363,7 +360,7 @@ static int mga_g200ev_set_plls(struct mga_device *mdev, long clock) WREG8(DAC_INDEX, MGA1064_PIX_CLK_CTL); tmp = RREG8(DAC_DATA); tmp &= ~MGA1064_PIX_CLK_CTL_CLK_DIS; - WREG_DAC(MGA1064_PIX_CLK_CTL, tmp); + WREG8(DAC_DATA, tmp); return 0; } @@ -416,7 +413,7 @@ static int mga_g200eh_set_plls(struct mga_device *mdev, long clock) WREG8(DAC_INDEX, MGA1064_PIX_CLK_CTL); tmp = RREG8(DAC_DATA); tmp |= MGA1064_PIX_CLK_CTL_CLK_DIS; - WREG_DAC(MGA1064_PIX_CLK_CTL_CLK_DIS, tmp); + WREG8(DAC_DATA, tmp); tmp = RREG8(MGAREG_MEM_MISC_READ); tmp |= 0x3 << 2; @@ -425,7 +422,7 @@ static int mga_g200eh_set_plls(struct mga_device *mdev, long clock) WREG8(DAC_INDEX, MGA1064_PIX_CLK_CTL); tmp = RREG8(DAC_DATA); tmp |= MGA1064_PIX_CLK_CTL_CLK_POW_DOWN; - WREG_DAC(MGA1064_PIX_CLK_CTL, tmp); + WREG8(DAC_DATA, tmp); udelay(500); @@ -439,13 +436,13 @@ static int mga_g200eh_set_plls(struct mga_device *mdev, long clock) tmp = RREG8(DAC_DATA); tmp &= ~MGA1064_PIX_CLK_CTL_SEL_MSK; tmp |= MGA1064_PIX_CLK_CTL_SEL_PLL; - WREG_DAC(MGA1064_PIX_CLK_CTL, tmp); + WREG8(DAC_DATA, tmp); WREG8(DAC_INDEX, MGA1064_PIX_CLK_CTL); tmp = RREG8(DAC_DATA); tmp &= ~MGA1064_PIX_CLK_CTL_CLK_DIS; tmp &= ~MGA1064_PIX_CLK_CTL_CLK_POW_DOWN; - WREG_DAC(MGA1064_PIX_CLK_CTL, tmp); + WREG8(DAC_DATA, tmp); vcount = RREG8(MGAREG_VCOUNT); @@ -515,12 +512,12 @@ static int mga_g200er_set_plls(struct mga_device *mdev, long clock) WREG8(DAC_INDEX, MGA1064_PIX_CLK_CTL); tmp = RREG8(DAC_DATA); tmp |= MGA1064_PIX_CLK_CTL_CLK_DIS; - WREG_DAC(MGA1064_PIX_CLK_CTL_CLK_DIS, tmp); + WREG8(DAC_DATA, tmp); WREG8(DAC_INDEX, MGA1064_REMHEADCTL); tmp = RREG8(DAC_DATA); tmp |= MGA1064_REMHEADCTL_CLKDIS; - WREG_DAC(MGA1064_REMHEADCTL, tmp); + WREG8(DAC_DATA, tmp); tmp = RREG8(MGAREG_MEM_MISC_READ); tmp |= (0x3<<2) | 0xc0; @@ -530,7 +527,7 @@ static int mga_g200er_set_plls(struct mga_device *mdev, long clock) tmp = RREG8(DAC_DATA); tmp &= ~MGA1064_PIX_CLK_CTL_CLK_DIS; tmp |= MGA1064_PIX_CLK_CTL_CLK_POW_DOWN; - WREG_DAC(MGA1064_PIX_CLK_CTL, tmp); + WREG8(DAC_DATA, tmp); udelay(500); @@ -657,12 +654,26 @@ static void mga_g200wb_commit(struct drm_crtc *crtc) WREG_DAC(MGA1064_GEN_IO_DATA, tmp); } - +/* + This is how the framebuffer base address is stored in g200 cards: + * Assume @offset is the gpu_addr variable of the framebuffer object + * Then addr is the number of _pixels_ (not bytes) from the start of + VRAM to the first pixel we want to display. (divided by 2 for 32bit + framebuffers) + * addr is stored in the CRTCEXT0, CRTCC and CRTCD registers + addr<20> -> CRTCEXT0<6> + addr<19-16> -> CRTCEXT0<3-0> + addr<15-8> -> CRTCC<7-0> + addr<7-0> -> CRTCD<7-0> + CRTCEXT0 has to be programmed last to trigger an update and make the + new addr variable take effect. + */ void mga_set_start_address(struct drm_crtc *crtc, unsigned offset) { struct mga_device *mdev = crtc->dev->dev_private; u32 addr; int count; + u8 crtcext0; while (RREG8(0x1fda) & 0x08); while (!(RREG8(0x1fda) & 0x08)); @@ -670,10 +681,17 @@ void mga_set_start_address(struct drm_crtc *crtc, unsigned offset) count = RREG8(MGAREG_VCOUNT) + 2; while (RREG8(MGAREG_VCOUNT) < count); - addr = offset >> 2; + WREG8(MGAREG_CRTCEXT_INDEX, 0); + crtcext0 = RREG8(MGAREG_CRTCEXT_DATA); + crtcext0 &= 0xB0; + addr = offset / 8; + /* Can't store addresses any higher than that... + but we also don't have more than 16MB of memory, so it should be fine. */ + WARN_ON(addr > 0x1fffff); + crtcext0 |= (!!(addr & (1<<20)))<<6; WREG_CRT(0x0d, (u8)(addr & 0xff)); WREG_CRT(0x0c, (u8)(addr >> 8) & 0xff); - WREG_CRT(0xaf, (u8)(addr >> 16) & 0xf); + WREG_ECRT(0x0, ((u8)(addr >> 16) & 0xf) | crtcext0); } @@ -829,11 +847,7 @@ static int mga_crtc_mode_set(struct drm_crtc *crtc, for (i = 0; i < sizeof(dacvalue); i++) { - if ((i <= 0x03) || - (i == 0x07) || - (i == 0x0b) || - (i == 0x0f) || - ((i >= 0x13) && (i <= 0x17)) || + if ((i <= 0x17) || (i == 0x1b) || (i == 0x1c) || ((i >= 0x1f) && (i <= 0x29)) || diff --git a/drivers/lguest/page_tables.c b/drivers/lguest/page_tables.c index 699187ab3800..5b9ac32801c7 100644 --- a/drivers/lguest/page_tables.c +++ b/drivers/lguest/page_tables.c @@ -1002,6 +1002,7 @@ void guest_set_pgd(struct lguest *lg, unsigned long gpgdir, u32 idx) kill_guest(&lg->cpus[0], "Cannot populate switcher mapping"); } + lg->pgdirs[pgdir].last_host_cpu = -1; } } diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c index 375c109607ff..f4f3038c1df0 100644 --- a/drivers/mmc/host/mmci.c +++ b/drivers/mmc/host/mmci.c @@ -1130,6 +1130,7 @@ static void mmci_set_ios(struct mmc_host *mmc, struct mmc_ios *ios) struct variant_data *variant = host->variant; u32 pwr = 0; unsigned long flags; + int ret; pm_runtime_get_sync(mmc_dev(mmc)); @@ -1161,8 +1162,12 @@ static void mmci_set_ios(struct mmc_host *mmc, struct mmc_ios *ios) break; case MMC_POWER_ON: if (!IS_ERR(mmc->supply.vqmmc) && - !regulator_is_enabled(mmc->supply.vqmmc)) - regulator_enable(mmc->supply.vqmmc); + !regulator_is_enabled(mmc->supply.vqmmc)) { + ret = regulator_enable(mmc->supply.vqmmc); + if (ret < 0) + dev_err(mmc_dev(mmc), + "failed to enable vqmmc regulator\n"); + } pwr |= MCI_PWR_ON; break; diff --git a/drivers/net/caif/Kconfig b/drivers/net/caif/Kconfig index 7ffc756131a2..547098086773 100644 --- a/drivers/net/caif/Kconfig +++ b/drivers/net/caif/Kconfig @@ -43,7 +43,7 @@ config CAIF_HSI config CAIF_VIRTIO tristate "CAIF virtio transport driver" - depends on CAIF + depends on CAIF && HAS_DMA select VHOST_RING select VIRTIO select GENERIC_ALLOCATOR diff --git a/drivers/net/ethernet/3com/3c59x.c b/drivers/net/ethernet/3com/3c59x.c index de570a8f8967..072c6f14e8fc 100644 --- a/drivers/net/ethernet/3com/3c59x.c +++ b/drivers/net/ethernet/3com/3c59x.c @@ -632,7 +632,6 @@ struct vortex_private { pm_state_valid:1, /* pci_dev->saved_config_space has sane contents */ open:1, medialock:1, - must_free_region:1, /* Flag: if zero, Cardbus owns the I/O region */ large_frames:1, /* accept large frames */ handling_irq:1; /* private in_irq indicator */ /* {get|set}_wol operations are already serialized by rtnl. @@ -1012,6 +1011,12 @@ static int vortex_init_one(struct pci_dev *pdev, if (rc < 0) goto out; + rc = pci_request_regions(pdev, DRV_NAME); + if (rc < 0) { + pci_disable_device(pdev); + goto out; + } + unit = vortex_cards_found; if (global_use_mmio < 0 && (unit >= MAX_UNITS || use_mmio[unit] < 0)) { @@ -1027,6 +1032,7 @@ static int vortex_init_one(struct pci_dev *pdev, if (!ioaddr) /* If mapping fails, fall-back to BAR 0... */ ioaddr = pci_iomap(pdev, 0, 0); if (!ioaddr) { + pci_release_regions(pdev); pci_disable_device(pdev); rc = -ENOMEM; goto out; @@ -1036,6 +1042,7 @@ static int vortex_init_one(struct pci_dev *pdev, ent->driver_data, unit); if (rc < 0) { pci_iounmap(pdev, ioaddr); + pci_release_regions(pdev); pci_disable_device(pdev); goto out; } @@ -1178,11 +1185,6 @@ static int vortex_probe1(struct device *gendev, void __iomem *ioaddr, int irq, /* PCI-only startup logic */ if (pdev) { - /* EISA resources already marked, so only PCI needs to do this here */ - /* Ignore return value, because Cardbus drivers already allocate for us */ - if (request_region(dev->base_addr, vci->io_size, print_name) != NULL) - vp->must_free_region = 1; - /* enable bus-mastering if necessary */ if (vci->flags & PCI_USES_MASTER) pci_set_master(pdev); @@ -1220,7 +1222,7 @@ static int vortex_probe1(struct device *gendev, void __iomem *ioaddr, int irq, &vp->rx_ring_dma); retval = -ENOMEM; if (!vp->rx_ring) - goto free_region; + goto free_device; vp->tx_ring = (struct boom_tx_desc *)(vp->rx_ring + RX_RING_SIZE); vp->tx_ring_dma = vp->rx_ring_dma + sizeof(struct boom_rx_desc) * RX_RING_SIZE; @@ -1484,9 +1486,7 @@ free_ring: + sizeof(struct boom_tx_desc) * TX_RING_SIZE, vp->rx_ring, vp->rx_ring_dma); -free_region: - if (vp->must_free_region) - release_region(dev->base_addr, vci->io_size); +free_device: free_netdev(dev); pr_err(PFX "vortex_probe1 fails. Returns %d\n", retval); out: @@ -3254,8 +3254,9 @@ static void vortex_remove_one(struct pci_dev *pdev) + sizeof(struct boom_tx_desc) * TX_RING_SIZE, vp->rx_ring, vp->rx_ring_dma); - if (vp->must_free_region) - release_region(dev->base_addr, vp->io_size); + + pci_release_regions(pdev); + free_netdev(dev); } diff --git a/drivers/net/ethernet/brocade/bna/bnad.c b/drivers/net/ethernet/brocade/bna/bnad.c index ce4a030d3d0c..07f7ef05c3f2 100644 --- a/drivers/net/ethernet/brocade/bna/bnad.c +++ b/drivers/net/ethernet/brocade/bna/bnad.c @@ -3236,9 +3236,10 @@ bnad_init(struct bnad *bnad, sprintf(bnad->wq_name, "%s_wq_%d", BNAD_NAME, bnad->id); bnad->work_q = create_singlethread_workqueue(bnad->wq_name); - - if (!bnad->work_q) + if (!bnad->work_q) { + iounmap(bnad->bar0); return -ENOMEM; + } return 0; } diff --git a/drivers/net/ethernet/cadence/Kconfig b/drivers/net/ethernet/cadence/Kconfig index 1194446f859a..768285ec10f4 100644 --- a/drivers/net/ethernet/cadence/Kconfig +++ b/drivers/net/ethernet/cadence/Kconfig @@ -22,7 +22,7 @@ if NET_CADENCE config ARM_AT91_ETHER tristate "AT91RM9200 Ethernet support" - depends on GENERIC_HARDIRQS + depends on GENERIC_HARDIRQS && HAS_DMA select NET_CORE select MACB ---help--- @@ -31,6 +31,7 @@ config ARM_AT91_ETHER config MACB tristate "Cadence MACB/GEM support" + depends on HAS_DMA select PHYLIB ---help--- The Cadence MACB ethernet interface is found on many Atmel AT32 and diff --git a/drivers/net/ethernet/calxeda/Kconfig b/drivers/net/ethernet/calxeda/Kconfig index aba435c3d4ae..184a063bed5f 100644 --- a/drivers/net/ethernet/calxeda/Kconfig +++ b/drivers/net/ethernet/calxeda/Kconfig @@ -1,6 +1,6 @@ config NET_CALXEDA_XGMAC tristate "Calxeda 1G/10G XGMAC Ethernet driver" - depends on HAS_IOMEM + depends on HAS_IOMEM && HAS_DMA select CRC32 help This is the driver for the XGMAC Ethernet IP block found on Calxeda diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index aff0310a778b..ca9825ca88c9 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -87,6 +87,8 @@ #define FEC_QUIRK_HAS_GBIT (1 << 3) /* Controller has extend desc buffer */ #define FEC_QUIRK_HAS_BUFDESC_EX (1 << 4) +/* Controller has hardware checksum support */ +#define FEC_QUIRK_HAS_CSUM (1 << 5) static struct platform_device_id fec_devtype[] = { { @@ -105,7 +107,7 @@ static struct platform_device_id fec_devtype[] = { }, { .name = "imx6q-fec", .driver_data = FEC_QUIRK_ENET_MAC | FEC_QUIRK_HAS_GBIT | - FEC_QUIRK_HAS_BUFDESC_EX, + FEC_QUIRK_HAS_BUFDESC_EX | FEC_QUIRK_HAS_CSUM, }, { .name = "mvf-fec", .driver_data = FEC_QUIRK_ENET_MAC, @@ -1744,6 +1746,8 @@ static const struct net_device_ops fec_netdev_ops = { static int fec_enet_init(struct net_device *ndev) { struct fec_enet_private *fep = netdev_priv(ndev); + const struct platform_device_id *id_entry = + platform_get_device_id(fep->pdev); struct bufdesc *cbd_base; /* Allocate memory for buffer descriptors. */ @@ -1775,12 +1779,14 @@ static int fec_enet_init(struct net_device *ndev) writel(FEC_RX_DISABLED_IMASK, fep->hwp + FEC_IMASK); netif_napi_add(ndev, &fep->napi, fec_enet_rx_napi, FEC_NAPI_WEIGHT); - /* enable hw accelerator */ - ndev->features |= (NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM - | NETIF_F_RXCSUM); - ndev->hw_features |= (NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM - | NETIF_F_RXCSUM); - fep->csum_flags |= FLAG_RX_CSUM_ENABLED; + if (id_entry->driver_data & FEC_QUIRK_HAS_CSUM) { + /* enable hw accelerator */ + ndev->features |= (NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM + | NETIF_F_RXCSUM); + ndev->hw_features |= (NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM + | NETIF_F_RXCSUM); + fep->csum_flags |= FLAG_RX_CSUM_ENABLED; + } fec_restart(ndev, 0); diff --git a/drivers/net/ethernet/ibm/emac/core.c b/drivers/net/ethernet/ibm/emac/core.c index 4989481c19f0..d300a0c0eafc 100644 --- a/drivers/net/ethernet/ibm/emac/core.c +++ b/drivers/net/ethernet/ibm/emac/core.c @@ -359,10 +359,26 @@ static int emac_reset(struct emac_instance *dev) } #ifdef CONFIG_PPC_DCR_NATIVE - /* Enable internal clock source */ - if (emac_has_feature(dev, EMAC_FTR_460EX_PHY_CLK_FIX)) - dcri_clrset(SDR0, SDR0_ETH_CFG, - 0, SDR0_ETH_CFG_ECS << dev->cell_index); + /* + * PPC460EX/GT Embedded Processor Advanced User's Manual + * section 28.10.1 Mode Register 0 (EMACx_MR0) states: + * Note: The PHY must provide a TX Clk in order to perform a soft reset + * of the EMAC. If none is present, select the internal clock + * (SDR0_ETH_CFG[EMACx_PHY_CLK] = 1). + * After a soft reset, select the external clock. + */ + if (emac_has_feature(dev, EMAC_FTR_460EX_PHY_CLK_FIX)) { + if (dev->phy_address == 0xffffffff && + dev->phy_map == 0xffffffff) { + /* No PHY: select internal loop clock before reset */ + dcri_clrset(SDR0, SDR0_ETH_CFG, + 0, SDR0_ETH_CFG_ECS << dev->cell_index); + } else { + /* PHY present: select external clock before reset */ + dcri_clrset(SDR0, SDR0_ETH_CFG, + SDR0_ETH_CFG_ECS << dev->cell_index, 0); + } + } #endif out_be32(&p->mr0, EMAC_MR0_SRST); @@ -370,10 +386,14 @@ static int emac_reset(struct emac_instance *dev) --n; #ifdef CONFIG_PPC_DCR_NATIVE - /* Enable external clock source */ - if (emac_has_feature(dev, EMAC_FTR_460EX_PHY_CLK_FIX)) - dcri_clrset(SDR0, SDR0_ETH_CFG, - SDR0_ETH_CFG_ECS << dev->cell_index, 0); + if (emac_has_feature(dev, EMAC_FTR_460EX_PHY_CLK_FIX)) { + if (dev->phy_address == 0xffffffff && + dev->phy_map == 0xffffffff) { + /* No PHY: restore external clock source after reset */ + dcri_clrset(SDR0, SDR0_ETH_CFG, + SDR0_ETH_CFG_ECS << dev->cell_index, 0); + } + } #endif if (n) { diff --git a/drivers/net/ethernet/mellanox/mlx4/en_resources.c b/drivers/net/ethernet/mellanox/mlx4/en_resources.c index 91f2b2c43c12..d3f508697a3d 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_resources.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_resources.c @@ -60,7 +60,7 @@ void mlx4_en_fill_qp_context(struct mlx4_en_priv *priv, int size, int stride, context->pri_path.sched_queue = 0x83 | (priv->port - 1) << 6; if (user_prio >= 0) { context->pri_path.sched_queue |= user_prio << 3; - context->pri_path.feup = 1 << 6; + context->pri_path.feup = MLX4_FEUP_FORCE_ETH_UP; } context->pri_path.counter_index = 0xff; context->cqn_send = cpu_to_be32(cqn); diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c index b147bdd40768..58a8e535d698 100644 --- a/drivers/net/ethernet/mellanox/mlx4/fw.c +++ b/drivers/net/ethernet/mellanox/mlx4/fw.c @@ -131,7 +131,9 @@ static void dump_dev_cap_flags2(struct mlx4_dev *dev, u64 flags) [2] = "RSS XOR Hash Function support", [3] = "Device manage flow steering support", [4] = "Automatic MAC reassignment support", - [5] = "Time stamping support" + [5] = "Time stamping support", + [6] = "VST (control vlan insertion/stripping) support", + [7] = "FSM (MAC anti-spoofing) support" }; int i; diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c index e12e0d2e0ee0..1157f028a90f 100644 --- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c +++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c @@ -372,24 +372,29 @@ static int update_vport_qp_param(struct mlx4_dev *dev, if (MLX4_QP_ST_RC == qp_type) return -EINVAL; + /* force strip vlan by clear vsd */ + qpc->param3 &= ~cpu_to_be32(MLX4_STRIP_VLAN); + if (0 != vp_oper->state.default_vlan) { + qpc->pri_path.vlan_control = + MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED | + MLX4_VLAN_CTRL_ETH_RX_BLOCK_PRIO_TAGGED | + MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED; + } else { /* priority tagged */ + qpc->pri_path.vlan_control = + MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED | + MLX4_VLAN_CTRL_ETH_RX_BLOCK_TAGGED; + } + + qpc->pri_path.fvl_rx |= MLX4_FVL_RX_FORCE_ETH_VLAN; qpc->pri_path.vlan_index = vp_oper->vlan_idx; - qpc->pri_path.fl = (1 << 6) | (1 << 2); /* set cv bit and hide_cqe_vlan bit*/ - qpc->pri_path.feup |= 1 << 3; /* set fvl bit */ + qpc->pri_path.fl |= MLX4_FL_CV | MLX4_FL_ETH_HIDE_CQE_VLAN; + qpc->pri_path.feup |= MLX4_FEUP_FORCE_ETH_UP | MLX4_FVL_FORCE_ETH_VLAN; qpc->pri_path.sched_queue &= 0xC7; qpc->pri_path.sched_queue |= (vp_oper->state.default_qos) << 3; - mlx4_dbg(dev, "qp %d port %d Q 0x%x set vlan to %d vidx %d feup %x fl %x\n", - be32_to_cpu(qpc->local_qpn) & 0xffffff, port, - (int)(qpc->pri_path.sched_queue), vp_oper->state.default_vlan, - vp_oper->vlan_idx, (int)(qpc->pri_path.feup), - (int)(qpc->pri_path.fl)); } if (vp_oper->state.spoofchk) { - qpc->pri_path.feup |= 1 << 5; /* set fsm bit */; + qpc->pri_path.feup |= MLX4_FSM_FORCE_ETH_SRC_MAC; qpc->pri_path.grh_mylmc = (0x80 & qpc->pri_path.grh_mylmc) + vp_oper->mac_idx; - mlx4_dbg(dev, "spoof qp %d port %d feup 0x%x, myLmc 0x%x mindx %d\n", - be32_to_cpu(qpc->local_qpn) & 0xffffff, port, - (int)qpc->pri_path.feup, (int)qpc->pri_path.grh_mylmc, - vp_oper->mac_idx); } return 0; } diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h index 90c253b145ef..019c5f78732e 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h @@ -429,6 +429,7 @@ struct qlcnic_hardware_context { u16 port_type; u16 board_type; + u16 supported_type; u16 link_speed; u16 link_duplex; @@ -1514,6 +1515,7 @@ void qlcnic_create_diag_entries(struct qlcnic_adapter *adapter); void qlcnic_remove_diag_entries(struct qlcnic_adapter *adapter); void qlcnic_82xx_add_sysfs(struct qlcnic_adapter *adapter); void qlcnic_82xx_remove_sysfs(struct qlcnic_adapter *adapter); +int qlcnic_82xx_get_settings(struct qlcnic_adapter *, struct ethtool_cmd *); int qlcnicvf_config_bridged_mode(struct qlcnic_adapter *, u32); int qlcnicvf_config_led(struct qlcnic_adapter *, u32, u32); diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c index ea790a93ee7c..b4ff1e35a11d 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c @@ -696,15 +696,14 @@ u32 qlcnic_83xx_mac_rcode(struct qlcnic_adapter *adapter) return 1; } -u32 qlcnic_83xx_mbx_poll(struct qlcnic_adapter *adapter) +u32 qlcnic_83xx_mbx_poll(struct qlcnic_adapter *adapter, u32 *wait_time) { u32 data; - unsigned long wait_time = 0; struct qlcnic_hardware_context *ahw = adapter->ahw; /* wait for mailbox completion */ do { data = QLCRDX(ahw, QLCNIC_FW_MBX_CTRL); - if (++wait_time > QLCNIC_MBX_TIMEOUT) { + if (++(*wait_time) > QLCNIC_MBX_TIMEOUT) { data = QLCNIC_RCODE_TIMEOUT; break; } @@ -720,8 +719,8 @@ int qlcnic_83xx_mbx_op(struct qlcnic_adapter *adapter, u16 opcode; u8 mbx_err_code; unsigned long flags; - u32 rsp, mbx_val, fw_data, rsp_num, mbx_cmd; struct qlcnic_hardware_context *ahw = adapter->ahw; + u32 rsp, mbx_val, fw_data, rsp_num, mbx_cmd, wait_time = 0; opcode = LSW(cmd->req.arg[0]); if (!test_bit(QLC_83XX_MBX_READY, &adapter->ahw->idc.status)) { @@ -754,15 +753,13 @@ int qlcnic_83xx_mbx_op(struct qlcnic_adapter *adapter, /* Signal FW about the impending command */ QLCWRX(ahw, QLCNIC_HOST_MBX_CTRL, QLCNIC_SET_OWNER); poll: - rsp = qlcnic_83xx_mbx_poll(adapter); + rsp = qlcnic_83xx_mbx_poll(adapter, &wait_time); if (rsp != QLCNIC_RCODE_TIMEOUT) { /* Get the FW response data */ fw_data = readl(QLCNIC_MBX_FW(ahw, 0)); if (fw_data & QLCNIC_MBX_ASYNC_EVENT) { __qlcnic_83xx_process_aen(adapter); - mbx_val = QLCRDX(ahw, QLCNIC_HOST_MBX_CTRL); - if (mbx_val) - goto poll; + goto poll; } mbx_err_code = QLCNIC_MBX_STATUS(fw_data); rsp_num = QLCNIC_MBX_NUM_REGS(fw_data); @@ -1276,11 +1273,13 @@ out: return err; } -static int qlcnic_83xx_diag_alloc_res(struct net_device *netdev, int test) +static int qlcnic_83xx_diag_alloc_res(struct net_device *netdev, int test, + int num_sds_ring) { struct qlcnic_adapter *adapter = netdev_priv(netdev); struct qlcnic_host_sds_ring *sds_ring; struct qlcnic_host_rds_ring *rds_ring; + u16 adapter_state = adapter->is_up; u8 ring; int ret; @@ -1304,6 +1303,10 @@ static int qlcnic_83xx_diag_alloc_res(struct net_device *netdev, int test) ret = qlcnic_fw_create_ctx(adapter); if (ret) { qlcnic_detach(adapter); + if (adapter_state == QLCNIC_ADAPTER_UP_MAGIC) { + adapter->max_sds_rings = num_sds_ring; + qlcnic_attach(adapter); + } netif_device_attach(netdev); return ret; } @@ -1596,7 +1599,8 @@ int qlcnic_83xx_loopback_test(struct net_device *netdev, u8 mode) if (test_and_set_bit(__QLCNIC_RESETTING, &adapter->state)) return -EBUSY; - ret = qlcnic_83xx_diag_alloc_res(netdev, QLCNIC_LOOPBACK_TEST); + ret = qlcnic_83xx_diag_alloc_res(netdev, QLCNIC_LOOPBACK_TEST, + max_sds_rings); if (ret) goto fail_diag_alloc; @@ -2830,6 +2834,23 @@ int qlcnic_83xx_test_link(struct qlcnic_adapter *adapter) break; } config = cmd.rsp.arg[3]; + if (QLC_83XX_SFP_PRESENT(config)) { + switch (ahw->module_type) { + case LINKEVENT_MODULE_OPTICAL_UNKNOWN: + case LINKEVENT_MODULE_OPTICAL_SRLR: + case LINKEVENT_MODULE_OPTICAL_LRM: + case LINKEVENT_MODULE_OPTICAL_SFP_1G: + ahw->supported_type = PORT_FIBRE; + break; + case LINKEVENT_MODULE_TWINAX_UNSUPPORTED_CABLE: + case LINKEVENT_MODULE_TWINAX_UNSUPPORTED_CABLELEN: + case LINKEVENT_MODULE_TWINAX: + ahw->supported_type = PORT_TP; + break; + default: + ahw->supported_type = PORT_OTHER; + } + } if (config & 1) err = 1; } @@ -2838,7 +2859,8 @@ out: return config; } -int qlcnic_83xx_get_settings(struct qlcnic_adapter *adapter) +int qlcnic_83xx_get_settings(struct qlcnic_adapter *adapter, + struct ethtool_cmd *ecmd) { u32 config = 0; int status = 0; @@ -2851,6 +2873,54 @@ int qlcnic_83xx_get_settings(struct qlcnic_adapter *adapter) ahw->module_type = QLC_83XX_SFP_MODULE_TYPE(config); /* hard code until there is a way to get it from flash */ ahw->board_type = QLCNIC_BRDTYPE_83XX_10G; + + if (netif_running(adapter->netdev) && ahw->has_link_events) { + ethtool_cmd_speed_set(ecmd, ahw->link_speed); + ecmd->duplex = ahw->link_duplex; + ecmd->autoneg = ahw->link_autoneg; + } else { + ethtool_cmd_speed_set(ecmd, SPEED_UNKNOWN); + ecmd->duplex = DUPLEX_UNKNOWN; + ecmd->autoneg = AUTONEG_DISABLE; + } + + if (ahw->port_type == QLCNIC_XGBE) { + ecmd->supported = SUPPORTED_1000baseT_Full; + ecmd->advertising = ADVERTISED_1000baseT_Full; + } else { + ecmd->supported = (SUPPORTED_10baseT_Half | + SUPPORTED_10baseT_Full | + SUPPORTED_100baseT_Half | + SUPPORTED_100baseT_Full | + SUPPORTED_1000baseT_Half | + SUPPORTED_1000baseT_Full); + ecmd->advertising = (ADVERTISED_100baseT_Half | + ADVERTISED_100baseT_Full | + ADVERTISED_1000baseT_Half | + ADVERTISED_1000baseT_Full); + } + + switch (ahw->supported_type) { + case PORT_FIBRE: + ecmd->supported |= SUPPORTED_FIBRE; + ecmd->advertising |= ADVERTISED_FIBRE; + ecmd->port = PORT_FIBRE; + ecmd->transceiver = XCVR_EXTERNAL; + break; + case PORT_TP: + ecmd->supported |= SUPPORTED_TP; + ecmd->advertising |= ADVERTISED_TP; + ecmd->port = PORT_TP; + ecmd->transceiver = XCVR_INTERNAL; + break; + default: + ecmd->supported |= SUPPORTED_FIBRE; + ecmd->advertising |= ADVERTISED_FIBRE; + ecmd->port = PORT_OTHER; + ecmd->transceiver = XCVR_EXTERNAL; + break; + } + ecmd->phy_address = ahw->physical_port; return status; } @@ -3046,7 +3116,8 @@ int qlcnic_83xx_interrupt_test(struct net_device *netdev) if (test_and_set_bit(__QLCNIC_RESETTING, &adapter->state)) return -EIO; - ret = qlcnic_83xx_diag_alloc_res(netdev, QLCNIC_INTERRUPT_TEST); + ret = qlcnic_83xx_diag_alloc_res(netdev, QLCNIC_INTERRUPT_TEST, + max_sds_rings); if (ret) goto fail_diag_irq; diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h index 1f1d85e6f2af..f5db67fc9f55 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h @@ -603,7 +603,7 @@ int qlcnic_83xx_get_vnic_pf_info(struct qlcnic_adapter *, struct qlcnic_info *); void qlcnic_83xx_get_minidump_template(struct qlcnic_adapter *); void qlcnic_83xx_get_stats(struct qlcnic_adapter *adapter, u64 *data); -int qlcnic_83xx_get_settings(struct qlcnic_adapter *); +int qlcnic_83xx_get_settings(struct qlcnic_adapter *, struct ethtool_cmd *); int qlcnic_83xx_set_settings(struct qlcnic_adapter *, struct ethtool_cmd *); void qlcnic_83xx_get_pauseparam(struct qlcnic_adapter *, struct ethtool_pauseparam *); @@ -620,7 +620,7 @@ int qlcnic_83xx_flash_test(struct qlcnic_adapter *); int qlcnic_83xx_enable_flash_write(struct qlcnic_adapter *); int qlcnic_83xx_disable_flash_write(struct qlcnic_adapter *); u32 qlcnic_83xx_mac_rcode(struct qlcnic_adapter *); -u32 qlcnic_83xx_mbx_poll(struct qlcnic_adapter *); +u32 qlcnic_83xx_mbx_poll(struct qlcnic_adapter *, u32 *); void qlcnic_83xx_enable_mbx_poll(struct qlcnic_adapter *); void qlcnic_83xx_disable_mbx_poll(struct qlcnic_adapter *); #endif diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_init.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_init.c index ab1d8d99cbd5..c67d1eb35e8f 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_init.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_init.c @@ -435,10 +435,6 @@ static void qlcnic_83xx_idc_attach_driver(struct qlcnic_adapter *adapter) } done: netif_device_attach(netdev); - if (netif_running(netdev)) { - netif_carrier_on(netdev); - netif_wake_queue(netdev); - } } static int qlcnic_83xx_idc_enter_failed_state(struct qlcnic_adapter *adapter, @@ -642,15 +638,21 @@ static int qlcnic_83xx_idc_reattach_driver(struct qlcnic_adapter *adapter) static void qlcnic_83xx_idc_update_idc_params(struct qlcnic_adapter *adapter) { + struct qlcnic_hardware_context *ahw = adapter->ahw; + qlcnic_83xx_idc_update_drv_presence_reg(adapter, 1, 1); - clear_bit(__QLCNIC_RESETTING, &adapter->state); set_bit(QLC_83XX_MBX_READY, &adapter->ahw->idc.status); qlcnic_83xx_idc_update_audit_reg(adapter, 0, 1); set_bit(QLC_83XX_MODULE_LOADED, &adapter->ahw->idc.status); - adapter->ahw->idc.quiesce_req = 0; - adapter->ahw->idc.delay = QLC_83XX_IDC_FW_POLL_DELAY; - adapter->ahw->idc.err_code = 0; - adapter->ahw->idc.collect_dump = 0; + + ahw->idc.quiesce_req = 0; + ahw->idc.delay = QLC_83XX_IDC_FW_POLL_DELAY; + ahw->idc.err_code = 0; + ahw->idc.collect_dump = 0; + ahw->reset_context = 0; + adapter->tx_timeo_cnt = 0; + + clear_bit(__QLCNIC_RESETTING, &adapter->state); } /** @@ -851,6 +853,7 @@ static int qlcnic_83xx_idc_ready_state(struct qlcnic_adapter *adapter) /* Check for soft reset request */ if (ahw->reset_context && !(val & QLC_83XX_IDC_DISABLE_FW_RESET_RECOVERY)) { + adapter->ahw->reset_context = 0; qlcnic_83xx_idc_tx_soft_reset(adapter); return ret; } @@ -914,6 +917,7 @@ static int qlcnic_83xx_idc_need_quiesce_state(struct qlcnic_adapter *adapter) static int qlcnic_83xx_idc_failed_state(struct qlcnic_adapter *adapter) { dev_err(&adapter->pdev->dev, "%s: please restart!!\n", __func__); + clear_bit(__QLCNIC_RESETTING, &adapter->state); adapter->ahw->idc.err_code = -EIO; return 0; diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c index 08efb4635007..f67652de5a63 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c @@ -131,12 +131,13 @@ static const char qlcnic_83xx_rx_stats_strings[][ETH_GSTRING_LEN] = { "ctx_lro_pkt_cnt", "ctx_ip_csum_error", "ctx_rx_pkts_wo_ctx", - "ctx_rx_pkts_dropped_wo_sts", + "ctx_rx_pkts_drop_wo_sds_on_card", + "ctx_rx_pkts_drop_wo_sds_on_host", "ctx_rx_osized_pkts", "ctx_rx_pkts_dropped_wo_rds", "ctx_rx_unexpected_mcast_pkts", "ctx_invalid_mac_address", - "ctx_rx_rds_ring_prim_attemoted", + "ctx_rx_rds_ring_prim_attempted", "ctx_rx_rds_ring_prim_success", "ctx_num_lro_flows_added", "ctx_num_lro_flows_removed", @@ -251,6 +252,18 @@ static int qlcnic_get_settings(struct net_device *dev, struct ethtool_cmd *ecmd) { struct qlcnic_adapter *adapter = netdev_priv(dev); + + if (qlcnic_82xx_check(adapter)) + return qlcnic_82xx_get_settings(adapter, ecmd); + else if (qlcnic_83xx_check(adapter)) + return qlcnic_83xx_get_settings(adapter, ecmd); + + return -EIO; +} + +int qlcnic_82xx_get_settings(struct qlcnic_adapter *adapter, + struct ethtool_cmd *ecmd) +{ struct qlcnic_hardware_context *ahw = adapter->ahw; u32 speed, reg; int check_sfp_module = 0; @@ -276,10 +289,7 @@ qlcnic_get_settings(struct net_device *dev, struct ethtool_cmd *ecmd) } else if (adapter->ahw->port_type == QLCNIC_XGBE) { u32 val = 0; - if (qlcnic_83xx_check(adapter)) - qlcnic_83xx_get_settings(adapter); - else - val = QLCRD32(adapter, QLCNIC_PORT_MODE_ADDR); + val = QLCRD32(adapter, QLCNIC_PORT_MODE_ADDR); if (val == QLCNIC_PORT_MODE_802_3_AP) { ecmd->supported = SUPPORTED_1000baseT_Full; @@ -289,16 +299,13 @@ qlcnic_get_settings(struct net_device *dev, struct ethtool_cmd *ecmd) ecmd->advertising = ADVERTISED_10000baseT_Full; } - if (netif_running(dev) && adapter->ahw->has_link_events) { - if (qlcnic_82xx_check(adapter)) { - reg = QLCRD32(adapter, - P3P_LINK_SPEED_REG(pcifn)); - speed = P3P_LINK_SPEED_VAL(pcifn, reg); - ahw->link_speed = speed * P3P_LINK_SPEED_MHZ; - } - ethtool_cmd_speed_set(ecmd, adapter->ahw->link_speed); - ecmd->autoneg = adapter->ahw->link_autoneg; - ecmd->duplex = adapter->ahw->link_duplex; + if (netif_running(adapter->netdev) && ahw->has_link_events) { + reg = QLCRD32(adapter, P3P_LINK_SPEED_REG(pcifn)); + speed = P3P_LINK_SPEED_VAL(pcifn, reg); + ahw->link_speed = speed * P3P_LINK_SPEED_MHZ; + ethtool_cmd_speed_set(ecmd, ahw->link_speed); + ecmd->autoneg = ahw->link_autoneg; + ecmd->duplex = ahw->link_duplex; goto skip; } @@ -340,8 +347,8 @@ skip: case QLCNIC_BRDTYPE_P3P_10G_SFP_QT: ecmd->advertising |= ADVERTISED_TP; ecmd->supported |= SUPPORTED_TP; - check_sfp_module = netif_running(dev) && - adapter->ahw->has_link_events; + check_sfp_module = netif_running(adapter->netdev) && + ahw->has_link_events; case QLCNIC_BRDTYPE_P3P_10G_XFP: ecmd->supported |= SUPPORTED_FIBRE; ecmd->advertising |= ADVERTISED_FIBRE; @@ -355,8 +362,8 @@ skip: ecmd->advertising |= (ADVERTISED_FIBRE | ADVERTISED_TP); ecmd->port = PORT_FIBRE; - check_sfp_module = netif_running(dev) && - adapter->ahw->has_link_events; + check_sfp_module = netif_running(adapter->netdev) && + ahw->has_link_events; } else { ecmd->autoneg = AUTONEG_ENABLE; ecmd->supported |= (SUPPORTED_TP | SUPPORTED_Autoneg); @@ -365,13 +372,6 @@ skip: ecmd->port = PORT_TP; } break; - case QLCNIC_BRDTYPE_83XX_10G: - ecmd->autoneg = AUTONEG_DISABLE; - ecmd->supported |= (SUPPORTED_FIBRE | SUPPORTED_TP); - ecmd->advertising |= (ADVERTISED_FIBRE | ADVERTISED_TP); - ecmd->port = PORT_FIBRE; - check_sfp_module = netif_running(dev) && ahw->has_link_events; - break; default: dev_err(&adapter->pdev->dev, "Unsupported board model %d\n", adapter->ahw->board_type); diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.h b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.h index 95b1b5732838..b6818f4356b9 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.h +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.h @@ -134,7 +134,7 @@ struct qlcnic_mailbox_metadata { #define QLCNIC_SET_OWNER 1 #define QLCNIC_CLR_OWNER 0 -#define QLCNIC_MBX_TIMEOUT 10000 +#define QLCNIC_MBX_TIMEOUT 5000 #define QLCNIC_MBX_RSP_OK 1 #define QLCNIC_MBX_PORT_RSP_OK 0x1a diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c index 264d5a4f8153..8fb836d4129f 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c @@ -37,24 +37,24 @@ MODULE_PARM_DESC(qlcnic_mac_learn, "Mac Filter (0=learning is disabled, 1=Driver learning is enabled, 2=FDB learning is enabled)"); int qlcnic_use_msi = 1; -MODULE_PARM_DESC(use_msi, "MSI interrupt (0=disabled, 1=enabled"); +MODULE_PARM_DESC(use_msi, "MSI interrupt (0=disabled, 1=enabled)"); module_param_named(use_msi, qlcnic_use_msi, int, 0444); int qlcnic_use_msi_x = 1; -MODULE_PARM_DESC(use_msi_x, "MSI-X interrupt (0=disabled, 1=enabled"); +MODULE_PARM_DESC(use_msi_x, "MSI-X interrupt (0=disabled, 1=enabled)"); module_param_named(use_msi_x, qlcnic_use_msi_x, int, 0444); int qlcnic_auto_fw_reset = 1; -MODULE_PARM_DESC(auto_fw_reset, "Auto firmware reset (0=disabled, 1=enabled"); +MODULE_PARM_DESC(auto_fw_reset, "Auto firmware reset (0=disabled, 1=enabled)"); module_param_named(auto_fw_reset, qlcnic_auto_fw_reset, int, 0644); int qlcnic_load_fw_file; -MODULE_PARM_DESC(load_fw_file, "Load firmware from (0=flash, 1=file"); +MODULE_PARM_DESC(load_fw_file, "Load firmware from (0=flash, 1=file)"); module_param_named(load_fw_file, qlcnic_load_fw_file, int, 0444); int qlcnic_config_npars; module_param(qlcnic_config_npars, int, 0444); -MODULE_PARM_DESC(qlcnic_config_npars, "Configure NPARs (0=disabled, 1=enabled"); +MODULE_PARM_DESC(qlcnic_config_npars, "Configure NPARs (0=disabled, 1=enabled)"); static int qlcnic_probe(struct pci_dev *pdev, const struct pci_device_id *ent); static void qlcnic_remove(struct pci_dev *pdev); @@ -308,6 +308,23 @@ int qlcnic_read_mac_addr(struct qlcnic_adapter *adapter) return 0; } +static void qlcnic_delete_adapter_mac(struct qlcnic_adapter *adapter) +{ + struct qlcnic_mac_list_s *cur; + struct list_head *head; + + list_for_each(head, &adapter->mac_list) { + cur = list_entry(head, struct qlcnic_mac_list_s, list); + if (!memcmp(adapter->mac_addr, cur->mac_addr, ETH_ALEN)) { + qlcnic_sre_macaddr_change(adapter, cur->mac_addr, + 0, QLCNIC_MAC_DEL); + list_del(&cur->list); + kfree(cur); + return; + } + } +} + static int qlcnic_set_mac(struct net_device *netdev, void *p) { struct qlcnic_adapter *adapter = netdev_priv(netdev); @@ -322,11 +339,15 @@ static int qlcnic_set_mac(struct net_device *netdev, void *p) if (!is_valid_ether_addr(addr->sa_data)) return -EINVAL; + if (!memcmp(adapter->mac_addr, addr->sa_data, ETH_ALEN)) + return 0; + if (test_bit(__QLCNIC_DEV_UP, &adapter->state)) { netif_device_detach(netdev); qlcnic_napi_disable(adapter); } + qlcnic_delete_adapter_mac(adapter); memcpy(adapter->mac_addr, addr->sa_data, netdev->addr_len); memcpy(netdev->dev_addr, addr->sa_data, netdev->addr_len); qlcnic_set_multi(adapter->netdev); @@ -2481,12 +2502,17 @@ static void qlcnic_tx_timeout(struct net_device *netdev) if (test_bit(__QLCNIC_RESETTING, &adapter->state)) return; - dev_err(&netdev->dev, "transmit timeout, resetting.\n"); - - if (++adapter->tx_timeo_cnt >= QLCNIC_MAX_TX_TIMEOUTS) - adapter->need_fw_reset = 1; - else + if (++adapter->tx_timeo_cnt >= QLCNIC_MAX_TX_TIMEOUTS) { + netdev_info(netdev, "Tx timeout, reset the adapter.\n"); + if (qlcnic_82xx_check(adapter)) + adapter->need_fw_reset = 1; + else if (qlcnic_83xx_check(adapter)) + qlcnic_83xx_idc_request_reset(adapter, + QLCNIC_FORCE_FW_DUMP_KEY); + } else { + netdev_info(netdev, "Tx timeout, reset adapter context.\n"); adapter->ahw->reset_context = 1; + } } static struct net_device_stats *qlcnic_get_stats(struct net_device *netdev) diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c index 44d547d78b84..3869c3864deb 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c @@ -280,9 +280,9 @@ void qlcnic_sriov_cleanup(struct qlcnic_adapter *adapter) static int qlcnic_sriov_post_bc_msg(struct qlcnic_adapter *adapter, u32 *hdr, u32 *pay, u8 pci_func, u8 size) { + u32 rsp, mbx_val, fw_data, rsp_num, mbx_cmd, val, wait_time = 0; struct qlcnic_hardware_context *ahw = adapter->ahw; unsigned long flags; - u32 rsp, mbx_val, fw_data, rsp_num, mbx_cmd, val; u16 opcode; u8 mbx_err_code; int i, j; @@ -330,15 +330,13 @@ static int qlcnic_sriov_post_bc_msg(struct qlcnic_adapter *adapter, u32 *hdr, * assume something is wrong. */ poll: - rsp = qlcnic_83xx_mbx_poll(adapter); + rsp = qlcnic_83xx_mbx_poll(adapter, &wait_time); if (rsp != QLCNIC_RCODE_TIMEOUT) { /* Get the FW response data */ fw_data = readl(QLCNIC_MBX_FW(ahw, 0)); if (fw_data & QLCNIC_MBX_ASYNC_EVENT) { __qlcnic_83xx_process_aen(adapter); - mbx_val = QLCRDX(ahw, QLCNIC_HOST_MBX_CTRL); - if (mbx_val) - goto poll; + goto poll; } mbx_err_code = QLCNIC_MBX_STATUS(fw_data); rsp_num = QLCNIC_MBX_NUM_REGS(fw_data); diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c index c81be2da119b..1a66ccded235 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c @@ -1133,9 +1133,6 @@ static int qlcnic_sriov_validate_linkevent(struct qlcnic_vf_info *vf, if ((cmd->req.arg[1] >> 16) != vf->rx_ctx_id) return -EINVAL; - if (!(cmd->req.arg[1] & BIT_8)) - return -EINVAL; - return 0; } diff --git a/drivers/net/ethernet/qlogic/qlge/qlge_main.c b/drivers/net/ethernet/qlogic/qlge/qlge_main.c index 87463bc701a6..50235d201592 100644 --- a/drivers/net/ethernet/qlogic/qlge/qlge_main.c +++ b/drivers/net/ethernet/qlogic/qlge/qlge_main.c @@ -1106,6 +1106,7 @@ static int ql_get_next_chunk(struct ql_adapter *qdev, struct rx_ring *rx_ring, if (pci_dma_mapping_error(qdev->pdev, map)) { __free_pages(rx_ring->pg_chunk.page, qdev->lbq_buf_order); + rx_ring->pg_chunk.page = NULL; netif_err(qdev, drv, qdev->ndev, "PCI mapping failed.\n"); return -ENOMEM; @@ -2777,6 +2778,12 @@ static void ql_free_lbq_buffers(struct ql_adapter *qdev, struct rx_ring *rx_ring curr_idx = 0; } + if (rx_ring->pg_chunk.page) { + pci_unmap_page(qdev->pdev, rx_ring->pg_chunk.map, + ql_lbq_block_size(qdev), PCI_DMA_FROMDEVICE); + put_page(rx_ring->pg_chunk.page); + rx_ring->pg_chunk.page = NULL; + } } static void ql_free_sbq_buffers(struct ql_adapter *qdev, struct rx_ring *rx_ring) diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig b/drivers/net/ethernet/stmicro/stmmac/Kconfig index f695a50bac47..43c1f3223322 100644 --- a/drivers/net/ethernet/stmicro/stmmac/Kconfig +++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig @@ -1,6 +1,6 @@ config STMMAC_ETH tristate "STMicroelectronics 10/100/1000 Ethernet driver" - depends on HAS_IOMEM + depends on HAS_IOMEM && HAS_DMA select NET_CORE select MII select PHYLIB diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index d5a141c7c4e7..1c502bb0c916 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -229,7 +229,8 @@ static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb) } if (port->passthru) - vlan = list_first_entry(&port->vlans, struct macvlan_dev, list); + vlan = list_first_or_null_rcu(&port->vlans, + struct macvlan_dev, list); else vlan = macvlan_hash_lookup(port, eth->h_dest); if (vlan == NULL) @@ -814,7 +815,7 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev, if (err < 0) goto upper_dev_unlink; - list_add_tail(&vlan->list, &port->vlans); + list_add_tail_rcu(&vlan->list, &port->vlans); netif_stacked_transfer_operstate(lowerdev, dev); return 0; @@ -842,7 +843,7 @@ void macvlan_dellink(struct net_device *dev, struct list_head *head) { struct macvlan_dev *vlan = netdev_priv(dev); - list_del(&vlan->list); + list_del_rcu(&vlan->list); unregister_netdevice_queue(dev, head); netdev_upper_dev_unlink(vlan->lowerdev, dev); } diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 3c23fdc27bf0..655bb25eed2b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -28,7 +28,7 @@ #include <linux/slab.h> #include <linux/cpu.h> -static int napi_weight = 128; +static int napi_weight = NAPI_POLL_WEIGHT; module_param(napi_weight, int, 0444); static bool csum = true, gso = true; diff --git a/drivers/net/wireless/ath/ath5k/base.c b/drivers/net/wireless/ath/ath5k/base.c index 9b20d9ee2719..7f702fe3ecc2 100644 --- a/drivers/net/wireless/ath/ath5k/base.c +++ b/drivers/net/wireless/ath/ath5k/base.c @@ -2369,6 +2369,9 @@ ath5k_tx_complete_poll_work(struct work_struct *work) int i; bool needreset = false; + if (!test_bit(ATH_STAT_STARTED, ah->status)) + return; + mutex_lock(&ah->lock); for (i = 0; i < ARRAY_SIZE(ah->txqs); i++) { @@ -2676,6 +2679,7 @@ done: mmiowb(); mutex_unlock(&ah->lock); + set_bit(ATH_STAT_STARTED, ah->status); ieee80211_queue_delayed_work(ah->hw, &ah->tx_complete_work, msecs_to_jiffies(ATH5K_TX_COMPLETE_POLL_INT)); @@ -2737,6 +2741,7 @@ void ath5k_stop(struct ieee80211_hw *hw) ath5k_stop_tasklets(ah); + clear_bit(ATH_STAT_STARTED, ah->status); cancel_delayed_work_sync(&ah->tx_complete_work); if (!ath5k_modparam_no_hw_rfkill_switch) diff --git a/drivers/net/wireless/ath/ath9k/Kconfig b/drivers/net/wireless/ath/ath9k/Kconfig index 17507dc8a1e7..f3dc124c60c7 100644 --- a/drivers/net/wireless/ath/ath9k/Kconfig +++ b/drivers/net/wireless/ath/ath9k/Kconfig @@ -17,7 +17,7 @@ config ATH9K_BTCOEX_SUPPORT config ATH9K tristate "Atheros 802.11n wireless cards support" - depends on MAC80211 + depends on MAC80211 && HAS_DMA select ATH9K_HW select MAC80211_LEDS select LEDS_CLASS diff --git a/drivers/net/wireless/ath/ath9k/ar9565_1p0_initvals.h b/drivers/net/wireless/ath/ath9k/ar9565_1p0_initvals.h index 0c2ac0c6dc89..e85a8b076c22 100644 --- a/drivers/net/wireless/ath/ath9k/ar9565_1p0_initvals.h +++ b/drivers/net/wireless/ath/ath9k/ar9565_1p0_initvals.h @@ -233,9 +233,9 @@ static const u32 ar9565_1p0_baseband_core[][2] = { {0x00009d10, 0x01834061}, {0x00009d14, 0x00c00400}, {0x00009d18, 0x00000000}, - {0x00009e08, 0x0078230c}, - {0x00009e24, 0x990bb515}, - {0x00009e28, 0x126f0000}, + {0x00009e08, 0x0038230c}, + {0x00009e24, 0x9907b515}, + {0x00009e28, 0x126f0600}, {0x00009e30, 0x06336f77}, {0x00009e34, 0x6af6532f}, {0x00009e38, 0x0cc80c00}, @@ -337,7 +337,7 @@ static const u32 ar9565_1p0_baseband_core[][2] = { static const u32 ar9565_1p0_baseband_postamble[][5] = { /* Addr 5G_HT20 5G_HT40 2G_HT40 2G_HT20 */ - {0x00009810, 0xd00a8005, 0xd00a8005, 0xd00a8005, 0xd00a800d}, + {0x00009810, 0xd00a8005, 0xd00a8005, 0xd00a8005, 0xd00a8009}, {0x00009820, 0x206a022e, 0x206a022e, 0x206a012e, 0x206a01ae}, {0x00009824, 0x5ac640d0, 0x5ac640d0, 0x5ac640d0, 0x63c640da}, {0x00009828, 0x06903081, 0x06903081, 0x06903881, 0x09143c81}, @@ -345,9 +345,9 @@ static const u32 ar9565_1p0_baseband_postamble[][5] = { {0x00009830, 0x0000059c, 0x0000059c, 0x0000059c, 0x0000059c}, {0x00009c00, 0x000000c4, 0x000000c4, 0x000000c4, 0x000000c4}, {0x00009e00, 0x0372111a, 0x0372111a, 0x037216a0, 0x037216a0}, - {0x00009e04, 0x00802020, 0x00802020, 0x00802020, 0x00802020}, - {0x00009e0c, 0x6c4000e2, 0x6d4000e2, 0x6d4000e2, 0x6c4000d8}, - {0x00009e10, 0x7ec88d2e, 0x7ec88d2e, 0x7ec84d2e, 0x7ec86d2e}, + {0x00009e04, 0x00802020, 0x00802020, 0x00142020, 0x00142020}, + {0x00009e0c, 0x6c4000e2, 0x6d4000e2, 0x6d4000e2, 0x6c4000e2}, + {0x00009e10, 0x7ec88d2e, 0x7ec88d2e, 0x7ec84d2e, 0x7ec84d2e}, {0x00009e14, 0x37b95d5e, 0x37b9605e, 0x3379605e, 0x33795d5e}, {0x00009e18, 0x00000000, 0x00000000, 0x00000000, 0x00000000}, {0x00009e1c, 0x0001cf9c, 0x0001cf9c, 0x00021f9c, 0x00021f9c}, @@ -450,6 +450,8 @@ static const u32 ar9565_1p0_soc_postamble[][5] = { static const u32 ar9565_1p0_Common_rx_gain_table[][2] = { /* Addr allmodes */ + {0x00004050, 0x00300300}, + {0x0000406c, 0x00100000}, {0x0000a000, 0x00010000}, {0x0000a004, 0x00030002}, {0x0000a008, 0x00050004}, @@ -498,27 +500,27 @@ static const u32 ar9565_1p0_Common_rx_gain_table[][2] = { {0x0000a0b4, 0x00000000}, {0x0000a0b8, 0x00000000}, {0x0000a0bc, 0x00000000}, - {0x0000a0c0, 0x001f0000}, - {0x0000a0c4, 0x01000101}, - {0x0000a0c8, 0x011e011f}, - {0x0000a0cc, 0x011c011d}, - {0x0000a0d0, 0x02030204}, - {0x0000a0d4, 0x02010202}, - {0x0000a0d8, 0x021f0200}, - {0x0000a0dc, 0x0302021e}, - {0x0000a0e0, 0x03000301}, - {0x0000a0e4, 0x031e031f}, - {0x0000a0e8, 0x0402031d}, - {0x0000a0ec, 0x04000401}, - {0x0000a0f0, 0x041e041f}, - {0x0000a0f4, 0x0502041d}, - {0x0000a0f8, 0x05000501}, - {0x0000a0fc, 0x051e051f}, - {0x0000a100, 0x06010602}, - {0x0000a104, 0x061f0600}, - {0x0000a108, 0x061d061e}, - {0x0000a10c, 0x07020703}, - {0x0000a110, 0x07000701}, + {0x0000a0c0, 0x00bf00a0}, + {0x0000a0c4, 0x11a011a1}, + {0x0000a0c8, 0x11be11bf}, + {0x0000a0cc, 0x11bc11bd}, + {0x0000a0d0, 0x22632264}, + {0x0000a0d4, 0x22612262}, + {0x0000a0d8, 0x227f2260}, + {0x0000a0dc, 0x4322227e}, + {0x0000a0e0, 0x43204321}, + {0x0000a0e4, 0x433e433f}, + {0x0000a0e8, 0x4462433d}, + {0x0000a0ec, 0x44604461}, + {0x0000a0f0, 0x447e447f}, + {0x0000a0f4, 0x5582447d}, + {0x0000a0f8, 0x55805581}, + {0x0000a0fc, 0x559e559f}, + {0x0000a100, 0x66816682}, + {0x0000a104, 0x669f6680}, + {0x0000a108, 0x669d669e}, + {0x0000a10c, 0x77627763}, + {0x0000a110, 0x77607761}, {0x0000a114, 0x00000000}, {0x0000a118, 0x00000000}, {0x0000a11c, 0x00000000}, @@ -530,27 +532,27 @@ static const u32 ar9565_1p0_Common_rx_gain_table[][2] = { {0x0000a134, 0x00000000}, {0x0000a138, 0x00000000}, {0x0000a13c, 0x00000000}, - {0x0000a140, 0x001f0000}, - {0x0000a144, 0x01000101}, - {0x0000a148, 0x011e011f}, - {0x0000a14c, 0x011c011d}, - {0x0000a150, 0x02030204}, - {0x0000a154, 0x02010202}, - {0x0000a158, 0x021f0200}, - {0x0000a15c, 0x0302021e}, - {0x0000a160, 0x03000301}, - {0x0000a164, 0x031e031f}, - {0x0000a168, 0x0402031d}, - {0x0000a16c, 0x04000401}, - {0x0000a170, 0x041e041f}, - {0x0000a174, 0x0502041d}, - {0x0000a178, 0x05000501}, - {0x0000a17c, 0x051e051f}, - {0x0000a180, 0x06010602}, - {0x0000a184, 0x061f0600}, - {0x0000a188, 0x061d061e}, - {0x0000a18c, 0x07020703}, - {0x0000a190, 0x07000701}, + {0x0000a140, 0x00bf00a0}, + {0x0000a144, 0x11a011a1}, + {0x0000a148, 0x11be11bf}, + {0x0000a14c, 0x11bc11bd}, + {0x0000a150, 0x22632264}, + {0x0000a154, 0x22612262}, + {0x0000a158, 0x227f2260}, + {0x0000a15c, 0x4322227e}, + {0x0000a160, 0x43204321}, + {0x0000a164, 0x433e433f}, + {0x0000a168, 0x4462433d}, + {0x0000a16c, 0x44604461}, + {0x0000a170, 0x447e447f}, + {0x0000a174, 0x5582447d}, + {0x0000a178, 0x55805581}, + {0x0000a17c, 0x559e559f}, + {0x0000a180, 0x66816682}, + {0x0000a184, 0x669f6680}, + {0x0000a188, 0x669d669e}, + {0x0000a18c, 0x77e677e7}, + {0x0000a190, 0x77e477e5}, {0x0000a194, 0x00000000}, {0x0000a198, 0x00000000}, {0x0000a19c, 0x00000000}, @@ -770,7 +772,7 @@ static const u32 ar9565_1p0_Modes_lowest_ob_db_tx_gain_table[][5] = { static const u32 ar9565_1p0_pciephy_clkreq_disable_L1[][2] = { /* Addr allmodes */ - {0x00018c00, 0x18213ede}, + {0x00018c00, 0x18212ede}, {0x00018c04, 0x000801d8}, {0x00018c08, 0x0003780c}, }; @@ -889,8 +891,8 @@ static const u32 ar9565_1p0_common_wo_xlna_rx_gain_table[][2] = { {0x0000a180, 0x66816682}, {0x0000a184, 0x669f6680}, {0x0000a188, 0x669d669e}, - {0x0000a18c, 0x77627763}, - {0x0000a190, 0x77607761}, + {0x0000a18c, 0x77e677e7}, + {0x0000a190, 0x77e477e5}, {0x0000a194, 0x00000000}, {0x0000a198, 0x00000000}, {0x0000a19c, 0x00000000}, @@ -1114,7 +1116,7 @@ static const u32 ar9565_1p0_modes_high_ob_db_tx_gain_table[][5] = { {0x0000a2e0, 0xffecec00, 0xffecec00, 0xfd339c84, 0xfd339c84}, {0x0000a2e4, 0xfc0f0000, 0xfc0f0000, 0xfec3e000, 0xfec3e000}, {0x0000a2e8, 0xfc100000, 0xfc100000, 0xfffc0000, 0xfffc0000}, - {0x0000a410, 0x000050d9, 0x000050d9, 0x000050d9, 0x000050d9}, + {0x0000a410, 0x000050d9, 0x000050d9, 0x000050df, 0x000050df}, {0x0000a500, 0x00002220, 0x00002220, 0x00000000, 0x00000000}, {0x0000a504, 0x06002223, 0x06002223, 0x04000002, 0x04000002}, {0x0000a508, 0x0b022220, 0x0b022220, 0x08000004, 0x08000004}, @@ -1140,13 +1142,13 @@ static const u32 ar9565_1p0_modes_high_ob_db_tx_gain_table[][5] = { {0x0000a558, 0x69027f56, 0x69027f56, 0x53001ce5, 0x53001ce5}, {0x0000a55c, 0x6d029f56, 0x6d029f56, 0x57001ce9, 0x57001ce9}, {0x0000a560, 0x73049f56, 0x73049f56, 0x5b001ceb, 0x5b001ceb}, - {0x0000a564, 0x7804ff56, 0x7804ff56, 0x5d001eec, 0x5d001eec}, - {0x0000a568, 0x7804ff56, 0x7804ff56, 0x5d001eec, 0x5d001eec}, - {0x0000a56c, 0x7804ff56, 0x7804ff56, 0x5d001eec, 0x5d001eec}, - {0x0000a570, 0x7804ff56, 0x7804ff56, 0x5d001eec, 0x5d001eec}, - {0x0000a574, 0x7804ff56, 0x7804ff56, 0x5d001eec, 0x5d001eec}, - {0x0000a578, 0x7804ff56, 0x7804ff56, 0x5d001eec, 0x5d001eec}, - {0x0000a57c, 0x7804ff56, 0x7804ff56, 0x5d001eec, 0x5d001eec}, + {0x0000a564, 0x7804ff56, 0x7804ff56, 0x60001cf0, 0x60001cf0}, + {0x0000a568, 0x7804ff56, 0x7804ff56, 0x61001cf1, 0x61001cf1}, + {0x0000a56c, 0x7804ff56, 0x7804ff56, 0x62001cf2, 0x62001cf2}, + {0x0000a570, 0x7804ff56, 0x7804ff56, 0x63001cf3, 0x63001cf3}, + {0x0000a574, 0x7804ff56, 0x7804ff56, 0x64001cf4, 0x64001cf4}, + {0x0000a578, 0x7804ff56, 0x7804ff56, 0x66001ff6, 0x66001ff6}, + {0x0000a57c, 0x7804ff56, 0x7804ff56, 0x66001ff6, 0x66001ff6}, {0x0000a600, 0x00000000, 0x00000000, 0x00000000, 0x00000000}, {0x0000a604, 0x00000000, 0x00000000, 0x00000000, 0x00000000}, {0x0000a608, 0x00000000, 0x00000000, 0x00000000, 0x00000000}, @@ -1174,7 +1176,7 @@ static const u32 ar9565_1p0_modes_high_power_tx_gain_table[][5] = { {0x0000a2e0, 0xffecec00, 0xffecec00, 0xfd339c84, 0xfd339c84}, {0x0000a2e4, 0xfc0f0000, 0xfc0f0000, 0xfec3e000, 0xfec3e000}, {0x0000a2e8, 0xfc100000, 0xfc100000, 0xfffc0000, 0xfffc0000}, - {0x0000a410, 0x000050d9, 0x000050d9, 0x000050d9, 0x000050d9}, + {0x0000a410, 0x000050d9, 0x000050d9, 0x000050df, 0x000050df}, {0x0000a500, 0x00002220, 0x00002220, 0x00000000, 0x00000000}, {0x0000a504, 0x06002223, 0x06002223, 0x04000002, 0x04000002}, {0x0000a508, 0x0a022220, 0x0a022220, 0x08000004, 0x08000004}, @@ -1200,13 +1202,13 @@ static const u32 ar9565_1p0_modes_high_power_tx_gain_table[][5] = { {0x0000a558, 0x66027f56, 0x66027f56, 0x4c001ce5, 0x4c001ce5}, {0x0000a55c, 0x6a029f56, 0x6a029f56, 0x50001ce9, 0x50001ce9}, {0x0000a560, 0x70049f56, 0x70049f56, 0x54001ceb, 0x54001ceb}, - {0x0000a564, 0x7504ff56, 0x7504ff56, 0x56001eec, 0x56001eec}, - {0x0000a568, 0x7504ff56, 0x7504ff56, 0x56001eec, 0x56001eec}, - {0x0000a56c, 0x7504ff56, 0x7504ff56, 0x56001eec, 0x56001eec}, - {0x0000a570, 0x7504ff56, 0x7504ff56, 0x56001eec, 0x56001eec}, - {0x0000a574, 0x7504ff56, 0x7504ff56, 0x56001eec, 0x56001eec}, - {0x0000a578, 0x7504ff56, 0x7504ff56, 0x56001eec, 0x56001eec}, - {0x0000a57c, 0x7504ff56, 0x7504ff56, 0x56001eec, 0x56001eec}, + {0x0000a564, 0x7504ff56, 0x7504ff56, 0x59001cf0, 0x59001cf0}, + {0x0000a568, 0x7504ff56, 0x7504ff56, 0x5a001cf1, 0x5a001cf1}, + {0x0000a56c, 0x7504ff56, 0x7504ff56, 0x5b001cf2, 0x5b001cf2}, + {0x0000a570, 0x7504ff56, 0x7504ff56, 0x5c001cf3, 0x5c001cf3}, + {0x0000a574, 0x7504ff56, 0x7504ff56, 0x5d001cf4, 0x5d001cf4}, + {0x0000a578, 0x7504ff56, 0x7504ff56, 0x5f001ff6, 0x5f001ff6}, + {0x0000a57c, 0x7504ff56, 0x7504ff56, 0x5f001ff6, 0x5f001ff6}, {0x0000a600, 0x00000000, 0x00000000, 0x00000000, 0x00000000}, {0x0000a604, 0x00000000, 0x00000000, 0x00000000, 0x00000000}, {0x0000a608, 0x00000000, 0x00000000, 0x00000000, 0x00000000}, diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c index 6963862a1872..a18414b5948b 100644 --- a/drivers/net/wireless/ath/ath9k/main.c +++ b/drivers/net/wireless/ath/ath9k/main.c @@ -227,13 +227,13 @@ static bool ath_complete_reset(struct ath_softc *sc, bool start) if (!test_bit(SC_OP_BEACONS, &sc->sc_flags)) goto work; - ath9k_set_beacon(sc); - if (ah->opmode == NL80211_IFTYPE_STATION && test_bit(SC_OP_PRIM_STA_VIF, &sc->sc_flags)) { spin_lock_irqsave(&sc->sc_pm_lock, flags); sc->ps_flags |= PS_BEACON_SYNC | PS_WAIT_FOR_BEACON; spin_unlock_irqrestore(&sc->sc_pm_lock, flags); + } else { + ath9k_set_beacon(sc); } work: ath_restart_work(sc); @@ -1332,6 +1332,7 @@ static int ath9k_sta_add(struct ieee80211_hw *hw, struct ath_common *common = ath9k_hw_common(sc->sc_ah); struct ath_node *an = (struct ath_node *) sta->drv_priv; struct ieee80211_key_conf ps_key = { }; + int key; ath_node_attach(sc, sta, vif); @@ -1339,7 +1340,9 @@ static int ath9k_sta_add(struct ieee80211_hw *hw, vif->type != NL80211_IFTYPE_AP_VLAN) return 0; - an->ps_key = ath_key_config(common, vif, sta, &ps_key); + key = ath_key_config(common, vif, sta, &ps_key); + if (key > 0) + an->ps_key = key; return 0; } @@ -1356,6 +1359,7 @@ static void ath9k_del_ps_key(struct ath_softc *sc, return; ath_key_delete(common, &ps_key); + an->ps_key = 0; } static int ath9k_sta_remove(struct ieee80211_hw *hw, diff --git a/drivers/net/wireless/b43/dma.c b/drivers/net/wireless/b43/dma.c index 523355b87659..f7c70b3a6ea9 100644 --- a/drivers/net/wireless/b43/dma.c +++ b/drivers/net/wireless/b43/dma.c @@ -1728,6 +1728,25 @@ drop_recycle_buffer: sync_descbuffer_for_device(ring, dmaaddr, ring->rx_buffersize); } +void b43_dma_handle_rx_overflow(struct b43_dmaring *ring) +{ + int current_slot, previous_slot; + + B43_WARN_ON(ring->tx); + + /* Device has filled all buffers, drop all packets and let TCP + * decrease speed. + * Decrement RX index by one will let the device to see all slots + * as free again + */ + /* + *TODO: How to increase rx_drop in mac80211? + */ + current_slot = ring->ops->get_current_rxslot(ring); + previous_slot = prev_slot(ring, current_slot); + ring->ops->set_current_rxslot(ring, previous_slot); +} + void b43_dma_rx(struct b43_dmaring *ring) { const struct b43_dma_ops *ops = ring->ops; diff --git a/drivers/net/wireless/b43/dma.h b/drivers/net/wireless/b43/dma.h index 9fdd1983079c..df8c8cdcbdb5 100644 --- a/drivers/net/wireless/b43/dma.h +++ b/drivers/net/wireless/b43/dma.h @@ -9,7 +9,7 @@ /* DMA-Interrupt reasons. */ #define B43_DMAIRQ_FATALMASK ((1 << 10) | (1 << 11) | (1 << 12) \ | (1 << 14) | (1 << 15)) -#define B43_DMAIRQ_NONFATALMASK (1 << 13) +#define B43_DMAIRQ_RDESC_UFLOW (1 << 13) #define B43_DMAIRQ_RX_DONE (1 << 16) /*** 32-bit DMA Engine. ***/ @@ -295,6 +295,8 @@ int b43_dma_tx(struct b43_wldev *dev, void b43_dma_handle_txstatus(struct b43_wldev *dev, const struct b43_txstatus *status); +void b43_dma_handle_rx_overflow(struct b43_dmaring *ring); + void b43_dma_rx(struct b43_dmaring *ring); void b43_dma_direct_fifo_rx(struct b43_wldev *dev, diff --git a/drivers/net/wireless/b43/main.c b/drivers/net/wireless/b43/main.c index d377f77d30b5..6dd07e2ec595 100644 --- a/drivers/net/wireless/b43/main.c +++ b/drivers/net/wireless/b43/main.c @@ -1902,30 +1902,18 @@ static void b43_do_interrupt_thread(struct b43_wldev *dev) } } - if (unlikely(merged_dma_reason & (B43_DMAIRQ_FATALMASK | - B43_DMAIRQ_NONFATALMASK))) { - if (merged_dma_reason & B43_DMAIRQ_FATALMASK) { - b43err(dev->wl, "Fatal DMA error: " - "0x%08X, 0x%08X, 0x%08X, " - "0x%08X, 0x%08X, 0x%08X\n", - dma_reason[0], dma_reason[1], - dma_reason[2], dma_reason[3], - dma_reason[4], dma_reason[5]); - b43err(dev->wl, "This device does not support DMA " + if (unlikely(merged_dma_reason & (B43_DMAIRQ_FATALMASK))) { + b43err(dev->wl, + "Fatal DMA error: 0x%08X, 0x%08X, 0x%08X, 0x%08X, 0x%08X, 0x%08X\n", + dma_reason[0], dma_reason[1], + dma_reason[2], dma_reason[3], + dma_reason[4], dma_reason[5]); + b43err(dev->wl, "This device does not support DMA " "on your system. It will now be switched to PIO.\n"); - /* Fall back to PIO transfers if we get fatal DMA errors! */ - dev->use_pio = true; - b43_controller_restart(dev, "DMA error"); - return; - } - if (merged_dma_reason & B43_DMAIRQ_NONFATALMASK) { - b43err(dev->wl, "DMA error: " - "0x%08X, 0x%08X, 0x%08X, " - "0x%08X, 0x%08X, 0x%08X\n", - dma_reason[0], dma_reason[1], - dma_reason[2], dma_reason[3], - dma_reason[4], dma_reason[5]); - } + /* Fall back to PIO transfers if we get fatal DMA errors! */ + dev->use_pio = true; + b43_controller_restart(dev, "DMA error"); + return; } if (unlikely(reason & B43_IRQ_UCODE_DEBUG)) @@ -1944,6 +1932,11 @@ static void b43_do_interrupt_thread(struct b43_wldev *dev) handle_irq_noise(dev); /* Check the DMA reason registers for received data. */ + if (dma_reason[0] & B43_DMAIRQ_RDESC_UFLOW) { + if (B43_DEBUG) + b43warn(dev->wl, "RX descriptor underrun\n"); + b43_dma_handle_rx_overflow(dev->dma.rx_ring); + } if (dma_reason[0] & B43_DMAIRQ_RX_DONE) { if (b43_using_pio_transfers(dev)) b43_pio_rx(dev->pio.rx_queue); @@ -2001,7 +1994,7 @@ static irqreturn_t b43_do_interrupt(struct b43_wldev *dev) return IRQ_NONE; dev->dma_reason[0] = b43_read32(dev, B43_MMIO_DMA0_REASON) - & 0x0001DC00; + & 0x0001FC00; dev->dma_reason[1] = b43_read32(dev, B43_MMIO_DMA1_REASON) & 0x0000DC00; dev->dma_reason[2] = b43_read32(dev, B43_MMIO_DMA2_REASON) @@ -3130,7 +3123,7 @@ static int b43_chip_init(struct b43_wldev *dev) b43_write32(dev, 0x018C, 0x02000000); } b43_write32(dev, B43_MMIO_GEN_IRQ_REASON, 0x00004000); - b43_write32(dev, B43_MMIO_DMA0_IRQ_MASK, 0x0001DC00); + b43_write32(dev, B43_MMIO_DMA0_IRQ_MASK, 0x0001FC00); b43_write32(dev, B43_MMIO_DMA1_IRQ_MASK, 0x0000DC00); b43_write32(dev, B43_MMIO_DMA2_IRQ_MASK, 0x0000DC00); b43_write32(dev, B43_MMIO_DMA3_IRQ_MASK, 0x0001DC00); diff --git a/drivers/net/wireless/iwlegacy/4965-mac.c b/drivers/net/wireless/iwlegacy/4965-mac.c index b8f82e688c72..9a95045c97b6 100644 --- a/drivers/net/wireless/iwlegacy/4965-mac.c +++ b/drivers/net/wireless/iwlegacy/4965-mac.c @@ -5741,8 +5741,7 @@ il4965_mac_setup_register(struct il_priv *il, u32 max_probe_length) hw->flags = IEEE80211_HW_SIGNAL_DBM | IEEE80211_HW_AMPDU_AGGREGATION | IEEE80211_HW_NEED_DTIM_BEFORE_ASSOC | IEEE80211_HW_SPECTRUM_MGMT | - IEEE80211_HW_REPORTS_TX_ACK_STATUS | IEEE80211_HW_SUPPORTS_PS | - IEEE80211_HW_SUPPORTS_DYNAMIC_PS; + IEEE80211_HW_SUPPORTS_PS | IEEE80211_HW_SUPPORTS_DYNAMIC_PS; if (il->cfg->sku & IL_SKU_N) hw->flags |= IEEE80211_HW_SUPPORTS_DYNAMIC_SMPS | diff --git a/drivers/net/wireless/mwifiex/cfg80211.c b/drivers/net/wireless/mwifiex/cfg80211.c index d3c8ece980d8..e42b266a023a 100644 --- a/drivers/net/wireless/mwifiex/cfg80211.c +++ b/drivers/net/wireless/mwifiex/cfg80211.c @@ -2234,9 +2234,6 @@ int mwifiex_del_virtual_intf(struct wiphy *wiphy, struct wireless_dev *wdev) if (wdev->netdev->reg_state == NETREG_REGISTERED) unregister_netdevice(wdev->netdev); - if (wdev->netdev->reg_state == NETREG_UNREGISTERED) - free_netdev(wdev->netdev); - /* Clear the priv in adapter */ priv->netdev = NULL; diff --git a/drivers/net/wireless/mwifiex/cmdevt.c b/drivers/net/wireless/mwifiex/cmdevt.c index 74db0d24a579..26755d9acb55 100644 --- a/drivers/net/wireless/mwifiex/cmdevt.c +++ b/drivers/net/wireless/mwifiex/cmdevt.c @@ -1191,6 +1191,7 @@ mwifiex_process_hs_config(struct mwifiex_adapter *adapter) adapter->if_ops.wakeup(adapter); adapter->hs_activated = false; adapter->is_hs_configured = false; + adapter->is_suspended = false; mwifiex_hs_activated_event(mwifiex_get_priv(adapter, MWIFIEX_BSS_ROLE_ANY), false); diff --git a/drivers/net/wireless/mwifiex/main.c b/drivers/net/wireless/mwifiex/main.c index 121443a0f2a1..2eb88ea9acf7 100644 --- a/drivers/net/wireless/mwifiex/main.c +++ b/drivers/net/wireless/mwifiex/main.c @@ -655,6 +655,7 @@ void mwifiex_init_priv_params(struct mwifiex_private *priv, struct net_device *dev) { dev->netdev_ops = &mwifiex_netdev_ops; + dev->destructor = free_netdev; /* Initialize private structure */ priv->current_key_index = 0; priv->media_connected = false; diff --git a/drivers/net/wireless/mwifiex/sta_ioctl.c b/drivers/net/wireless/mwifiex/sta_ioctl.c index 311d0b26b81c..1a8a19dbd635 100644 --- a/drivers/net/wireless/mwifiex/sta_ioctl.c +++ b/drivers/net/wireless/mwifiex/sta_ioctl.c @@ -96,7 +96,7 @@ int mwifiex_request_set_multicast_list(struct mwifiex_private *priv, } else { /* Multicast */ priv->curr_pkt_filter &= ~HostCmd_ACT_MAC_PROMISCUOUS_ENABLE; - if (mcast_list->mode == MWIFIEX_MULTICAST_MODE) { + if (mcast_list->mode == MWIFIEX_ALL_MULTI_MODE) { dev_dbg(priv->adapter->dev, "info: Enabling All Multicast!\n"); priv->curr_pkt_filter |= @@ -108,20 +108,11 @@ int mwifiex_request_set_multicast_list(struct mwifiex_private *priv, dev_dbg(priv->adapter->dev, "info: Set multicast list=%d\n", mcast_list->num_multicast_addr); - /* Set multicast addresses to firmware */ - if (old_pkt_filter == priv->curr_pkt_filter) { - /* Send request to firmware */ - ret = mwifiex_send_cmd_async(priv, - HostCmd_CMD_MAC_MULTICAST_ADR, - HostCmd_ACT_GEN_SET, 0, - mcast_list); - } else { - /* Send request to firmware */ - ret = mwifiex_send_cmd_async(priv, - HostCmd_CMD_MAC_MULTICAST_ADR, - HostCmd_ACT_GEN_SET, 0, - mcast_list); - } + /* Send multicast addresses to firmware */ + ret = mwifiex_send_cmd_async(priv, + HostCmd_CMD_MAC_MULTICAST_ADR, + HostCmd_ACT_GEN_SET, 0, + mcast_list); } } } diff --git a/drivers/rtc/Kconfig b/drivers/rtc/Kconfig index 0c81915b1997..b9838130a7b0 100644 --- a/drivers/rtc/Kconfig +++ b/drivers/rtc/Kconfig @@ -20,7 +20,6 @@ if RTC_CLASS config RTC_HCTOSYS bool "Set system time from RTC on startup and resume" default y - depends on !ALWAYS_USE_PERSISTENT_CLOCK help If you say yes here, the system time (wall clock) will be set using the value read from a specified RTC device. This is useful to avoid @@ -29,7 +28,6 @@ config RTC_HCTOSYS config RTC_SYSTOHC bool "Set the RTC time based on NTP synchronization" default y - depends on !ALWAYS_USE_PERSISTENT_CLOCK help If you say yes here, the system time (wall clock) will be stored in the RTC specified by RTC_HCTOSYS_DEVICE approximately every 11 diff --git a/drivers/spi/spi-atmel.c b/drivers/spi/spi-atmel.c index 787bd2c22bca..380387a47b1d 100644 --- a/drivers/spi/spi-atmel.c +++ b/drivers/spi/spi-atmel.c @@ -526,13 +526,17 @@ static void atmel_spi_next_xfer_pio(struct spi_master *master, } if (xfer->tx_buf) - spi_writel(as, TDR, *(u8 *)(xfer->tx_buf)); + if (xfer->bits_per_word > 8) + spi_writel(as, TDR, *(u16 *)(xfer->tx_buf)); + else + spi_writel(as, TDR, *(u8 *)(xfer->tx_buf)); else spi_writel(as, TDR, 0); dev_dbg(master->dev.parent, - " start pio xfer %p: len %u tx %p rx %p\n", - xfer, xfer->len, xfer->tx_buf, xfer->rx_buf); + " start pio xfer %p: len %u tx %p rx %p bitpw %d\n", + xfer, xfer->len, xfer->tx_buf, xfer->rx_buf, + xfer->bits_per_word); /* Enable relevant interrupts */ spi_writel(as, IER, SPI_BIT(RDRF) | SPI_BIT(OVRES)); @@ -950,21 +954,39 @@ atmel_spi_pump_pio_data(struct atmel_spi *as, struct spi_transfer *xfer) { u8 *txp; u8 *rxp; + u16 *txp16; + u16 *rxp16; unsigned long xfer_pos = xfer->len - as->current_remaining_bytes; if (xfer->rx_buf) { - rxp = ((u8 *)xfer->rx_buf) + xfer_pos; - *rxp = spi_readl(as, RDR); + if (xfer->bits_per_word > 8) { + rxp16 = (u16 *)(((u8 *)xfer->rx_buf) + xfer_pos); + *rxp16 = spi_readl(as, RDR); + } else { + rxp = ((u8 *)xfer->rx_buf) + xfer_pos; + *rxp = spi_readl(as, RDR); + } } else { spi_readl(as, RDR); } - - as->current_remaining_bytes--; + if (xfer->bits_per_word > 8) { + as->current_remaining_bytes -= 2; + if (as->current_remaining_bytes < 0) + as->current_remaining_bytes = 0; + } else { + as->current_remaining_bytes--; + } if (as->current_remaining_bytes) { if (xfer->tx_buf) { - txp = ((u8 *)xfer->tx_buf) + xfer_pos + 1; - spi_writel(as, TDR, *txp); + if (xfer->bits_per_word > 8) { + txp16 = (u16 *)(((u8 *)xfer->tx_buf) + + xfer_pos + 2); + spi_writel(as, TDR, *txp16); + } else { + txp = ((u8 *)xfer->tx_buf) + xfer_pos + 1; + spi_writel(as, TDR, *txp); + } } else { spi_writel(as, TDR, 0); } @@ -1378,9 +1400,16 @@ static int atmel_spi_transfer(struct spi_device *spi, struct spi_message *msg) } } + if (xfer->bits_per_word > 8) { + if (xfer->len % 2) { + dev_dbg(&spi->dev, "buffer len should be 16 bits aligned\n"); + return -EINVAL; + } + } + /* FIXME implement these protocol options!! */ - if (xfer->speed_hz) { - dev_dbg(&spi->dev, "no protocol options yet\n"); + if (xfer->speed_hz < spi->max_speed_hz) { + dev_dbg(&spi->dev, "can't change speed in transfer\n"); return -ENOPROTOOPT; } diff --git a/drivers/spi/spi-davinci.c b/drivers/spi/spi-davinci.c index 2e8f24a1fb95..50b13c9b1ab6 100644 --- a/drivers/spi/spi-davinci.c +++ b/drivers/spi/spi-davinci.c @@ -784,7 +784,7 @@ static const struct of_device_id davinci_spi_of_match[] = { }, { }, }; -MODULE_DEVICE_TABLE(of, davini_spi_of_match); +MODULE_DEVICE_TABLE(of, davinci_spi_of_match); /** * spi_davinci_get_pdata - Get platform data from DTS binding diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index 163fd802b7ac..32b7bb111eb6 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -334,7 +334,7 @@ struct spi_device *spi_alloc_device(struct spi_master *master) spi->dev.parent = &master->dev; spi->dev.bus = &spi_bus_type; spi->dev.release = spidev_release; - spi->cs_gpio = -EINVAL; + spi->cs_gpio = -ENOENT; device_initialize(&spi->dev); return spi; } @@ -1067,8 +1067,11 @@ static int of_spi_register_master(struct spi_master *master) nb = of_gpio_named_count(np, "cs-gpios"); master->num_chipselect = max(nb, (int)master->num_chipselect); - if (nb < 1) + /* Return error only for an incorrectly formed cs-gpios property */ + if (nb == 0 || nb == -ENOENT) return 0; + else if (nb < 0) + return nb; cs = devm_kzalloc(&master->dev, sizeof(int) * master->num_chipselect, @@ -1079,7 +1082,7 @@ static int of_spi_register_master(struct spi_master *master) return -ENOMEM; for (i = 0; i < master->num_chipselect; i++) - cs[i] = -EINVAL; + cs[i] = -ENOENT; for (i = 0; i < nb; i++) cs[i] = of_get_named_gpio(np, "cs-gpios", i); diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c index bff0775e258c..5174ebac288d 100644 --- a/drivers/vhost/vringh.c +++ b/drivers/vhost/vringh.c @@ -3,6 +3,7 @@ * * Since these may be in userspace, we use (inline) accessors. */ +#include <linux/module.h> #include <linux/vringh.h> #include <linux/virtio_ring.h> #include <linux/kernel.h> @@ -1005,3 +1006,5 @@ int vringh_need_notify_kern(struct vringh *vrh) return __vringh_need_notify(vrh, getu16_kern); } EXPORT_SYMBOL(vringh_need_notify_kern); + +MODULE_LICENSE("GPL"); diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 0aabb344b02e..5aae3d12d400 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -209,7 +209,6 @@ typedef struct ext4_io_end { ssize_t size; /* size of the extent */ struct kiocb *iocb; /* iocb struct for AIO */ int result; /* error value for AIO */ - atomic_t count; /* reference counter */ } ext4_io_end_t; struct ext4_io_submit { @@ -2651,14 +2650,11 @@ extern int ext4_move_extents(struct file *o_filp, struct file *d_filp, /* page-io.c */ extern int __init ext4_init_pageio(void); +extern void ext4_add_complete_io(ext4_io_end_t *io_end); extern void ext4_exit_pageio(void); extern void ext4_ioend_shutdown(struct inode *); +extern void ext4_free_io_end(ext4_io_end_t *io); extern ext4_io_end_t *ext4_init_io_end(struct inode *inode, gfp_t flags); -extern ext4_io_end_t *ext4_get_io_end(ext4_io_end_t *io_end); -extern int ext4_put_io_end(ext4_io_end_t *io_end); -extern void ext4_put_io_end_defer(ext4_io_end_t *io_end); -extern void ext4_io_submit_init(struct ext4_io_submit *io, - struct writeback_control *wbc); extern void ext4_end_io_work(struct work_struct *work); extern void ext4_io_submit(struct ext4_io_submit *io); extern int ext4_bio_write_page(struct ext4_io_submit *io, diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 107936db244e..bc0f1910b9cf 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -3642,7 +3642,7 @@ int ext4_find_delalloc_range(struct inode *inode, { struct extent_status es; - ext4_es_find_delayed_extent(inode, lblk_start, &es); + ext4_es_find_delayed_extent_range(inode, lblk_start, lblk_end, &es); if (es.es_len == 0) return 0; /* there is no delay extent in this tree */ else if (es.es_lblk <= lblk_start && @@ -4608,9 +4608,10 @@ static int ext4_find_delayed_extent(struct inode *inode, struct extent_status es; ext4_lblk_t block, next_del; - ext4_es_find_delayed_extent(inode, newes->es_lblk, &es); - if (newes->es_pblk == 0) { + ext4_es_find_delayed_extent_range(inode, newes->es_lblk, + newes->es_lblk + newes->es_len - 1, &es); + /* * No extent in extent-tree contains block @newes->es_pblk, * then the block may stay in 1)a hole or 2)delayed-extent. @@ -4630,7 +4631,7 @@ static int ext4_find_delayed_extent(struct inode *inode, } block = newes->es_lblk + newes->es_len; - ext4_es_find_delayed_extent(inode, block, &es); + ext4_es_find_delayed_extent_range(inode, block, EXT_MAX_BLOCKS, &es); if (es.es_len == 0) next_del = EXT_MAX_BLOCKS; else diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index fe3337a85ede..e6941e622d31 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -232,14 +232,16 @@ static struct extent_status *__es_tree_search(struct rb_root *root, } /* - * ext4_es_find_delayed_extent: find the 1st delayed extent covering @es->lblk - * if it exists, otherwise, the next extent after @es->lblk. + * ext4_es_find_delayed_extent_range: find the 1st delayed extent covering + * @es->lblk if it exists, otherwise, the next extent after @es->lblk. * * @inode: the inode which owns delayed extents * @lblk: the offset where we start to search + * @end: the offset where we stop to search * @es: delayed extent that we found */ -void ext4_es_find_delayed_extent(struct inode *inode, ext4_lblk_t lblk, +void ext4_es_find_delayed_extent_range(struct inode *inode, + ext4_lblk_t lblk, ext4_lblk_t end, struct extent_status *es) { struct ext4_es_tree *tree = NULL; @@ -247,7 +249,8 @@ void ext4_es_find_delayed_extent(struct inode *inode, ext4_lblk_t lblk, struct rb_node *node; BUG_ON(es == NULL); - trace_ext4_es_find_delayed_extent_enter(inode, lblk); + BUG_ON(end < lblk); + trace_ext4_es_find_delayed_extent_range_enter(inode, lblk); read_lock(&EXT4_I(inode)->i_es_lock); tree = &EXT4_I(inode)->i_es_tree; @@ -270,6 +273,10 @@ out: if (es1 && !ext4_es_is_delayed(es1)) { while ((node = rb_next(&es1->rb_node)) != NULL) { es1 = rb_entry(node, struct extent_status, rb_node); + if (es1->es_lblk > end) { + es1 = NULL; + break; + } if (ext4_es_is_delayed(es1)) break; } @@ -285,7 +292,7 @@ out: read_unlock(&EXT4_I(inode)->i_es_lock); ext4_es_lru_add(inode); - trace_ext4_es_find_delayed_extent_exit(inode, es); + trace_ext4_es_find_delayed_extent_range_exit(inode, es); } static struct extent_status * diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h index d8e2d4dc311e..f740eb03b707 100644 --- a/fs/ext4/extents_status.h +++ b/fs/ext4/extents_status.h @@ -62,7 +62,8 @@ extern int ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, unsigned long long status); extern int ext4_es_remove_extent(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len); -extern void ext4_es_find_delayed_extent(struct inode *inode, ext4_lblk_t lblk, +extern void ext4_es_find_delayed_extent_range(struct inode *inode, + ext4_lblk_t lblk, ext4_lblk_t end, struct extent_status *es); extern int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk, struct extent_status *es); diff --git a/fs/ext4/file.c b/fs/ext4/file.c index 4959e29573b6..b1b4d51b5d86 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -465,7 +465,7 @@ static loff_t ext4_seek_data(struct file *file, loff_t offset, loff_t maxsize) * If there is a delay extent at this offset, * it will be as a data. */ - ext4_es_find_delayed_extent(inode, last, &es); + ext4_es_find_delayed_extent_range(inode, last, last, &es); if (es.es_len != 0 && in_range(last, es.es_lblk, es.es_len)) { if (last != start) dataoff = last << blkbits; @@ -548,7 +548,7 @@ static loff_t ext4_seek_hole(struct file *file, loff_t offset, loff_t maxsize) * If there is a delay extent at this offset, * we will skip this extent. */ - ext4_es_find_delayed_extent(inode, last, &es); + ext4_es_find_delayed_extent_range(inode, last, last, &es); if (es.es_len != 0 && in_range(last, es.es_lblk, es.es_len)) { last = es.es_lblk + es.es_len; holeoff = last << blkbits; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 0723774bdfb5..d6382b89ecbd 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1488,10 +1488,7 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd, struct ext4_io_submit io_submit; BUG_ON(mpd->next_page <= mpd->first_page); - ext4_io_submit_init(&io_submit, mpd->wbc); - io_submit.io_end = ext4_init_io_end(inode, GFP_NOFS); - if (!io_submit.io_end) - return -ENOMEM; + memset(&io_submit, 0, sizeof(io_submit)); /* * We need to start from the first_page to the next_page - 1 * to make sure we also write the mapped dirty buffer_heads. @@ -1579,8 +1576,6 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd, pagevec_release(&pvec); } ext4_io_submit(&io_submit); - /* Drop io_end reference we got from init */ - ext4_put_io_end_defer(io_submit.io_end); return ret; } @@ -2239,16 +2234,9 @@ static int ext4_writepage(struct page *page, */ return __ext4_journalled_writepage(page, len); - ext4_io_submit_init(&io_submit, wbc); - io_submit.io_end = ext4_init_io_end(inode, GFP_NOFS); - if (!io_submit.io_end) { - redirty_page_for_writepage(wbc, page); - return -ENOMEM; - } + memset(&io_submit, 0, sizeof(io_submit)); ret = ext4_bio_write_page(&io_submit, page, len, wbc); ext4_io_submit(&io_submit); - /* Drop io_end reference we got from init */ - ext4_put_io_end_defer(io_submit.io_end); return ret; } @@ -3079,13 +3067,9 @@ static void ext4_end_io_dio(struct kiocb *iocb, loff_t offset, struct inode *inode = file_inode(iocb->ki_filp); ext4_io_end_t *io_end = iocb->private; - /* if not async direct IO just return */ - if (!io_end) { - inode_dio_done(inode); - if (is_async) - aio_complete(iocb, ret, 0); - return; - } + /* if not async direct IO or dio with 0 bytes write, just return */ + if (!io_end || !size) + goto out; ext_debug("ext4_end_io_dio(): io_end 0x%p " "for inode %lu, iocb 0x%p, offset %llu, size %zd\n", @@ -3093,13 +3077,25 @@ static void ext4_end_io_dio(struct kiocb *iocb, loff_t offset, size); iocb->private = NULL; + + /* if not aio dio with unwritten extents, just free io and return */ + if (!(io_end->flag & EXT4_IO_END_UNWRITTEN)) { + ext4_free_io_end(io_end); +out: + inode_dio_done(inode); + if (is_async) + aio_complete(iocb, ret, 0); + return; + } + io_end->offset = offset; io_end->size = size; if (is_async) { io_end->iocb = iocb; io_end->result = ret; } - ext4_put_io_end_defer(io_end); + + ext4_add_complete_io(io_end); } /* @@ -3133,7 +3129,6 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb, get_block_t *get_block_func = NULL; int dio_flags = 0; loff_t final_size = offset + count; - ext4_io_end_t *io_end = NULL; /* Use the old path for reads and writes beyond i_size. */ if (rw != WRITE || final_size > inode->i_size) @@ -3172,16 +3167,13 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb, iocb->private = NULL; ext4_inode_aio_set(inode, NULL); if (!is_sync_kiocb(iocb)) { - io_end = ext4_init_io_end(inode, GFP_NOFS); + ext4_io_end_t *io_end = ext4_init_io_end(inode, GFP_NOFS); if (!io_end) { ret = -ENOMEM; goto retake_lock; } io_end->flag |= EXT4_IO_END_DIRECT; - /* - * Grab reference for DIO. Will be dropped in ext4_end_io_dio() - */ - iocb->private = ext4_get_io_end(io_end); + iocb->private = io_end; /* * we save the io structure for current async direct * IO, so that later ext4_map_blocks() could flag the @@ -3205,27 +3197,26 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb, NULL, dio_flags); + if (iocb->private) + ext4_inode_aio_set(inode, NULL); /* - * Put our reference to io_end. This can free the io_end structure e.g. - * in sync IO case or in case of error. It can even perform extent - * conversion if all bios we submitted finished before we got here. - * Note that in that case iocb->private can be already set to NULL - * here. + * The io_end structure takes a reference to the inode, that + * structure needs to be destroyed and the reference to the + * inode need to be dropped, when IO is complete, even with 0 + * byte write, or failed. + * + * In the successful AIO DIO case, the io_end structure will + * be destroyed and the reference to the inode will be dropped + * after the end_io call back function is called. + * + * In the case there is 0 byte write, or error case, since VFS + * direct IO won't invoke the end_io call back function, we + * need to free the end_io structure here. */ - if (io_end) { - ext4_inode_aio_set(inode, NULL); - ext4_put_io_end(io_end); - /* - * In case of error or no write ext4_end_io_dio() was not - * called so we have to put iocb's reference. - */ - if (ret <= 0 && ret != -EIOCBQUEUED) { - WARN_ON(iocb->private != io_end); - ext4_put_io_end(io_end); - iocb->private = NULL; - } - } - if (ret > 0 && !overwrite && ext4_test_inode_state(inode, + if (ret != -EIOCBQUEUED && ret <= 0 && iocb->private) { + ext4_free_io_end(iocb->private); + iocb->private = NULL; + } else if (ret > 0 && !overwrite && ext4_test_inode_state(inode, EXT4_STATE_DIO_UNWRITTEN)) { int err; /* diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index b1ed9e07434b..def84082a9a9 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -2105,7 +2105,11 @@ repeat: group = ac->ac_g_ex.fe_group; for (i = 0; i < ngroups; group++, i++) { - if (group == ngroups) + /* + * Artificially restricted ngroups for non-extent + * files makes group > ngroups possible on first loop. + */ + if (group >= ngroups) group = 0; /* This now checks without needing the buddy page */ diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c index 19599bded62a..4acf1f78881b 100644 --- a/fs/ext4/page-io.c +++ b/fs/ext4/page-io.c @@ -62,28 +62,15 @@ void ext4_ioend_shutdown(struct inode *inode) cancel_work_sync(&EXT4_I(inode)->i_unwritten_work); } -static void ext4_release_io_end(ext4_io_end_t *io_end) +void ext4_free_io_end(ext4_io_end_t *io) { - BUG_ON(!list_empty(&io_end->list)); - BUG_ON(io_end->flag & EXT4_IO_END_UNWRITTEN); - - if (atomic_dec_and_test(&EXT4_I(io_end->inode)->i_ioend_count)) - wake_up_all(ext4_ioend_wq(io_end->inode)); - if (io_end->flag & EXT4_IO_END_DIRECT) - inode_dio_done(io_end->inode); - if (io_end->iocb) - aio_complete(io_end->iocb, io_end->result, 0); - kmem_cache_free(io_end_cachep, io_end); -} - -static void ext4_clear_io_unwritten_flag(ext4_io_end_t *io_end) -{ - struct inode *inode = io_end->inode; + BUG_ON(!io); + BUG_ON(!list_empty(&io->list)); + BUG_ON(io->flag & EXT4_IO_END_UNWRITTEN); - io_end->flag &= ~EXT4_IO_END_UNWRITTEN; - /* Wake up anyone waiting on unwritten extent conversion */ - if (atomic_dec_and_test(&EXT4_I(inode)->i_unwritten)) - wake_up_all(ext4_ioend_wq(inode)); + if (atomic_dec_and_test(&EXT4_I(io->inode)->i_ioend_count)) + wake_up_all(ext4_ioend_wq(io->inode)); + kmem_cache_free(io_end_cachep, io); } /* check a range of space and convert unwritten extents to written. */ @@ -106,8 +93,13 @@ static int ext4_end_io(ext4_io_end_t *io) "(inode %lu, offset %llu, size %zd, error %d)", inode->i_ino, offset, size, ret); } - ext4_clear_io_unwritten_flag(io); - ext4_release_io_end(io); + /* Wake up anyone waiting on unwritten extent conversion */ + if (atomic_dec_and_test(&EXT4_I(inode)->i_unwritten)) + wake_up_all(ext4_ioend_wq(inode)); + if (io->flag & EXT4_IO_END_DIRECT) + inode_dio_done(inode); + if (io->iocb) + aio_complete(io->iocb, io->result, 0); return ret; } @@ -138,7 +130,7 @@ static void dump_completed_IO(struct inode *inode) } /* Add the io_end to per-inode completed end_io list. */ -static void ext4_add_complete_io(ext4_io_end_t *io_end) +void ext4_add_complete_io(ext4_io_end_t *io_end) { struct ext4_inode_info *ei = EXT4_I(io_end->inode); struct workqueue_struct *wq; @@ -175,6 +167,8 @@ static int ext4_do_flush_completed_IO(struct inode *inode) err = ext4_end_io(io); if (unlikely(!ret && err)) ret = err; + io->flag &= ~EXT4_IO_END_UNWRITTEN; + ext4_free_io_end(io); } return ret; } @@ -206,43 +200,10 @@ ext4_io_end_t *ext4_init_io_end(struct inode *inode, gfp_t flags) atomic_inc(&EXT4_I(inode)->i_ioend_count); io->inode = inode; INIT_LIST_HEAD(&io->list); - atomic_set(&io->count, 1); } return io; } -void ext4_put_io_end_defer(ext4_io_end_t *io_end) -{ - if (atomic_dec_and_test(&io_end->count)) { - if (!(io_end->flag & EXT4_IO_END_UNWRITTEN) || !io_end->size) { - ext4_release_io_end(io_end); - return; - } - ext4_add_complete_io(io_end); - } -} - -int ext4_put_io_end(ext4_io_end_t *io_end) -{ - int err = 0; - - if (atomic_dec_and_test(&io_end->count)) { - if (io_end->flag & EXT4_IO_END_UNWRITTEN) { - err = ext4_convert_unwritten_extents(io_end->inode, - io_end->offset, io_end->size); - ext4_clear_io_unwritten_flag(io_end); - } - ext4_release_io_end(io_end); - } - return err; -} - -ext4_io_end_t *ext4_get_io_end(ext4_io_end_t *io_end) -{ - atomic_inc(&io_end->count); - return io_end; -} - /* * Print an buffer I/O error compatible with the fs/buffer.c. This * provides compatibility with dmesg scrapers that look for a specific @@ -325,7 +286,12 @@ static void ext4_end_bio(struct bio *bio, int error) bi_sector >> (inode->i_blkbits - 9)); } - ext4_put_io_end_defer(io_end); + if (!(io_end->flag & EXT4_IO_END_UNWRITTEN)) { + ext4_free_io_end(io_end); + return; + } + + ext4_add_complete_io(io_end); } void ext4_io_submit(struct ext4_io_submit *io) @@ -339,37 +305,40 @@ void ext4_io_submit(struct ext4_io_submit *io) bio_put(io->io_bio); } io->io_bio = NULL; -} - -void ext4_io_submit_init(struct ext4_io_submit *io, - struct writeback_control *wbc) -{ - io->io_op = (wbc->sync_mode == WB_SYNC_ALL ? WRITE_SYNC : WRITE); - io->io_bio = NULL; + io->io_op = 0; io->io_end = NULL; } -static int io_submit_init_bio(struct ext4_io_submit *io, - struct buffer_head *bh) +static int io_submit_init(struct ext4_io_submit *io, + struct inode *inode, + struct writeback_control *wbc, + struct buffer_head *bh) { + ext4_io_end_t *io_end; + struct page *page = bh->b_page; int nvecs = bio_get_nr_vecs(bh->b_bdev); struct bio *bio; + io_end = ext4_init_io_end(inode, GFP_NOFS); + if (!io_end) + return -ENOMEM; bio = bio_alloc(GFP_NOIO, min(nvecs, BIO_MAX_PAGES)); bio->bi_sector = bh->b_blocknr * (bh->b_size >> 9); bio->bi_bdev = bh->b_bdev; + bio->bi_private = io->io_end = io_end; bio->bi_end_io = ext4_end_bio; - bio->bi_private = ext4_get_io_end(io->io_end); - if (!io->io_end->size) - io->io_end->offset = (bh->b_page->index << PAGE_CACHE_SHIFT) - + bh_offset(bh); + + io_end->offset = (page->index << PAGE_CACHE_SHIFT) + bh_offset(bh); + io->io_bio = bio; + io->io_op = (wbc->sync_mode == WB_SYNC_ALL ? WRITE_SYNC : WRITE); io->io_next_block = bh->b_blocknr; return 0; } static int io_submit_add_bh(struct ext4_io_submit *io, struct inode *inode, + struct writeback_control *wbc, struct buffer_head *bh) { ext4_io_end_t *io_end; @@ -380,18 +349,18 @@ submit_and_retry: ext4_io_submit(io); } if (io->io_bio == NULL) { - ret = io_submit_init_bio(io, bh); + ret = io_submit_init(io, inode, wbc, bh); if (ret) return ret; } - ret = bio_add_page(io->io_bio, bh->b_page, bh->b_size, bh_offset(bh)); - if (ret != bh->b_size) - goto submit_and_retry; io_end = io->io_end; if (test_clear_buffer_uninit(bh)) ext4_set_io_unwritten_flag(inode, io_end); - io_end->size += bh->b_size; + io->io_end->size += bh->b_size; io->io_next_block++; + ret = bio_add_page(io->io_bio, bh->b_page, bh->b_size, bh_offset(bh)); + if (ret != bh->b_size) + goto submit_and_retry; return 0; } @@ -463,7 +432,7 @@ int ext4_bio_write_page(struct ext4_io_submit *io, do { if (!buffer_async_write(bh)) continue; - ret = io_submit_add_bh(io, inode, bh); + ret = io_submit_add_bh(io, inode, wbc, bh); if (ret) { /* * We only get here on ENOMEM. Not much else diff --git a/include/drm/drmP.h b/include/drm/drmP.h index 61196592152e..63d17ee9eb48 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -316,6 +316,7 @@ struct drm_ioctl_desc { int flags; drm_ioctl_t *func; unsigned int cmd_drv; + const char *name; }; /** @@ -324,7 +325,7 @@ struct drm_ioctl_desc { */ #define DRM_IOCTL_DEF_DRV(ioctl, _func, _flags) \ - [DRM_IOCTL_NR(DRM_##ioctl)] = {.cmd = DRM_##ioctl, .func = _func, .flags = _flags, .cmd_drv = DRM_IOCTL_##ioctl} + [DRM_IOCTL_NR(DRM_##ioctl)] = {.cmd = DRM_##ioctl, .func = _func, .flags = _flags, .cmd_drv = DRM_IOCTL_##ioctl, .name = #ioctl} struct drm_magic_entry { struct list_head head; diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h index 8230b46fdd73..471f276ce8f7 100644 --- a/include/drm/drm_fb_helper.h +++ b/include/drm/drm_fb_helper.h @@ -50,13 +50,14 @@ struct drm_fb_helper_surface_size { /** * struct drm_fb_helper_funcs - driver callbacks for the fbdev emulation library - * @gamma_set: - Set the given gamma lut register on the given crtc. - * @gamma_get: - Read the given gamma lut register on the given crtc, used to - * save the current lut when force-restoring the fbdev for e.g. - * kdbg. - * @fb_probe: - Driver callback to allocate and initialize the fbdev info - * structure. Futhermore it also needs to allocate the drm - * framebuffer used to back the fbdev. + * @gamma_set: Set the given gamma lut register on the given crtc. + * @gamma_get: Read the given gamma lut register on the given crtc, used to + * save the current lut when force-restoring the fbdev for e.g. + * kdbg. + * @fb_probe: Driver callback to allocate and initialize the fbdev info + * structure. Futhermore it also needs to allocate the drm + * framebuffer used to back the fbdev. + * @initial_config: Setup an initial fbdev display configuration * * Driver callbacks used by the fbdev emulation helper library. */ diff --git a/include/linux/journal-head.h b/include/linux/journal-head.h index 13a3da25ff07..98cd41bb39c8 100644 --- a/include/linux/journal-head.h +++ b/include/linux/journal-head.h @@ -30,15 +30,19 @@ struct journal_head { /* * Journalling list for this buffer [jbd_lock_bh_state()] + * NOTE: We *cannot* combine this with b_modified into a bitfield + * as gcc would then (which the C standard allows but which is + * very unuseful) make 64-bit accesses to the bitfield and clobber + * b_jcount if its update races with bitfield modification. */ - unsigned b_jlist:4; + unsigned b_jlist; /* * This flag signals the buffer has been modified by * the currently running transaction * [jbd_lock_bh_state()] */ - unsigned b_modified:1; + unsigned b_modified; /* * Copy of the buffer data frozen for writing to the log. diff --git a/include/linux/mlx4/qp.h b/include/linux/mlx4/qp.h index 67f46ad6920a..352eec9df1b8 100644 --- a/include/linux/mlx4/qp.h +++ b/include/linux/mlx4/qp.h @@ -126,7 +126,7 @@ struct mlx4_rss_context { struct mlx4_qp_path { u8 fl; - u8 reserved1[1]; + u8 vlan_control; u8 disable_pkey_check; u8 pkey_index; u8 counter_index; @@ -141,11 +141,32 @@ struct mlx4_qp_path { u8 sched_queue; u8 vlan_index; u8 feup; - u8 reserved3; + u8 fvl_rx; u8 reserved4[2]; u8 dmac[6]; }; +enum { /* fl */ + MLX4_FL_CV = 1 << 6, + MLX4_FL_ETH_HIDE_CQE_VLAN = 1 << 2 +}; +enum { /* vlan_control */ + MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED = 1 << 6, + MLX4_VLAN_CTRL_ETH_RX_BLOCK_TAGGED = 1 << 2, + MLX4_VLAN_CTRL_ETH_RX_BLOCK_PRIO_TAGGED = 1 << 1, /* 802.1p priority tag */ + MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED = 1 << 0 +}; + +enum { /* feup */ + MLX4_FEUP_FORCE_ETH_UP = 1 << 6, /* force Eth UP */ + MLX4_FSM_FORCE_ETH_SRC_MAC = 1 << 5, /* force Source MAC */ + MLX4_FVL_FORCE_ETH_VLAN = 1 << 3 /* force Eth vlan */ +}; + +enum { /* fvl_rx */ + MLX4_FVL_RX_FORCE_ETH_VLAN = 1 << 0 /* enforce Eth rx vlan */ +}; + struct mlx4_qp_context { __be32 flags; __be32 pd; @@ -185,6 +206,10 @@ struct mlx4_qp_context { u32 reserved5[10]; }; +enum { /* param3 */ + MLX4_STRIP_VLAN = 1 << 30 +}; + /* Which firmware version adds support for NEC (NoErrorCompletion) bit */ #define MLX4_FW_VER_WQE_CTRL_NEC mlx4_fw_ver(2, 2, 232) diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h index 733eb5ee31c5..6ff26c8db7b9 100644 --- a/include/linux/spi/spi.h +++ b/include/linux/spi/spi.h @@ -57,7 +57,7 @@ extern struct bus_type spi_bus_type; * @modalias: Name of the driver to use with this device, or an alias * for that name. This appears in the sysfs "modalias" attribute * for driver coldplugging, and in uevents used for hotplugging - * @cs_gpio: gpio number of the chipselect line (optional, -EINVAL when + * @cs_gpio: gpio number of the chipselect line (optional, -ENOENT when * when not using a GPIO line) * * A @spi_device is used to interchange data between an SPI slave @@ -266,7 +266,7 @@ static inline void spi_unregister_driver(struct spi_driver *sdrv) * queue so the subsystem notifies the driver that it may relax the * hardware by issuing this call * @cs_gpios: Array of GPIOs to use as chip select lines; one per CS - * number. Any individual value may be -EINVAL for CS lines that + * number. Any individual value may be -ENOENT for CS lines that * are not GPIOs (driven by the SPI controller itself). * * Each SPI master controller can communicate with one or more @spi_device diff --git a/include/linux/time.h b/include/linux/time.h index 22d81b3c955b..d5d229b2e5af 100644 --- a/include/linux/time.h +++ b/include/linux/time.h @@ -117,14 +117,10 @@ static inline bool timespec_valid_strict(const struct timespec *ts) extern bool persistent_clock_exist; -#ifdef ALWAYS_USE_PERSISTENT_CLOCK -#define has_persistent_clock() true -#else static inline bool has_persistent_clock(void) { return persistent_clock_exist; } -#endif extern void read_persistent_clock(struct timespec *ts); extern void read_boot_clock(struct timespec *ts); diff --git a/include/net/sock.h b/include/net/sock.h index 5c97b0fc5623..66772cf8c3c5 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -866,6 +866,18 @@ struct inet_hashinfo; struct raw_hashinfo; struct module; +/* + * caches using SLAB_DESTROY_BY_RCU should let .next pointer from nulls nodes + * un-modified. Special care is taken when initializing object to zero. + */ +static inline void sk_prot_clear_nulls(struct sock *sk, int size) +{ + if (offsetof(struct sock, sk_node.next) != 0) + memset(sk, 0, offsetof(struct sock, sk_node.next)); + memset(&sk->sk_node.pprev, 0, + size - offsetof(struct sock, sk_node.pprev)); +} + /* Networking protocol blocks we attach to sockets. * socket layer -> transport layer interface * transport -> network interface is defined by struct inet_proto diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index d0e686402df8..8ee15b97cd38 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -2139,7 +2139,7 @@ TRACE_EVENT(ext4_es_remove_extent, __entry->lblk, __entry->len) ); -TRACE_EVENT(ext4_es_find_delayed_extent_enter, +TRACE_EVENT(ext4_es_find_delayed_extent_range_enter, TP_PROTO(struct inode *inode, ext4_lblk_t lblk), TP_ARGS(inode, lblk), @@ -2161,7 +2161,7 @@ TRACE_EVENT(ext4_es_find_delayed_extent_enter, (unsigned long) __entry->ino, __entry->lblk) ); -TRACE_EVENT(ext4_es_find_delayed_extent_exit, +TRACE_EVENT(ext4_es_find_delayed_extent_range_exit, TP_PROTO(struct inode *inode, struct extent_status *es), TP_ARGS(inode, es), diff --git a/kernel/cpu/idle.c b/kernel/cpu/idle.c index 8b86c0c68edf..d5585f5e038e 100644 --- a/kernel/cpu/idle.c +++ b/kernel/cpu/idle.c @@ -40,11 +40,13 @@ __setup("hlt", cpu_idle_nopoll_setup); static inline int cpu_idle_poll(void) { + rcu_idle_enter(); trace_cpu_idle_rcuidle(0, smp_processor_id()); local_irq_enable(); while (!need_resched()) cpu_relax(); trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id()); + rcu_idle_exit(); return 1; } diff --git a/kernel/events/core.c b/kernel/events/core.c index 6b41c1899a8b..9dc297faf7c0 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -4394,6 +4394,64 @@ perf_event_read_event(struct perf_event *event, perf_output_end(&handle); } +typedef int (perf_event_aux_match_cb)(struct perf_event *event, void *data); +typedef void (perf_event_aux_output_cb)(struct perf_event *event, void *data); + +static void +perf_event_aux_ctx(struct perf_event_context *ctx, + perf_event_aux_match_cb match, + perf_event_aux_output_cb output, + void *data) +{ + struct perf_event *event; + + list_for_each_entry_rcu(event, &ctx->event_list, event_entry) { + if (event->state < PERF_EVENT_STATE_INACTIVE) + continue; + if (!event_filter_match(event)) + continue; + if (match(event, data)) + output(event, data); + } +} + +static void +perf_event_aux(perf_event_aux_match_cb match, + perf_event_aux_output_cb output, + void *data, + struct perf_event_context *task_ctx) +{ + struct perf_cpu_context *cpuctx; + struct perf_event_context *ctx; + struct pmu *pmu; + int ctxn; + + rcu_read_lock(); + list_for_each_entry_rcu(pmu, &pmus, entry) { + cpuctx = get_cpu_ptr(pmu->pmu_cpu_context); + if (cpuctx->unique_pmu != pmu) + goto next; + perf_event_aux_ctx(&cpuctx->ctx, match, output, data); + if (task_ctx) + goto next; + ctxn = pmu->task_ctx_nr; + if (ctxn < 0) + goto next; + ctx = rcu_dereference(current->perf_event_ctxp[ctxn]); + if (ctx) + perf_event_aux_ctx(ctx, match, output, data); +next: + put_cpu_ptr(pmu->pmu_cpu_context); + } + + if (task_ctx) { + preempt_disable(); + perf_event_aux_ctx(task_ctx, match, output, data); + preempt_enable(); + } + rcu_read_unlock(); +} + /* * task tracking -- fork/exit * @@ -4416,8 +4474,9 @@ struct perf_task_event { }; static void perf_event_task_output(struct perf_event *event, - struct perf_task_event *task_event) + void *data) { + struct perf_task_event *task_event = data; struct perf_output_handle handle; struct perf_sample_data sample; struct task_struct *task = task_event->task; @@ -4445,62 +4504,11 @@ out: task_event->event_id.header.size = size; } -static int perf_event_task_match(struct perf_event *event) -{ - if (event->state < PERF_EVENT_STATE_INACTIVE) - return 0; - - if (!event_filter_match(event)) - return 0; - - if (event->attr.comm || event->attr.mmap || - event->attr.mmap_data || event->attr.task) - return 1; - - return 0; -} - -static void perf_event_task_ctx(struct perf_event_context *ctx, - struct perf_task_event *task_event) +static int perf_event_task_match(struct perf_event *event, + void *data __maybe_unused) { - struct perf_event *event; - - list_for_each_entry_rcu(event, &ctx->event_list, event_entry) { - if (perf_event_task_match(event)) - perf_event_task_output(event, task_event); - } -} - -static void perf_event_task_event(struct perf_task_event *task_event) -{ - struct perf_cpu_context *cpuctx; - struct perf_event_context *ctx; - struct pmu *pmu; - int ctxn; - - rcu_read_lock(); - list_for_each_entry_rcu(pmu, &pmus, entry) { - cpuctx = get_cpu_ptr(pmu->pmu_cpu_context); - if (cpuctx->unique_pmu != pmu) - goto next; - perf_event_task_ctx(&cpuctx->ctx, task_event); - - ctx = task_event->task_ctx; - if (!ctx) { - ctxn = pmu->task_ctx_nr; - if (ctxn < 0) - goto next; - ctx = rcu_dereference(current->perf_event_ctxp[ctxn]); - if (ctx) - perf_event_task_ctx(ctx, task_event); - } -next: - put_cpu_ptr(pmu->pmu_cpu_context); - } - if (task_event->task_ctx) - perf_event_task_ctx(task_event->task_ctx, task_event); - - rcu_read_unlock(); + return event->attr.comm || event->attr.mmap || + event->attr.mmap_data || event->attr.task; } static void perf_event_task(struct task_struct *task, @@ -4531,7 +4539,10 @@ static void perf_event_task(struct task_struct *task, }, }; - perf_event_task_event(&task_event); + perf_event_aux(perf_event_task_match, + perf_event_task_output, + &task_event, + task_ctx); } void perf_event_fork(struct task_struct *task) @@ -4557,8 +4568,9 @@ struct perf_comm_event { }; static void perf_event_comm_output(struct perf_event *event, - struct perf_comm_event *comm_event) + void *data) { + struct perf_comm_event *comm_event = data; struct perf_output_handle handle; struct perf_sample_data sample; int size = comm_event->event_id.header.size; @@ -4585,39 +4597,16 @@ out: comm_event->event_id.header.size = size; } -static int perf_event_comm_match(struct perf_event *event) -{ - if (event->state < PERF_EVENT_STATE_INACTIVE) - return 0; - - if (!event_filter_match(event)) - return 0; - - if (event->attr.comm) - return 1; - - return 0; -} - -static void perf_event_comm_ctx(struct perf_event_context *ctx, - struct perf_comm_event *comm_event) +static int perf_event_comm_match(struct perf_event *event, + void *data __maybe_unused) { - struct perf_event *event; - - list_for_each_entry_rcu(event, &ctx->event_list, event_entry) { - if (perf_event_comm_match(event)) - perf_event_comm_output(event, comm_event); - } + return event->attr.comm; } static void perf_event_comm_event(struct perf_comm_event *comm_event) { - struct perf_cpu_context *cpuctx; - struct perf_event_context *ctx; char comm[TASK_COMM_LEN]; unsigned int size; - struct pmu *pmu; - int ctxn; memset(comm, 0, sizeof(comm)); strlcpy(comm, comm_event->task->comm, sizeof(comm)); @@ -4627,24 +4616,11 @@ static void perf_event_comm_event(struct perf_comm_event *comm_event) comm_event->comm_size = size; comm_event->event_id.header.size = sizeof(comm_event->event_id) + size; - rcu_read_lock(); - list_for_each_entry_rcu(pmu, &pmus, entry) { - cpuctx = get_cpu_ptr(pmu->pmu_cpu_context); - if (cpuctx->unique_pmu != pmu) - goto next; - perf_event_comm_ctx(&cpuctx->ctx, comm_event); - ctxn = pmu->task_ctx_nr; - if (ctxn < 0) - goto next; - - ctx = rcu_dereference(current->perf_event_ctxp[ctxn]); - if (ctx) - perf_event_comm_ctx(ctx, comm_event); -next: - put_cpu_ptr(pmu->pmu_cpu_context); - } - rcu_read_unlock(); + perf_event_aux(perf_event_comm_match, + perf_event_comm_output, + comm_event, + NULL); } void perf_event_comm(struct task_struct *task) @@ -4706,8 +4682,9 @@ struct perf_mmap_event { }; static void perf_event_mmap_output(struct perf_event *event, - struct perf_mmap_event *mmap_event) + void *data) { + struct perf_mmap_event *mmap_event = data; struct perf_output_handle handle; struct perf_sample_data sample; int size = mmap_event->event_id.header.size; @@ -4734,46 +4711,24 @@ out: } static int perf_event_mmap_match(struct perf_event *event, - struct perf_mmap_event *mmap_event, - int executable) -{ - if (event->state < PERF_EVENT_STATE_INACTIVE) - return 0; - - if (!event_filter_match(event)) - return 0; - - if ((!executable && event->attr.mmap_data) || - (executable && event->attr.mmap)) - return 1; - - return 0; -} - -static void perf_event_mmap_ctx(struct perf_event_context *ctx, - struct perf_mmap_event *mmap_event, - int executable) + void *data) { - struct perf_event *event; + struct perf_mmap_event *mmap_event = data; + struct vm_area_struct *vma = mmap_event->vma; + int executable = vma->vm_flags & VM_EXEC; - list_for_each_entry_rcu(event, &ctx->event_list, event_entry) { - if (perf_event_mmap_match(event, mmap_event, executable)) - perf_event_mmap_output(event, mmap_event); - } + return (!executable && event->attr.mmap_data) || + (executable && event->attr.mmap); } static void perf_event_mmap_event(struct perf_mmap_event *mmap_event) { - struct perf_cpu_context *cpuctx; - struct perf_event_context *ctx; struct vm_area_struct *vma = mmap_event->vma; struct file *file = vma->vm_file; unsigned int size; char tmp[16]; char *buf = NULL; const char *name; - struct pmu *pmu; - int ctxn; memset(tmp, 0, sizeof(tmp)); @@ -4829,27 +4784,10 @@ got_name: mmap_event->event_id.header.size = sizeof(mmap_event->event_id) + size; - rcu_read_lock(); - list_for_each_entry_rcu(pmu, &pmus, entry) { - cpuctx = get_cpu_ptr(pmu->pmu_cpu_context); - if (cpuctx->unique_pmu != pmu) - goto next; - perf_event_mmap_ctx(&cpuctx->ctx, mmap_event, - vma->vm_flags & VM_EXEC); - - ctxn = pmu->task_ctx_nr; - if (ctxn < 0) - goto next; - - ctx = rcu_dereference(current->perf_event_ctxp[ctxn]); - if (ctx) { - perf_event_mmap_ctx(ctx, mmap_event, - vma->vm_flags & VM_EXEC); - } -next: - put_cpu_ptr(pmu->pmu_cpu_context); - } - rcu_read_unlock(); + perf_event_aux(perf_event_mmap_match, + perf_event_mmap_output, + mmap_event, + NULL); kfree(buf); } diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig index e4c07b0692bb..70f27e89012b 100644 --- a/kernel/time/Kconfig +++ b/kernel/time/Kconfig @@ -12,11 +12,6 @@ config CLOCKSOURCE_WATCHDOG config ARCH_CLOCKSOURCE_DATA bool -# Platforms has a persistent clock -config ALWAYS_USE_PERSISTENT_CLOCK - bool - default n - # Timekeeping vsyscall support config GENERIC_TIME_VSYSCALL bool diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index 206bbfb34e09..24938d577669 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -786,11 +786,11 @@ bool tick_broadcast_oneshot_available(void) void __init tick_broadcast_init(void) { - alloc_cpumask_var(&tick_broadcast_mask, GFP_NOWAIT); - alloc_cpumask_var(&tmpmask, GFP_NOWAIT); + zalloc_cpumask_var(&tick_broadcast_mask, GFP_NOWAIT); + zalloc_cpumask_var(&tmpmask, GFP_NOWAIT); #ifdef CONFIG_TICK_ONESHOT - alloc_cpumask_var(&tick_broadcast_oneshot_mask, GFP_NOWAIT); - alloc_cpumask_var(&tick_broadcast_pending_mask, GFP_NOWAIT); - alloc_cpumask_var(&tick_broadcast_force_mask, GFP_NOWAIT); + zalloc_cpumask_var(&tick_broadcast_oneshot_mask, GFP_NOWAIT); + zalloc_cpumask_var(&tick_broadcast_pending_mask, GFP_NOWAIT); + zalloc_cpumask_var(&tick_broadcast_force_mask, GFP_NOWAIT); #endif } diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index bc67d4245e1d..f4208138fbf4 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -717,6 +717,7 @@ static bool can_stop_idle_tick(int cpu, struct tick_sched *ts) if (unlikely(!cpu_online(cpu))) { if (cpu == tick_do_timer_cpu) tick_do_timer_cpu = TICK_DO_TIMER_NONE; + return false; } if (unlikely(ts->nohz_mode == NOHZ_MODE_INACTIVE)) @@ -1168,7 +1169,7 @@ void tick_cancel_sched_timer(int cpu) hrtimer_cancel(&ts->sched_timer); # endif - ts->nohz_mode = NOHZ_MODE_INACTIVE; + memset(ts, 0, sizeof(*ts)); } #endif diff --git a/kernel/timer.c b/kernel/timer.c index a860bba34412..15ffdb3f1948 100644 --- a/kernel/timer.c +++ b/kernel/timer.c @@ -1539,12 +1539,12 @@ static int __cpuinit init_timers_cpu(int cpu) boot_done = 1; base = &boot_tvec_bases; } + spin_lock_init(&base->lock); tvec_base_done[cpu] = 1; } else { base = per_cpu(tvec_bases, cpu); } - spin_lock_init(&base->lock); for (j = 0; j < TVN_SIZE; j++) { INIT_LIST_HEAD(base->tv5.vec + j); diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 4aa9f5bc6b2d..1ae602809efb 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -4311,6 +4311,12 @@ bool current_is_workqueue_rescuer(void) * no synchronization around this function and the test result is * unreliable and only useful as advisory hints or for debugging. * + * If @cpu is WORK_CPU_UNBOUND, the test is performed on the local CPU. + * Note that both per-cpu and unbound workqueues may be associated with + * multiple pool_workqueues which have separate congested states. A + * workqueue being congested on one CPU doesn't mean the workqueue is also + * contested on other CPUs / NUMA nodes. + * * RETURNS: * %true if congested, %false otherwise. */ @@ -4321,6 +4327,9 @@ bool workqueue_congested(int cpu, struct workqueue_struct *wq) rcu_read_lock_sched(); + if (cpu == WORK_CPU_UNBOUND) + cpu = smp_processor_id(); + if (!(wq->flags & WQ_UNBOUND)) pwq = per_cpu_ptr(wq->cpu_pwqs, cpu); else diff --git a/net/batman-adv/distributed-arp-table.c b/net/batman-adv/distributed-arp-table.c index 8e15d966d9b0..239992021b1d 100644 --- a/net/batman-adv/distributed-arp-table.c +++ b/net/batman-adv/distributed-arp-table.c @@ -837,6 +837,19 @@ bool batadv_dat_snoop_outgoing_arp_request(struct batadv_priv *bat_priv, dat_entry = batadv_dat_entry_hash_find(bat_priv, ip_dst); if (dat_entry) { + /* If the ARP request is destined for a local client the local + * client will answer itself. DAT would only generate a + * duplicate packet. + * + * Moreover, if the soft-interface is enslaved into a bridge, an + * additional DAT answer may trigger kernel warnings about + * a packet coming from the wrong port. + */ + if (batadv_is_my_client(bat_priv, dat_entry->mac_addr)) { + ret = true; + goto out; + } + skb_new = arp_create(ARPOP_REPLY, ETH_P_ARP, ip_src, bat_priv->soft_iface, ip_dst, hw_src, dat_entry->mac_addr, hw_src); diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c index 3e30a0f1b908..1240f07ad31d 100644 --- a/net/batman-adv/main.c +++ b/net/batman-adv/main.c @@ -163,14 +163,22 @@ void batadv_mesh_free(struct net_device *soft_iface) batadv_vis_quit(bat_priv); batadv_gw_node_purge(bat_priv); - batadv_originator_free(bat_priv); batadv_nc_free(bat_priv); + batadv_dat_free(bat_priv); + batadv_bla_free(bat_priv); + /* Free the TT and the originator tables only after having terminated + * all the other depending components which may use these structures for + * their purposes. + */ batadv_tt_free(bat_priv); - batadv_bla_free(bat_priv); - - batadv_dat_free(bat_priv); + /* Since the originator table clean up routine is accessing the TT + * tables as well, it has to be invoked after the TT tables have been + * freed and marked as empty. This ensures that no cleanup RCU callbacks + * accessing the TT data are scheduled for later execution. + */ + batadv_originator_free(bat_priv); free_percpu(bat_priv->bat_counters); @@ -475,7 +483,7 @@ static int batadv_param_set_ra(const char *val, const struct kernel_param *kp) char *algo_name = (char *)val; size_t name_len = strlen(algo_name); - if (algo_name[name_len - 1] == '\n') + if (name_len > 0 && algo_name[name_len - 1] == '\n') algo_name[name_len - 1] = '\0'; bat_algo_ops = batadv_algo_get(algo_name); diff --git a/net/batman-adv/network-coding.c b/net/batman-adv/network-coding.c index f7c54305a918..e84629ece9b7 100644 --- a/net/batman-adv/network-coding.c +++ b/net/batman-adv/network-coding.c @@ -1514,6 +1514,7 @@ batadv_nc_skb_decode_packet(struct batadv_priv *bat_priv, struct sk_buff *skb, struct ethhdr *ethhdr, ethhdr_tmp; uint8_t *orig_dest, ttl, ttvn; unsigned int coding_len; + int err; /* Save headers temporarily */ memcpy(&coded_packet_tmp, skb->data, sizeof(coded_packet_tmp)); @@ -1568,8 +1569,11 @@ batadv_nc_skb_decode_packet(struct batadv_priv *bat_priv, struct sk_buff *skb, coding_len); /* Resize decoded skb if decoded with larger packet */ - if (nc_packet->skb->len > coding_len + h_size) - pskb_trim_rcsum(skb, coding_len + h_size); + if (nc_packet->skb->len > coding_len + h_size) { + err = pskb_trim_rcsum(skb, coding_len + h_size); + if (err) + return NULL; + } /* Create decoded unicast packet */ unicast_packet = (struct batadv_unicast_packet *)skb->data; diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index a3395fdfbd4f..d5953b87918c 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -1204,6 +1204,7 @@ void ceph_osdc_unregister_linger_request(struct ceph_osd_client *osdc, mutex_lock(&osdc->request_mutex); if (req->r_linger) { __unregister_linger_request(osdc, req); + req->r_linger = 0; ceph_osdc_put_request(req); } mutex_unlock(&osdc->request_mutex); @@ -2120,7 +2121,9 @@ int ceph_osdc_start_request(struct ceph_osd_client *osdc, down_read(&osdc->map_sem); mutex_lock(&osdc->request_mutex); __register_request(osdc, req); - WARN_ON(req->r_sent); + req->r_sent = 0; + req->r_got_reply = 0; + req->r_completed = 0; rc = __map_request(osdc, req, 0); if (rc < 0) { if (nofail) { diff --git a/net/core/sock.c b/net/core/sock.c index d4f4cea726e7..6ba327da79e1 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1217,18 +1217,6 @@ static void sock_copy(struct sock *nsk, const struct sock *osk) #endif } -/* - * caches using SLAB_DESTROY_BY_RCU should let .next pointer from nulls nodes - * un-modified. Special care is taken when initializing object to zero. - */ -static inline void sk_prot_clear_nulls(struct sock *sk, int size) -{ - if (offsetof(struct sock, sk_node.next) != 0) - memset(sk, 0, offsetof(struct sock, sk_node.next)); - memset(&sk->sk_node.pprev, 0, - size - offsetof(struct sock, sk_node.pprev)); -} - void sk_prot_clear_portaddr_nulls(struct sock *sk, int size) { unsigned long nulls1, nulls2; diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 147abf5275aa..4bcabf3ab4ca 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -84,7 +84,7 @@ int sysctl_ip_default_ttl __read_mostly = IPDEFTTL; EXPORT_SYMBOL(sysctl_ip_default_ttl); /* Generate a checksum for an outgoing IP datagram. */ -__inline__ void ip_send_check(struct iphdr *iph) +void ip_send_check(struct iphdr *iph) { iph->check = 0; iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index d3ddd8400354..ecd60733e5e2 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -1081,6 +1081,7 @@ static int ip6gre_tunnel_ioctl(struct net_device *dev, } if (t == NULL) t = netdev_priv(dev); + memset(&p, 0, sizeof(p)); ip6gre_tnl_parm_to_user(&p, &t->parms); if (copy_to_user(ifr->ifr_ifru.ifru_data, &p, sizeof(p))) err = -EFAULT; @@ -1128,6 +1129,7 @@ static int ip6gre_tunnel_ioctl(struct net_device *dev, if (t) { err = 0; + memset(&p, 0, sizeof(p)); ip6gre_tnl_parm_to_user(&p, &t->parms); if (copy_to_user(ifr->ifr_ifru.ifru_data, &p, sizeof(p))) err = -EFAULT; diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 71167069b394..0a17ed9eaf39 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1890,6 +1890,17 @@ void tcp6_proc_exit(struct net *net) } #endif +static void tcp_v6_clear_sk(struct sock *sk, int size) +{ + struct inet_sock *inet = inet_sk(sk); + + /* we do not want to clear pinet6 field, because of RCU lookups */ + sk_prot_clear_nulls(sk, offsetof(struct inet_sock, pinet6)); + + size -= offsetof(struct inet_sock, pinet6) + sizeof(inet->pinet6); + memset(&inet->pinet6 + 1, 0, size); +} + struct proto tcpv6_prot = { .name = "TCPv6", .owner = THIS_MODULE, @@ -1933,6 +1944,7 @@ struct proto tcpv6_prot = { #ifdef CONFIG_MEMCG_KMEM .proto_cgroup = tcp_proto_cgroup, #endif + .clear_sk = tcp_v6_clear_sk, }; static const struct inet6_protocol tcpv6_protocol = { diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index d4defdd44937..42923b14dfa6 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -1432,6 +1432,17 @@ void udp6_proc_exit(struct net *net) { } #endif /* CONFIG_PROC_FS */ +void udp_v6_clear_sk(struct sock *sk, int size) +{ + struct inet_sock *inet = inet_sk(sk); + + /* we do not want to clear pinet6 field, because of RCU lookups */ + sk_prot_clear_portaddr_nulls(sk, offsetof(struct inet_sock, pinet6)); + + size -= offsetof(struct inet_sock, pinet6) + sizeof(inet->pinet6); + memset(&inet->pinet6 + 1, 0, size); +} + /* ------------------------------------------------------------------------ */ struct proto udpv6_prot = { @@ -1462,7 +1473,7 @@ struct proto udpv6_prot = { .compat_setsockopt = compat_udpv6_setsockopt, .compat_getsockopt = compat_udpv6_getsockopt, #endif - .clear_sk = sk_prot_clear_portaddr_nulls, + .clear_sk = udp_v6_clear_sk, }; static struct inet_protosw udpv6_protosw = { diff --git a/net/ipv6/udp_impl.h b/net/ipv6/udp_impl.h index d7571046bfc4..4691ed50a928 100644 --- a/net/ipv6/udp_impl.h +++ b/net/ipv6/udp_impl.h @@ -31,6 +31,8 @@ extern int udpv6_recvmsg(struct kiocb *iocb, struct sock *sk, extern int udpv6_queue_rcv_skb(struct sock * sk, struct sk_buff *skb); extern void udpv6_destroy_sock(struct sock *sk); +extern void udp_v6_clear_sk(struct sock *sk, int size); + #ifdef CONFIG_PROC_FS extern int udp6_seq_show(struct seq_file *seq, void *v); #endif diff --git a/net/ipv6/udplite.c b/net/ipv6/udplite.c index 1d08e21d9f69..dfcc4be46898 100644 --- a/net/ipv6/udplite.c +++ b/net/ipv6/udplite.c @@ -56,7 +56,7 @@ struct proto udplitev6_prot = { .compat_setsockopt = compat_udpv6_setsockopt, .compat_getsockopt = compat_udpv6_getsockopt, #endif - .clear_sk = sk_prot_clear_portaddr_nulls, + .clear_sk = udp_v6_clear_sk, }; static struct inet_protosw udplite6_protosw = { diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c index 4ef7bdb65440..23ed03d786c8 100644 --- a/net/ipv6/xfrm6_policy.c +++ b/net/ipv6/xfrm6_policy.c @@ -103,8 +103,10 @@ static int xfrm6_fill_dst(struct xfrm_dst *xdst, struct net_device *dev, dev_hold(dev); xdst->u.rt6.rt6i_idev = in6_dev_get(dev); - if (!xdst->u.rt6.rt6i_idev) + if (!xdst->u.rt6.rt6i_idev) { + dev_put(dev); return -ENODEV; + } rt6_transfer_peer(&xdst->u.rt6, rt); |