summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* cpuidle: Quickly notice prediction failure for repeat modeYouquan Song2012-11-153-5/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The prediction for future is difficult and when the cpuidle governor prediction fails and govenor possibly choose the shallower C-state than it should. How to quickly notice and find the failure becomes important for power saving. cpuidle menu governor has a method to predict the repeat pattern if there are 8 C-states residency which are continuous and the same or very close, so it will predict the next C-states residency will keep same residency time. There is a real case that turbostat utility (tools/power/x86/turbostat) at kernel 3.3 or early. turbostat utility will read 10 registers one by one at Sandybridge, so it will generate 10 IPIs to wake up idle CPUs. So cpuidle menu governor will predict it is repeat mode and there is another IPI wake up idle CPU soon, so it keeps idle CPU stay at C1 state even though CPU is totally idle. However, in the turbostat, following 10 registers reading is sleep 5 seconds by default, so the idle CPU will keep at C1 for a long time though it is idle until break event occurs. In a idle Sandybridge system, run "./turbostat -v", we will notice that deep C-state dangles between "70% ~ 99%". After patched the kernel, we will notice deep C-state stays at >99.98%. In the patch, a timer is added when menu governor detects a repeat mode and choose a shallow C-state. The timer is set to a time out value that greater than predicted time, and we conclude repeat mode prediction failure if timer is triggered. When repeat mode happens as expected, the timer is not triggered and CPU waken up from C-states and it will cancel the timer initiatively. When repeat mode does not happen, the timer will be time out and menu governor will quickly notice that the repeat mode prediction fails and then re-evaluates deeper C-states possibility. Below is another case which will clearly show the patch much benefit: #include <stdlib.h> #include <stdio.h> #include <unistd.h> #include <signal.h> #include <sys/time.h> #include <time.h> #include <pthread.h> volatile int * shutdown; volatile long * count; int delay = 20; int loop = 8; void usage(void) { fprintf(stderr, "Usage: idle_predict [options]\n" " --help -h Print this help\n" " --thread -n Thread number\n" " --loop -l Loop times in shallow Cstate\n" " --delay -t Sleep time (uS)in shallow Cstate\n"); } void *simple_loop() { int idle_num = 1; while (!(*shutdown)) { *count = *count + 1; if (idle_num % loop) usleep(delay); else { /* sleep 1 second */ usleep(1000000); idle_num = 0; } idle_num++; } } static void sighand(int sig) { *shutdown = 1; } int main(int argc, char *argv[]) { sigset_t sigset; int signum = SIGALRM; int i, c, er = 0, thread_num = 8; pthread_t pt[1024]; static char optstr[] = "n:l:t:h:"; while ((c = getopt(argc, argv, optstr)) != EOF) switch (c) { case 'n': thread_num = atoi(optarg); break; case 'l': loop = atoi(optarg); break; case 't': delay = atoi(optarg); break; case 'h': default: usage(); exit(1); } printf("thread=%d,loop=%d,delay=%d\n",thread_num,loop,delay); count = malloc(sizeof(long)); shutdown = malloc(sizeof(int)); *count = 0; *shutdown = 0; sigemptyset(&sigset); sigaddset(&sigset, signum); sigprocmask (SIG_BLOCK, &sigset, NULL); signal(SIGINT, sighand); signal(SIGTERM, sighand); for(i = 0; i < thread_num ; i++) pthread_create(&pt[i], NULL, simple_loop, NULL); for (i = 0; i < thread_num; i++) pthread_join(pt[i], NULL); exit(0); } Get powertop V2 from git://github.com/fenrus75/powertop, build powertop. After build the above test application, then run it. Test plaform can be Intel Sandybridge or other recent platforms. #./idle_predict -l 10 & #./powertop We will find that deep C-state will dangle between 40%~100% and much time spent on C1 state. It is because menu governor wrongly predict that repeat mode is kept, so it will choose the C1 shallow C-state even though it has chance to sleep 1 second in deep C-state. While after patched the kernel, we find that deep C-state will keep >99.6%. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpuidle / sysfs: move kobj initialization in the syfs fileDaniel Lezcano2012-11-152-6/+5
| | | | | | | | Move the kobj initialization and completion in the sysfs.c and encapsulate the code more. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpuidle / sysfs: change function parameterDaniel Lezcano2012-11-153-16/+8
| | | | | | | | | | | | | | The function needs the cpuidle_device which is initially passed to the caller. The current code gets the struct device from the struct cpuidle_device, pass it the cpuidle_add_sysfs function. This function calls per_cpu(cpuidle_devices, cpu) to get the cpuidle_device. This patch pass the cpuidle_device instead and simplify the code. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* Linux 3.7-rc5v3.7-rc5Linus Torvalds2012-11-111-1/+1
|
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2012-11-1025-66/+131
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: "Bug fixes galore, mostly in drivers as is often the case: 1) USB gadget and cdc_eem drivers need adjustments to their frame size lengths in order to handle VLANs correctly. From Ian Coolidge. 2) TIPC and several network drivers erroneously call tasklet_disable before tasklet_kill, fix from Xiaotian Feng. 3) r8169 driver needs to apply the WOL suspend quirk to more chipsets, fix from Cyril Brulebois. 4) Fix multicast filters on RTL_GIGA_MAC_VER_35 r8169 chips, from Nathan Walp. 5) FDB netlink dumps should use RTM_NEWNEIGH as the message type, not zero. From John Fastabend. 6) Fix smsc95xx tx checksum offload on big-endian, from Steve Glendinning. 7) __inet_diag_dump() needs to repsect and report the error value returned from inet_diag_lock_handler() rather than ignore it. Otherwise if an inet diag handler is not available for a particular protocol, we essentially report success instead of giving an error indication. Fix from Cyrill Gorcunov. 8) When the QFQ packet scheduler sees TSO/GSO packets it does not handle things properly, and in fact ends up corrupting it's datastructures as well as mis-schedule packets. Fix from Paolo Valente. 9) Fix oopser in skb_loop_sk(), from Eric Leblond. 10) CXGB4 passes partially uninitialized datastructures in to FW commands, fix from Vipul Pandya. 11) When we send unsolicited ipv6 neighbour advertisements, we should send them to the link-local allnodes multicast address, as per RFC4861. Fix from Hannes Frederic Sowa. 12) There is some kind of bug in the usbnet's kevent deferral mechanism, but more immediately when it triggers an uncontrolled stream of kernel messages spam the log. Rate limit the error log message triggered when this problem occurs, as sending thousands of error messages into the kernel log doesn't help matters at all, and in fact makes further diagnosis more difficult. From Steve Glendinning. 13) Fix gianfar restore from hibernation, from Wang Dongsheng. 14) The netlink message attribute sizes are wrong in the ipv6 GRE driver, it was using the size of ipv4 addresses instead of ipv6 ones :-) Fix from Nicolas Dichtel." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: gre6: fix rtnl dump messages gianfar: ethernet vanishes after restoring from hibernation usbnet: ratelimit kevent may have been dropped warnings ipv6: send unsolicited neighbour advertisements to all-nodes net: usb: cdc_eem: Fix rx skb allocation for 802.1Q VLANs usb: gadget: g_ether: fix frame size check for 802.1Q cxgb4: Fix initialization of SGE_CONTROL register isdn: Make CONFIG_ISDN depend on CONFIG_NETDEVICES cxgb4: Initialize data structures before using. af-packet: fix oops when socket is not present pkt_sched: enable QFQ to support TSO/GSO net: inet_diag -- Return error code if protocol handler is missed net: bnx2x: Fix typo in bnx2x driver smsc95xx: fix tx checksum offload for big endian rtnetlink: Use nlmsg type RTM_NEWNEIGH from dflt fdb dump ptp: update adjfreq callback description r8169: allow multicast packets on sub-8168f chipset. r8169: Fix WoL on RTL8168d/8111d. drivers/net: use tasklet_kill in device remove/close process tipc: do not use tasklet_disable before tasklet_kill
| * gre6: fix rtnl dump messagesNicolas Dichtel2012-11-091-4/+4
| | | | | | | | | | | | | | | | | | | | Spotted after a code review. Introduced by c12b395a46646bab69089ce7016ac78177f6001f (gre: Support GRE over IPv6). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * gianfar: ethernet vanishes after restoring from hibernationWang Dongsheng2012-11-091-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a gianfar ethernet device is down prior to hibernating a system, it will no longer be present upon system restore. For example: ~# ifconfig eth0 down ~# echo disk > /sys/power/state <trigger a restore from hibernation> ~# ifconfig eth0 up SIOCSIFFLAGS: No such device This happens because the restore function bails out early upon finding devices that were not up at hibernation. In doing so, it never gets to the netif_device_attach call at the end of the restore function. Adding the netif_device_attach as done here also makes the gfar_restore code consistent with what is done in the gfar_resume code. Cc: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * usbnet: ratelimit kevent may have been dropped warningsSteve Glendinning2012-11-091-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when something goes wrong, a flood of these messages can be generated by usbnet (thousands per second). This doesn't generally *help* the condition so this patch ratelimits the rate of their generation. There's an underlying problem in usbnet's kevent deferral mechanism which needs fixing, specifically that events *can* get dropped and not handled. This patch doesn't address this, but just mitigates fallout caused by the current implemention. Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv6: send unsolicited neighbour advertisements to all-nodesHannes Frederic Sowa2012-11-091-2/+1
| | | | | | | | | | | | | | | | | | As documented in RFC4861 (Neighbor Discovery for IP version 6) 7.2.6., unsolicited neighbour advertisements should be sent to the all-nodes multicast address. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: usb: cdc_eem: Fix rx skb allocation for 802.1Q VLANsIan Coolidge2012-11-081-1/+2
| | | | | | | | | | | | | | | | | | cdc_eem frames might need to contain 802.1Q VLAN Ethernet frames. URB/skb sizing from usbnet will default to the hard_mtu, so account for the VLAN header by expanding that via hard_header_len Signed-off-by: Ian Coolidge <iancoolidge@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * usb: gadget: g_ether: fix frame size check for 802.1QIan Coolidge2012-11-081-1/+2
| | | | | | | | | | | | | | | | | | Checking skb->len against ETH_FRAME_LEN assumes a 1514 ethernet frame size. With an 802.1Q VLAN header, ethernet frame length can now be 1518. Validate frame length against that. Signed-off-by: Ian Coolidge <iancoolidge@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * cxgb4: Fix initialization of SGE_CONTROL registerVipul Pandya2012-11-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | INGPADBOUNDARY_MASK is already shifted. No need to shift it again. On reloading a driver it was resulting in a bad SGE FL MTU sizes [1536, 9088] error. This only causes an issue on systems that have L1 cache size of 32B, 128B, 512B, 2048B or 4096B. Signed-off-by: Jay Hernandez <jay@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * isdn: Make CONFIG_ISDN depend on CONFIG_NETDEVICESLee Jones2012-11-083-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It doesn't make much sense to enable ISDN services if you don't intend to connect to a network. Therefore insisting that ISDN depends on NETDEVICES seems logical. We can then remove any guards mentioning NETDEVICES inside all subordinate drivers. This also has the nice side-effect of fixing the warning below when ISDN_I4L && !CONFIG_NETDEVICES at compile time. This patch fixes: drivers/isdn/i4l/isdn_common.c: In function ‘isdn_ioctl’: drivers/isdn/i4l/isdn_common.c:1278:8: warning: unused variable ‘s’ [-Wunused-variable] Cc: Karsten Keil <isdn@linux-pingi.de> Cc: netdev@vger.kernel.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * cxgb4: Initialize data structures before using.Vipul Pandya2012-11-071-0/+4
| | | | | | | | | | | | | | | | | | We should not assume reserve fields to be don't cares as fields may change. Clearing data structures before using. Signed-off-by: Jay Hernandez <jay@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * af-packet: fix oops when socket is not presentEric Leblond2012-11-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to a NULL dereference, the following patch is causing oops in normal trafic condition: commit c0de08d04215031d68fa13af36f347a6cfa252ca Author: Eric Leblond <eric@regit.org> Date:   Thu Aug 16 22:02:58 2012 +0000     af_packet: don't emit packet on orig fanout group This buggy patch was a feature fix and has reached most stable branches. When skb->sk is NULL and when packet fanout is used, there is a crash in match_fanout_group where skb->sk is accessed. This patch fixes the issue by returning false as soon as the socket is NULL: this correspond to the wanted behavior because the kernel as to resend the skb to all the listening socket in this case. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * pkt_sched: enable QFQ to support TSO/GSOPaolo Valente2012-11-071-30/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the max packet size for some class (configured through tc) is violated by the actual size of the packets of that class, then QFQ would not schedule classes correctly, and the data structures implementing the bucket lists may get corrupted. This problem occurs with TSO/GSO even if the max packet size is set to the MTU, and is, e.g., the cause of the failure reported in [1]. Two patches have been proposed to solve this problem in [2], one of them is a preliminary version of this patch. This patch addresses the above issues by: 1) setting QFQ parameters to proper values for supporting TSO/GSO (in particular, setting the maximum possible packet size to 64KB), 2) automatically increasing the max packet size for a class, lmax, when a packet with a larger size than the current value of lmax arrives. The drawback of the first point is that the maximum weight for a class is now limited to 4096, which is equal to 1/16 of the maximum weight sum. Finally, this patch also forcibly caps the timestamps of a class if they are too high to be stored in the bucket list. This capping, taken from QFQ+ [3], handles the unfrequent case described in the comment to the function slot_insert. [1] http://marc.info/?l=linux-netdev&m=134968777902077&w=2 [2] http://marc.info/?l=linux-netdev&m=135096573507936&w=2 [3] http://marc.info/?l=linux-netdev&m=134902691421670&w=2 Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Tested-by: Cong Wang <amwang@redhat.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: inet_diag -- Return error code if protocol handler is missedCyrill Gorcunov2012-11-041-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've observed that in case if UDP diag module is not supported in kernel the netlink returns NLMSG_DONE without notifying a caller that handler is missed. This patch makes __inet_diag_dump to return error code instead. So as example it become possible to detect such situation and handle it gracefully on userspace level. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> CC: David Miller <davem@davemloft.net> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Pavel Emelyanov <xemul@parallels.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: bnx2x: Fix typo in bnx2x driverMasanari Iida2012-11-032-3/+3
| | | | | | | | | | | | | | Correct spelling typo in bnx2x driver Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * smsc95xx: fix tx checksum offload for big endianSteve Glendinning2012-11-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | f7b2927 introduced tx checksum offload support for smsc95xx, and enabled it by default. This feature doesn't take endianness into account, so causes most tx to fail on those platforms. This patch fixes the problem fully by adding the missing conversion. An alternate workaround is to disable TX checksum offload on those platforms. The cpu impact of this feature is very low. Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * rtnetlink: Use nlmsg type RTM_NEWNEIGH from dflt fdb dumpJohn Fastabend2012-11-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the dflt fdb dump handler to use RTM_NEWNEIGH to be compatible with bridge dump routines. The dump reply from the network driver handlers should match the reply from bridge handler. The fact they were not in the ixgbe case was effectively a bug. This patch resolves it. Applications that were not checking the nlmsg type will continue to work. And now applications that do check the type will work as expected. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ptp: update adjfreq callback descriptionJacob Keller2012-11-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch updates the adjfreq callback description to include a note that the delta in ppb is always relative to the base frequency, and not to the current frequency of the hardware clock. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> CC: stable@vger.kernel.org [v3.5+] CC: Richard Cochran <richard.cochran@gmail.com> CC: John Stultz <john.stultz@linaro.org> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * r8169: allow multicast packets on sub-8168f chipset.Nathan Walp2012-11-031-0/+3
| | | | | | | | | | | | | | | | | | RTL_GIGA_MAC_VER_35 includes no multicast hardware filter. Signed-off-by: Nathan Walp <faceprint@faceprint.com> Suggested-by: Hayes Wang <hayeswang@realtek.com> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * r8169: Fix WoL on RTL8168d/8111d.Cyril Brulebois2012-11-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This regression was spotted between Debian squeeze and Debian wheezy kernels (respectively based on 2.6.32 and 3.2). More info about Wake-on-LAN issues with Realtek's 816x chipsets can be found in the following thread: http://marc.info/?t=132079219400004 Probable regression from d4ed95d796e5126bba51466dc07e287cebc8bd19; more chipsets are likely affected. Tested on top of a 3.2.23 kernel. Reported-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Tested-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Hinted-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: Cyril Brulebois <kibi@debian.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * drivers/net: use tasklet_kill in device remove/close processXiaotian Feng2012-11-035-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Some driver uses tasklet_disable in device remove/close process, tasklet_disable will inc tasklet->count and return. If the tasklet is not handled yet because some softirq pressure, the tasklet will placed on the tasklet_vec, never have a chance to excute. This might lead to ksoftirqd heavy loaded, wakeup with pending_softirq, but tasklet is disabled. tasklet_kill should be used in this case. Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
| * tipc: do not use tasklet_disable before tasklet_killXiaotian Feng2012-11-031-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | If tasklet_disable() is called before related tasklet handled, tasklet_kill will never be finished. tasklet_kill is enough. Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com> Cc: Jon Maloy <jon.maloy@ericsson.com> Cc: Allan Stephens <allan.stephens@windriver.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Cc: tipc-discussion@lists.sourceforge.net Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds2012-11-1038-97/+291
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull sparc fixes from David Miller: "Several build/bug fixes for sparc, including: 1) Configuring a mix of static vs. modular sparc64 crypto modules didn't work, remove an ill-conceived attempt to only have to build the device match table for these drivers once to fix the problem. Reported by Meelis Roos. 2) Make the montgomery multiple/square and mpmul instructions actually usable in 32-bit tasks. Essentially this involves providing 32-bit userspace with a way to use a 64-bit stack when it needs to. 3) Our sparc64 atomic backoffs don't yield cpu strands properly on Niagara chips. Use pause instruction when available to achieve this, otherwise use a benign instruction we know blocks the strand for some time. 4) Wire up kcmp 5) Fix the build of various drivers by removing the unnecessary blocking of OF_GPIO when SPARC. 6) Fix unintended regression wherein of_address_to_resource stopped being provided. Fix from Andreas Larsson. 7) Fix NULL dereference in leon_handle_ext_irq(), also from Andreas Larsson." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Fix build with mix of modular vs. non-modular crypto drivers. sparc: Support atomic64_dec_if_positive properly. of/address: sparc: Declare of_address_to_resource() as an extern function for sparc again sparc32, leon: Check for existent irq_map entry in leon_handle_ext_irq sparc: Add sparc support for platform_get_irq() sparc: Allow OF_GPIO on sparc. qlogicpti: Fix build warning. sparc: Wire up sys_kcmp. sparc64: Improvde documentation and readability of atomic backoff code. sparc64: Use pause instruction when available. sparc64: Fix cpu strand yielding. sparc64: Make montmul/montsqr/mpmul usable in 32-bit threads.
| * | sparc64: Fix build with mix of modular vs. non-modular crypto drivers.David S. Miller2012-11-109-8/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We tried linking in a single built object to hold the device table, but only works if all of the sparc64 crypto modules get built the same way (modular vs. non-modular). Just include the device ID stub into each driver source file so that the table gets compiled into the correct result in all cases. Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sparc: Support atomic64_dec_if_positive properly.David S. Miller2012-11-104-2/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sparc32 already supported it, as a consequence of using the generic atomic64 implementation. And the sparc64 implementation is rather trivial. This allows us to set ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE for all of sparc, and avoid the annoying warning from lib/atomic64_test.c Signed-off-by: David S. Miller <davem@davemloft.net>
| * | of/address: sparc: Declare of_address_to_resource() as an extern function ↵Andreas Larsson2012-11-102-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for sparc again This bug-fix makes sure that of_address_to_resource is defined extern for sparc so that the sparc-specific implementation of of_address_to_resource() is once again used when including include/linux/of_address.h in a sparc context. A number of drivers in mainline relies on this function working for sparc. The bug was introduced in a850a7554442f08d3e910c6eeb4ee216868dda1e, "of/address: add empty static inlines for !CONFIG_OF". Contrary to that commit title, the static inlines are added for !CONFIG_OF_ADDRESS, and CONFIG_OF_ADDRESS is never defined for sparc. This is good behavior for the other functions in include/linux/of_address.h, as the extern functions defined in drivers/of/address.c only gets linked when OF_ADDRESS is configured. However, for of_address_to_resource there exists a sparc-specific implementation in arch/sparc/arch/sparc/kernel/of_device_common.c Solution suggested by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andreas Larsson <andreas@gaisler.com> Acked-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sparc32, leon: Check for existent irq_map entry in leon_handle_ext_irqAndreas Larsson2012-11-101-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an irq is being unlinked concurrently with leon_handle_ext_irq, irq_map[eirq] might be null in leon_handle_ext_irq. Make sure that this is not dereferenced. Signed-off-by: Andreas Larsson <andreas@gaisler.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sparc: Add sparc support for platform_get_irq()Andreas Larsson2012-11-101-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds sparc support for platform_get_irq that in the normal case use platform_get_resource() to get an irq. This standard approach fails for sparc as there are no resources of type IORESOURCE_IRQ for irqs for sparc. Cross platform drivers can then use this standard platform function and work on sparc instead of having to have a special case for sparc. Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sparc: Allow OF_GPIO on sparc.David S. Miller2012-11-071-1/+1
| | | | | | | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>
| * | qlogicpti: Fix build warning.David S. Miller2012-10-281-12/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The build warns: drivers/scsi/qlogicpti.c: In function 'qpti_sbus_probe': drivers/scsi/qlogicpti.c:1316:45: warning: passing argument 1 of 'scsi_host_alloc' discards 'const' qualifier from pointer target type [enabled by default] include/scsi/scsi_host.h:778:26: note: expected 'struct scsi_host_template *' but argument is of type 'const struct scsi_host_template *' The problem is that of_device_id->data is a const void pointer. This is pretty silly in this specific instance, because for all matched device IDs we set match->data to the same value, &qpti_template. So just use that directly instead of the unnecessary and improperly typed abstraction. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sparc: Wire up sys_kcmp.David S. Miller2012-10-283-3/+5
| | | | | | | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sparc64: Improvde documentation and readability of atomic backoff code.David S. Miller2012-10-285-10/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Document what's going on in asm/backoff.h with a large and descriptive comment. Refer to it above the cpu_relax() definition in asm/processor_64.h Rename the pause patching section to have "3insn" in it's name like the other patching sections do. Based upon feedback from Sam Ravnborg. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sparc64: Use pause instruction when available.David S. Miller2012-10-285-16/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In atomic backoff and cpu_relax(), use the pause instruction found on SPARC-T4 and later. It makes the cpu strand unselectable for the given number of cycles, unless an intervening disrupting trap occurs. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sparc64: Fix cpu strand yielding.David S. Miller2012-10-282-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For atomic backoff, we just loop over an exponentially backed off counter. This is extremely ineffective as it doesn't actually yield the cpu strand so that other competing strands can use the cpu core. In cpus previous to SPARC-T4 we have to do this in a slightly hackish way, by doing an operation with no side effects that also happens to mark the strand as unavailable. The mechanism we choose for this is three reads of the %ccr (condition-code) register into %g0 (the zero register). SPARC-T4 has an explicit "pause" instruction, and we'll make use of that in a subsequent commit. Yield strands also in cpu_relax(). We really should have done this a very long time ago. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sparc64: Make montmul/montsqr/mpmul usable in 32-bit threads.David S. Miller2012-10-2713-61/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Montgomery Multiply, Montgomery Square, and Multiple-Precision Multiply instructions work by loading a combination of the floating point and multiple register windows worth of integer registers with the inputs. These values are 64-bit. But for 32-bit userland processes we only save the low 32-bits of each integer register during a register spill. This is because the register window save area is in the user stack and has a fixed layout. Therefore, the only way to use these instruction in 32-bit mode is to perform the following sequence: 1) Load the top-32bits of a choosen integer register with a sentinel, say "-1". This will be in the outer-most register window. The idea is that we're trying to see if the outer-most register window gets spilled, and thus the 64-bit values were truncated. 2) Load all the inputs for the montmul/montsqr/mpmul instruction, down to the inner-most register window. 3) Execute the opcode. 4) Traverse back up to the outer-most register window. 5) Check the sentinel, if it's still "-1" store the results. Otherwise retry the entire sequence. This retry is extremely troublesome. If you're just unlucky and an interrupt or other trap happens, it'll push that outer-most window to the stack and clear the sentinel when we restore it. We could retry forever and never make forward progress if interrupts arrive at a fast enough rate (consider perf events as one example). So we have do limited retries and fallback to software which is extremely non-deterministic. Luckily it's very straightforward to provide a mechanism to let 32-bit applications use a 64-bit stack. Stacks in 64-bit mode are biased by 2047 bytes, which means that the lowest bit is set in the actual %sp register value. So if we see bit zero set in a 32-bit application's stack we treat it like a 64-bit stack. Runtime detection of such a facility is tricky, and cumbersome at best. For example, just trying to use a biased stack and seeing if it works is hard to recover from (the signal handler will need to use an alt stack, plus something along the lines of longjmp). Therefore, we add a system call to report a bitmask of arch specific features like this in a cheap and less hairy way. With help from Andy Polyakov. Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2012-11-102-30/+30
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull cifs fixes from Jeff Layton. * 'for-linus' of git://git.samba.org/sfrench/cifs-2.6: cifs: Do not lookup hashed negative dentry in cifs_atomic_open cifs: fix potential buffer overrun in cifs.idmap handling code
| * | | cifs: Do not lookup hashed negative dentry in cifs_atomic_openSachin Prabhu2012-11-051-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We do not need to lookup a hashed negative directory since we have already revalidated it before and have found it to be fine. This also prevents a crash in cifs_lookup() when it attempts to rehash the already hashed negative lookup dentry. The patch has been tested using the reproducer at https://bugzilla.redhat.com/show_bug.cgi?id=867344#c28 Cc: <stable@kernel.org> # 3.6.x Reported-by: Vit Zahradka <vit.zahradka@tiscali.cz> Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
| * | | cifs: fix potential buffer overrun in cifs.idmap handling codeJeff Layton2012-11-031-29/+20
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The userspace cifs.idmap program generally works with the wbclient libs to generate binary SIDs in userspace. That program defines the struct that holds these values as having a max of 15 subauthorities. The kernel idmapping code however limits that value to 5. When the kernel copies those values around though, it doesn't sanity check the num_subauths value handed back from userspace or from the server. It's possible therefore for userspace to hand us back a bogus num_subauths value (or one that's valid, but greater than 5) that could cause the kernel to walk off the end of the cifs_sid->sub_auths array. Fix this by defining a new routine for copying sids and using that in all of the places that copy it. If we end up with a sid that's longer than expected then this approach will just lop off the "extra" subauths, but that's basically what the code does today already. Better approaches might be to fix this code to reject SIDs with >5 subauths, or fix it to handle the subauths array dynamically. At the same time, change the kernel to check the length of the data returned by userspace. If it's shorter than struct cifs_sid, reject it and return -EIO. If that happens we'll end up with fields that are basically uninitialized. Long term, it might make sense to redefine cifs_sid using a flexarray at the end, to allow for variable-length subauth lists, and teach the code to handle the case where the subauths array being passed in from userspace is shorter than 5 elements. Note too, that I don't consider this a security issue since you'd need a compromised cifs.idmap program. If you have that, you can do all sorts of nefarious stuff. Still, this is probably reasonable for stable. Cc: stable@kernel.org Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com>
* | | Merge tag 'arm64-fixes' of ↵Linus Torvalds2012-11-1011-47/+20
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64 Pull arm64 fixes from Catalin Marinas: - correct argument type (pgprot_t) when calling __ioremap() - PCI_IOBASE virtual address change - use architected event for CPU cycle counter - fix ELF core dumping - select CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION - missing completion for secondary CPU boot - booting on systems with all memory beyond 4GB * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: arm64: mm: fix booting on systems with no memory below 4GB arm64: smp: add missing completion for secondary boot arm64: compat: select CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION arm64: elf: fix core dumping definitions for GP and FP registers arm64: perf: use architected event for CPU cycle counter arm64: Move PCI_IOBASE closer to MODULES_VADDR arm64: Use pgprot_t as the last argument when invoking __ioremap()
| * | | arm64: mm: fix booting on systems with no memory below 4GBWill Deacon2012-11-082-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Booting on a system with all of its memory above the 4GB boundary breaks for two reasons: (1) We still try to create a non-empty DMA32 zone (2) no-bootmem limits allocations to 0xffffffff This patch fixes these issues for ARM64. Tested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * | | arm64: smp: add missing completion for secondary bootWill Deacon2012-11-081-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 149c24151e85 ("ARM: SMP: use a timing out completion for cpu hotplug") modified arm's CPU up path to use completions. It seems that we only got half of this patch for arm64, so add the missing call to complete. Reported-by: Jon Brawn <jon.brawn@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * | | arm64: compat: select CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSIONWill Deacon2012-11-082-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit c1d7e01d7877 ("ipc: use Kconfig options for __ARCH_WANT_[COMPAT_]IPC_PARSE_VERSION") replaced the __ARCH_WANT_COMPAT_IPC_PARSE_VERSION token with a corresponding Kconfig option instead. This patch updates arm64 to use the latter, rather than #define an unused token. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * | | arm64: elf: fix core dumping definitions for GP and FP registersWill Deacon2012-11-083-25/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct user_fp does not exist for arm64, so use struct user_fpsimd_state instead for the ELF core dumping definitions. Furthermore, since we use regset-based core dumping, we do not need definitions for dump_task_regs and dump_fpu. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * | | arm64: perf: use architected event for CPU cycle counterWill Deacon2012-11-081-8/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently use a fake event encoding (0xFF) to indicate CPU cycles so that we don't waste an event counter and can target the hardware cycle counter instead. The problem with this approach is that the event space defined by the architecture permits an implementation to allocate 0xFF for some other event. This patch uses the architected cycle counter encoding (0x11) so that we avoid potentially clashing with event encodings on future CPU implementations. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * | | Merge tag 'v3.7-rc4' into upstream-masterCatalin Marinas2012-11-05626-5016/+7780
| |\ \ \ | | | | | | | | | | | | | | | Linux 3.7-rc4
| * | | | arm64: Move PCI_IOBASE closer to MODULES_VADDRCatalin Marinas2012-10-232-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is to reuse the same pmd table that is sparsely populated with the modules space. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * | | | arm64: Use pgprot_t as the last argument when invoking __ioremap()Catalin Marinas2012-10-231-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Even if it works with since the types have the same size, the correct type of the last __ioremap() argument is pgprot_t rather than pteval_t. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>