linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	sched: Try not to migrate higher priority RT tasks	Steven Rostedt	2010-09-21	1	-10/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When first working on the RT scheduler design, we concentrated on keeping all CPUs running RT tasks instead of having multiple RT tasks on a single CPU waiting for the migration thread to move them. Instead we take a more proactive stance and push or pull RT tasks from one CPU to another on wakeup or scheduling. When an RT task wakes up on a CPU that is running another RT task, instead of preempting it and killing the cache of the running RT task, we look to see if we can migrate the RT task that is waking up, even if the RT task waking up is of higher priority. This may sound a bit odd, but RT tasks should be limited in migration by the user anyway. But in practice, people do not do this, which causes high prio RT tasks to bounce around the CPUs. This becomes even worse when we have priority inheritance, because a high prio task can block on a lower prio task and boost its priority. When the lower prio task wakes up the high prio task, if it happens to be on the same CPU it will migrate off of it. But in reality, the above does not happen much either, because the wake up of the lower prio task, which has already been boosted, if it was on the same CPU as the higher prio task, it would then migrate off of it. But anyway, we do not want to migrate them either. To examine the scheduling, I created a test program and examined it under kernelshark. The test program created CPU * 2 threads, where each thread had a different priority. The program takes different options. The options used in this change log was to have priority inheritance mutexes or not. All threads did the following loop: static void grab_lock(long id, int iter, int l) { ftrace_write("thread %ld iter %d, taking lock %d\n", id, iter, l); pthread_mutex_lock(&locks[l]); ftrace_write("thread %ld iter %d, took lock %d\n", id, iter, l); busy_loop(nr_tasks - id); ftrace_write("thread %ld iter %d, unlock lock %d\n", id, iter, l); pthread_mutex_unlock(&locks[l]); } void start_task(void id) { [...] while (!done) { for (l = 0; l < nr_locks; l++) { grab_lock(id, i, l); ftrace_write("thread %ld iter %d sleeping\n", id, i); ms_sleep(id); } i++; } [...] } The busy_loop(ms) keeps the CPU spinning for ms milliseconds. The ms_sleep(ms) sleeps for ms milliseconds. The ftrace_write() writes to the ftrace buffer to help analyze via ftrace. The higher the id, the higher the prio, the shorter it does the busy loop, but the longer it spins. This is usually the case with RT tasks, the lower priority tasks usually run longer than higher priority tasks. At the end of the test, it records the number of loops each thread took, as well as the number of voluntary preemptions, non-voluntary preemptions, and number of migrations each thread took, taking the information from /proc/$$/sched and /proc/$$/status. Running this on a 4 CPU processor, the results without changes to the kernel looked like this: Task vol nonvol migrated iterations ---- --- ------ -------- ---------- 0: 53 3220 1470 98 1: 562 773 724 98 2: 752 933 1375 98 3: 749 39 697 98 4: 758 5 515 98 5: 764 2 679 99 6: 761 2 535 99 7: 757 3 346 99 total: 5156 4977 6341 787 Each thread regardless of priority migrated a few hundred times. The higher priority tasks, were a little better but still took quite an impact. By letting higher priority tasks bump the lower prio task from the CPU, things changed a bit: Task vol nonvol migrated iterations ---- --- ------ -------- ---------- 0: 37 2835 1937 98 1: 666 1821 1865 98 2: 654 1003 1385 98 3: 664 635 973 99 4: 698 197 352 99 5: 703 101 159 99 6: 708 1 75 99 7: 713 1 2 99 total: 4843 6594 6748 789 The total # of migrations did not change (several runs showed the difference all within the noise). But we now see a dramatic improvement to the higher priority tasks. (kernelshark showed that the watchdog timer bumped the highest priority task to give it the 2 count. This was actually consistent with every run). Notice that the # of iterations did not change either. The above was with priority inheritance mutexes. That is, when the higher prority task blocked on a lower priority task, the lower priority task would inherit the higher priority task (which shows why task 6 was bumped so many times). When not using priority inheritance mutexes, the current kernel shows this: Task vol nonvol migrated iterations ---- --- ------ -------- ---------- 0: 56 3101 1892 95 1: 594 713 937 95 2: 625 188 618 95 3: 628 4 491 96 4: 640 7 468 96 5: 631 2 501 96 6: 641 1 466 96 7: 643 2 497 96 total: 4458 4018 5870 765 Not much changed with or without priority inheritance mutexes. But if we let the high priority task bump lower priority tasks on wakeup we see: Task vol nonvol migrated iterations ---- --- ------ -------- ---------- 0: 115 3439 2782 98 1: 633 1354 1583 99 2: 652 919 1218 99 3: 645 713 934 99 4: 690 3 3 99 5: 694 1 4 99 6: 720 3 4 99 7: 747 0 1 100 Which shows a even bigger change. The big difference between task 3 and task 4 is because we have only 4 CPUs on the machine, causing the 4 highest prio tasks to always have preference. Although I did not measure cache misses, and I'm sure there would be little to measure since the test was not data intensive, I could imagine large improvements for higher priority tasks when dealing with lower priority tasks. Thus, I'm satisfied with making the change and agreeing with what Gregory Haskins argued a few years ago when we first had this discussion. One final note. All tasks in the above tests were RT tasks. Any RT task will always preempt a non RT task that is running on the CPU the RT task wants to run on. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Gregory Haskins <ghaskins@novell.com> LKML-Reference: <20100921024138.605460343@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
*	sched: Increment cache_nice_tries only on periodic lb	Venkatesh Pallipadi	2010-09-21	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	scheduler uses cache_nice_tries as an indicator to do cache_hot and active load balance, when normal load balance fails. Currently, this value is changed on any failed load balance attempt. That ends up being not so nice to workloads that enter/exit idle often, as they do more frequent new_idle balance and that pretty soon results in cache hot tasks being pulled in. Making the cache_nice_tries ignore failed new_idle balance seems to make better sense. With that only the failed load balance in periodic load balance gets accounted and the rate of accumulation of cache_nice_tries will not depend on idle entry/exit (short running sleep-wakeup kind of tasks). This reduces movement of cache_hot tasks. schedstat diff (after-before) excerpt from a workload that has frequent and short wakeup-idle pattern (:2 in cpu col below refers to NEWIDLE idx) This snapshot was across ~400 seconds. Without this change: domainstats: domain0 cpu cnt bln fld imb gain hgain nobusyq nobusyg 0:2 306487 219575 73167 110069413 44583 19070 1172 218403 1:2 292139 194853 81421 120893383 50745 21902 1259 193594 2:2 283166 174607 91359 129699642 54931 23688 1287 173320 3:2 273998 161788 93991 132757146 57122 24351 1366 160422 4:2 289851 215692 62190 83398383 36377 13680 851 214841 5:2 316312 222146 77605 117582154 49948 20281 988 221158 6:2 297172 195596 83623 122133390 52801 21301 929 194667 7:2 283391 178078 86378 126622761 55122 22239 928 177150 8:2 297655 210359 72995 110246694 45798 19777 1125 209234 9:2 297357 202011 79363 119753474 50953 22088 1089 200922 10:2 278797 178703 83180 122514385 52969 22726 1128 177575 11:2 272661 167669 86978 127342327 55857 24342 1195 166474 12:2 293039 204031 73211 110282059 47285 19651 948 203083 13:2 289502 196762 76803 114712942 49339 20547 1016 195746 14:2 264446 169609 78292 115715605 50459 21017 982 168627 15:2 260968 163660 80142 116811793 51483 21281 1064 162596 With this change: domainstats: domain0 cpu cnt bln fld imb gain hgain nobusyq nobusyg 0:2 272347 187380 77455 105420270 24975 1 953 186427 1:2 267276 172360 86234 116242264 28087 6 1028 171332 2:2 259769 156777 93281 123243134 30555 1 1043 155734 3:2 250870 143129 97627 127370868 32026 6 1188 141941 4:2 248422 177116 64096 78261112 22202 2 757 176359 5:2 275595 180683 84950 116075022 29400 6 778 179905 6:2 262418 162609 88944 119256898 31056 4 817 161792 7:2 252204 147946 92646 122388300 32879 4 824 147122 8:2 262335 172239 81631 110477214 26599 4 864 171375 9:2 261563 164775 88016 117203621 28331 3 849 163926 10:2 243389 140949 93379 121353071 29585 2 909 140040 11:2 242795 134651 98310 124768957 30895 2 1016 133635 12:2 255234 166622 79843 104696912 26483 4 746 165876 13:2 244944 151595 83855 109808099 27787 3 801 150794 14:2 241301 140982 89935 116954383 30403 6 845 140137 15:2 232271 128564 92821 119185207 31207 4 1416 127148 Signed-off-by: Venkatesh Pallipadi <venki@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1284167957-3675-1-git-send-email-venki@google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
*	Merge commit 'v2.6.36-rc5' into sched/core	Ingo Molnar	2010-09-21	622	-3977/+7024
\|\ \| \| \| \| \| \| \| \| \| \|	Merge reason: Pick up the latest fixes in -rc5. Signed-off-by: Ingo Molnar <mingo@elte.hu>
\| *	Linux 2.6.36-rc5v2.6.36-rc5	Linus Torvalds	2010-09-21	1	-1/+1
\| \|
\| *	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6	Linus Torvalds	2010-09-21	3	-22/+10
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: Staging: vt6655: fix buffer overflow Revert: "Staging: batman-adv: Adding netfilter-bridge hooks"
\| \| *	Staging: vt6655: fix buffer overflow	Dan Carpenter	2010-09-21	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"param->u.wpa_associate.wpa_ie_len" comes from the user. We should check it so that the copy_from_user() doesn't overflow the buffer. Also further down in the function, we assume that if "param->u.wpa_associate.wpa_ie_len" is set then "abyWPAIE[0]" is initialized. To make that work, I changed the test here to say that if "wpa_ie_len" is set then "wpa_ie" has to be a valid pointer or we return -EINVAL. Oddly, we only use the first element of the abyWPAIE[] array. So I suspect there may be some other issues in this function. Signed-off-by: Dan Carpenter <error27@gmail.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
\| \| *	Revert: "Staging: batman-adv: Adding netfilter-bridge hooks"	Sven Eckelmann	2010-09-21	2	-19/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 96d592ed599434d2d5f339a1d282871bc6377d2c. The netfilter hook seems to be misused and may leak skbs in situations when NF_HOOK returns NF_STOLEN. It may not filter everything as expected. Also the ethernet bridge tables are not yet capable to understand batman-adv packet correctly. It was only added for testing purposes and can be removed again. Reported-by: Vasiliy Kulikov <segooon@gmail.com> Signed-off-by: Sven Eckelmann <sven.eckelmann@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
\| * \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6	Linus Torvalds	2010-09-21	6	-32/+66
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: USB: musb: MAINTAINERS: Fix my mail address USB: serial/mos*: prevent reading uninitialized stack memory USB: otg: twl4030: fix phy initialization(v1) USB: EHCI: Disable langwell/penwell LPM capability usb: musb_debugfs: don't use the struct file private_data field with seq_files
\| \| * \|	USB: musb: MAINTAINERS: Fix my mail address	Felipe Balbi	2010-09-21	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we don't, contributors to musb and any USB OMAP code will be sending mails to an unexistent inbox. Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
\| \| * \|	USB: serial/mos*: prevent reading uninitialized stack memory	Dan Rosenberg	2010-09-21	2	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The TIOCGICOUNT device ioctl in both mos7720.c and mos7840.c allows unprivileged users to read uninitialized stack memory, because the "reserved" member of the serial_icounter_struct struct declared on the stack is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
\| \| * \|	USB: otg: twl4030: fix phy initialization(v1)	Ming Lei	2010-09-21	1	-27/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 461c317705eca5cac09a360f488715927fd0a927(into 2.6.36-v3) is put forward to power down phy if no usb cable is connected, but does introduce the two issues below: 1), phy is not into work state if usb cable is connected with PC during poweron, so musb device mode is not usable in such case, follows the reasons: -twl4030_phy_resume is not called, so regulators are not enabled i2c access are not enabled usb mode not configurated 2), The kernel warings[1] of regulators 'unbalanced disables' is caused if poweron without usb cable connected with PC or b-device. This patch fixes the two issues above: -power down phy only if no usb cable is connected with PC and b-device -do phy initialization(via __twl4030_phy_resume) if usb cable is connected with PC(vbus event) or another b-device(ID event) in twl4030_usb_probe. This patch also doesn't put VUSB3V1 LDO into active mode in twl4030_usb_ldo_init until VBUS/ID change detected, so we can save more power consumption than before. This patch is verified OK on Beagle board either connected with usb cable or not when poweron. [1]. warnings of 'unbalanced disables' of regulators. [root@OMAP3EVM /]# dmesg ------------[ cut here ]------------ WARNING: at drivers/regulator/core.c:1357 _regulator_disable+0x38/0x128() unbalanced disables for VUSB1V8 Modules linked in: Backtrace: [<c0030c48>] (dump_backtrace+0x0/0x110) from [<c034f5a8>] (dump_stack+0x18/0x1c) r7:c78179d8 r6:c01ed6b8 r5:c0410822 r4:0000054d [<c034f590>] (dump_stack+0x0/0x1c) from [<c0057da8>] (warn_slowpath_common+0x54/0x6c) [<c0057d54>] (warn_slowpath_common+0x0/0x6c) from [<c0057e64>] (warn_slowpath_fmt+0x38/0x40) r9:00000000 r8:00000000 r7:c78e6608 r6:00000000 r5:fffffffb r4:c78e6c00 [<c0057e2c>] (warn_slowpath_fmt+0x0/0x40) from [<c01ed6b8>] (_regulator_disable+0x38/0x128) r3:c0410e53 r2:c0410ad5 [<c01ed680>] (_regulator_disable+0x0/0x128) from [<c01ed87c>] (regulator_disable+0x24/0x38) r7:c78e6608 r6:00000000 r5:c78e6c40 r4:c78e6c00 [<c01ed858>] (regulator_disable+0x0/0x38) from [<c02382dc>] (twl4030_phy_power+0x15c/0x17c) r5:c78595c0 r4:00000000 [<c0238180>] (twl4030_phy_power+0x0/0x17c) from [<c023831c>] (twl4030_phy_suspend+0x20/0x2c) r6:00000000 r5:c78595c0 r4:c78595c0 [<c02382fc>] (twl4030_phy_suspend+0x0/0x2c) from [<c0238638>] (twl4030_usb_irq+0x11c/0x16c) r5:c78595c0 r4:00000040 [<c023851c>] (twl4030_usb_irq+0x0/0x16c) from [<c034ec18>] (twl4030_usb_probe+0x2c4/0x32c) r6:00000000 r5:00000000 r4:c78595c0 [<c034e954>] (twl4030_usb_probe+0x0/0x32c) from [<c02152a0>] (platform_drv_probe+0x20/0x24) r7:00000000 r6:c047d49c r5:c78e6608 r4:c047d49c [<c0215280>] (platform_drv_probe+0x0/0x24) from [<c0214244>] (driver_probe_device+0xd0/0x190) [<c0214174>] (driver_probe_device+0x0/0x190) from [<c02143d4>] (__device_attach+0x44/0x48) r7:00000000 r6:c78e6608 r5:c78e6608 r4:c047d49c [<c0214390>] (__device_attach+0x0/0x48) from [<c0213694>] (bus_for_each_drv+0x50/0x90) r5:c0214390 r4:00000000 [<c0213644>] (bus_for_each_drv+0x0/0x90) from [<c0214474>] (device_attach+0x70/0x94) r6:c78e663c r5:c78e6608 r4:c78e6608 [<c0214404>] (device_attach+0x0/0x94) from [<c02134fc>] (bus_probe_device+0x2c/0x48) r7:00000000 r6:00000002 r5:c78e6608 r4:c78e6600 [<c02134d0>] (bus_probe_device+0x0/0x48) from [<c0211e48>] (device_add+0x340/0x4b4) [<c0211b08>] (device_add+0x0/0x4b4) from [<c021597c>] (platform_device_add+0x110/0x16c) [<c021586c>] (platform_device_add+0x0/0x16c) from [<c0220cb0>] (add_numbered_child+0xd8/0x118) r7:00000000 r6:c045f15c r5:c78e6600 r4:00000000 [<c0220bd8>] (add_numbered_child+0x0/0x118) from [<c001c618>] (twl_probe+0x3a4/0x72c) [<c001c274>] (twl_probe+0x0/0x72c) from [<c02601ac>] (i2c_device_probe+0x7c/0xa4) [<c0260130>] (i2c_device_probe+0x0/0xa4) from [<c0214244>] (driver_probe_device+0xd0/0x190) r5:c7856e20 r4:c047c860 [<c0214174>] (driver_probe_device+0x0/0x190) from [<c02143d4>] (__device_attach+0x44/0x48) r7:c7856e04 r6:c7856e20 r5:c7856e20 r4:c047c860 [<c0214390>] (__device_attach+0x0/0x48) from [<c0213694>] (bus_for_each_drv+0x50/0x90) r5:c0214390 r4:00000000 [<c0213644>] (bus_for_each_drv+0x0/0x90) from [<c0214474>] (device_attach+0x70/0x94) r6:c7856e54 r5:c7856e20 r4:c7856e20 [<c0214404>] (device_attach+0x0/0x94) from [<c02134fc>] (bus_probe_device+0x2c/0x48) r7:c7856e04 r6:c78fd048 r5:c7856e20 r4:c7856e20 [<c02134d0>] (bus_probe_device+0x0/0x48) from [<c0211e48>] (device_add+0x340/0x4b4) [<c0211b08>] (device_add+0x0/0x4b4) from [<c0211fd8>] (device_register+0x1c/0x20) [<c0211fbc>] (device_register+0x0/0x20) from [<c0260aa8>] (i2c_new_device+0xec/0x150) r5:c7856e00 r4:c7856e20 [<c02609bc>] (i2c_new_device+0x0/0x150) from [<c0260dc0>] (i2c_register_adapter+0xa0/0x1c4) r7:00000000 r6:c78fd078 r5:c78fd048 r4:c781d5c0 [<c0260d20>] (i2c_register_adapter+0x0/0x1c4) from [<c0260f80>] (i2c_add_numbered_adapter+0x9c/0xb4) r7:00000a28 r6:c04600a8 r5:c78fd048 r4:00000000 [<c0260ee4>] (i2c_add_numbered_adapter+0x0/0xb4) from [<c034efa4>] (omap_i2c_probe+0x324/0x3e8) r5:00000000 r4:c78fd000 [<c034ec80>] (omap_i2c_probe+0x0/0x3e8) from [<c02152a0>] (platform_drv_probe+0x20/0x24) [<c0215280>] (platform_drv_probe+0x0/0x24) from [<c0214244>] (driver_probe_device+0xd0/0x190) [<c0214174>] (driver_probe_device+0x0/0x190) from [<c021436c>] (__driver_attach+0x68/0x8c) r7:c78b2140 r6:c047e214 r5:c04600e4 r4:c04600b0 [<c0214304>] (__driver_attach+0x0/0x8c) from [<c021399c>] (bus_for_each_dev+0x50/0x84) r7:c78b2140 r6:c047e214 r5:c0214304 r4:00000000 [<c021394c>] (bus_for_each_dev+0x0/0x84) from [<c0214068>] (driver_attach+0x20/0x28) r6:c047e214 r5:c047e214 r4:c00270d0 [<c0214048>] (driver_attach+0x0/0x28) from [<c0213274>] (bus_add_driver+0xa8/0x228) [<c02131cc>] (bus_add_driver+0x0/0x228) from [<c02146a4>] (driver_register+0xb0/0x13c) [<c02145f4>] (driver_register+0x0/0x13c) from [<c0215744>] (platform_driver_register+0x4c/0x60) r9:00000000 r8:c001f688 r7:00000013 r6:c005b6fc r5:c00083dc r4:c00270d0 [<c02156f8>] (platform_driver_register+0x0/0x60) from [<c001f69c>] (omap_i2c_init_driver+0x14/0x1c) [<c001f688>] (omap_i2c_init_driver+0x0/0x1c) from [<c002c460>] (do_one_initcall+0xd0/0x1a4) [<c002c390>] (do_one_initcall+0x0/0x1a4) from [<c0008478>] (kernel_init+0x9c/0x154) [<c00083dc>] (kernel_init+0x0/0x154) from [<c005b6fc>] (do_exit+0x0/0x688) r5:c00083dc r4:00000000 ---[ end trace 1b75b31a2719ed1d ]--- Signed-off-by: Ming Lei <tom.leiming@gmail.com> Cc: David Brownell <dbrownell@users.sourceforge.net> Cc: Felipe Balbi <me@felipebalbi.com> Cc: Anand Gadiyar <gadiyar@ti.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
\| \| * \|	USB: EHCI: Disable langwell/penwell LPM capability	Alek Du	2010-09-21	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have to do so due to HW limitation. Signed-off-by: Alek Du <alek.du@intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
\| \| * \|	usb: musb_debugfs: don't use the struct file private_data field with seq_files	Mathias Nyman	2010-09-21	1	-3/+2
\| \| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	seq_files use the private_data field of a file struct for storing a seq_file structure, data should be stored in seq_file's own private field (e.g. file->private_data->private) Otherwise seq_release() will free the private data when the file is closed. Signed-off-by: Mathias Nyman <mathias.nyman@nokia.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
\| * \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6	Linus Torvalds	2010-09-21	2	-15/+11
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: serial: mfd: fix bug in serial_hsu_remove() serial: amba-pl010: fix set_ldisc
\| \| * \|	serial: mfd: fix bug in serial_hsu_remove()	Feng Tang	2010-09-21	1	-8/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Medfield HSU driver deal with 4 pci devices(3 uart ports + 1 dma controller), so in pci remove func, we need handle them differently Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
\| \| * \|	serial: amba-pl010: fix set_ldisc	Mika Westerberg	2010-09-21	1	-7/+2
\| \| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit d87d9b7d1 ("tty: serial - fix tty referencing in set_ldisc") changed set_ldisc to take ldisc number as parameter. This patch fixes AMBA PL010 driver according the new prototype. Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi> Cc: Alan Cox <alan@linux.intel.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
\| * \|	frv: double syscall restarts, syscall restart in sigreturn()	Al Viro	2010-09-20	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to make sure that only the first do_signal() to be handled on the way out syscall will bother with syscall restarts; additionally, the check on the "signal has user handler" path had been wrong - compare with restart prevention in sigreturn()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
\| * \|	frv: handling of restart into restart_syscall is fscked	Al Viro	2010-09-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	do_signal() should place the syscall number in gr7, not gr8 when handling ERESTART_WOULDBLOCK. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
\| * \|	frv: avoid infinite loop of SIGSEGV delivery	Al Viro	2010-09-20	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use force_sigsegv() rather than force_sig(SIGSEGV, ...) as the former resets the SEGV handler pointer which will kill the process, rather than leaving it open to an infinite loop if the SEGV handler itself caused a SEGV signal. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
\| * \|	frv: fix address verification holes in setup_frame/setup_rt_frame	Al Viro	2010-09-20	1	-16/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a) sa_handler might be maliciously set to point to kernel memory; blindly dereferencing it in FDPIC case is a Bad Idea(tm). b) I'm not sure you need that set_fs(USER_DS) there at all, but if you do, you'd better do it before checking the frame you've decided to use with access_ok(), lest sigaltstack() becomes a convenient roothole. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
\| * \|	frv: restart_block.fn needs to be reset on sigreturn	Al Viro	2010-09-20	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reset restart_block.fn on executing a sigreturn such that any currently pending system call restarts will be forced to return -EINTR. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
\| * \|	mm: further fix swapin race condition	Hugh Dickins	2010-09-20	1	-3/+5
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 4969c1192d15 ("mm: fix swapin race condition") is now agreed to be incomplete. There's a race, not very much less likely than the original race envisaged, in which it is further necessary to check that the swapcache page's swap has not changed. Here's the reasoning: cast in terms of reuse_swap_page(), but probably could be reformulated to rely on try_to_free_swap() instead, or on swapoff+swapon. A, faults into do_swap_page(): does page1 = lookup_swap_cache(swap1) and comes through the lock_page(page1). B, a racing thread of the same process, faults on the same address: does page1 = lookup_swap_cache(swap1) and now waits in lock_page(page1), but for whatever reason is unlucky not to get the lock any time soon. A carries on through do_swap_page(), a write fault, but cannot reuse the swap page1 (another reference to swap1). Unlocks the page1 (but B doesn't get it yet), does COW in do_wp_page(), page2 now in that pte. C, perhaps the parent of A+B, comes in and write faults the same swap page1 into its mm, reuse_swap_page() succeeds this time, swap1 is freed. kswapd comes in after some time (B still unlucky) and swaps out some pages from A+B and C: it allocates the original swap1 to page2 in A+B, and some other swap2 to the original page1 now in C. But does not immediately free page1 (actually it couldn't: B holds a reference), leaving it in swap cache for now. B at last gets the lock on page1, hooray! Is PageSwapCache(page1)? Yes. Is pte_same(*page_table, orig_pte)? Yes, because page2 has now been given the swap1 which page1 used to have. So B proceeds to insert page1 into A+B's page_table, though its content now belongs to C, quite different from what A wrote there. B ought to have checked that page1's swap was still swap1. Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
\| *	Merge branch 'for-linus' of ↵	Linus Torvalds	2010-09-19	13	-131/+88
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6: alpha: deal with multiple simultaneously pending signals alpha: fix a 14 years old bug in sigreturn tracing alpha: unb0rk sigsuspend() and rt_sigsuspend() alpha: belated ERESTART_RESTARTBLOCK race fix alpha: Shift perf event pending work earlier in timer interrupt alpha: wire up fanotify and prlimit64 syscalls alpha: kill big kernel lock alpha: fix build breakage in asm/cacheflush.h alpha: remove unnecessary cast from void* in assignment. alpha: Use static const char * const where possible
\| \| *	alpha: deal with multiple simultaneously pending signals	Al Viro	2010-09-19	1	-5/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unlike the other targets, alpha sets _one_ sigframe and buggers off until the next syscall/interrupt, even if more signals are pending. It leads to quite a few unpleasant inconsistencies, starting with SIGSEGV potentially arriving not where it should and including e.g. mess with sigsuspend(); consider two pending signals blocked until sigsuspend() unblocks them. We pick the first one; then, if we are hit by interrupt while in the handler, we process the second one as well. If we are not, and if no syscalls had been made, we get out of the first handler and leave the second signal pending; normally sigreturn() would've picked it anyway, but here it starts with restoring the original mask and voila - the second signal is blocked again. On everything else we get both delivered consistently. It's actually easy to fix; the only thing to watch out for is prevention of double syscall restart. Fortunately, the idea I've nicked from arm fix by rmk works just fine... Testcase demonstrating the behaviour in question; on alpha we get one or both flags set (usually one), on everything else both are always set. #include <signal.h> #include <stdio.h> int had1, had2; void f1(int sig) { had1 = 1; } void f2(int sig) { had2 = 1; } main() { sigset_t set1, set2; sigemptyset(&set1); sigemptyset(&set2); sigaddset(&set2, 1); sigaddset(&set2, 2); signal(1, f1); signal(2, f2); sigprocmask(SIG_SETMASK, &set2, NULL); raise(1); raise(2); sigsuspend(&set1); printf("had1:%d had2:%d\n", had1, had2); } Tested-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matt Turner <mattst88@gmail.com>
\| \| *	alpha: fix a 14 years old bug in sigreturn tracing	Al Viro	2010-09-19	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The way sigreturn() is implemented on alpha breaks PTRACE_SYSCALL, all way back to 1.3.95 when alpha has grown PTRACE_SYSCALL support. What happens is direct return to ret_from_syscall, in order to bypass mangling of a3 (error indicator) and prevent other mutilations of registers (e.g. by syscall restart). That's fine, but... the entire TIF_SYSCALL_TRACE codepath is kept separate on alpha and post-syscall stopping/notifying the tracer is after the syscall. And the normal path we are forcibly switching to doesn't have it. So we end up with one stop in traced sigreturn() vs. two in other syscalls. And yes, strace is visibly broken by that; try to strace the following #include <signal.h> #include <stdio.h> void f(int sig) {} main() { signal(SIGHUP, f); raise(SIGHUP); write(1, "eeeek\n", 6); } and watch the show. The close(1) = 405 in the end of strace output is coming from return value of write() (6 == __NR_close on alpha) and syscall number of exit_group() (__NR_exit_group == 405 there). The fix is fairly simple - the only thing we end up missing is the call of syscall_trace() and we can tell whether we'd been called from the SYSCALL_TRACE path by checking ra value. Since we are setting the switch_stack up (that's what sys_sigreturn() does), we have the right environment for calling syscall_trace() - just before we call undo_switch_stack() and return. Since undo_switch_stack() will overwrite s0 anyway, we can use it to store the result of "has it been called from SYSCALL_TRACE path?" check. The same thing applies in rt_sigreturn(). Tested-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matt Turner <mattst88@gmail.com>
\| \| *	alpha: unb0rk sigsuspend() and rt_sigsuspend()	Al Viro	2010-09-19	3	-69/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Old code used to set regs->r0 and regs->r19 to force the right return value. Leaving that after switch to ERESTARTNOHAND was a Bad Idea(tm), since now that screws the restart - if we hit the case when get_signal_to_deliver() returns 0, we will step back to syscall insn, with v0 set to EINTR and a3 to 1. The latter won't matter, since EINTR is 4, aka __NR_write. Testcase: #include <signal.h> #define _GNU_SOURCE #include <unistd.h> #include <sys/syscall.h> main() { sigset_t mask; sigemptyset(&mask); sigaddset(&mask, SIGCONT); sigprocmask(SIG_SETMASK, &mask, NULL); kill(0, SIGCONT); syscall(__NR_sigsuspend, 1, "b0rken\n", 7); } results on alpha in immediate message to stdout... Fix is obvious; moreover, since we don't need regs anymore, we can switch to normal prototypes for these guys and lose the wrappers. Even better, rt_sigsuspend() is identical to generic version in kernel/signal.c now. Tested-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matt Turner <mattst88@gmail.com>
\| \| *	alpha: belated ERESTART_RESTARTBLOCK race fix	Al Viro	2010-09-19	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	same thing as had been done on other targets back in 2003 - move setting ->restart_block.fn into {rt_,}sigreturn(). Tested-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matt Turner <mattst88@gmail.com>
\| \| *	alpha: Shift perf event pending work earlier in timer interrupt	Michael Cree	2010-09-19	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pending work from the performance event subsystem is executed in the timer interrupt. This patch shifts the call to perf_event_do_pending() before the call to update_process_times() as the latter may call back into the perf event subsystem and it is prudent to have the pending work executed first. Signed-off-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Matt Turner <mattst88@gmail.com>
\| \| *	alpha: wire up fanotify and prlimit64 syscalls	Mikael Pettersson	2010-09-19	2	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 2.6.36-rc kernel added three new system calls: fanotify_init, fanotify_mark, and prlimit64. This patch wires them up on Alpha. Built and booted on an XP900. Untested beyond that. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Matt Turner <mattst88@gmail.com>
\| \| *	alpha: kill big kernel lock	Arnd Bergmann	2010-09-19	2	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All uses of the BKL on alpha are totally bogus, nothing is really protected by this. Remove the remaining users so we don't have to mark alpha as 'depends on BKL'. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: linux-alpha@vger.kernel.org Signed-off-by: Matt Turner <mattst88@gmail.com>
\| \| *	alpha: fix build breakage in asm/cacheflush.h	Tejun Heo	2010-09-19	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Alpha SMP flush_icache_user_range() is implemented as an inline function inside include/asm/cacheflush.h. It dereferences @current but doesn't include linux/sched.h and thus causes build failure if linux/sched.h wasn't included previously. Fix it by including the needed header file explicitly. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Matt Turner <mattst88@gmail.com>
\| \| *	alpha: remove unnecessary cast from void* in assignment.	matt mooney	2010-09-19	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Acked-by: Jan-Benedict Glaw <jbglaw@lug-owl.de> Signed-off-by: matt mooney <mfm@muteddisk.com> Signed-off-by: Matt Turner <mattst88@gmail.com>
\| \| *	alpha: Use static const char * const where possible	Joe Perches	2010-09-19	4	-38/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Acked-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Matt Turner <mattst88@gmail.com>
\| * \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6	Linus Torvalds	2010-09-19	1	-9/+3
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6: ide: Fix ordering of procfs registry.
\| \| * \|	ide: Fix ordering of procfs registry.	Wolfram Sang	2010-09-14	1	-9/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We must ensure that ide_proc_port_register_devices() occurs on an interface before ide_proc_register_driver() executes for that interfaces drives. Therefore defer the registry of the driver device objects backed by ide_bus_type until after ide_proc_port_register_devices() has run and thus all of the drive->proc procfs directory pointers have been setup. Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds	2010-09-19	22	-35/+136
\| \|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits) dca: disable dca on IOAT ver.3.0 multiple-IOH platforms netpoll: Disable IRQ around RCU dereference in netpoll_rx sctp: Do not reset the packet during sctp_packet_config(). net/llc: storing negative error codes in unsigned short MAINTAINERS: move atlx discussions to netdev drivers/net/cxgb3/cxgb3_main.c: prevent reading uninitialized stack memory drivers/net/eql.c: prevent reading uninitialized stack memory drivers/net/usb/hso.c: prevent reading uninitialized memory xfrm: dont assume rcu_read_lock in xfrm_output_one() r8169: Handle rxfifo errors on 8168 chips 3c59x: Remove atomic context inside vortex_{set\|get}_wol tcp: Prevent overzealous packetization by SWS logic. net: RPS needs to depend upon USE_GENERIC_SMP_HELPERS phylib: fix PAL state machine restart on resume net: use rcu_barrier() in rollback_registered_many bonding: correctly process non-linear skbs ipv4: enable getsockopt() for IP_NODEFRAG ipv4: force_igmp_version ignored when a IGMPv3 query received ppp: potential NULL dereference in ppp_mp_explode() net/llc: make opt unsigned in llc_ui_setsockopt() ...
\| \| * \| \|	dca: disable dca on IOAT ver.3.0 multiple-IOH platforms	Sosnowski, Maciej	2010-09-18	1	-6/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Direct Cache Access is not supported on IOAT ver.3.0 multiple-IOH platforms. This patch blocks registering of dca providers when multiple IOH detected with IOAT ver.3.0. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| * \| \|	netpoll: Disable IRQ around RCU dereference in netpoll_rx	Herbert Xu	2010-09-18	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We cannot use rcu_dereference_bh safely in netpoll_rx as we may be called with IRQs disabled. We could however simply disable IRQs as that too causes BH to be disabled and is safe in either case. Thanks to John Linville for discovering this bug and providing a patch. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| * \| \|	sctp: Do not reset the packet during sctp_packet_config().	Vlad Yasevich	2010-09-18	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sctp_packet_config() is called when getting the packet ready for appending of chunks. The function should not touch the current state, since it's possible to ping-pong between two transports when sending, and that can result packet corruption followed by skb overlfow crash. Reported-by: Thomas Dreibholz <dreibh@iem.uni-due.de> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| * \| \|	net/llc: storing negative error codes in unsigned short	Dan Carpenter	2010-09-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the alloc_skb() fails then we return 65431 instead of -ENOBUFS (-105). Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| * \| \|	MAINTAINERS: move atlx discussions to netdev	Chris Snook	2010-09-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The atlx drivers are sufficiently mature that we no longer need a separate mailing list for them. Move the discussion to netdev, so we can decommission atl1-devel, which is now mostly spam. Signed-off-by: Chris Snook <chris.snook@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| * \| \|	drivers/net/cxgb3/cxgb3_main.c: prevent reading uninitialized stack memory	Dan Rosenberg	2010-09-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed formatting (tabs and line breaks). The CHELSIO_GET_QSET_NUM device ioctl allows unprivileged users to read 4 bytes of uninitialized stack memory, because the "addr" member of the ch_reg struct declared on the stack in cxgb_extension_ioctl() is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| * \| \|	drivers/net/eql.c: prevent reading uninitialized stack memory	Dan Rosenberg	2010-09-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed formatting (tabs and line breaks). The EQL_GETMASTRCFG device ioctl allows unprivileged users to read 16 bytes of uninitialized stack memory, because the "master_name" member of the master_config_t struct declared on the stack in eql_g_master_cfg() is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| * \| \|	drivers/net/usb/hso.c: prevent reading uninitialized memory	Dan Rosenberg	2010-09-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed formatting (tabs and line breaks). The TIOCGICOUNT device ioctl allows unprivileged users to read uninitialized stack memory, because the "reserved" member of the serial_icounter_struct struct declared on the stack in hso_get_count() is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| * \| \|	xfrm: dont assume rcu_read_lock in xfrm_output_one()	Eric Dumazet	2010-09-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ip_local_out() is called with rcu_read_lock() held from ip_queue_xmit() but not from other call sites. Reported-and-bisected-by: Nick Bowler <nbowler@elliptictech.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| * \| \|	r8169: Handle rxfifo errors on 8168 chips	Matthew Garrett	2010-09-16	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Thinkpad X100e seems to have some odd behaviour when the display is powered off - the onboard r8169 starts generating rxfifo overflow errors. The root cause of this has not yet been identified and may well be a hardware design bug on the platform, but r8169 should be more resiliant to this. This patch enables the rxfifo interrupt on 8168 devices and removes the MAC version check in the interrupt handler, and the machine no longer crashes when under network load while the screen turns off. Signed-off-by: Matthew Garrett <mjg@redhat.com> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| * \| \|	3c59x: Remove atomic context inside vortex_{set\|get}_wol	Denis Kirjanov	2010-09-15	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is no need to use spinlocks in vortex_{set\|get}_wol. This also fixes a bug: [ 254.214993] 3c59x 0000:00:0d.0: PME# enabled [ 254.215021] BUG: sleeping function called from invalid context at kernel/mutex.c:94 [ 254.215030] in_atomic(): 0, irqs_disabled(): 1, pid: 4875, name: ethtool [ 254.215042] Pid: 4875, comm: ethtool Tainted: G W 2.6.36-rc3+ #7 [ 254.215049] Call Trace: [ 254.215050] [] __might_sleep+0xb1/0xb6 [ 254.215050] [] mutex_lock+0x17/0x30 [ 254.215050] [] acpi_enable_wakeup_device_power+0x2b/0xb1 [ 254.215050] [] acpi_pm_device_sleep_wake+0x42/0x7f [ 254.215050] [] acpi_pci_sleep_wake+0x5d/0x63 [ 254.215050] [] platform_pci_sleep_wake+0x1d/0x20 [ 254.215050] [] __pci_enable_wake+0x90/0xd0 [ 254.215050] [] acpi_set_WOL+0x8e/0xf5 [3c59x] [ 254.215050] [] vortex_set_wol+0x4e/0x5e [3c59x] [ 254.215050] [] dev_ethtool+0x1cf/0xb61 [ 254.215050] [] ? debug_mutex_free_waiter+0x45/0x4a [ 254.215050] [] ? __mutex_lock_common+0x204/0x20e [ 254.215050] [] ? __mutex_lock_slowpath+0x12/0x15 [ 254.215050] [] ? mutex_lock+0x23/0x30 [ 254.215050] [] dev_ioctl+0x42c/0x533 [ 254.215050] [] ? _cond_resched+0x8/0x1c [ 254.215050] [] ? lock_page+0x1c/0x30 [ 254.215050] [] ? page_address+0x15/0x7c [ 254.215050] [] ? filemap_fault+0x187/0x2c4 [ 254.215050] [] sock_ioctl+0x1d4/0x1e0 [ 254.215050] [] ? sock_ioctl+0x0/0x1e0 [ 254.215050] [] vfs_ioctl+0x19/0x33 [ 254.215050] [] do_vfs_ioctl+0x424/0x46f [ 254.215050] [] ? selinux_file_ioctl+0x3c/0x40 [ 254.215050] [] sys_ioctl+0x40/0x5a [ 254.215050] [] sysenter_do_call+0x12/0x22 vortex_set_wol protected with a spinlock, but nested acpi_set_WOL acquires a mutex inside atomic context. Ethtool operations are already serialized by RTNL mutex, so it is safe to drop the locks. Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| * \| \|	tcp: Prevent overzealous packetization by SWS logic.	Alexey Kuznetsov	2010-09-15	1	-2/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If peer uses tiny MSS (say, 75 bytes) and similarly tiny advertised window, the SWS logic will packetize to half the MSS unnecessarily. This causes problems with some embedded devices. However for large MSS devices we do want to half-MSS packetize otherwise we never get enough packets into the pipe for things like fast retransmit and recovery to work. Be careful also to handle the case where MSS > window, otherwise we'll never send until the probe timer. Reported-by: ツ Leandro Melo de Sales <leandroal@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| * \| \|	net: RPS needs to depend upon USE_GENERIC_SMP_HELPERS	David S. Miller	2010-09-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	You cannot invoke __smp_call_function_single() unless the architecture sets this symbol. Reported-by: Daniel Hellstrom <daniel@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| * \| \|	phylib: fix PAL state machine restart on resume	Simon Guinot	2010-09-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On resume, before starting the PAL state machine, check if the adjust_link() method is well supplied. If not, this would lead to a NULL pointer dereference in the phy_state_machine() function. This scenario can happen if the Ethernet driver call manually the PHY functions instead of using the PAL state machine. The mv643xx_eth driver is a such example. Signed-off-by: Simon Guinot <sguinot@lacie.com> Signed-off-by: David S. Miller <davem@davemloft.net>