summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* [SPARC64]: Fix typo in pgprot_noncached().David S. Miller2006-07-281-1/+1
| | | | | | The sun4v code sequence was or'ing in the sun4u pte bits by mistake. Signed-off-by: David S. Miller <davem@davemloft.net>
* [SPARC64]: Fix quad-float multiply emulation.David S. Miller2006-07-281-1/+1
| | | | | | | | | | Something is wrong with the 3-multiply (vs. 4-multiply) optimized version of _FP_MUL_MEAT_2_*(), so just use the slower version which actually computes correct values. Noticed by Rene Rebe Signed-off-by: David S. Miller <davem@davemloft.net>
* [PATCH] fix compile regression for a few scsi driversChristoph Hellwig2006-07-262-12/+6
| | | | | | | | | | | | | | | | | | | | | This fixes three drivers to compile again after my patch that removes the data_cmnd member from struct scsi_cmnd. The fas216 change is trivial, it should have been using ->cmnd all the time. NCR53C9 (which seem to be mostly duplicate driver with esp.c!) is doing something odd, it should only have looked at ->cmnd before not the saved copy that is kept for the error handlers sake. Note that it really should deal with the sync setting themselves but use the generic domain validation code that get this right - but that's for later let's push this simple compile fix for now. And sorry for the late fix for this, I have been busy with OLS and associated activities last week. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds2006-07-263-9/+8
|\ | | | | | | | | | | | | * master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SCSI] esp: Fix build. [SPARC]: Fix SA_STATIC_ALLOC value. [SPARC64]: Explicitly print return PC when the kernel fault PC is bogus.
| * [SCSI] esp: Fix build.David S. Miller2006-07-251-8/+4
| | | | | | | | | | | | | | | | The data_cmd[] member got deleted, so do not use it any more. Scsi commands do not have their ->cmd[] overwritten temporary to probe for status after an error before retrying. Signed-off-by: David S. Miller <davem@davemloft.net>
| * [SPARC]: Fix SA_STATIC_ALLOC value.David S. Miller2006-07-251-1/+1
| | | | | | | | | | | | | | It alises IRQF_SHARED which causes all kinds of problems. Signed-off-by: David S. Miller <davem@davemloft.net>
| * [SPARC64]: Explicitly print return PC when the kernel fault PC is bogus.David S. Miller2006-07-251-0/+3
| | | | | | | | | | | | | | That way we'll have at least some debugging info even if the stack dump explodes. Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2006-07-2617-51/+167
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [IPV4/IPV6]: Setting 0 for unused port field in RAW IP recvmsg(). [IPV4] ipmr: ip multicast route bug fix. [TG3]: Update version and reldate [TG3]: Handle tg3_init_rings() failures [TG3]: Add tg3_restart_hw() [IPV4]: Clear the whole IPCB, this clears also IPCB(skb)->flags. [IPV6]: Clean skb cb on IPv6 input. [NETFILTER]: Demote xt_sctp to EXPERIMENTAL [NETFILTER]: bridge netfilter: add deferred output hooks to feature-removal-schedule [NETFILTER]: xt_pkttype: fix mismatches on locally generated packets [NETFILTER]: SNMP NAT: fix byteorder confusion [NETFILTER]: conntrack: fix SYSCTL=n compile [NETFILTER]: nf_queue: handle NF_STOP and unknown verdicts in nf_reinject [NETFILTER]: H.323 helper: fix possible NULL-ptr dereference
| * | [IPV4/IPV6]: Setting 0 for unused port field in RAW IP recvmsg().Tetsuo Handa2006-07-262-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Tetsuo Handa from-linux-kernel@i-love.sakura.ne.jp The recvmsg() for raw socket seems to return random u16 value from the kernel stack memory since port field is not initialized. But I'm not sure this patch is correct. Does raw socket return any information stored in port field? [ BSD defines RAW IP recvmsg to return a sin_port value of zero. This is described in Steven's TCP/IP Illustrated Volume 2 on page 1055, which is discussing the BSD rip_input() implementation. ] Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | [IPV4] ipmr: ip multicast route bug fix.Alexey Kuznetsov2006-07-261-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IP multicast route code was reusing an skb which causes use after free and double free. From: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Note, it is real skb_clone(), not alloc_skb(). Equeued skb contains the whole half-prepared netlink message plus room for the rest. It could be also skb_copy(), if we want to be puristic about mangling cloned data, but original copy is really not going to be used. Acked-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | [TG3]: Update version and reldateMichael Chan2006-07-261-2/+2
| | | | | | | | | | | | | | | | | | | | | Update version to 3.63. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | [TG3]: Handle tg3_init_rings() failuresMichael Chan2006-07-261-5/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Handle dev_alloc_skb() failures when initializing the RX rings. Without proper handling, the driver will crash when using a partial ring. Thanks to Stephane Doyon <sdoyon@max-t.com> for reporting the bug and providing the initial patch. Howie Xu <howie@vmware.com> also reported the same issue. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | [TG3]: Add tg3_restart_hw()Michael Chan2006-07-261-22/+58
| | | | | | | | | | | | | | | | | | | | | | | | Add tg3_restart_hw() to handle failures when re-initializing the device. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | [IPV4]: Clear the whole IPCB, this clears also IPCB(skb)->flags.Guillaume Chazarain2006-07-251-1/+1
| | | | | | | | | | | | | | | Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | [IPV6]: Clean skb cb on IPv6 input.Guillaume Chazarain2006-07-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | Clear the accumulated junk in IP6CB when starting to handle an IPV6 packet. Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | [NETFILTER]: Demote xt_sctp to EXPERIMENTALPatrick McHardy2006-07-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | After the recent problems with all the SCTP stuff it seems reasonable to mark this as experimental. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | [NETFILTER]: bridge netfilter: add deferred output hooks to ↵Patrick McHardy2006-07-254-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | feature-removal-schedule Add bridge netfilter deferred output hooks to feature-removal-schedule and disable them by default. Until their removal they will be activated by the physdev match when needed. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | [NETFILTER]: xt_pkttype: fix mismatches on locally generated packetsPhil Oester2006-07-251-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Locally generated broadcast and multicast packets have pkttype set to PACKET_LOOPBACK instead of PACKET_BROADCAST or PACKET_MULTICAST. This causes the pkttype match to fail to match packets of either type. The below patch remedies this by using the daddr as a hint as to broadcast|multicast. While not pretty, this seems like the only way to solve the problem short of just noting this as a limitation of the match. This resolves netfilter bugzilla #484 Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | [NETFILTER]: SNMP NAT: fix byteorder confusionPatrick McHardy2006-07-251-2/+2
| | | | | | | | | | | | | | | Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | [NETFILTER]: conntrack: fix SYSCTL=n compileAdrian Bunk2006-07-252-4/+4
| | | | | | | | | | | | | | | | | | Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | [NETFILTER]: nf_queue: handle NF_STOP and unknown verdicts in nf_reinjectPatrick McHardy2006-07-251-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | In case of an unknown verdict or NF_STOP the packet leaks. Unknown verdicts can happen when userspace is buggy. Reinject the packet in case of NF_STOP, drop on unknown verdicts. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | [NETFILTER]: H.323 helper: fix possible NULL-ptr dereferencePatrick McHardy2006-07-251-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | An RCF message containing a timeout results in a NULL-ptr dereference if no RRQ has been seen before. Noticed by the "SATURN tool", reported by Thomas Dillig <tdillig@stanford.edu> and Isil Dillig <isil@stanford.edu>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [PATCH] Reorganize the cpufreq cpu hotplug locking to not be totally bizareArjan van de Ven2006-07-265-29/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch below moves the cpu hotplugging higher up in the cpufreq layering; this is needed to avoid recursive taking of the cpu hotplug lock and to otherwise detangle the mess. The new rules are: 1. you must do lock_cpu_hotplug() around the following functions: __cpufreq_driver_target __cpufreq_governor (for CPUFREQ_GOV_LIMITS operation only) __cpufreq_set_policy 2. governer methods (.governer) must NOT take the lock_cpu_hotplug() lock in any way; they are called with the lock taken already 3. if your governer spawns a thread that does things, like calling __cpufreq_driver_target, your thread must honor rule #1. 4. the policy lock and other cpufreq internal locks nest within the lock_cpu_hotplug() lock. I'm not entirely happy about how the __cpufreq_governor rule ended up (conditional locking rule depending on the argument) but basically all callers pass this as a constant so it's not too horrible. The patch also removes the cpufreq_governor() function since during the locking audit it turned out to be entirely unused (so no need to fix it) The patch works on my testbox, but it could use more testing (otoh... it can't be much worse than the current code) Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] cfq-iosched: don't use a hard jiffies value, translate from msecsJens Axboe2006-07-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | The CIC_SEEKY() test really wants to use the minimum of either: - 2 msecs (not jiffies) - or, the pending slice time So code it like that. Signed-off-by: Jens Axboe <axboe@suse.de>
* | [PATCH] blktrace: fix read-ahead bitMilton Miller2006-07-251-1/+1
| | | | | | | | | | | | It should be toggling the same bit on and off, fix it up. Signed-off-by: Jens Axboe <axboe@suse.de>
* | [PATCH] cciss: fix stall with softirq handling and CFQJens Axboe2006-07-251-41/+45
|/ | | | | | | | | We need to postpone the queue startup until after the softirq handler has actually finished some requests, otherwise we could be racing with cciss_softirq_done() and not actually restart the queue handling. Signed-off-by: Jens Axboe <axboe@suse.de>
* [NET]: Correct dev_alloc_skb kerneldocChristoph Hellwig2006-07-251-2/+2
| | | | | | | | dev_alloc_skb is designated for RX descriptors, not TX. (Some drivers use it for the latter anyway, but that's a different story) Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Remove CONFIG_HAVE_ARCH_DEV_ALLOC_SKBChristoph Hellwig2006-07-251-4/+0
| | | | | | | | | | skbuff.h has an #ifndef CONFIG_HAVE_ARCH_DEV_ALLOC_SKB to allow architectures to reimplement __dev_alloc_skb. It's not set on any architecture and now that we have an architecture-overrideable NET_SKB_PAD there is not point at all to have one either. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* [VLAN]: Fix link state propagationStefan Rompf2006-07-241-5/+3
| | | | | | | | | | When the queue of the underlying device is stopped at initialization time or the device is marked "not present", the state will be propagated to the vlan device and never change. Based on an analysis by Patrick McHardy. Signed-off-by: Stefan Rompf <stefan@loplof.de> ACKed-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV6] xfrm6_tunnel: Delete debugging code.David S. Miller2006-07-241-130/+10
| | | | | | | | | | | | | | It doesn't compile, and it's dubious in several regards: 1) is enabled by non-Kconfig controlled CONFIG_* value (noted by Randy Dunlap) 2) XFRM6_TUNNEL_SPI_MAGIC is defined after it's first use 3) the debugging messages print object pointer addresses which have no meaning without context So let's just get rid of it. Signed-off-by: David S. Miller <davem@davemloft.net>
* [Bluetooth] Enable SCO support for Broadcom HID proxy dongleMarcel Holtmann2006-07-241-1/+1
| | | | | | | | | The Broadcom dongles with HID proxy support actually support SCO over HCI if the SCO buffer size values are corrected. So instead of disabling the SCO support, mark this dongle with the quirk for the Bluetooth core to correct the wrong buffer size values. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
* [Bluetooth] Add quirk for another broken RTX Telecom based dongleMarcel Holtmann2006-07-241-1/+2
| | | | | | | | This patch disables the ISOC transfers for another broken RTX Telecom based USB dongle. Starting the USB ISOC transfers only ends in a burst of error messages for invalid SCO packets on connection handle 0. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
* [Bluetooth] Correct SCO buffer size for Belkin devicesMarcel Holtmann2006-07-241-1/+2
| | | | | | | | The Belkin F8T012 and F8T013 devices are both based on a Bluetooth chip from Broadcom and their SCO buffer size values are wrong. The Bluetooth core should correct these values. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
* [Bluetooth] Correct SCO buffer size for another Broadcom chipMarcel Holtmann2006-07-241-2/+15
| | | | | | | | The SCO buffer size values on IBM/Lenovo ThinkPad laptops with a Bluetooth chip from Broadcom are wrong. The USB Bluetooth driver has to set a quirk to correct the SCO buffer size values. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
* [Bluetooth] Correct RFCOMM channel MTU for broken implementationsMarcel Holtmann2006-07-241-2/+17
| | | | | | | | | | | Some Bluetooth RFCOMM implementations try to negotiate a bigger channel MTU than we can support for a particular session. The maximum MTU for a RFCOMM session is limited through the L2CAP layer. So if the other side proposes a channel MTU that is bigger than the underlying L2CAP MTU, we should reduce it to the L2CAP MTU of the session minus five bytes for the RFCOMM headers. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
* [PKT_SCHED]: Fix regression in PSCHED_TADD{,2}.Guillaume Chazarain2006-07-241-12/+6
| | | | | | | | | | | | | | | In PSCHED_TADD and PSCHED_TADD2, if delta is less than tv.tv_usec (so, less than USEC_PER_SEC too) then tv_res will be smaller than tv. The affectation "(tv_res).tv_usec = __delta;" is wrong. The fix is to revert to the original code before 4ee303dfeac6451b402e3d8512723d3a0f861857 and change the 'if' in 'while'. [Shuya MAEDA: "while (__delta >= USEC_PER_SEC){ ... }" instead of "while (__delta > USEC_PER_SEC){ ... }"] Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
* [DCCP]: Fix default sequence window sizeIan McDonald2006-07-244-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When using the default sequence window size (100) I got the following in my logs: Jun 22 14:24:09 localhost kernel: [ 1492.114775] DCCP: Step 6 failed for DATA packet, (LSWL(6279674225) <= P.seqno(6279674749) <= S.SWH(6279674324)) and (P.ackno doesn't exist or LAWL(18798206530) <= P.ackno(1125899906842620) <= S.AWH(18798206548), sending SYNC... Jun 22 14:24:09 localhost kernel: [ 1492.115147] DCCP: Step 6 failed for DATA packet, (LSWL(6279674225) <= P.seqno(6279674750) <= S.SWH(6279674324)) and (P.ackno doesn't exist or LAWL(18798206530) <= P.ackno(1125899906842620) <= S.AWH(18798206549), sending SYNC... I went to alter the default sysctl and it didn't take for new sockets. Below patch fixes this. I think the default is too low but it is what the DCCP spec specifies. As a side effect of this my rx speed using iperf goes from about 2.8 Mbits/sec to 3.5. This is still far too slow but it is a step in the right direction. Compile tested only for IPv6 but not particularly complex change. Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* IB/mthca: Initialize max_cmds before debug code prints itRoland Dreier2006-07-241-2/+3
| | | | | | | | Read the max_cmds value from the response to the QUERY_FW command before printing out the value, so that the real value goes into the debug output. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipoib: Fix packet loss after hardware address updateMichael S. Tsirkin2006-07-242-0/+24
| | | | | | | | | | | The neighbour ha field may get updated without destroying the neighbour. In this case, the ha field gets out of sync with the address handle stored in ipoib_neigh->ah, with the result that the ah field would point to an incorrect path, resulting in all packets being lost. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipoib: Fix oops with ipoib_debug_mcast setOr Gerlitz2006-07-241-4/+4
| | | | | | | Need to set mcast->ah before debug code dereferences it. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mad: Validate MADs for spec complianceSean Hefty2006-07-243-21/+95
| | | | | | | | | | | | | | Validate MADs sent by userspace clients for spec compliance with C13-18.1.1 (prevent duplicate requests and responses sent on the same port). Without this, RMPP transactions get aborted because of duplicate packets. This patch is similar to that provided by Jack Morgenstein. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: ipath_skip_sge() can break if num_sge > 1Ralph Campbell2006-07-241-4/+0
| | | | | | | | | | | ipath_skip_sge() doesn't exactly duplicate the side effects of ipath_copy_sge() if num_sge > 1 since it doesn't decrement ss->num_sge. This could result in the sg_list being accessed out of bounds. Since ipath_skip_sge() is almost always called with num_sge == 1, the original "optimization" is almost never used. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: Fix ib_ipath driver to work with SRPRalph Campbell2006-07-242-0/+16
| | | | | | | | | | | I am still working on a proposal to remove the phys_to_virt() calls in the ib_ipath driver. In the mean time, this patch allows SRP to work by fixing the R_Key check and conversion from IB address to kernel virtual address. It also returns the correct page size for FMRs. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: Fix a data corruptionRalph Campbell2006-07-241-40/+36
| | | | | | | | | This patch fixes a problem where certain error packets are passed to the InfiniBand layer for processing even though the packet actually was received with an error. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix SRQ limit event range checkDotan Barak2006-07-241-1/+2
| | | | | | | | | Mem-free HCAs always keep one spare SRQ WQE, so the SRQ limit cannot be set beyond srq->max - 1. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/uverbs: Fix lockdep warningsRoland Dreier2006-07-241-11/+21
| | | | | | | | | Lockdep warns because uverbs is trying to take uobj->mutex when it already holds that lock. This is because there are really multiple types of uobjs even though all of their locks are initialized in common code. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/uverbs: Fix unlocking in error pathsMichael S. Tsirkin2006-07-241-2/+8
| | | | | | | | | ib_uverbs_create_ah() and ib_uverbs_create_srq() did not release the PD's read lock in their error paths, which lead to deadlock when destroying the PD. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* [PATCH] Cpuset: fix ABBA deadlock with cpu hotplug lockPaul Jackson2006-07-231-3/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix ABBA deadlock between lock_cpu_hotplug() and the cpuset callback_mutex lock. It only happens on cpu_exclusive cpusets, due to the dynamic sched domain code trying to take the cpu hotplug lock inside the cpuset callback_mutex lock. This bug has apparently been here for several months, but didn't get hit until the right customer load on a large system. This fix appears right from inspection, but it will take a few more days running it on that customers workload to be confident we nailed it. We don't have any other reproducible test case. The cpu_hotplug_lock() tends to cover large runs of code. The other places that hold both that lock and the cpuset callback mutex lock always nest the cpuset lock inside the hotplug lock. This place tries to do the reverse, risking an ABBA deadlock. This is in the cpuset_rmdir() code, where we: * take the callback_mutex lock * mark the cpuset CS_REMOVED * call update_cpu_domains for cpu_exclusive cpusets * in that call, take the cpu_hotplug lock if the cpuset is marked for removal. Thanks to Jack Steiner for identifying this deadlock. The fix is to tear down the dynamic sched domain before we grab the cpuset callback_mutex lock. This way, the two locks are serialized, with the hotplug lock taken and released before trying for the cpuset lock. I suspect that this bug was introduced when I changed the cpuset locking from one lock to two. The dynamic sched domain dependency on cpu_exclusive cpusets and its hotplug hooks were added to this code earlier, when cpusets had only a single lock. It may well have been fine then. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* cpu hotplug: simplify and hopefully fix lockingLinus Torvalds2006-07-232-47/+34
| | | | | | | | | | | | | | | | | The CPU hotplug locking was quite messy, with a recursive lock to handle the fact that both the actual up/down sequence wanted to protect itself from being re-entered, but the callbacks that it called also tended to want to protect themselves from CPU events. This splits the lock into two (one to serialize the whole hotplug sequence, the other to protect against the CPU present bitmaps changing). The latter still allows recursive usage because some subsystems (ondemand policy for cpufreq at least) had already gotten too used to the lax locking, but the locking mistakes are hopefully now less fundamental, and we now warn about recursive lock usage when we see it, in the hope that it can be fixed. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [cpufreq] ondemand: make shutdown sequence more robustLinus Torvalds2006-07-231-6/+10
| | | | | | | | | | | Shutting down the ondemand policy was fraught with potential problems, causing issues for SMP suspend (which wants to hot- unplug) all but the last CPU. This should fix at least the worst problems (divide-by-zero and infinite wait for the workqueue to shut down). Signed-off-by: Linus Torvalds <torvalds@osdl.org>