linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	xfs: refactor xlog_recover_commit_trans	Christoph Hellwig	2010-12-16	1	-64/+53
\| \| \| \| \| \| \| \| \|	Merge the call to xlog_recover_reorder_trans and the loop over the recovery items from xlog_recover_do_trans into xlog_recover_commit_trans, and keep the switch statement over the log item types as a separate helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: use struct list_head for the buf cancel table	Christoph Hellwig	2010-12-16	3	-111/+65
\| \| \| \| \|	Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: remove leftovers of old buffer log items in recovery code	Christoph Hellwig	2010-12-16	1	-115/+40
\| \| \| \| \| \| \| \| \| \|	XFS used to support different types of buffer log items long time ago. Remove the switch statements checking the log item type in various buffer recovery helpers that were left over from those days and the rather useless xlog_recover_do_buffer_pass2 wrapper. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: fix exporting with left over 64-bit inodes	Samuel Kvasnica	2010-12-16	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We now support mounting and using filesystems with 64-bit inodes even when not mounted with the inode64 option (which now only controls if we allocate new inodes in that space or not). Make sure we always use large NFS file handles when exporting a filesystem that may contain 64-bit inodes. Note that this only affects newly generated file handles, any outstanding 32-bit file handle is still accepted. [hch: the comment and commit log are mine, the rest is from a patch snipplet from Samuel] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: log timestamp changes to the source inode in rename	Christoph Hellwig	2010-12-10	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we don't mark VFS inodes dirty anymore for internal timestamp changes, but rely on the transaction subsystem to push them out, we need to explicitly log the source inode in rename after updating it's timestamps to make sure the changes actually get forced out by sync/fsync or an AIL push. We already account for the fourth inode in the log reservation, as a rename of directories needs to update the nlink field, so just adding the xfs_trans_log_inode call is enough. This fixes the xfsqa 065 regression introduced by: "xfs: don't use vfs writeback for pure metadata modifications" Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
*	xfs: only run xfs_error_test if error injection is active	Dave Chinner	2010-12-01	2	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recent tests writing lots of small files showed the flusher thread being CPU bound and taking a long time to do allocations on a debug kernel. perf showed this as the prime reason: samples pcnt function DSO _______ _____ ___________________________ _________________ 224648.00 36.8% xfs_error_test [kernel.kallsyms] 86045.00 14.1% xfs_btree_check_sblock [kernel.kallsyms] 39778.00 6.5% prandom32 [kernel.kallsyms] 37436.00 6.1% xfs_btree_increment [kernel.kallsyms] 29278.00 4.8% xfs_btree_get_rec [kernel.kallsyms] 27717.00 4.5% random32 [kernel.kallsyms] Walking btree blocks during allocation checking them requires each block (a cache hit, so no I/O) call xfs_error_test(), which then does a random32() call as the first operation. IOWs, ~50% of the CPU is being consumed just testing whether we need to inject an error, even though error injection is not active. Kill this overhead when error injection is not active by adding a global counter of active error traps and only calling into xfs_error_test when fault injection is active. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
*	xfs: avoid moving stale inodes in the AIL	Dave Chinner	2010-12-01	1	-6/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an inode has been marked stale because the cluster is being freed, we don't want to (re-)insert this inode into the AIL. There is a race condition where the cluster buffer may be unpinned before the inode is inserted into the AIL during transaction committed processing. If the buffer is unpinned before the inode item has been committed and inserted, then it is possible for the buffer to be released and hence processthe stale inode callbacks before the inode is inserted into the AIL. In this case, we then insert a clean, stale inode into the AIL which will never get removed by an IO completion. It will, however, get reclaimed and that triggers an assert in xfs_inode_free() complaining about freeing an inode still in the AIL. This race can be avoided by not moving stale inodes forward in the AIL during transaction commit completion processing. This closes the race condition by ensuring we never insert clean stale inodes into the AIL. It is safe to do this because a dirty stale inode, by definition, must already be in the AIL. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
*	xfs: delayed alloc blocks beyond EOF are valid after writeback	Dave Chinner	2010-12-01	2	-2/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is an assumption in the parts of XFS that flushing a dirty file will make all the delayed allocation blocks disappear from an inode. That is, that after calling xfs_flush_pages() then ip->i_delayed_blks will be zero. This is an invalid assumption as we may have specualtive preallocation beyond EOF and they are recorded in ip->i_delayed_blks. A flush of the dirty pages of an inode will not change the state of these blocks beyond EOF, so a non-zero deeelalloc block count after a flush is valid. The bmap code has an invalid ASSERT() that needs to be removed, and the swapext code has a bug in that while it swaps the data forks around, it fails to swap the i_delayed_blks counter associated with the fork and hence can get the block accounting wrong. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
*	xfs: push stale, pinned buffers on trylock failures	Dave Chinner	2010-12-01	1	-19/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As reported by Nick Piggin, XFS is suffering from long pauses under highly concurrent workloads when hosted on ramdisks. The problem is that an inode buffer is stuck in the pinned state in memory and as a result either the inode buffer or one of the inodes within the buffer is stopping the tail of the log from being moved forward. The system remains in this state until a periodic log force issued by xfssyncd causes the buffer to be unpinned. The main problem is that these are stale buffers, and are hence held locked until the transaction/checkpoint that marked them state has been committed to disk. When the filesystem gets into this state, only the xfssyncd can cause the async transactions to be committed to disk and hence unpin the inode buffer. This problem was encountered when scaling the busy extent list, but only the blocking lock interface was fixed to solve the problem. Extend the same fix to the buffer trylock operations - if we fail to lock a pinned, stale buffer, then force the log immediately so that when the next attempt to lock it comes around, it will have been unpinned. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
*	xfs: fix failed write truncation handling.	Dave Chinner	2010-12-01	3	-54/+121
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the move to the new truncate sequence we call xfs_setattr to truncate down excessively instanciated blocks. As shown by the testcase in kernel.org BZ #22452 that doesn't work too well. Due to the confusion of the internal inode size, and the VFS inode i_size it zeroes data that it shouldn't. But full blown truncate seems like overkill here. We only instanciate delayed allocations in the write path, and given that we never released the iolock we can't have converted them to real allocations yet either. The only nasty case is pre-existing preallocation which we need to skip. We already do this for page discard during writeback, so make the delayed allocation block punching a generic function and call it from the failed write path as well as xfs_aops_discard_page. The callers are responsible for ensuring that partial blocks are not truncated away, and that they hold the ilock. Based on a fix originally from Christoph Hellwig. This version used filesystem blocks as the range unit. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
*	Linux 2.6.37-rc4v2.6.37-rc4	Linus Torvalds	2010-11-30	1	-1/+1
\|
*	Merge branch 'merge' of ↵	Linus Torvalds	2010-11-30	1	-1/+1
\|\ \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Use call_rcu_sched() for pagetables
\| *	powerpc: Use call_rcu_sched() for pagetables	Peter Zijlstra	2010-11-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PowerPC relies on IRQ-disable to guard against RCU quiecent states, use the appropriate RCU call version. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* \|	Revert "debug_locks: set oops_in_progress if we will log messages."	Dave Airlie	2010-11-30	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit e0fdace10e75dac67d906213b780ff1b1a4cc360. On-list discussion seems to suggest that the robustness fixes for printk make this unnecessary and DaveM has also agreed in person at Kernel Summit and on list. The main problem with this code is once we hit a lockdep splat we always keep oops_in_progress set, the console layer uses oops_in_progress with KMS to decide when it should be showing the oops and not showing X, so it causes problems around suspend/resume time when a userspace resume can cause a console switch away from X, only if oops_in_progress is set (which is what we want if an oops actually is in progress, but not because we had a lockdep splat 2 days prior). Cc: David S Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	Merge branch 'for-linus' of ↵	Linus Torvalds	2010-11-29	1	-0/+24
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: tpm: Autodetect itpm devices
\| * \|	tpm: Autodetect itpm devices	Matthew Garrett	2010-11-29	1	-0/+24
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some Lenovos have TPMs that require a quirk to function correctly. This can be autodetected by checking whether the device has a _HID of INTC0102. This is an invalid PNPid, and as such is discarded by the pnp layer - however it's still present in the ACPI code, so we can pull it out that way. This means that the quirk won't be automatically applied on non-ACPI systems, but without ACPI we don't have any way to identify the chip anyway so I don't think that's a great concern. Signed-off-by: Matthew Garrett <mjg@redhat.com> Acked-by: Rajiv Andrade <srajiv@linux.vnet.ibm.com> Tested-by: Jiri Kosina <jkosina@suse.cz> Tested-by: Andy Isaacson <adi@hexapodia.org> Signed-off-by: James Morris <jmorris@namei.org>
* \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds	2010-11-29	30	-203/+280
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits) af_unix: limit recursion level pch_gbe driver: The wrong of initializer entry pch_gbe dreiver: chang author ucc_geth: fix ucc halt problem in half duplex mode inet: Fix __inet_inherit_port() to correctly increment bsockets and num_owners ehea: Add some info messages and fix an issue hso: fix disable_net NET: wan/x25_asy, move lapb_unregister to x25_asy_close_tty cxgb4vf: fix setting unicast/multicast addresses ... net, ppp: Report correct error code if unit allocation failed DECnet: don't leak uninitialized stack byte au1000_eth: fix invalid address accessing the MAC enable register dccp: fix error in updating the GAR tcp: restrict net.ipv4.tcp_adv_win_scale (#20312) netns: Don't leak others' openreq-s in proc Net: ceph: Makefile: Remove unnessary code vhost/net: fix rcu check usage econet: fix CVE-2010-3848 econet: fix CVE-2010-3850 econet: disallow NULL remote addr for sendmsg(), fixes CVE-2010-3849 ...
\| * \|	af_unix: limit recursion level	Eric Dumazet	2010-11-29	3	-6/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Its easy to eat all kernel memory and trigger NMI watchdog, using an exploit program that queues unix sockets on top of others. lkml ref : http://lkml.org/lkml/2010/11/25/8 This mechanism is used in applications, one choice we have is to have a recursion limit. Other limits might be needed as well (if we queue other types of files), since the passfd mechanism is currently limited by socket receive queue sizes only. Add a recursion_level to unix socket, allowing up to 4 levels. Each time we send an unix socket through sendfd mechanism, we copy its recursion level (plus one) to receiver. This recursion level is cleared when socket receive queue is emptied. Reported-by: Марк Коренберг <socketpair@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	pch_gbe driver: The wrong of initializer entry	Toshiharu Okada	2010-11-29	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The wrong of initializer entry was modified. Signed-off-by: Toshiharu Okada <toshiharu-linux@dsn.okisemi.com> Reported-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	pch_gbe dreiver: chang author	Toshiharu Okada	2010-11-29	2	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This driver's AUTHOR was changed to "Toshiharu Okada" from "Masayuki Ohtake". I update the Kconfig, renamed "Topcliff" to "EG20T". Signed-off-by: Toshiharu Okada <toshiharu-linux@dsn.okisemi.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	ucc_geth: fix ucc halt problem in half duplex mode	Yang Li	2010-11-29	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit 58933c64(ucc_geth: Fix the wrong the Rx/Tx FIFO size), the UCC_GETH_UTFTT_INIT is set to 512 based on the recommendation of the QE Reference Manual. But that will sometimes cause tx halt while working in half duplex mode. According to errata draft QE_GENERAL-A003(High Tx Virtual FIFO threshold size can cause UCC to halt), setting UTFTT less than [(UTFS x (M - 8)/M) - 128] will prevent this from happening (M is the minimum buffer size). The patch changes UTFTT back to 256. Signed-off-by: Li Yang <leoli@freescale.com> Cc: Jean-Denis Boyer <jdboyer@media5corp.com> Cc: Andreas Schmitz <Andreas.Schmitz@riedel.net> Cc: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	inet: Fix __inet_inherit_port() to correctly increment bsockets and num_owners	Nagendra Tomar	2010-11-29	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	inet sockets corresponding to passive connections are added to the bind hash using ___inet_inherit_port(). These sockets are later removed from the bind hash using __inet_put_port(). These two functions are not exactly symmetrical. __inet_put_port() decrements hashinfo->bsockets and tb->num_owners, whereas ___inet_inherit_port() does not increment them. This results in both of these going to -ve values. This patch fixes this by calling inet_bind_hash() from ___inet_inherit_port(), which does the right thing. 'bsockets' and 'num_owners' were introduced by commit a9d8f9110d7e953c (inet: Allowing more than 64k connections and heavily optimize bind(0)) Signed-off-by: Nagendra Singh Tomar <tomer_iisc@yahoo.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	ehea: Add some info messages and fix an issue	Breno Leitao	2010-11-29	1	-4/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds some debug information about ehea not being able to allocate enough spaces. Also it correctly updates the amount of available skb. Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	hso: fix disable_net	Filip Aben	2010-11-28	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The HSO driver incorrectly creates a serial device instead of a net device when disable_net is set. It shouldn't create anything for the network interface. Signed-off-by: Filip Aben <f.aben@option.com> Reported-by: Piotr Isajew <pki@ex.com.pl> Reported-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	NET: wan/x25_asy, move lapb_unregister to x25_asy_close_tty	Jiri Slaby	2010-11-28	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We register lapb when tty is created, but unregister it only when the device is UP. So move the lapb_unregister to x25_asy_close_tty after the device is down. The old behaviour causes ldisc switching to fail each second attempt, because we noted for us that the device is unused, so we use it the second time, but labp layer still have it registered, so it fails obviously. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Reported-by: Sergey Lapin <slapin@ossfans.org> Cc: Andrew Hendry <andrew.hendry@gmail.com> Tested-by: Sergey Lapin <slapin@ossfans.org> Tested-by: Mikhail Ulyanov <ulyanov.mikhail@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	cxgb4vf: fix setting unicast/multicast addresses ...	Casey Leedom	2010-11-28	2	-63/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were truncating the number of unicast and multicast MAC addresses supported. Additionally, we were incorrectly computing the MAC Address hash (a "1 << N" where we needed a "1ULL << N"). Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net, ppp: Report correct error code if unit allocation failed	Cyrill Gorcunov	2010-11-28	1	-21/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allocating unit from ird might return several error codes not only -EAGAIN, so it should not be changed and returned precisely. Same time unit release procedure should be invoked only if device is unregistering. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> CC: Paul Mackerras <paulus@samba.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	DECnet: don't leak uninitialized stack byte	Dan Rosenberg	2010-11-28	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A single uninitialized padding byte is leaked to userspace. Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> CC: stable <stable@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	au1000_eth: fix invalid address accessing the MAC enable register	Wolfgang Grandegger	2010-11-28	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"aup->enable" holds already the address pointing to the MAC enable register. The bug was introduced by commit d0e7cb: "au1000-eth: remove volatiles, switch to I/O accessors". CC: Florian Fainelli <florian@openwrt.org> Signed-off-by: Wolfgang Grandegger <wg@denx.de> Acked-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	dccp: fix error in updating the GAR	Gerrit Renker	2010-11-28	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a bug in updating the Greatest Acknowledgment number Received (GAR): the current implementation does not track the greatest received value - lower values in the range AWL..AWH (RFC 4340, 7.5.1) erase higher ones. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	Merge branch 'vhost-net' of ↵	David S. Miller	2010-11-28	1	-2/+3
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
\| \| * \|	vhost/net: fix rcu check usage	Michael S. Tsirkin	2010-11-25	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Incorrect rcu check was used as rcu isn't done under mutex here. Force check to 1 for now, to stop it from complaining. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
\| * \| \|	tcp: restrict net.ipv4.tcp_adv_win_scale (#20312)	Alexey Dobriyan	2010-11-28	2	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tcp_win_from_space() does the following: if (sysctl_tcp_adv_win_scale <= 0) return space >> (-sysctl_tcp_adv_win_scale); else return space - (space >> sysctl_tcp_adv_win_scale); "space" is int. As per C99 6.5.7 (3) shifting int for 32 or more bits is undefined behaviour. Indeed, if sysctl_tcp_adv_win_scale is exactly 32, space >> 32 equals space and function returns 0. Which means we busyloop in tcp_fixup_rcvbuf(). Restrict net.ipv4.tcp_adv_win_scale to [-31, 31]. Fix https://bugzilla.kernel.org/show_bug.cgi?id=20312 Steps to reproduce: echo 32 >/proc/sys/net/ipv4/tcp_adv_win_scale wget www.kernel.org [softlockup] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	netns: Don't leak others' openreq-s in proc	Pavel Emelyanov	2010-11-28	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The /proc/net/tcp leaks openreq sockets from other namespaces. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	Net: ceph: Makefile: Remove unnessary code	Tracey Dent	2010-11-28	1	-22/+0
\| \|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove the if and else conditional because the code is in mainline and there is no need in it being there. Signed-off-by: Tracey Dent <tdent48227@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	econet: fix CVE-2010-3848	Phil Blundell	2010-11-24	1	-31/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Don't declare variable sized array of iovecs on the stack since this could cause stack overflow if msg->msgiovlen is large. Instead, coalesce the user-supplied data into a new buffer and use a single iovec for it. Signed-off-by: Phil Blundell <philb@gnu.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	econet: fix CVE-2010-3850	Phil Blundell	2010-11-24	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add missing check for capable(CAP_NET_ADMIN) in SIOCSIFADDR operation. Signed-off-by: Phil Blundell <philb@gnu.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	econet: disallow NULL remote addr for sendmsg(), fixes CVE-2010-3849	Phil Blundell	2010-11-24	1	-18/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Later parts of econet_sendmsg() rely on saddr != NULL, so return early with EINVAL if NULL was passed otherwise an oops may occur. Signed-off-by: Phil Blundell <philb@gnu.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	tcp: Make TCP_MAXSEG minimum more correct.	David S. Miller	2010-11-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use TCP_MIN_MSS instead of constant 64. Reported-by: Min Zhang <mzhang@mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	isdn: icn: Fix stack corruption bug.	Steven Rostedt	2010-11-24	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Running randconfig with ktest.pl I hit this bug: [ 16.101158] ICN-ISDN-driver Rev 1.65.6.8 mem=0x000d0000 [ 16.106376] icn: (line0) ICN-2B, port 0x320 added [ 16.111064] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: c1642880 [ 16.111066] [ 16.121214] Pid: 1, comm: swapper Not tainted 2.6.37-rc2-test-00124-g6656b3f #8 [ 16.128499] Call Trace: [ 16.130942] [<c0f51662>] ? printk+0x1d/0x23 [ 16.135200] [<c0f5153f>] panic+0x5c/0x162 [ 16.139286] [<c0d62a9a>] ? icn_addcard+0x6d/0xbe [ 16.143975] [<c0445783>] print_tainted+0x0/0x8c [ 16.148582] [<c1642880>] ? icn_init+0xd8/0xdf [ 16.153012] [<c1642880>] icn_init+0xd8/0xdf [ 16.157271] [<c04012e5>] do_one_initcall+0x8c/0x143 [ 16.162222] [<c16427a8>] ? icn_init+0x0/0xdf [ 16.166566] [<c15f1a05>] kernel_init+0x13f/0x1da [ 16.171256] [<c15f18c6>] ? kernel_init+0x0/0x1da [ 16.175945] [<c0403bfe>] kernel_thread_helper+0x6/0x10 [ 16.181181] panic occurred, switching back to text console Looking into it I found that the stack was corrupted by the assignment of the Rev #. The variable rev is given 10 bytes, and in this output the characters that were copied was: " 1.65.6.8 $". Which was 11 characters plus the null ending character for a total of 12 bytes, thus corrupting the stack. This patch ups the variable size to 20 bytes as well as changes the strcpy to strncpy. I also added a check to make sure '$' is found. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	Merge branch 'master' of ↵	David S. Miller	2010-11-24	5	-2/+5
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
\| \| * \|	wireless: b43: fix error path in SDIO	Guennadi Liakhovetski	2010-11-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix unbalanced call to sdio_release_host() on the error path. Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: stable@kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| \| * \|	carl9170: fix virtual interface setup crash	Christian Lamparter	2010-11-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes a faulty bound check which caused a crash when too many virtual interface were brought up. BUG: unable to handle kernel NULL pointer dereference at 00000004 IP: [<f8125f67>] carl9170_op_add_interface+0x1d7/0x2c0 [carl9170] *pde = 00000000 Oops: 0002 [#1] PREEMPT Modules linked in: carl9170 [...] Pid: 4720, comm: wpa_supplicant Not tainted 2.6.37-rc2-wl+ EIP: 0060:[<f8125f67>] EFLAGS: 00210206 CPU: 0 EIP is at carl9170_op_add_interface+0x1d7/0x2c0 [carl9170] EAX: 00000000 ... Process wpa_supplicant Stack: f4f88f34 fffffff4 .. Call Trace: [<f8f4e666>] ? ieee80211_do_open+0x406/0x5c0 [mac80211] [...] Code: <89> 42 04 ... EIP: [<f8125f67>] carl9170_op_add_interface+0x1d7/0x2c0 [carl9170] CR2: 0000000000000004 Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| \| * \|	ssb: b43-pci-bridge: Add new vendor for BCM4318	Daniel Klaffenbach	2010-11-22	2	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new vendor for Broadcom 4318. Signed-off-by: Daniel Klaffenbach <danielklaffenbach@gmail.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Stable <stable@kernel.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| \| * \|	ath9k: fix timeout on stopping rx dma	Felix Fietkau	2010-11-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It seems that using ath9k_hw_stoppcurecv to stop rx dma is not enough. When it's time to stop DMA, the PCU is still busy, so the rx enable bit never clears. Using ath9k_hw_abortpcurecv helps with getting rx stopped much faster, with this change, I cannot reproduce the rx stop related WARN_ON anymore. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Cc: stable@kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| * \| \|	af_unix: limit unix_tot_inflight	Eric Dumazet	2010-11-24	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Vegard Nossum found a unix socket OOM was possible, posting an exploit program. My analysis is we can eat all LOWMEM memory before unix_gc() being called from unix_release_sock(). Moreover, the thread blocked in unix_gc() can consume huge amount of time to perform cleanup because of huge working set. One way to handle this is to have a sensible limit on unix_tot_inflight, tested from wait_for_unix_gc() and to force a call to unix_gc() if this limit is hit. This solves the OOM and also reduce overall latencies, and should not slowdown normal workloads. Reported-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \| \|	Merge branch 'omap-fixes-for-linus' of ↵	Linus Torvalds	2010-11-29	3	-1/+23
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6 * 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6: OMAP2+: PM/serial: hold console semaphore while OMAP UARTs are disabled OMAP: UART: don't resume UARTs that are not enabled.
\| * \| \| \|	OMAP2+: PM/serial: hold console semaphore while OMAP UARTs are disabled	Paul Walmsley	2010-11-25	3	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The console semaphore must be held while the OMAP UART devices are disabled, lest a console write cause an ARM abort (and a kernel crash) when the underlying console device is inaccessible. These crashes only occur when the console is on one of the OMAP internal serial ports. While this problem has been latent in the PM idle loop for some time, the crash was not triggerable with an unmodified kernel until commit 6f251e9db1093c187addc309b5f2f7fe3efd2995 ("OMAP: UART: omap_device conversions, remove implicit 8520 assumptions"). After this patch, a console write often occurs after the console UART has been disabled in the idle loop, crashing the system. Several users have encountered this bug: http://www.mail-archive.com/linux-omap@vger.kernel.org/msg38396.html http://www.mail-archive.com/linux-omap@vger.kernel.org/msg36602.html The same commit also introduced new code that disabled the UARTs during init, in omap_serial_init_port(). The kernel will also crash in this code when earlyconsole and extra debugging is enabled: http://www.mail-archive.com/linux-omap@vger.kernel.org/msg36411.html The minimal fix for the -rc series is to hold the console semaphore while the OMAP UARTs are disabled. This is a somewhat overbroad fix, since the console may not be located on an OMAP UART, as is the case with the GPMC UART on Zoom3. While it is technically possible to determine which devices the console or earlyconsole is actually running on, it is not a trivial problem to solve, and the code to do so is not really appropriate for the -rc series. The right long-term fix is to ensure that no code outside of the OMAP serial driver can disable an OMAP UART. As I understand it, code to implement this is under development by TI. This patch is a collaboration between Paul Walmsley <paul@pwsan.com> and Tony Lindgren <tony@atomide.com>. Thanks to Ming Lei <tom.leiming@gmail.com> and Pramod <pramod.gurav@ti.com> for their feedback on earlier versions of this patch. Signed-off-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Acked-by: Kevin Hilman <khilman@deeprootsystems.com> Cc: Ming Lei <tom.leiming@gmail.com> Cc: Pramod <pramod.gurav@ti.com> Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Jean Pihet <jean.pihet@newoldbits.com> Cc: Govindraj.R <govindraj.raja@ti.com>
\| * \| \| \|	OMAP: UART: don't resume UARTs that are not enabled.	Kevin Hilman	2010-11-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add additional check to omap_uart_resume_idle() so that only enabled (specifically, idle-enabled) UARTs are allowed to resume. This matches the existing check in prepare idle. Without this patch, the system will hang if a board is configured to register only some uarts instead of all of them and PM is enabled. Cc: Govindraj R. <govindraj.raja@ti.com> Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com> [tony@atomide.com: updated description] Signed-off-by: Tony Lindgren <tony@atomide.com>
* \| \| \| \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable	Linus Torvalds	2010-11-29	15	-114/+572
\|\ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (24 commits) Btrfs: don't use migrate page without CONFIG_MIGRATION Btrfs: deal with DIO bios that span more than one ordered extent Btrfs: setup blank root and fs_info for mount time Btrfs: fix fiemap Btrfs - fix race between btrfs_get_sb() and umount Btrfs: update inode ctime when using links Btrfs: make sure new inode size is ok in fallocate Btrfs: fix typo in fallocate to make it honor actual size Btrfs: avoid NULL pointer deref in try_release_extent_buffer Btrfs: make btrfs_add_nondir take parent inode as an argument Btrfs: hold i_mutex when calling btrfs_log_dentry_safe Btrfs: use dget_parent where we can UPDATED Btrfs: fix more ESTALE problems with NFS Btrfs: handle NFS lookups properly btrfs: make 1-bit signed fileds unsigned btrfs: Show device attr correctly for symlinks btrfs: Set file size correctly in file clone btrfs: Check if dest_offset is block-size aligned before cloning file Btrfs: handle the space_cache option properly btrfs: Fix early enospc because 'unused' calculated with wrong sign. ...