summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* fsnotify: clone existing eventsEric Paris2010-07-282-4/+28
| | | | | | | | | | | fsnotify_clone_event will take an event, clone it, and return the cloned event to the caller. Since events may be in use by multiple fsnotify groups simultaneously certain event entries (such as the mask) cannot be changed after the event was created. Since fanotify would like to merge events happening on the same file it needs a new clean event to work with so it can change any fields it wishes. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: per group notification queue merge typesEric Paris2010-07-284-58/+79
| | | | | | | | | | | inotify only wishes to merge a new event with the last event on the notification fifo. fanotify is willing to merge any events including by means of bitwise OR masks of multiple events together. This patch moves the inotify event merging logic out of the generic fsnotify notification.c and into the inotify code. This allows each use of fsnotify to provide their own merge functionality. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: send struct file when sending events to parents when possibleEric Paris2010-07-283-24/+33
| | | | | | | | | fanotify needs a path in order to open an fd to the object which changed. Currently notifications to inode's parents are done using only the inode. For some parental notification we have the entire file, send that so fanotify can use it. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: pass a file instead of an inode to open, read, and writeEric Paris2010-07-286-18/+20
| | | | | | | | | | fanotify, the upcoming notification system actually needs a struct path so it can do opens in the context of listeners, and it needs a file so it can get f_flags from the original process. Close was the only operation that already was passing a struct file to the notification hook. This patch passes a file for access, modify, and open as well as they are easily available to these hooks. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: include data in should_send callsEric Paris2010-07-286-6/+7
| | | | | | | | | fanotify is going to need to look at file->private_data to know if an event should be sent or not. This passes the data (which might be a file, dentry, inode, or none) to the should_send function calls so fanotify can get that information when available Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: provide the data type to should_send_eventEric Paris2010-07-286-6/+11
| | | | | | | | | | fanotify is only interested in event types which contain enough information to open the original file in the context of the fanotify listener. Since fanotify may not want to send events if that data isn't present we pass the data type to the should_send_event function call so fanotify can express its lack of interest. Signed-off-by: Eric Paris <eparis@redhat.com>
* inotify: do not spam console without limitEric Paris2010-07-281-7/+4
| | | | | | | | | inotify was supposed to have a dmesg printk ratelimitor which would cause inotify to only emit one message per boot. The static bool was never set so it kept firing messages. This patch correctly limits warnings in multiple places. Signed-off-by: Eric Paris <eparis@redhat.com>
* inotify: remove inotify in kernel interfaceEric Paris2010-07-2810-1130/+4
| | | | | | nothing uses inotify in the kernel, drop it! Signed-off-by: Eric Paris <eparis@redhat.com>
* inotify: do not reuse watch descriptorsEric Paris2010-07-281-7/+6
| | | | | | | | | | Prior to 2.6.31 inotify would not reuse watch descriptors until all of them had been used at least once. After the rewrite inotify would reuse watch descriptors. The selinux utility 'restorecond' was found to have problems when watch descriptors were reused. This patch reverts to the pre inotify rewrite behavior to not reuse watch descriptors. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: use kmem_cache_zalloc to simplify event initializationEric Paris2010-07-281-12/+1
| | | | | | | | | fsnotify event initialization is done entry by entry with almost everything set to either 0 or NULL. Use kmem_cache_zalloc and only initialize things that need non-zero initialization. Also means we don't have to change initialization entries based on the config options. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: kzalloc fsnotify groupsEric Paris2010-07-281-1/+1
| | | | | | | Use kzalloc for fsnotify_groups so that none of the fields can leak any information accidentally. Signed-off-by: Eric Paris <eparis@redhat.com>
* inotify: use container_of instead of castingEric Paris2010-07-281-1/+3
| | | | | | | | inotify_free_mark casts directly from an fsnotify_mark_entry to an inotify_inode_mark_entry. This works, but should use container_of instead for future proofing. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: use fsnotify_create_event to allocate the q_overflow eventEric Paris2010-07-281-4/+7
| | | | | | | | | | Currently fsnotify defines a static fsnotify event which is sent when a group overflows its allotted queue length. This patch just allocates that event from the event cache rather than defining it statically. There is no known reason that the current implementation is wrong, but this makes sure the event is initialized and created like any other. Signed-off-by: Eric Paris <eparis@redhat.com>
* Audit: audit watch init should not be before fsnotify initEric Paris2010-07-281-1/+1
| | | | | | | | | | | | Audit watch init and fsnotify init both use subsys_initcall() but since the audit watch code is linked in before the fsnotify code the audit watch code would be using the fsnotify srcu struct before it was initialized. This patch fixes that problem by moving audit watch init to device_initcall() so it happens after fsnotify is ready. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Eric Paris <eparis@redhat.com> Tested-by : Sachin Sant <sachinp@in.ibm.com>
* Audit: split audit watch KconfigEric Paris2010-07-283-4/+21
| | | | | | | | Audit watch should depend on CONFIG_AUDIT_SYSCALL and should select FSNOTIFY. This splits the spagetti like mixing of audit_watch and audit_filter code so they can be configured seperately. Signed-off-by: Eric Paris <eparis@redhat.com>
* Audit: audit watches depend on fsnotifyEric Paris2010-07-281-2/+2
| | | | | | | | CONFIG_AUDIT builds audit_watches which depend on fsnotify. Make CONFIG_AUDIT select fsnotify. Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Eric Paris <eparis@redhat.com>
* audit: reimplement audit_trees using fsnotify rather than inotifyEric Paris2010-07-284-109/+136
| | | | | | | Simply switch audit_trees from using inotify to using fsnotify for it's inode pinning and disappearing act information. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: allow addition of duplicate fsnotify marksEric Paris2010-07-285-7/+9
| | | | | | | | | This patch allows a task to add a second fsnotify mark to an inode for the same group. This mark will be added to the end of the inode's list and this will never be found by the stand fsnotify_find_mark() function. This is useful if a user wants to add a new mark before removing the old one. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: duplicate fsnotify_mark_entry data between 2 marksEric Paris2010-07-282-1/+11
| | | | | | | Simple copy fsnotify information from one mark to another in preparation for the second mark to replace the first. Signed-off-by: Eric Paris <eparis@redhat.com>
* audit: do not get and put just to free a watchEric Paris2010-07-283-31/+5
| | | | | | | | | | deleting audit watch rules is not currently done under audit_filter_mutex. It was done this way because we could not hold the mutex during inotify manipulation. Since we are using fsnotify we don't need to do the extra get/put pair nor do we need the private list on which to store the parents while they are about to be freed. Signed-off-by: Eric Paris <eparis@redhat.com>
* audit: redo audit watch locking and refcnt in light of fsnotifyEric Paris2010-07-281-40/+5
| | | | | | | | fsnotify can handle mutexes to be held across all fsnotify operations since it deals strickly in spinlocks. This can simplify and reduce some of the audit_filter_mutex taking and dropping. Signed-off-by: Eric Paris <eparis@redhat.com>
* audit: convert audit watches to use fsnotify instead of inotifyEric Paris2010-07-282-61/+152
| | | | | | | | Audit currently uses inotify to pin inodes in core and to detect when watched inodes are deleted or unmounted. This patch uses fsnotify instead of inotify. Signed-off-by: Eric Paris <eparis@redhat.com>
* Audit: clean up the audit_watch splitEric Paris2010-07-285-67/+60
| | | | | | | | No real changes, just cleanup to the audit_watch split patch which we done with minimal code changes for easy review. Now fix interfaces to make things work better. Signed-off-by: Eric Paris <eparis@redhat.com>
* inotify: simplify the inotify idr handlingEric Paris2010-07-281-55/+139
| | | | | | | | This patch moves all of the idr editing operations into their own idr functions. It makes it easier to prove locking correctness and to to understand the code flow. Signed-off-by: Eric Paris <eparis@redhat.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2010-07-271-1/+1
|\ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: Pass the correct end of buffer to p9stat_read
| * 9p: Pass the correct end of buffer to p9stat_readLatchesar Ionkov2010-07-271-1/+1
| | | | | | | | | | | | | | Pass the correct end of the buffer to p9stat_read. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
* | gpio: fix spurious printk when freeing a gpioJon Povey2010-07-271-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When freeing a gpio that has not been exported, gpio_unexport() prints a debug message when it should just fall through silently. Example spurious message: gpio_unexport: gpio0 status -22 Signed-off-by: Jon Povey <jon.povey@racelogic.co.uk> Cc: David Brownell <david-b@pacbell.net> Acked-by: Uwe Kleine-K?nig <u.kleine-koenig@pengutronix.de> Cc: Gregory Bean <gbean@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | edac: mpc85xx: fix coldplug/hotplug module autoloadingAnton Vorontsov2010-07-271-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The MPC85xx EDAC driver is missing module device aliases, so the driver won't load automatically on boot. This patch fixes the issue by adding proper MODULE_DEVICE_TABLE() macros. Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Dave Jiang <djiang@mvista.com> Cc: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | drivers/rtc/rtc-rx8581.c: fix setdatetimeRudolf Marek2010-07-271-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the logic while writing new date/time to the chip. The driver incorrectly wrote back register values to different registers and even with wrong mask. The patch adds clearing of the VLF register, which should be cleared if all date/time values are set. Signed-off-by: Rudolf Marek <rudolf.marek@sysgo.com> Acked-by: Wan ZongShun <mcuos.com@gmail.com> Cc: Martyn Welch <martyn.welch@gefanuc.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | dynamic debug: move ddebug_remove_module() down into free_module()Jason Baron2010-07-271-1/+3
|/ | | | | | | | | | | | | | | | | | | The command echo "file ec.c +p" >/sys/kernel/debug/dynamic_debug/control causes an oops. Move the call to ddebug_remove_module() down into free_module(). In this way it should be called from all error paths. Currently, we are missing the remove if the module init routine fails. Signed-off-by: Jason Baron <jbaron@redhat.com> Reported-by: Thomas Renninger <trenn@suse.de> Tested-by: Thomas Renninger <trenn@suse.de> Cc: <stable@kernel.org> [2.6.32+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'urgent' of ↵Linus Torvalds2010-07-271-3/+3
|\ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulus/perf * 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/perf: perf, powerpc: Use perf_sample_data_init() for the FSL code
| * perf, powerpc: Use perf_sample_data_init() for the FSL codePeter Zijlstra2010-07-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should use perf_sample_data_init() to initialize struct perf_sample_data. As explained in the description of commit dc1d628a ("perf: Provide generic perf_sample_data initialization"), it is possible for userspace to get the kernel to dereference data.raw, so if it is not initialized, that means that unprivileged userspace can possibly oops the kernel. Using perf_sample_data_init makes sure it gets initialized to NULL. This conversion should have been included in commit dc1d628a, but it got missed. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Kumar Gala <kumar.gala@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | Merge git://git.infradead.org/users/cbou/battery-2.6.35Linus Torvalds2010-07-271-15/+14
|\ \ | | | | | | | | | | | | * git://git.infradead.org/users/cbou/battery-2.6.35: ds2782_battery: Rename get_current to fix build failure / name conflict
| * | ds2782_battery: Rename get_current to fix build failure / name conflictPeter Huewe2010-06-141-15/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the name of get_current function pointer to get_battery_current to resolve a name conflict with the get_current macro defined in current.h. This conflict resulted in a build-failure[1] for the sh4 arch allyesconfig: drivers/power/ds2782_battery.c:216:48: error: macro "get_current" passed 2 arguments, but takes just This patch fixes the issue. To be consistent the other function pointers (_voltage,_capacity) were renamed too. Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Acked-by: Ryan Mallon <ryan@bluewatersys.com> Acked-by: Mike Rapoport <mike@compulab.co.il> Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2010-07-2719-36/+151
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: s2io: fixing DBG_PRINT() macro ath9k: fix dma direction for map/unmap in ath_rx_tasklet net: dev_forward_skb should call nf_reset net sched: fix race in mirred device removal tun: avoid BUG, dump packet on GSO errors bonding: set device in RLB ARP packet handler wimax/i2400m: Add PID & VID for Intel WiMAX 6250 ipv6: Don't add routes to ipv6 disabled interfaces. net: Fix skb_copy_expand() handling of ->csum_start net: Fix corruption of skb csum field in pskb_expand_head() of net/core/skbuff.c macvtap: Limit packet queue length ixgbe/igb: catch invalid VF settings bnx2x: Advance a module version bnx2x: Protect statistics ramrod and sequence number bnx2x: Protect a SM state change wireless: use netif_rx_ni in ieee80211_send_layer2_update
| * | s2io: fixing DBG_PRINT() macroBreno Leitao2010-07-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch 9e39f7c5b311a306977c5471f9e2ce4c456aa038 changed the DBG_PRINT() macro and the if clause was wrongly changed. It means that currently all the DBG_PRINT are being printed, flooding the kernel log buffer with things like: s2io: eth6: Next block at: c0000000b9c90000 s2io: eth6: In Neterion Tx routine Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com> Acked-by: Sreenivasa Honnur <Sreenivasa.Honnur@neterion.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge branch 'master' of ↵David S. Miller2010-07-262-3/+3
| |\ \ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
| | * | ath9k: fix dma direction for map/unmap in ath_rx_taskletMing Lei2010-07-261-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For edma, we should use DMA_BIDIRECTIONAL, or else use DMA_FROM_DEVICE. This is found to address "BUG at arch/x86/mm/physaddr.c:5" as described here: http://lkml.org/lkml/2010/7/14/21 Signed-off-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * | wireless: use netif_rx_ni in ieee80211_send_layer2_updateJohn W. Linville2010-07-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These synthetic frames are all triggered from userland requests in process context. https://bugzilla.kernel.org/show_bug.cgi?id=16412 Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * | | net: dev_forward_skb should call nf_resetBen Greear2010-07-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With conn-track zones and probably with different network namespaces, the netfilter logic needs to be re-calculated on packet receive. If the netfilter logic is not reset, it will not be recalculated properly. This patch adds the nf_reset logic to dev_forward_skb. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net sched: fix race in mirred device removalstephen hemminger2010-07-252-3/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes hang when target device of mirred packet classifier action is removed. If a mirror or redirection action is configured to cause packets to go to another device, the classifier holds a ref count, but was assuming the adminstrator cleaned up all redirections before removing. The fix is to add a notifier and cleanup during unregister. The new list is implicitly protected by RTNL mutex because it is held during filter add/delete as well as notifier. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge branch 'wimax-2.6.35.y' of ↵David S. Miller2010-07-252-0/+3
| |\ \ \ | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/inaky/wimax
| | * | | wimax/i2400m: Add PID & VID for Intel WiMAX 6250Alexey Shvetsov2010-07-222-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This version of intel wimax device was found in my IBM ThinkPad x201 Signed-off-by: Alexey Shvetsov <alexxy@gentoo.org>
| * | | | tun: avoid BUG, dump packet on GSO errorsMichael S. Tsirkin2010-07-251-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are still some LRO cards that cause GSO errors in tun, and BUG on this is an unfriendly way to tell the admin to disable LRO. Further, experience shows we might have more GSO bugs lurking. See https://bugzilla.kernel.org/show_bug.cgi?id=16413 as a recent example. dumping a packet will make it easier to figure it out. Replace BUG with warning+dump+drop the packet to make GSO errors in tun less critical and easier to debug. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Alex Unigovsky <unik@compot.ru> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | bonding: set device in RLB ARP packet handlerGreg Edwards2010-07-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After: commit 6146b1a4da98377e4abddc91ba5856bef8f23f1e Author: Jay Vosburgh <fubar@us.ibm.com> Date: Tue Nov 4 17:51:15 2008 -0800 bonding: Fix ALB mode to balance traffic on VLANs the dev field in the RLB ARP packet handler was set to NULL to wildcard and accommodate balancing VLANs on top of bonds. This has the side-effect of the packet handler being called against other, non RLB-enabled bonds, and a kernel oops results when it tries to dereference rx_hashtbl in rlb_update_entry_from_arp(), which won't be set for those bonds, e.g. active-backup. With the __netif_receive_skb() changes from: commit 1f3c8804acba841b5573b953f5560d2683d2db0d Author: Andy Gospodarek <andy@greyhouse.net> Date: Mon Dec 14 10:48:58 2009 +0000 bonding: allow arp_ip_targets on separate vlans to use arp validation frames received on VLANs correctly make their way to the bond's handler, so we no longer need to wildcard the device. The oops can be reproduced by: modprobe bonding echo active-backup > /sys/class/net/bond0/bonding/mode echo 100 > /sys/class/net/bond0/bonding/miimon ifconfig bond0 xxx.xxx.xxx.xxx netmask xxx.xxx.xxx.xxx echo +eth0 > /sys/class/net/bond0/bonding/slaves echo +eth1 > /sys/class/net/bond0/bonding/slaves echo +bond1 > /sys/class/net/bonding_masters echo balance-alb > /sys/class/net/bond1/bonding/mode echo 100 > /sys/class/net/bond1/bonding/miimon ifconfig bond1 xxx.xxx.xxx.xxx netmask xxx.xxx.xxx.xxx echo +eth2 > /sys/class/net/bond1/bonding/slaves echo +eth3 > /sys/class/net/bond1/bonding/slaves Pass some traffic on bond0. Boom. [ Tested, behaves as advertised. I do not believe a test of the bonding mode is necessary, as there is no race between the packet handler and the bonding mode changing (the mode can only change when the device is closed). Also updated the log message to include the reproduction and full commit ids. -J ] Signed-off-by: Greg Edwards <greg.edwards@hp.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | ipv6: Don't add routes to ipv6 disabled interfaces.Brian Haley2010-07-221-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the interface has IPv6 disabled, don't add a multicast or link-local route since we won't be adding a link-local address. Reported-by: Mahesh Kelkar <maheshkelkar@gmail.com> Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: Fix skb_copy_expand() handling of ->csum_startDavid S. Miller2010-07-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It should only be adjusted if ip_summed == CHECKSUM_PARTIAL. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: Fix corruption of skb csum field in pskb_expand_head() of net/core/skbuff.cAndrea Shepard2010-07-221-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make pskb_expand_head() check ip_summed to make sure csum_start is really csum_start and not csum before adjusting it. This fixes a bug I encountered using a Sun Quad-Fast Ethernet card and VLANs. On my configuration, the sunhme driver produces skbs with differing amounts of headroom on receive depending on the packet size. See line 2030 of drivers/net/sunhme.c; packets smaller than RX_COPY_THRESHOLD have 52 bytes of headroom but packets larger than that cutoff have only 20 bytes. When these packets reach the VLAN driver, vlan_check_reorder_header() calls skb_cow(), which, if the packet has less than NET_SKB_PAD (== 32) bytes of headroom, uses pskb_expand_head() to make more. Then, pskb_expand_head() needs to adjust a lot of offsets into the skb, including csum_start. Since csum_start is a union with csum, if the packet has a valid csum value this will corrupt it, which was the effect I observed. The sunhme hardware computes receive checksums, so the skbs would be created by the driver with ip_summed == CHECKSUM_COMPLETE and a valid csum field, and then pskb_expand_head() would corrupt the csum field, leading to an "hw csum error" message later on, for example in icmp_rcv() for pings larger than the sunhme RX_COPY_THRESHOLD. On the basis of the comment at the beginning of include/linux/skbuff.h, I believe that the csum_start skb field is only meaningful if ip_csummed is CSUM_PARTIAL, so this patch makes pskb_expand_head() adjust it only in that case to avoid corrupting a valid csum value. Please see my more in-depth disucssion of tracking down this bug for more details if you like: http://puellavulnerata.livejournal.com/112186.html http://puellavulnerata.livejournal.com/112567.html http://puellavulnerata.livejournal.com/112891.html http://puellavulnerata.livejournal.com/113096.html http://puellavulnerata.livejournal.com/113591.html I am not subscribed to this list, so please CC me on replies. Signed-off-by: Andrea Shepard <andrea@persephoneslair.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | macvtap: Limit packet queue lengthHerbert Xu2010-07-223-4/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark Wagner reported OOM symptoms when sending UDP traffic over a macvtap link to a kvm receiver. This appears to be caused by the fact that macvtap packet queues are unlimited in length. This means that if the receiver can't keep up with the rate of flow, then we will hit OOM. Of course it gets worse if the OOM killer then decides to kill the receiver. This patch imposes a cap on the packet queue length, in the same way as the tuntap driver, using the device TX queue length. Please note that macvtap currently has no way of giving congestion notification, that means the software device TX queue cannot be used and packets will always be dropped once the macvtap driver queue fills up. This shouldn't be a great problem for the scenario where macvtap is used to feed a kvm receiver, as the traffic is most likely external in origin so congestion notification can't be applied anyway. Of course, if anybody decides to complain about guest-to-guest UDP packet loss down the track, then we may have to revisit this. Incidentally, this patch also fixes a real memory leak when macvtap_get_queue fails. Chris Wright noticed that for this patch to work, we need a non-zero TX queue length. This patch includes his work to change the default macvtap TX queue length to 500. Reported-by: Mark Wagner <mwagner@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | ixgbe/igb: catch invalid VF settingsAndy Gospodarek2010-07-212-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some ixgbe cards put an invalid VF device ID in the PCIe SR-IOV capability. The ixgbe driver is only valid for PFs or non SR-IOV hardware. It seems that the same problem could occur on igb hardware as well, so if we discover we are trying to initialize a VF in ixbge_probe or igb_probe, print an error and exit. Based on a patch for ixgbe from Chris Wright <chrisw@sous-sol.org>. Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Cc: Chris Wright <chrisw@sous-sol.org> Acked-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>