summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* [JFFS2][XATTR] XATTR support on JFFS2 (version. 5)KaiGai Kohei2006-05-1329-15/+2800
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This attached patches provide xattr support including POSIX-ACL and SELinux support on JFFS2 (version.5). There are some significant differences from previous version posted at last December. The biggest change is addition of EBS(Erase Block Summary) support. Currently, both kernel and usermode utility (sumtool) can recognize xattr nodes which have JFFS2_NODETYPE_XATTR/_XREF nodetype. In addition, some bugs are fixed. - A potential race condition was fixed. - Unexpected fail when updating a xattr by same name/value pair was fixed. - A bug when removing xattr name/value pair was fixed. The fundamental structures (such as using two new nodetypes and exclusion mechanism by rwsem) are unchanged. But most of implementation were reviewed and updated if necessary. Espacially, we had to change several internal implementations related to load_xattr_datum() to avoid a potential race condition. [1/2] xattr_on_jffs2.kernel.version-5.patch [2/2] xattr_on_jffs2.utils.version-5.patch Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* Trivial typo fixes in Kconfig files (MTD).Egry Gábor2006-05-123-4/+4
| | | | | Signed-off-by: Egry Gábor <gaboregry@t-online.hu> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* OneNAND: fix block command typoKyungmin Park2006-05-121-1/+1
| | | | | | We need to check block cmd only instead with comparing with cmd Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
* OneNAND: One-Time Programmable (OTP) supportKyungmin Park2006-05-125-4/+335
| | | | | | | | | | | | | | One Block of the NAND Flash Array memory is reserved as a One-Time Programmable Block memory area. Also, 1st Block of NAND Flash Array can be used as OTP. The OTP block can be read, programmed and locked using the same operations as any other NAND Flash Array memory block. OTP block cannot be erased. OTP block is fully-guaranteed to be a valid block. Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
* OneNAND: Handle erase correctly in Double Density Package (DDP)Kyungmin Park2006-05-121-0/+6
| | | | | | | There's erase bug in DDP. We need to add DDP select in erase Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
* OneNAND: Write oob area with aligned size, mtd->oobsizeKyungmin Park2006-05-121-2/+5
| | | | | | | | There's some problem with write oob in serveral platform. So we write oob with oobsize aligned (16bytes) instead of 3 bytes (from {2, 3}) Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
* OneNAND: Add write_oob verify functionKyungmin Park2006-05-121-4/+43
| | | | Signed-off-by: Jarkko Lavinen <jarkko.lavinen@nokia.com>
* OneNand: Fix free byte positions.Jarkko Lavinen2006-05-121-1/+2
| | | | | | | | Some free byte positions at onenand_oob_64 were wrong. This was also reported by Christian Lehne. 3 byte slots are at 2+16*i and 2 byte slots at 14+16*i. Signed-off-by: Jarkko Lavinen <jarkko.lavinen@nokia.com>
* OneNAND: handle byte access on BufferRAMKyungmin Park2006-05-122-0/+41
| | | | Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
* OneNAND: Add touch_softlock_watchdog()Kyungmin Park2006-05-121-0/+1
| | | | Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
* [JFFS2] Remove number of pointer dereferences in fs/jffs2/summary.cJesper Juhl2006-05-121-19/+19
| | | | | | | | | | | | | | | Reduce the nr. of pointer dereferences in fs/jffs2/summary.c Benefits: - micro speed optimization due to fewer pointer derefs - generated code is slightly smaller - better readability (The first two sound like a compiler problem but I'll go with the third. dwmw2). Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* [MTD] Fix invalid default value of CONFIG_MTD_PCMCIA_ANONYMOUS in KconfigJean-Luc Leger2006-05-121-1/+0
| | | | | | | | | | Default values for boolean and tristate options can only be 'y', 'm' or 'n'. This patch removes wrong default for MTD_PCMCIA_ANONYMOUS. Signed-off-by: Jean-Luc Leger <jean-luc.leger@dspnet.fr.eu.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* [JFFS2] Remove obsolete histo.hDomen Puncer2006-05-121-3/+0
| | | | | | | | | | | This file hasn't actually been used since the very early days of JFFS2 when Arjan was playing with compression methods. It can go now. Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* [MTD] Fix capitalisation in export of old doc2001.c initfuncDavid Woodhouse2006-05-121-1/+1
| | | | | | | | | | | | | Oops. Stupid StudlyCaps. Again. This driver is doubly-deprecated because is was subsumed into doc2000.c and _also_ we want people to start using the new NAND wrapper for these devices anyway. But ISTR there was still one person using it because something didn't work for them. Must chase that up and then I can kill this. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* [MTD] Basic NAND driver for AMD/NatSemi CS5535/CS5536 Geode companion chipDavid Woodhouse2006-05-113-0/+298
| | | | | | | This lacks hardware ECC support and a few optimisations we're going to want fairly soon, but it works well enough to mount and use JFFS2. Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* [MTD] Fix capitalisation in export of DiskOnChip Millennium initfuncDavid Woodhouse2006-05-101-1/+1
| | | | | | Stupid StudlyCaps. Who did that? Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* Export cfi_cmdset_0020 and cfi_cmdset_0002 with EXPORT_SYMBOL_GPLDavid Woodhouse2006-05-082-1/+2
| | | | Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* Finally remove the obnoxious inter_module_xxx()David Woodhouse2006-05-084-197/+0
| | | | | | | This was already a bad plan when I argued against adding it in the first place. Good riddance. Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* Remove use of inter_module_crap in NOR flash chip drivers.David Woodhouse2006-05-087-93/+20
| | | | Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* Fix non-modular case for DiskOnChip probeDavid Woodhouse2006-05-081-4/+31
| | | | Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* Remove inter_module_xxx() from DiskOnChip drivers.David Woodhouse2006-05-086-64/+18
| | | | | | Finally putting it back how it was before Keith got at it -- yay :) Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* [MTD] Convert physmap to platform driverLennert Buytenhek2006-05-073-83/+196
| | | | | | | | | | | | | | | | | | | | | | | | | | | After dwmw2 let me know it ought to be done, I rewrote the physmap map driver to be a platform driver. I know zilch about the driver model, so I probably botched it in some way, but I've done some tests on an ixp23xx board which uses physmap, and it all seems to work. In order to not break existing physmap users, I've added some compat code that will instantiate a platform device iff CONFIG_MTD_PHYSMAP_LEN is defined and != 0. Also, I've changed the default value for CONFIG_MTD_PHYSMAP_LEN to zero, so that people who inadvertently compile in physmap (or new, platform-style, users of physmap) don't get burned. This works pretty well -- the new physmap driver is a drop-in replacement for the old one, and works on said ixp23xx board without any code changes needed. (This should hold as long as users don't touch 'physmap_map' directly.) Once all physmap users have been converted to instantiate their own platform devices, the compat code can go. (Or we decide that we can change all the in-tree users at the same time, and never merge the compat code.) Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* [JFFS2] Fix race in setting file attributesDmitry Bazhenov2006-05-051-1/+6
| | | | | | | | | | | It seems like there is a potential race in the function jffs2_do_setattr() in the case when attributes of a symlink are updated. The symlink metadata is read without having f->sem locked. The following patch should fix the race. Signed-off-by: Dmitry Bazhenov <atrey@emcraft.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* Merge branch 'master' of ↵David Woodhouse2006-05-03892-12904/+21258
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
| * [NETFILTER] SCTP conntrack: fix infinite loopPatrick McHardy2006-05-032-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | fix infinite loop in the SCTP-netfilter code: check SCTP chunk size to guarantee progress of for_each_sctp_chunk(). (all other uses of for_each_sctp_chunk() are preceded by do_basic_checks(), so this fix should be complete.) Based on patch from Ingo Molnar <mingo@elte.hu> CVE-2006-1527 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * forcedeth: fix multi irq issuesAyaz Abdulla2006-05-021-84/+228
| | | | | | | | | | | | | | | | | | | | | | This patch fixes the issues with multiple irqs. I am resending based on feedback. I decoupled the dma mask for consistent memory and fixed leak with multiple irq in error path. Thanks to Manfred for catching the spin lock problem. Signed-Off-By: Ayaz Abdulla <aabdulla@nvidia.com>
| * [PATCH] via-rhine: zero pad short packets on Rhine I ethernet cardsCraig Brind2006-05-021-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes Rhine I cards disclosing fragments of previously transmitted frames in new transmissions. Before transmission, any socket buffer (skb) shorter than the ethernet minimum length of 60 bytes was zero-padded. On Rhine I cards the data can later be copied into an aligned transmission buffer without copying this padding. This resulted in the transmission of the frame with the extra bytes beyond the provided content leaking the previous contents of this buffer on to the network. Now zero-padding is repeated in the local aligned buffer if one is used. Following a suggestion from the via-rhine maintainer, no attempt is made here to avoid the duplicated effort of padding the skb if it is known that an aligned buffer will definitely be used. This is to make the change "obviously correct" and allow it to be applied to a stable kernel if necessary. There is no change to the flow of control and the changes are only to the Rhine I code path. The patch has run on an in-service Rhine-I host without incident. Frames shorter than 60 bytes are now correctly zero-padded when captured on a separate host. I see no unusual stats reported by ifconfig, and no unusual log messages. Signed-off-by: Craig Brind <craigbrind@gmail.com> Signed-off-by: Roger Luethi <rl@hellgate.ch> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| * [PATCH] mv643xx_eth: provide sysfs class device symlinkOlaf Hering2006-05-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Sat, Mar 11, Olaf Hering wrote: > Why is the /sys/class/net/eth0/device symlink not created for the > mv643xx_eth driver? Does this work for other platform device drivers? > Seems to work for the ps2 keyboard at least. The SET_NETDEV_DEV has to be done before a call to register_netdev. With the new patch below, the device symlink for the platform device was created. Unfortunately, after the 4 ls commands, the network connection died. No idea if the box crashed or if something else broke, lost remote access. Provide sysfs 'device' in /class/net/ethN Also, set module owner field, like pcnet32 driver does. Signed-off-by: Olaf Hering <olh@suse.de> Acked-by: Dale Farnsworth <dale@farnsworth.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| * [PATCH] vmsplice: restrict stealing a little moreJens Axboe2006-05-023-4/+5
| | | | | | | | | | | | | | Apply the same rules as the anon pipe pages, only allow stealing if no one else is using the page. Signed-off-by: Jens Axboe <axboe@suse.de>
| * [PATCH] splice: fix page LRU accountingJens Axboe2006-05-022-13/+23
| | | | | | | | | | | | | | | | | | | | | | | | Currently we rely on the PIPE_BUF_FLAG_LRU flag being set correctly to know whether we need to fiddle with page LRU state after stealing it, however for some origins we just don't know if the page is on the LRU list or not. So remove PIPE_BUF_FLAG_LRU and do this check/add manually in pipe_to_file() instead. Signed-off-by: Jens Axboe <axboe@suse.de>
| * [PATCH] vmsplice: fix badly placed end paranthesisJens Axboe2006-05-021-1/+1
| | | | | | | | | | | | | | | | We need to use the minium of {len, PAGE_SIZE-off}, not {len, PAGE_SIZE}-off. The latter doesn't make any sense, and could cause us to attempt negative length transfers... Signed-off-by: Jens Axboe <axboe@suse.de>
| * Merge branch 'audit.b10' of ↵Linus Torvalds2006-05-0233-275/+1142
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current * 'audit.b10' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: [PATCH] Audit Filter Performance [PATCH] Rework of IPC auditing [PATCH] More user space subject labels [PATCH] Reworked patch for labels on user space messages [PATCH] change lspp ipc auditing [PATCH] audit inode patch [PATCH] support for context based audit filtering, part 2 [PATCH] support for context based audit filtering [PATCH] no need to wank with task_lock() and pinning task down in audit_syscall_exit() [PATCH] drop task argument of audit_syscall_{entry,exit} [PATCH] drop gfp_mask in audit_log_exit() [PATCH] move call of audit_free() into do_exit() [PATCH] sockaddr patch [PATCH] deal with deadlocks in audit_free()
| | * [PATCH] Audit Filter PerformanceSteve Grubb2006-05-011-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While testing the watch performance, I noticed that selinux_task_ctxid() was creeping into the results more than it should. Investigation showed that the function call was being called whether it was needed or not. The below patch fixes this. Signed-off-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * [PATCH] Rework of IPC auditingSteve Grubb2006-05-016-11/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1) The audit_ipc_perms() function has been split into two different functions: - audit_ipc_obj() - audit_ipc_set_perm() There's a key shift here... The audit_ipc_obj() collects the uid, gid, mode, and SElinux context label of the current ipc object. This audit_ipc_obj() hook is now found in several places. Most notably, it is hooked in ipcperms(), which is called in various places around the ipc code permforming a MAC check. Additionally there are several places where *checkid() is used to validate that an operation is being performed on a valid object while not necessarily having a nearby ipcperms() call. In these locations, audit_ipc_obj() is called to ensure that the information is captured by the audit system. The audit_set_new_perm() function is called any time the permissions on the ipc object changes. In this case, the NEW permissions are recorded (and note that an audit_ipc_obj() call exists just a few lines before each instance). 2) Support for an AUDIT_IPC_SET_PERM audit message type. This allows for separate auxiliary audit records for normal operations on an IPC object and permissions changes. Note that the same struct audit_aux_data_ipcctl is used and populated, however there are separate audit_log_format statements based on the type of the message. Finally, the AUDIT_IPC block of code in audit_free_aux() was extended to handle aux messages of this new type. No more mem leaks I hope ;-) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * [PATCH] More user space subject labelsSteve Grubb2006-05-014-40/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hi, The patch below builds upon the patch sent earlier and adds subject label to all audit events generated via the netlink interface. It also cleans up a few other minor things. Signed-off-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * [PATCH] Reworked patch for labels on user space messagesSteve Grubb2006-05-015-3/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The below patch should be applied after the inode and ipc sid patches. This patch is a reworking of Tim's patch that has been updated to match the inode and ipc patches since its similar. [updated: > Stephen Smalley also wanted to change a variable from isec to tsec in the > user sid patch. ] Signed-off-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * [PATCH] change lspp ipc auditingSteve Grubb2006-05-016-77/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hi, The patch below converts IPC auditing to collect sid's and convert to context string only if it needs to output an audit record. This patch depends on the inode audit change patch already being applied. Signed-off-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * [PATCH] audit inode patchSteve Grubb2006-05-013-37/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we were gathering the context instead of the sid. Now in this patch, we gather just the sid and convert to context only if an audit event is being output. This patch brings the performance hit from 146% down to 23% Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * [PATCH] support for context based audit filtering, part 2Darrel Goeddel2006-05-014-27/+256
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch provides the ability to filter audit messages based on the elements of the process' SELinux context (user, role, type, mls sensitivity, and mls clearance). It uses the new interfaces from selinux to opaquely store information related to the selinux context and to filter based on that information. It also uses the callback mechanism provided by selinux to refresh the information when a new policy is loaded. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * [PATCH] support for context based audit filteringDarrel Goeddel2006-05-018-10/+419
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following patch provides selinux interfaces that will allow the audit system to perform filtering based on the process context (user, role, type, sensitivity, and clearance). These interfaces will allow the selinux module to perform efficient matches based on lower level selinux constructs, rather than relying on context retrievals and string comparisons within the audit module. It also allows for dominance checks on the mls portion of the contexts that are impossible with only string comparisons. Signed-off-by: Darrel Goeddel <dgoeddel@trustedcs.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * [PATCH] no need to wank with task_lock() and pinning task down in ↵Al Viro2006-05-011-9/+1
| | | | | | | | | | | | | | | | | | audit_syscall_exit() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * [PATCH] drop task argument of audit_syscall_{entry,exit}Al Viro2006-05-0111-33/+27
| | | | | | | | | | | | | | | | | | ... it's always current, and that's a good thing - allows simpler locking. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * [PATCH] drop gfp_mask in audit_log_exit()Al Viro2006-05-011-30/+32
| | | | | | | | | | | | | | | | | | | | | now we can do that - all callers are process-synchronous and do not hold any locks. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * [PATCH] move call of audit_free() into do_exit()Al Viro2006-05-013-10/+4
| | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * [PATCH] sockaddr patchSteve Grubb2006-05-011-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Thursday 23 March 2006 09:08, John D. Ramsdell wrote: > I noticed that a socketcall(bind) and socketcall(connect) event contain a > record of type=SOCKADDR, but I cannot see one for a system call event > associated with socketcall(accept). Recording the sockaddr of an accepted > socket is important for cross platform information flow analys Thanks for pointing this out. The following patch should address this. Signed-off-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * [PATCH] deal with deadlocks in audit_free()Al Viro2006-05-011-10/+10
| | | | | | | | | | | | | | | | | | | | | Don't assume that audit_log_exit() et.al. are called for the context of current; pass task explictly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | [NETFILTER] x_tables: fix compat related crash on non-x86Patrick McHardy2006-05-022-19/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When iptables userspace adds an ipt_standard_target, it calculates the size of the entire entry as: sizeof(struct ipt_entry) + XT_ALIGN(sizeof(struct ipt_standard_target)) ipt_standard_target looks like this: struct xt_standard_target { struct xt_entry_target target; int verdict; }; xt_entry_target contains a pointer, so when compiled for 64 bit the structure gets an extra 4 byte of padding at the end. On 32 bit architectures where iptables aligns to 8 byte it will also have 4 byte padding at the end because it is only 36 bytes large. The compat_ipt_standard_fn in the kernel adjusts the offsets by sizeof(struct ipt_standard_target) - sizeof(struct compat_ipt_standard_target), which will always result in 4, even if the structure from userspace was already padded to a multiple of 8. On x86 this works out by accident because userspace only aligns to 4, on all other architectures this is broken and causes incorrect adjustments to the size and following offsets. Thanks to Linus for lots of debugging help and testing. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-blockLinus Torvalds2006-05-024-126/+255
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block: [PATCH] vmsplice: allow user to pass in gift pages [PATCH] pipe: enable atomic copying of pipe data to/from user space [PATCH] splice: call handle_ra_miss() on failure to lookup page [PATCH] Add ->splice_read/splice_write to def_blk_fops [PATCH] pipe: introduce ->pin() buffer operation [PATCH] splice: fix bugs in pipe_to_file() [PATCH] splice: fix bugs with stealing regular pipe pages
| | * | [PATCH] vmsplice: allow user to pass in gift pagesJens Axboe2006-05-012-3/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If SPLICE_F_GIFT is set, the user is basically giving this pages away to the kernel. That means we can steal them for eg page cache uses instead of copying it. The data must be properly page aligned and also a multiple of the page size in length. Signed-off-by: Jens Axboe <axboe@suse.de>
| | * | [PATCH] pipe: enable atomic copying of pipe data to/from user spaceJens Axboe2006-05-013-30/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pipe ->map() method uses kmap() to virtually map the pages, which is both slow and has known scalability issues on SMP. This patch enables atomic copying of pipe pages, by pre-faulting data and using kmap_atomic() instead. lmbench bw_pipe and lat_pipe measurements agree this is a Good Thing. Here are results from that on a UP machine with highmem (1.5GiB of RAM), running first a UP kernel, SMP kernel, and SMP kernel patched. Vanilla-UP: Pipe bandwidth: 1622.28 MB/sec Pipe bandwidth: 1610.59 MB/sec Pipe bandwidth: 1608.30 MB/sec Pipe latency: 7.3275 microseconds Pipe latency: 7.2995 microseconds Pipe latency: 7.3097 microseconds Vanilla-SMP: Pipe bandwidth: 1382.19 MB/sec Pipe bandwidth: 1317.27 MB/sec Pipe bandwidth: 1355.61 MB/sec Pipe latency: 9.6402 microseconds Pipe latency: 9.6696 microseconds Pipe latency: 9.6153 microseconds Patched-SMP: Pipe bandwidth: 1578.70 MB/sec Pipe bandwidth: 1579.95 MB/sec Pipe bandwidth: 1578.63 MB/sec Pipe latency: 9.1654 microseconds Pipe latency: 9.2266 microseconds Pipe latency: 9.1527 microseconds Signed-off-by: Jens Axboe <axboe@suse.de>