linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	genirq / PM: describe IRQF_COND_SUSPEND	Mark Rutland	2015-03-06	1	-3/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	With certain restrictions it is possible for a wakeup device to share an IRQ with an IRQF_NO_SUSPEND user, and the warnings introduced by commit cab303be91dc47942bc25de33dc1140123540800 are spurious. The new IRQF_COND_SUSPEND flag allows drivers to tell the core when these restrictions are met, allowing spurious warnings to be silenced. This patch documents how IRQF_COND_SUSPEND is expected to be used, updating some of the text now made invalid by its addition. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
*	tty: serial: atmel: rework interrupt and wakeup handling	Boris BREZILLON	2015-03-06	1	-4/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The IRQ line connected to the DBGU UART is often shared with a timer device which request the IRQ with IRQF_NO_SUSPEND. Since the UART driver is correctly disabling IRQs when entering suspend we can safely request the IRQ with IRQF_COND_SUSPEND so that irq core will not complain about mixing IRQF_NO_SUSPEND and !IRQF_NO_SUSPEND. Rework the interrupt handler to wake the system up when an interrupt happens on the DEBUG_UART while the system is suspended. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
*	watchdog: at91sam9: request the irq with IRQF_NO_SUSPEND	Boris BREZILLON	2015-03-06	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The watchdog interrupt (only used when activating software watchdog) shouldn't be suspended when entering suspend mode, because it is shared with a timer device (which request the line with IRQF_NO_SUSPEND) and once the watchdog "Mode Register" has been written, it cannot be changed (which means we cannot disable the watchdog interrupt when entering suspend). Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
*	clk: at91: implement suspend/resume for the PMC irqchip	Boris BREZILLON	2015-03-04	2	-1/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The irq line used by the PMC block is shared with several peripherals including the init timer which is registering its handler with IRQF_NO_SUSPEND. Implement the appropriate suspend/resume callback for the PMC irqchip, and inform irq core that PMC irq handler can be safely called while the system is suspended by setting IRQF_COND_SUSPEND. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
*	rtc: at91rm9200: rework wakeup and interrupt handling	Boris BREZILLON	2015-03-04	1	-14/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The IRQ line used by the RTC device is usually shared with the system timer (PIT) on at91 platforms. Since timers are registering their handlers with IRQF_NO_SUSPEND, we should expect being called in suspended state, and properly wake the system up when this is the case. Set IRQF_COND_SUSPEND flag when registering the IRQ handler to inform irq core that it can safely be called while the system is suspended. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
*	rtc: at91sam9: rework wakeup and interrupt handling	Boris BREZILLON	2015-03-04	1	-12/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The IRQ line used by the RTC device is usually shared with the system timer (PIT) on at91 platforms. Since timers are registering their handlers with IRQF_NO_SUSPEND, we should expect being called in suspended state, and properly wake the system up when this is the case. Set IRQF_COND_SUSPEND flag when registering the IRQ handler to inform irq core that it can safely be called while the system is suspended. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
*	PM / wakeup: export pm_system_wakeup symbol	Boris BREZILLON	2015-03-04	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Export pm_system_wakeup function to allow irq handlers to deal with system wakeup. This is needed for shared IRQ lines where one of the handler is registered with IRQF_NO_SUSPEND, while the other ones want to configure it as a wakeup source. In this specific case, irq core does not handle the wakeup process and leave the decision to each irq handler. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
*	genirq / PM: Add flag for shared NO_SUSPEND interrupt lines	Rafael J. Wysocki	2015-03-04	4	-2/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It currently is required that all users of NO_SUSPEND interrupt lines pass the IRQF_NO_SUSPEND flag when requesting the IRQ or the WARN_ON_ONCE() in irq_pm_install_action() will trigger. That is done to warn about situations in which unprepared interrupt handlers may be run unnecessarily for suspended devices and may attempt to access those devices by mistake. However, it may cause drivers that have no technical reasons for using IRQF_NO_SUSPEND to set that flag just because they happen to share the interrupt line with something like a timer. Moreover, the generic handling of wakeup interrupts introduced by commit 9ce7a25849e8 (genirq: Simplify wakeup mechanism) only works for IRQs without any NO_SUSPEND users, so the drivers of wakeup devices needing to use shared NO_SUSPEND interrupt lines for signaling system wakeup generally have to detect wakeup in their interrupt handlers. Thus if they happen to share an interrupt line with a NO_SUSPEND user, they also need to request that their interrupt handlers be run after suspend_device_irqs(). In both cases the reason for using IRQF_NO_SUSPEND is not because the driver in question has a genuine need to run its interrupt handler after suspend_device_irqs(), but because it happens to share the line with some other NO_SUSPEND user. Otherwise, the driver would do without IRQF_NO_SUSPEND just fine. To make it possible to specify that condition explicitly, introduce a new IRQ action handler flag for shared IRQs, IRQF_COND_SUSPEND, that, when set, will indicate to the IRQ core that the interrupt user is generally fine with suspending the IRQ, but it also can tolerate handler invocations after suspend_device_irqs() and, in particular, it is capable of detecting system wakeup and triggering it as appropriate from its interrupt handler. That will allow us to work around a problem with a shared timer interrupt line on at91 platforms. Link: http://marc.info/?l=linux-kernel&m=142252777602084&w=2 Link: http://marc.info/?t=142252775300011&r=1&w=2 Link: https://lkml.org/lkml/2014/12/15/552 Reported-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com>
*	genirq / PM: better describe IRQF_NO_SUSPEND semantics	Mark Rutland	2015-02-26	2	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The IRQF_NO_SUSPEND flag is intended to be used for interrupts required to be enabled during the suspend-resume cycle. This mostly consists of IPIs and timer interrupts, potentially including chained irqchip interrupts if these are necessary to handle timers or IPIs. If an interrupt does not fall into one of the aforementioned categories, requesting it with IRQF_NO_SUSPEND is likely incorrect. Using IRQF_NO_SUSPEND does not guarantee that the interrupt can wake the system from a suspended state. For an interrupt to be able to trigger a wakeup, it may be necessary to program various components of the system. In these cases it is necessary to use {enable,disabled}_irq_wake. Unfortunately, several drivers assume that IRQF_NO_SUSPEND ensures that an IRQ can wake up the system, and the documentation can be read ambiguously w.r.t. this property. This patch updates the documentation regarding IRQF_NO_SUSPEND to make this caveat explicit, hopefully making future misuse rarer. Cleanup of existing misuse will occur as part of later patch series. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
*	Linux 4.0-rc1v4.0-rc1	Linus Torvalds	2015-02-23	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	.. after extensive statistical analysis of my G+ polling, I've come to the inescapable conclusion that internet polls are bad. Big surprise. But "Hurr durr I'ma sheep" trounced "I like online polls" by a 62-to-38% margin, in a poll that people weren't even supposed to participate in. Who can argue with solid numbers like that? 5,796 votes from people who can't even follow the most basic directions? In contrast, "v4.0" beat out "v3.20" by a slimmer margin of 56-to-44%, but with a total of 29,110 votes right now. Now, arguably, that vote spread is only about 3,200 votes, which is less than the almost six thousand votes that the "please ignore" poll got, so it could be considered noise. But hey, I asked, so I'll honor the votes.
*	Merge tag 'ext4_for_linus' of ↵	Linus Torvalds	2015-02-23	5	-56/+108
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Ext4 bug fixes. We also reserved code points for encryption and read-only images (for which the implementation is mostly just the reserved code point for a read-only feature :-)" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix indirect punch hole corruption ext4: ignore journal checksum on remount; don't fail ext4: remove duplicate remount check for JOURNAL_CHECKSUM change ext4: fix mmap data corruption in nodelalloc mode when blocksize < pagesize ext4: support read-only images ext4: change to use setup_timer() instead of init_timer() ext4: reserve codepoints used by the ext4 encryption feature jbd2: complain about descriptor block checksum errors
\| *	ext4: fix indirect punch hole corruption	Omar Sandoval	2015-02-15	1	-34/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 4f579ae7de56 (ext4: fix punch hole on files with indirect mapping) rewrote FALLOC_FL_PUNCH_HOLE for ext4 files with indirect mapping. However, there are bugs in several corner cases. This fixes 5 distinct bugs: 1. When there is at least one entire level of indirection between the start and end of the punch range and the end of the punch range is the first block of its level, we can't return early; we have to free the intervening levels. 2. When the end is at a higher level of indirection than the start and ext4_find_shared returns a top branch for the end, we still need to free the rest of the shared branch it returns; we can't decrement partial2. 3. When a punch happens within one level of indirection, we need to converge on an indirect block that contains the start and end. However, because the branches returned from ext4_find_shared do not necessarily start at the same level (e.g., the partial2 chain will be shallower if the last block occurs at the beginning of an indirect group), the walk of the two chains can end up "missing" each other and freeing a bunch of extra blocks in the process. This mismatch can be handled by first making sure that the chains are at the same level, then walking them together until they converge. 4. When the punch happens within one level of indirection and ext4_find_shared returns a top branch for the start, we must free it, but only if the end does not occur within that branch. 5. When the punch happens within one level of indirection and ext4_find_shared returns a top branch for the end, then we shouldn't free the block referenced by the end of the returned chain (this mirrors the different levels case). Signed-off-by: Omar Sandoval <osandov@osandov.com>
\| *	ext4: ignore journal checksum on remount; don't fail	Eric Sandeen	2015-02-13	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As of v3.18, ext4 started rejecting a remount which changes the journal_checksum option. Prior to that, it was simply ignored; the problem here is that if someone has this in their fstab for the root fs, now the box fails to boot properly, because remount of root with the new options will fail, and the box proceeds with a readonly root. I think it is a little nicer behavior to accept the option, but warn that it's being ignored, rather than failing the mount, but that might be a subjective matter... Reported-by: Cónräd <conradsand.arma@gmail.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| *	ext4: remove duplicate remount check for JOURNAL_CHECKSUM change	Eric Sandeen	2015-02-13	1	-11/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rejection of, changing journal_checksum during remount. One suffices. While we're at it, remove old comment about the "check" option which has been deprecated for some time now. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| *	ext4: fix mmap data corruption in nodelalloc mode when blocksize < pagesize	Xiaoguang Wang	2015-02-13	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since commit 90a8020 and d6320cb, Jan Kara has fixed this issue partially. This mmap data corruption still exists in nodelalloc mode, fix this. Signed-off-by: Xiaoguang Wang <wangxg.fnst@cn.fujitsu.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
\| *	ext4: support read-only images	Darrick J. Wong	2015-02-13	2	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a rocompat feature, "readonly" to mark a FS image as read-only. The feature prevents the kernel and e2fsprogs from changing the image; the flag can be toggled by tune2fs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| *	ext4: change to use setup_timer() instead of init_timer()	Jan Mrazek	2015-01-26	1	-3/+2
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Jan Mrazek <email@honzamrazek.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| *	ext4: reserve codepoints used by the ext4 encryption feature	Theodore Ts'o	2015-01-19	1	-4/+13
\| \| \| \| \| \| \| \|	Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| *	jbd2: complain about descriptor block checksum errors	Darrick J. Wong	2015-01-19	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We should complain in dmesg when journal recovery fails on account of the descriptor block being corrupt, so that the diagnostic data can be recovered. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* \|	Merge branch 'for-linus-2' of ↵	Linus Torvalds	2015-02-23	70	-758/+907
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: "Assorted stuff from this cycle. The big ones here are multilayer overlayfs from Miklos and beginning of sorting ->d_inode accesses out from David" * 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (51 commits) autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation procfs: fix race between symlink removals and traversals debugfs: leave freeing a symlink body until inode eviction Documentation/filesystems/Locking: ->get_sb() is long gone trylock_super(): replacement for grab_super_passive() fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) SELinux: Use d_is_positive() rather than testing dentry->d_inode Smack: Use d_is_positive() rather than testing dentry->d_inode TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR() Apparmor: Use d_is_positive/negative() rather than testing dentry->d_inode Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sb VFS: Split DCACHE_FILE_TYPE into regular and special types VFS: Add a fallthrough flag for marking virtual dentries VFS: Add a whiteout dentry type VFS: Introduce inode-getting helpers for layered/unioned fs environments Infiniband: Fix potential NULL d_inode dereference posix_acl: fix reference leaks in posix_acl_create autofs4: Wrong format for printing dentry ...
\| * \|	autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation	Al Viro	2015-02-22	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	X-Coverup: just ask spender Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	procfs: fix race between symlink removals and traversals	Al Viro	2015-02-22	3	-12/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	use_pde()/unuse_pde() in ->follow_link()/->put_link() resp. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	debugfs: leave freeing a symlink body until inode eviction	Al Viro	2015-02-22	1	-17/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As it is, we have debugfs_remove() racing with symlink traversals. Supply ->evict_inode() and do freeing there - inode will remain pinned until we are done with the symlink body. And rip the idiocy with checking if dentry is positive right after we'd verified debugfs_positive(), which is a stronger check... Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	Documentation/filesystems/Locking: ->get_sb() is long gone	Al Viro	2015-02-22	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	trylock_super(): replacement for grab_super_passive()	Konstantin Khlebnikov	2015-02-22	3	-26/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I've noticed significant locking contention in memory reclaimer around sb_lock inside grab_super_passive(). Grab_super_passive() is called from two places: in icache/dcache shrinkers (function super_cache_scan) and from writeback (function __writeback_inodes_wb). Both are required for progress in memory allocator. Grab_super_passive() acquires sb_lock to increment sb->s_count and check sb->s_instances. It seems sb->s_umount locked for read is enough here: super-block deactivation always runs under sb->s_umount locked for write. Protecting super-block itself isn't a problem: in super_cache_scan() sb is protected by shrinker_rwsem: it cannot be freed if its slab shrinkers are still active. Inside writeback super-block comes from inode from bdi writeback list under wb->list_lock. This patch removes locking sb_lock and checks s_instances under s_umount: generic_shutdown_super() unlinks it under sb->s_umount locked for write. New variant is called trylock_super() and since it only locks semaphore, callers must call up_read(&sb->s_umount) instead of drop_super(sb) when they're done. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions	David Howells	2015-02-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fanotify probably doesn't want to watch autodirs so make it use d_can_lookup() rather than d_is_dir() when checking a dir watch and give an error on fake directories. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions	David Howells	2015-02-22	4	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix up the following scripted S_ISDIR/S_ISREG/S_ISLNK conversions (or lack thereof) in cachefiles: (1) Cachefiles mostly wants to use d_can_lookup() rather than d_is_dir() as it doesn't want to deal with automounts in its cache. (2) Coccinelle didn't find S_IS* expressions in ASSERT() statements in cachefiles. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)	David Howells	2015-02-22	34	-71/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Convert the following where appropriate: (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry). (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry). (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more complicated than it appears as some calls should be converted to d_can_lookup() instead. The difference is whether the directory in question is a real dir with a ->lookup op or whether it's a fake dir with a ->d_automount op. In some circumstances, we can subsume checks for dentry->d_inode not being NULL into this, provided we the code isn't in a filesystem that expects d_inode to be NULL if the dirent really is negative (ie. if we're going to use d_inode() rather than d_backing_inode() to get the inode pointer). Note that the dentry type field may be set to something other than DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS manages the fall-through from a negative dentry to a lower layer. In such a case, the dentry type of the negative union dentry is set to the same as the type of the lower dentry. However, if you know d_inode is not NULL at the call site, then you can use the d_is_xxx() functions even in a filesystem. There is one further complication: a 0,0 chardev dentry may be labelled DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was intended for special directory entry types that don't have attached inodes. The following perl+coccinelle script was used: use strict; my @callers; open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' \|') \|\| die "Can't grep for S_ISDIR and co. callers"; @callers = <$fd>; close($fd); unless (@callers) { print "No matches\n"; exit(0); } my @cocci = ( '@@', 'expression E;', '@@', '', '- S_ISLNK(E->d_inode->i_mode)', '+ d_is_symlink(E)', '', '@@', 'expression E;', '@@', '', '- S_ISDIR(E->d_inode->i_mode)', '+ d_is_dir(E)', '', '@@', 'expression E;', '@@', '', '- S_ISREG(E->d_inode->i_mode)', '+ d_is_reg(E)' ); my $coccifile = "tmp.sp.cocci"; open($fd, ">$coccifile") \|\| die $coccifile; print($fd "$_\n") \|\| die $coccifile foreach (@cocci); close($fd); foreach my $file (@callers) { chomp $file; print "Processing ", $file, "\n"; system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 \|\| die "spatch failed"; } [AV: overlayfs parts skipped] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	SELinux: Use d_is_positive() rather than testing dentry->d_inode	David Howells	2015-02-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use d_is_positive() rather than testing dentry->d_inode in SELinux to get rid of direct references to d_inode outside of the VFS. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	Smack: Use d_is_positive() rather than testing dentry->d_inode	David Howells	2015-02-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use d_is_positive() rather than testing dentry->d_inode in Smack to get rid of direct references to d_inode outside of the VFS. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR()	David Howells	2015-02-22	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use d_is_dir() rather than d_inode and S_ISDIR(). Note that this will include fake directories such as automount triggers. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	Apparmor: Use d_is_positive/negative() rather than testing dentry->d_inode	David Howells	2015-02-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use d_is_positive(dentry) or d_is_negative(dentry) rather than testing dentry->d_inode as the dentry may cover another layer that has an inode when the top layer doesn't or may hold a 0,0 chardev that's actually a whiteout. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sb	David Howells	2015-02-22	2	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mediated_filesystem() should use dentry->d_sb not dentry->d_inode->i_sb and should avoid file_inode() also since it is really dealing with the path. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	VFS: Split DCACHE_FILE_TYPE into regular and special types	David Howells	2015-02-22	2	-8/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Split DCACHE_FILE_TYPE into DCACHE_REGULAR_TYPE (dentries representing regular files) and DCACHE_SPECIAL_TYPE (representing blockdev, chardev, FIFO and socket files). d_is_reg() and d_is_special() are added to detect these subtypes and d_is_file() is left as the union of the two. This allows a number of places that use S_ISREG(dentry->d_inode->i_mode) to use d_is_reg(dentry) instead. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	VFS: Add a fallthrough flag for marking virtual dentries	David Howells	2015-02-22	2	-1/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a DCACHE_FALLTHRU flag to indicate that, in a layered filesystem, this is a virtual dentry that covers another one in a lower layer that should be used instead. This may be recorded on medium if directory integration is stored there. The flag can be set with d_set_fallthru() and tested with d_is_fallthru(). Original-author: Valerie Aurora <vaurora@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	VFS: Add a whiteout dentry type	David Howells	2015-02-22	1	-6/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add DCACHE_WHITEOUT_TYPE and provide a d_is_whiteout() accessor function. A d_is_miss() accessor is also added for ordinary cache misses and d_is_negative() is modified to indicate either an ordinary miss or an enforced miss (whiteout). Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	VFS: Introduce inode-getting helpers for layered/unioned fs environments	David Howells	2015-02-22	1	-0/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce some function for getting the inode (and also the dentry) in an environment where layered/unioned filesystems are in operation. The problem is that we have places where we need both the union dentry and the lower source or workspace inode or dentry available, but we can only have a handle on one of them. Therefore we need to derive the handle to the other from that. The idea is to introduce an extra field in struct dentry that allows the union dentry to refer to and pin the lower dentry. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \|	Merge branch 'overlayfs-next' of ↵	Al Viro	2015-02-20	7	-319/+517
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs into for-next
\| \| * \|	ovl: discard independent cursor in readdir()	hujianyang	2015-01-09	1	-24/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the ovl_dir_cache is stable during a directory reading, the cursor of struct ovl_dir_file don't need to be an independent entry in the list of a merged directory. This patch changes cursor to a pointer which points to the entry in the ovl_dir_cache. After this, we don't need to check is_cursor either. Signed-off-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
\| \| * \|	ovl: document lower layer ordering	Miklos Szeredi	2015-01-08	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reported-by: Fabian Sturm <fabian.sturm@aduu.de> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
\| \| * \|	ovl: Prevent rw remount when it should be ro mount	Seunghun Lee	2015-01-08	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Overlayfs should be mounted read-only when upper-fs is read-only or nonexistent. But now it can be remounted read-write and this can cause kernel panic. So we should prevent read-write remount when the above situation happens. Signed-off-by: Seunghun Lee <waydi1@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
\| \| * \|	ovl: Fix opaque regression in ovl_lookup	hujianyang	2015-01-08	1	-10/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current multi-layer support overlayfs has a regression in .lookup(). If there is a directory in upperdir and a regular file has same name in lowerdir in a merged directory, lower file is hidden and upper directory is set to opaque in former case. But it is changed in present code. In lowerdir lookup path, if a found inode is not directory, the type checking of previous inode is missing. This inode will be copied to the lowerstack of ovl_entry directly. That will lead to several wrong conditions, for example, the reading of the directory in upperdir may return an error like: ls: reading directory .: Not a directory This patch makes the lowerdir lookup path check the opaque for non-directory file too. Signed-off-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
\| \| * \|	ovl: Fix kernel panic while mounting overlayfs	hujianyang	2015-01-08	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The function ovl_fill_super() in recently multi-layer support version will incorrectly return 0 at error handling path and then cause kernel panic. This failure can be reproduced by mounting a overlayfs with upperdir and workdir in different mounts. And also, If the memory allocation of lower_mnt fail, this function may return an zero either. This patch fix this problem by setting err to proper error number before jumping to error handling path. Signed-off-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
\| \| * \|	ovl: add testsuite to docs	Miklos Szeredi	2014-12-13	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
\| \| * \|	ovl: Use macros to present ovl_xattr	hujianyang	2014-12-13	4	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds two macros: OVL_XATTR_PRE_NAME and OVL_XATTR_PRE_LEN to present ovl_xattr name prefix and its length. Also, a new macro OVL_XATTR_OPAQUE is introduced to replace old ovl_opaque_xattr. Fix the length of "trusted.overlay." to 16. Signed-off-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
\| \| * \|	ovl: Cleanup redundant blank lines	hujianyang	2014-12-13	3	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch removes redundant blanks lines in overlayfs. Signed-off-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
\| \| * \|	ovl: support multiple lower layers	Miklos Szeredi	2014-12-13	2	-27/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow "lowerdir=" option to contain multiple lower directories separated by a colon (e.g. "lowerdir=/bin:/usr/bin"). Colon characters in filenames can be escaped with a backslash. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
\| \| * \|	ovl: make upperdir optional	Miklos Szeredi	2014-12-13	1	-36/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make "upperdir=" mount option optional. If "upperdir=" is not given, then the "workdir=" option is also optional (and ignored if given). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
\| \| * \|	ovl: improve mount helpers	Miklos Szeredi	2014-12-13	1	-52/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move common checks into ovl_mount_dir() helper. Create helper for looking up lower directories. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
\| \| * \|	ovl: mount: change order of initialization	Miklos Szeredi	2014-12-13	1	-38/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move allocation of root entry above to where it's needed. Move initializations related to upperdir and workdir near each other. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>