linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	lsm: copy comm before calling audit_log to avoid race in string printing	Richard Guy Briggs	2015-04-15	1	-6/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When task->comm is passed directly to audit_log_untrustedstring() without getting a copy or using the task_lock, there is a race that could happen that would output a NULL (\0) in the middle of the output string that would effectively truncate the rest of the report text after the comm= field in the audit log message, losing fields. Using get_task_comm() to get a copy while acquiring the task_lock to prevent this and to prevent the result from being a mixture of old and new values of comm would incur potentially unacceptable overhead, considering that the value can be influenced by userspace and therefore untrusted anyways. Copy the value before passing it to audit_log_untrustedstring() ensures that a local copy is used to calculate the length and subsequently printed. Even if this value contains a mix of old and new values, it will only calculate and copy up to the first NULL, preventing the rest of the audit log message being truncated. Use a second local copy of comm to avoid a race between the first and second calls to audit_log_untrustedstring() with comm. Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
*	Merge branch 'tomoyo-cleanup' of ↵	James Morris	2015-04-13	4	-45/+15
\|\ \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild into next
\| *	tomoyo: Do not generate empty policy files	Michal Marek	2015-04-07	3	-29/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Makefile automatically generates the tomoyo policy files, which are not removed by make clean (because they could have been provided by the user). Instead of generating the missing files, use /dev/null if a given file is not provided. Store the default exception_policy in exception_policy.conf.default. Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Michal Marek <mmarek@suse.cz>
\| *	tomoyo: Use if_changed when generating builtin-policy.h	Michal Marek	2015-04-07	1	-18/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Combine the generation of builtin-policy.h into a single command and use if_changed, so that the file is regenerated each time the command changes. The next patch will make use of this. Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Michal Marek <mmarek@suse.cz>
\| *	tomoyo: Use bin2c to generate builtin-policy.h	Michal Marek	2015-04-07	2	-10/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Simplify the Makefile by using a readily available tool instead of a custom sed script. The downside is that builtin-policy.h becomes unreadable for humans, but it is only a generated file. Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Michal Marek <mmarek@suse.cz>
* \|	selinux: increase avtab max buckets	Stephen Smalley	2015-04-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we can safely increase the avtab max buckets without triggering high order allocations and have a hash function that will make better use of the larger number of buckets, increase the max buckets to 2^16. Original: 101421 entries and 2048/2048 buckets used, longest chain length 374 With new hash function: 101421 entries and 2048/2048 buckets used, longest chain length 81 With increased max buckets: 101421 entries and 31078/32768 buckets used, longest chain length 12 Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com>
* \|	selinux: Use a better hash function for avtab	John Brooks	2015-04-07	2	-5/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This function, based on murmurhash3, has much better distribution than the original. Using the current default of 2048 buckets, there are many fewer collisions: Before: 101421 entries and 2048/2048 buckets used, longest chain length 374 After: 101421 entries and 2048/2048 buckets used, longest chain length 81 The difference becomes much more significant when buckets are increased. A naive attempt to expand the current function to larger outputs doesn't yield any significant improvement; so this function is a prerequisite for increasing the bucket size. sds: Adapted from the original patches for libsepol to the kernel. Signed-off-by: John Brooks <john.brooks@jolla.com> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com>
* \|	selinux: convert avtab hash table to flex_array	Stephen Smalley	2015-04-07	2	-13/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we shrank the avtab max hash buckets to avoid high order memory allocations, but this causes avtab lookups to degenerate to very long linear searches for the Fedora policy. Convert to using a flex_array instead so that we can increase the buckets without such limitations. This change does not alter the max hash buckets; that is left to a separate follow-on change. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com>
* \|	selinux: reconcile security_netlbl_secattr_to_sid() and mls_import_netlbl_cat()	Paul Moore	2015-04-07	2	-12/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the NetLabel secattr MLS category import logic into mls_import_netlbl_cat() where it belongs, and use the mls_import_netlbl_cat() function in security_netlbl_secattr_to_sid(). Reported-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: Paul Moore <pmoore@redhat.com>
* \|	selinux: remove unnecessary pointer reassignment	Jeff Vander Stoep	2015-04-07	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit f01e1af445fa ("selinux: don't pass in NULL avd to avc_has_perm_noaudit") made this pointer reassignment unnecessary. Avd should continue to reference the stack-based copy. Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: tweaked subject line] Signed-off-by: Paul Moore <pmoore@redhat.com>
* \|	Merge branch 'smack-for-4.1' of git://github.com/cschaufler/smack-next into next	James Morris	2015-04-02	5	-69/+307
\|\ \
\| * \|	Smack: Updates for Smack documentation	Casey Schaufler	2015-03-31	1	-50/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Document the Smack bringup features. Update the proper location for mounting smackfs from /smack to /sys/fs/smackfs. Fix some spelling errors. Suggest the use of the load2 interface instead of the load interface. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
\| * \|	smack: Fix gcc warning from unused smack_syslog_lock mutex in smackfs.c	Paul Gortmaker	2015-03-23	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit 00f84f3f2e9d088f06722f4351d67f5f577abe22 ("Smack: Make the syslog control configurable") this mutex was added, but the rest of the final commit never actually made use of it, resulting in: In file included from include/linux/mutex.h:29:0, from include/linux/notifier.h:13, from include/linux/memory_hotplug.h:6, from include/linux/mmzone.h:821, from include/linux/gfp.h:5, from include/linux/slab.h:14, from include/linux/security.h:27, from security/smack/smackfs.c:21: security/smack/smackfs.c:63:21: warning: ‘smack_syslog_lock’ defined but not used [-Wunused-variable] static DEFINE_MUTEX(smack_syslog_lock); ^ A git grep shows no other instances/references to smack_syslog_lock. Delete it, assuming that the mutex addition was just a leftover from an earlier work in progress version of the change. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
\| * \|	Smack: Allow an unconfined label in bringup mode	Casey Schaufler	2015-03-23	4	-17/+182
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I have vehemently opposed adding a "permissive" mode to Smack for the simple reasons that it would be subject to massive abuse and that developers refuse to turn it off come product release. I still believe that this is true, and still refuse to add a general "permissive mode". So don't ask again. Bumjin Im suggested an approach that addresses most of the concerns, and I have implemented it here. I still believe that we'd be better off without this sort of thing, but it looks like this minimizes the abuse potential. Firstly, you have to configure Smack Bringup Mode. That allows for "release" software to be ammune from abuse. Second, only one label gets to be "permissive" at a time. You can use it for debugging, but that's about it. A label written to smackfs/unconfined is treated specially. If either the subject or object label of an access check matches the "unconfined" label, and the access would not have been allowed otherwise an audit record and a console message are generated. The audit record "request" string is marked with either "(US)" or "(UO)", to indicate that the request was granted because of an unconfined label. The fact that an inode was accessed by an unconfined label is remembered, and subsequent accesses to that "impure" object are noted in the log. The impurity is not stored in the filesystem, so a file mislabled as a side effect of using an unconfined label may still cause concern after a reboot. So, it's there, it's dangerous, but so many application developers seem incapable of living without it I have given in. I've tried to make it as safe as I can, but in the end it's still a chain saw. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
\| * \|	Smack: getting the Smack security context of keys	José Bollo	2015-03-23	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With this commit, the LSM Smack implements the LSM side part of the system call keyctl with the action code KEYCTL_GET_SECURITY. It is now possible to get the context of, for example, the user session key using the command "keyctl security @s". The original patch has been modified for merge. Signed-off-by: José Bollo <jose.bollo@open.eurogiciel.org> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
\| * \|	Smack: Assign smack_known_web as default smk_in label for kernel thread's socket	Marcin Lis	2015-03-23	1	-1/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change fixes the bug associated with sockets owned by kernel threads. These sockets, created usually by network devices' drivers tasks, received smk_in label from the task that created them - the "floor" label in the most cases. The result was that they were not able to receive data packets because of missing smack rules. The main reason of the access deny is that the socket smk_in label is placed as the object during smk check, kernel thread's capabilities are omitted. Signed-off-by: Marcin Lis <m.lis@samsung.com>
* \| \|	tpm/st33zp24/spi: Add missing device table for spi phy.	Christophe Ricard	2015-03-27	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MODULE_DEVICE_TABLE is missing in spi phy in case CONFIG_OF is not set. Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
* \| \|	tpm/st33zp24: Add proper wait for ordinal duration in case of irq mode	Christophe Ricard	2015-03-27	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In case the driver is configured to use irq, we are not waiting the answer for a duration period to see the DATA_AVAIL status bit to raise but at maximum timeout_c. This may result in critical failure as we will not wait long enough for the command completion. Reviewed-by: Jason Gunthorpe <jason.gunthorpe@obsidianresearch.com> Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Fixes: bf38b8710892 ("tpm/tpm_i2c_stm_st33: Split tpm_i2c_tpm_st33 in 2 layers (core + phy)") Reviewed-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
* \| \|	tpm/tpm_infineon: Use struct dev_pm_ops for power management	Peter Huewe	2015-03-18	1	-25/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make the tpm_infineon driver define its PM callbacks through a struct dev_pm_ops object rather than by using legacy PM hooks in struct pnp_driver. This allows the driver to use tpm_pm_suspend() as its suspend callback directly, so we can remove the duplicated savestate code. Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
* \| \|	MAINTAINERS: Add Jason as designated reviewer for TPM	Peter Huewe	2015-03-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Jason does an excellent job reviewing the TPM stuff, so we add him to the designated reviewer list (with his consent :) Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
* \| \|	tpm: Update KConfig text to include TPM2.0 FIFO chips	Peter Huewe	2015-03-18	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I got a lot of requests lately about whether the new TPM2.0 support includes the FIFO interface for TPM2.0 as well. The FIFO interface is handled by tpm_tis since FIFO=TIS (more or less). -> Update the helptext and headline Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
* \| \|	tpm/st33zp24/dts/st33zp24-spi: Add dts documentation for st33zp24 spi phy	Christophe Ricard	2015-03-18	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewed-by: Jason Gunthorpe <jason.gunthorpe@obsidianresearch.com> Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
* \| \|	tpm/st33zp24/spi: Add st33zp24 spi phy	Christophe Ricard	2015-03-18	5	-2/+408
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	st33zp24 TIS 1.2 support also SPI. It is using a proprietary protocol to transport TIS data. Acked-by: Jarkko Sakkinen <jarkko.sakknen@linux.intel.com> Reviewed-by: Jason Gunthorpe <jason.gunthorpe@obsidianresearch.com> Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
* \| \|	tpm/tpm_i2c_stm_st33: Split tpm_i2c_tpm_st33 in 2 layers (core + phy)	Christophe Ricard	2015-03-18	9	-942/+1036
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tpm_i2c_stm_st33 is a TIS 1.2 TPM with a core interface which can be used by different phy such as i2c or spi. The core part is called st33zp24 which is also the main part reference. include/linux/platform_data/tpm_stm_st33.h is renamed consequently. The driver is also split into an i2c phy in charge of sending/receiving data as well as managing platform data or dts configuration. Acked-by: Jarkko Sakkinen <jarkko.sakknen@linux.intel.com> Reviewed-by: Jason Gunthorpe <jason.gunthorpe@obsidianresearch.com> Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
* \| \|	tpm/tpm_i2c_stm_st33: Replace access to io_lpcpd from struct ↵	Christophe Ricard	2015-03-18	1	-7/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	st33zp24_platform_data to tpm_stm_dev io_lpcpd is accessible from struct tpm_stm_dev. struct st33zp24_platform_data is only valid when using static platform configuration data, not when using dts. Reviewed-by: Jason Gunthorpe <jason.gunthorpe@obsidianresearch.com> Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
* \| \|	tpm: fix: sanitized code paths in tpm_chip_register()	Jarkko Sakkinen	2015-03-18	1	-24/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I started to work with PPI interface so that it would be available under character device sysfs directory and realized that chip registeration was still too messy. In TPM 1.x in some rare scenarios (errors that almost never occur) wrong order in deinitialization steps was taken in teardown. I reproduced these scenarios by manually inserting error codes in the place of the corresponding function calls. The key problem is that the teardown is messy with two separate code paths (this was inherited when moving code from tpm-interface.c). Moved TPM 1.x specific register/unregister functionality to own helper functions and added single code path for teardown in tpm_chip_register(). Now the code paths have been fixed and it should be easier to review later on this part of the code. Cc: <stable@vger.kernel.org> Fixes: 7a1d7e6dd76a ("tpm: TPM 2.0 baseline support") Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Scot Doyle <lkml14@scotdoyle.com> Reviewed-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
* \| \|	tpm: fix call order in tpm-chip.c	Jarkko Sakkinen	2015-03-05	1	-20/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- tpm_dev_add_device(): cdev_add() must be done before uevent is propagated in order to avoid races. - tpm_chip_register(): tpm_dev_add_device() must be done as the last step before exposing device to the user space in order to avoid races. In addition clarified description in tpm_chip_register(). Fixes: 313d21eeab92 ("tpm: device class for tpm") Fixes: afb5abc262e9 ("tpm: two-phase chip management functions") Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
* \| \|	tpm/ibmvtpm: Additional LE support for tpm_ibmvtpm_send	jmlatten@linux.vnet.ibm.com	2015-03-05	2	-8/+8
\|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When IMA and VTPM are both enabled in kernel config, kernel hangs during bootup on LE OS. Why?: IMA calls tpm_pcr_read() which results in tpm_ibmvtpm_send and tpm_ibmtpm_recv getting called. A trace showed that tpm_ibmtpm_recv was hanging. Resolution: tpm_ibmtpm_recv was hanging because tpm_ibmvtpm_send was sending CRQ message that probably did not make much sense to phype because of Endianness. The fix below sends correctly converted CRQ for LE. This was not caught before because it seems IMA is not enabled by default in kernel config and IMA exercises this particular code path in vtpm. Tested with IMA and VTPM enabled in kernel config and VTPM enabled on both a BE OS and a LE OS ppc64 lpar. This exercised CRQ and TPM command code paths in vtpm. Patch is against Peter's tpmdd tree on github which included Vicky's previous vtpm le patches. Signed-off-by: Joy Latten <jmlatten@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> # eb71f8a5e33f: "Added Little Endian support to vtpm module" Cc: <stable@vger.kernel.org> Reviewed-by: Ashley Lai <ashley@ahsleylai.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
* \|	Merge tag 'yama-4.0' of ↵	James Morris	2015-03-03	2	-10/+5
\|\ \ \| \|/ \|/\| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux into next
\| *	security/yama: Remove unnecessary selects from Kconfig.	Stephen Smalley	2015-02-28	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Yama selects SECURITYFS and SECURITY_PATH, but requires neither. Remove them. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Kees Cook <keescook@chromium.org>
\| *	Yama: do not modify global sysctl table entry	Kees Cook	2015-02-28	1	-8/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the sysctl table is constified, we won't be able to directly modify it. Instead, use a table copy that carries any needed changes. Suggested-by: PaX Team <pageexec@freemail.hu> Signed-off-by: Kees Cook <keescook@chromium.org>
* \|	Linux 4.0-rc1v4.0-rc1	Linus Torvalds	2015-02-23	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	.. after extensive statistical analysis of my G+ polling, I've come to the inescapable conclusion that internet polls are bad. Big surprise. But "Hurr durr I'ma sheep" trounced "I like online polls" by a 62-to-38% margin, in a poll that people weren't even supposed to participate in. Who can argue with solid numbers like that? 5,796 votes from people who can't even follow the most basic directions? In contrast, "v4.0" beat out "v3.20" by a slimmer margin of 56-to-44%, but with a total of 29,110 votes right now. Now, arguably, that vote spread is only about 3,200 votes, which is less than the almost six thousand votes that the "please ignore" poll got, so it could be considered noise. But hey, I asked, so I'll honor the votes.
* \|	Merge tag 'ext4_for_linus' of ↵	Linus Torvalds	2015-02-23	5	-56/+108
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Ext4 bug fixes. We also reserved code points for encryption and read-only images (for which the implementation is mostly just the reserved code point for a read-only feature :-)" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix indirect punch hole corruption ext4: ignore journal checksum on remount; don't fail ext4: remove duplicate remount check for JOURNAL_CHECKSUM change ext4: fix mmap data corruption in nodelalloc mode when blocksize < pagesize ext4: support read-only images ext4: change to use setup_timer() instead of init_timer() ext4: reserve codepoints used by the ext4 encryption feature jbd2: complain about descriptor block checksum errors
\| * \|	ext4: fix indirect punch hole corruption	Omar Sandoval	2015-02-15	1	-34/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 4f579ae7de56 (ext4: fix punch hole on files with indirect mapping) rewrote FALLOC_FL_PUNCH_HOLE for ext4 files with indirect mapping. However, there are bugs in several corner cases. This fixes 5 distinct bugs: 1. When there is at least one entire level of indirection between the start and end of the punch range and the end of the punch range is the first block of its level, we can't return early; we have to free the intervening levels. 2. When the end is at a higher level of indirection than the start and ext4_find_shared returns a top branch for the end, we still need to free the rest of the shared branch it returns; we can't decrement partial2. 3. When a punch happens within one level of indirection, we need to converge on an indirect block that contains the start and end. However, because the branches returned from ext4_find_shared do not necessarily start at the same level (e.g., the partial2 chain will be shallower if the last block occurs at the beginning of an indirect group), the walk of the two chains can end up "missing" each other and freeing a bunch of extra blocks in the process. This mismatch can be handled by first making sure that the chains are at the same level, then walking them together until they converge. 4. When the punch happens within one level of indirection and ext4_find_shared returns a top branch for the start, we must free it, but only if the end does not occur within that branch. 5. When the punch happens within one level of indirection and ext4_find_shared returns a top branch for the end, then we shouldn't free the block referenced by the end of the returned chain (this mirrors the different levels case). Signed-off-by: Omar Sandoval <osandov@osandov.com>
\| * \|	ext4: ignore journal checksum on remount; don't fail	Eric Sandeen	2015-02-13	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As of v3.18, ext4 started rejecting a remount which changes the journal_checksum option. Prior to that, it was simply ignored; the problem here is that if someone has this in their fstab for the root fs, now the box fails to boot properly, because remount of root with the new options will fail, and the box proceeds with a readonly root. I think it is a little nicer behavior to accept the option, but warn that it's being ignored, rather than failing the mount, but that might be a subjective matter... Reported-by: Cónräd <conradsand.arma@gmail.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| * \|	ext4: remove duplicate remount check for JOURNAL_CHECKSUM change	Eric Sandeen	2015-02-13	1	-11/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rejection of, changing journal_checksum during remount. One suffices. While we're at it, remove old comment about the "check" option which has been deprecated for some time now. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| * \|	ext4: fix mmap data corruption in nodelalloc mode when blocksize < pagesize	Xiaoguang Wang	2015-02-13	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since commit 90a8020 and d6320cb, Jan Kara has fixed this issue partially. This mmap data corruption still exists in nodelalloc mode, fix this. Signed-off-by: Xiaoguang Wang <wangxg.fnst@cn.fujitsu.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
\| * \|	ext4: support read-only images	Darrick J. Wong	2015-02-13	2	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a rocompat feature, "readonly" to mark a FS image as read-only. The feature prevents the kernel and e2fsprogs from changing the image; the flag can be toggled by tune2fs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| * \|	ext4: change to use setup_timer() instead of init_timer()	Jan Mrazek	2015-01-26	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Jan Mrazek <email@honzamrazek.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| * \|	ext4: reserve codepoints used by the ext4 encryption feature	Theodore Ts'o	2015-01-19	1	-4/+13
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| * \|	jbd2: complain about descriptor block checksum errors	Darrick J. Wong	2015-01-19	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We should complain in dmesg when journal recovery fails on account of the descriptor block being corrupt, so that the diagnostic data can be recovered. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* \| \|	Merge branch 'for-linus-2' of ↵	Linus Torvalds	2015-02-23	70	-758/+907
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: "Assorted stuff from this cycle. The big ones here are multilayer overlayfs from Miklos and beginning of sorting ->d_inode accesses out from David" * 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (51 commits) autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation procfs: fix race between symlink removals and traversals debugfs: leave freeing a symlink body until inode eviction Documentation/filesystems/Locking: ->get_sb() is long gone trylock_super(): replacement for grab_super_passive() fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) SELinux: Use d_is_positive() rather than testing dentry->d_inode Smack: Use d_is_positive() rather than testing dentry->d_inode TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR() Apparmor: Use d_is_positive/negative() rather than testing dentry->d_inode Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sb VFS: Split DCACHE_FILE_TYPE into regular and special types VFS: Add a fallthrough flag for marking virtual dentries VFS: Add a whiteout dentry type VFS: Introduce inode-getting helpers for layered/unioned fs environments Infiniband: Fix potential NULL d_inode dereference posix_acl: fix reference leaks in posix_acl_create autofs4: Wrong format for printing dentry ...
\| * \| \|	autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation	Al Viro	2015-02-22	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	X-Coverup: just ask spender Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \| \|	procfs: fix race between symlink removals and traversals	Al Viro	2015-02-22	3	-12/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	use_pde()/unuse_pde() in ->follow_link()/->put_link() resp. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \| \|	debugfs: leave freeing a symlink body until inode eviction	Al Viro	2015-02-22	1	-17/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As it is, we have debugfs_remove() racing with symlink traversals. Supply ->evict_inode() and do freeing there - inode will remain pinned until we are done with the symlink body. And rip the idiocy with checking if dentry is positive right after we'd verified debugfs_positive(), which is a stronger check... Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \| \|	Documentation/filesystems/Locking: ->get_sb() is long gone	Al Viro	2015-02-22	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \| \|	trylock_super(): replacement for grab_super_passive()	Konstantin Khlebnikov	2015-02-22	3	-26/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I've noticed significant locking contention in memory reclaimer around sb_lock inside grab_super_passive(). Grab_super_passive() is called from two places: in icache/dcache shrinkers (function super_cache_scan) and from writeback (function __writeback_inodes_wb). Both are required for progress in memory allocator. Grab_super_passive() acquires sb_lock to increment sb->s_count and check sb->s_instances. It seems sb->s_umount locked for read is enough here: super-block deactivation always runs under sb->s_umount locked for write. Protecting super-block itself isn't a problem: in super_cache_scan() sb is protected by shrinker_rwsem: it cannot be freed if its slab shrinkers are still active. Inside writeback super-block comes from inode from bdi writeback list under wb->list_lock. This patch removes locking sb_lock and checks s_instances under s_umount: generic_shutdown_super() unlinks it under sb->s_umount locked for write. New variant is called trylock_super() and since it only locks semaphore, callers must call up_read(&sb->s_umount) instead of drop_super(sb) when they're done. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \| \|	fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions	David Howells	2015-02-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fanotify probably doesn't want to watch autodirs so make it use d_can_lookup() rather than d_is_dir() when checking a dir watch and give an error on fake directories. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \| \|	Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions	David Howells	2015-02-22	4	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix up the following scripted S_ISDIR/S_ISREG/S_ISLNK conversions (or lack thereof) in cachefiles: (1) Cachefiles mostly wants to use d_can_lookup() rather than d_is_dir() as it doesn't want to deal with automounts in its cache. (2) Coccinelle didn't find S_IS* expressions in ASSERT() statements in cachefiles. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| * \| \|	VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)	David Howells	2015-02-22	34	-71/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Convert the following where appropriate: (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry). (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry). (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more complicated than it appears as some calls should be converted to d_can_lookup() instead. The difference is whether the directory in question is a real dir with a ->lookup op or whether it's a fake dir with a ->d_automount op. In some circumstances, we can subsume checks for dentry->d_inode not being NULL into this, provided we the code isn't in a filesystem that expects d_inode to be NULL if the dirent really is negative (ie. if we're going to use d_inode() rather than d_backing_inode() to get the inode pointer). Note that the dentry type field may be set to something other than DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS manages the fall-through from a negative dentry to a lower layer. In such a case, the dentry type of the negative union dentry is set to the same as the type of the lower dentry. However, if you know d_inode is not NULL at the call site, then you can use the d_is_xxx() functions even in a filesystem. There is one further complication: a 0,0 chardev dentry may be labelled DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was intended for special directory entry types that don't have attached inodes. The following perl+coccinelle script was used: use strict; my @callers; open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' \|') \|\| die "Can't grep for S_ISDIR and co. callers"; @callers = <$fd>; close($fd); unless (@callers) { print "No matches\n"; exit(0); } my @cocci = ( '@@', 'expression E;', '@@', '', '- S_ISLNK(E->d_inode->i_mode)', '+ d_is_symlink(E)', '', '@@', 'expression E;', '@@', '', '- S_ISDIR(E->d_inode->i_mode)', '+ d_is_dir(E)', '', '@@', 'expression E;', '@@', '', '- S_ISREG(E->d_inode->i_mode)', '+ d_is_reg(E)' ); my $coccifile = "tmp.sp.cocci"; open($fd, ">$coccifile") \|\| die $coccifile; print($fd "$_\n") \|\| die $coccifile foreach (@cocci); close($fd); foreach my $file (@callers) { chomp $file; print "Processing ", $file, "\n"; system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 \|\| die "spatch failed"; } [AV: overlayfs parts skipped] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>