summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* gfs2: Modify struct gfs2_quota_change_host to use struct kqidEric W. Biederman2013-02-131-3/+5
| | | | | Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* gfs2: Introduce qd2indexEric W. Biederman2013-02-131-2/+8
| | | | | | | | | Both qd_alloc and qd2offset perform the exact same computation to get an index from a gfs2_quota_data. Make life a little simpler and factor out this index computation. Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* gfs2: Report quotas in the caller's user namespace.Eric W. Biederman2013-02-131-1/+1
| | | | | | | | | | When a quota is queried return the uid or the gid in the mapped into the caller's user namespace. In addition perform the munged version of the mapping so that instead of -1 a value that does not map is reported as the overflowuid or the overflowgid. Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* gfs2: Split NO_QUOTA_CHANGE inot NO_UID_QUTOA_CHANGE and NO_GID_QUTOA_CHANGEEric W. Biederman2013-02-137-14/+15
| | | | | | | | Split NO_QUOTA_CHANGE into NO_UID_QUTOA_CHANGE and NO_GID_QUTOA_CHANGE so the constants may be well typed. Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* gfs2: Remove improper checks in gfs2_set_dqblk.Eric W. Biederman2013-02-131-6/+0
| | | | | | | | | | | | | | | | | | | In set_dqblk it is an error to look at fdq->d_id or fdq->d_flags. Userspace quota applications do not set these fields when calling quotactl(Q_XSETQLIM,...), and the kernel does not set those fields when quota_setquota calls set_dqblk. gfs2 never looks at fdq->d_id or fdq->d_flags after checking to see if they match the id and type supplied to set_dqblk. No other linux filesystem in set_dqblk looks at either fdq->d_id or fdq->d_flags. Therefore remove these bogus checks from gfs2 and allow normal quota setting applications to work. Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* ocfs2: Enable building with user namespaces enabledEric W. Biederman2013-02-131-1/+0
| | | | | | | | | | | Now that ocfs2 has been converted to store uids and gids in kuid_t and kgid_t and all of the conversions have been added to the appropriate places it is safe to allow building and using ocfs2 with user namespace support enabled. Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* ocfs2: Compare kuids and kgids using uid_eq and gid_eqEric W. Biederman2013-02-132-5/+5
| | | | | | Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* ocfs2: For tracing report the uid and gid values in the initial user namespaceEric W. Biederman2013-02-131-1/+2
| | | | | | Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* ocfs2: Convert uid and gids between in core and on disk inodesEric W. Biederman2013-02-132-8/+8
| | | | | | Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* ocfs2: convert between kuids and kgids and DLM locksEric W. Biederman2013-02-131-4/+4
| | | | | | | | | | Convert between uid and gids stored in the on the wire format of dlm locks aka struct ocfs2_meta_lvb and kuids and kgids stored in inode->i_uid and inode->i_gid. Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* ocfs2: Handle kuids and kgids in acl/xattr conversions.Eric W. Biederman2013-02-131-2/+29
| | | | | | | | | Explicitly deal with the different kinds of acls because they need different conversions. Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* coda: Allow coda to be built when user namespace support is enabledEric W. Biederman2013-02-131-1/+0
| | | | | | | | | | | Now that the coda kernel to userspace has been modified to convert between kuids and kgids and uids and gids, and all internal coda structures have be modified to store uids and gids as kuids and kgids it is safe to allow code to be built with user namespace support enabled. Cc: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* coda: Cache permisions in struct coda_inode_info in a kuid_t.Eric W. Biederman2013-02-133-4/+4
| | | | | | | | | - Change c_uid in struct coda_indoe_info from a vuid_t to a kuid_t. - Initialize c_uid to GLOBAL_ROOT_UID instead of 0. - Use uid_eq to compare cached kuids. Cc: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* coda: Restrict coda messages to the initial user namespaceEric W. Biederman2013-02-134-8/+11
| | | | | | | | | | | | | | | | | | | | | | Remove the slight chance that uids and gids in coda messages will be interpreted in the wrong user namespace. - Only allow processes in the initial user namespace to open the coda character device to communicate with coda filesystems. - Explicitly convert the uids in the coda header into the initial user namespace. - In coda_vattr_to_attr make kuids and kgids from the initial user namespace uids and gids in struct coda_vattr that just came from userspace. - In coda_iattr_to_vattr convert kuids and kgids into uids and gids in the intial user namespace and store them in struct coda_vattr for sending to coda userspace programs. Nothing needs to be changed with mounts as coda does not support being mounted in anything other than the initial user namespace. Cc: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* coda: Restrict coda messages to the initial pid namespaceEric W. Biederman2013-02-133-2/+10
| | | | | | | | | | | | | | Remove the slight chance that pids in coda messages will be interpreted in the wrong pid namespace. - Explicitly send all pids in coda messages in the initial pid namespace. - Only allow mounts from processes in the initial pid namespace. - Only allow processes in the initial pid namespace to open the coda character device to communicate with coda. Cc: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* afs: Support interacting with multiple user namespacesEric W. Biederman2013-02-134-10/+15
| | | | | | | | | | | | | | | | | | | | | | Modify struct afs_file_status to store owner as a kuid_t and group as a kgid_t. In xdr_decode_AFSFetchStatus as owner is now a kuid_t and group is now a kgid_t don't use the EXTRACT macro. Instead perform the work of the extract macro explicitly. Read the value with ntohl and convert it to the appropriate type with make_kuid or make_kgid. Test if the value is different from what is stored in status and update changed. Update the value in status. In xdr_encode_AFS_StoreStatus call from_kuid or from_kgid as we are computing the on the wire encoding. Initialize uids with GLOBAL_ROOT_UID instead of 0. Initialize gids with GLOBAL_ROOT_GID instead of 0. Cc: David Howells <dhowells@redhat.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
* afs: Only allow mounting afs in the intial network namespaceEric W. Biederman2013-02-131-0/+6
| | | | | | | | rxrpc sockets only work in the initial network namespace so it isn't possible to support afs in any other network namespace. Cc: David Howells <dhowells@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* afs: Remove unused structure afs_store_statusEric W. Biederman2013-02-121-7/+0
| | | | | | | | | | | | | While looking for kuid_t and kgid_t conversions I found this structure that has never been used since it was added to the kernel in 2007. The obvious for this structure to be used is in xdr_encode_AFS_StoreStatus and that function uses a small handful of local variables instead. So remove the unnecessary structure to prevent confusion. Cc: David Howells <dhowells@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* 9p: Allow building 9p with user namespaces enabled.Eric W. Biederman2013-02-121-4/+0
| | | | | | | | | | | Now that the uid_t -> kuid_t, gid_t -> kgid_t conversion has been completed in 9p allow 9p to be built when user namespaces are enabled. Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@gmail.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* 9p: Modify v9fs_get_fsgid_for_create to return a kgidEric W. Biederman2013-02-121-5/+5
| | | | | | | | | | | Modify v9fs_get_fsgid_for_create to return a kgid and modify all of the variables that hold the result of v9fs_get_fsgid_for_create to be of type kgid_t. Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@gmail.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* 9p: Modify struct v9fs_session_info to use a kuids and kgidsEric W. Biederman2013-02-122-10/+30
| | | | | | | | | | | | | Change struct v9fs_session_info and the code that popluates it to use kuids and kgids. When parsing the 9p mount options convert the dfltuid, dflutgid, and the session uid from the current user namespace into kuids and kgids. Modify V9FS_DEFUID and V9FS_DEFGUID to be kuid and kgid values. Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@gmail.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* 9p: Modify struct 9p_fid to use a kuid_t not a uid_tEric W. Biederman2013-02-123-10/+11
| | | | | | | | | | Change struct 9p_fid and it's associated functions to use kuid_t's instead of uid_t. Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@gmail.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* 9p: Modify the stat structures to use kuid_t and kgid_tEric W. Biederman2013-02-124-21/+30
| | | | | | | | | | | | | | | | 9p has thre strucrtures that can encode inode stat information. Modify all of those structures to contain kuid_t and kgid_t values. Modify he wire encoders and decoders of those structures to use 'u' and 'g' instead of 'd' in the format string where uids and gids are present. This results in all kuid and kgid conversion to and from on the wire values being performed by the same code in protocol.c where the client is known at the time of the conversion. Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@gmail.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
* 9p: Transmit kuid and kgid valuesEric W. Biederman2013-02-123-18/+19
| | | | | | | | | | | | | | Modify the p9_client_rpc format specifiers of every function that directly transmits a uid or a gid from 'd' to 'u' or 'g' as appropriate. Modify those same functions to take kuid_t and kgid_t parameters instead of uid_t and gid_t parameters. Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@gmail.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
* 9p: Add 'u' and 'g' format specifies for kuids and kgidsEric W. Biederman2013-02-121-0/+36
| | | | | | | | | | This allows concentrating all of the conversion to and from kuids and kgids into the format needed by the 9p protocol into one location. Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@gmail.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* ceph: Enable building when user namespaces are enabled.Eric W. Biederman2013-02-121-1/+0
| | | | | | | | | | | Now that conversions happen from kuids and kgids when generating ceph messages and conversion happen to kuids and kgids after receiving celph messages, and all intermediate data structures store uids and gids as type kuid_t and kgid_t it is safe to enable ceph with user namespace support enabled. Cc: Sage Weil <sage@inktank.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* ceph: Convert kuids and kgids before printing them.Eric W. Biederman2013-02-122-4/+8
| | | | | | | | Before printing kuid and kgids values convert them into the initial user namespace. Cc: Sage Weil <sage@inktank.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* ceph: Convert struct ceph_mds_request to use kuid_t and kgid_tEric W. Biederman2013-02-122-4/+4
| | | | | | | | | | Hold the uid and gid for a pending ceph mds request using the types kuid_t and kgid_t. When a request message is finally created convert the kuid_t and kgid_t values into uids and gids in the initial user namespace. Cc: Sage Weil <sage@inktank.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* ceph: Translate inode uid and gid attributes to/from kuids and kgids.Eric W. Biederman2013-02-121-6/+8
| | | | | | | | | | | | - In fill_inode() transate uids and gids in the initial user namespace into kuids and kgids stored in inode->i_uid and inode->i_gid. - In ceph_setattr() if they have changed convert inode->i_uid and inode->i_gid into initial user namespace uids and gids for transmission. Cc: Sage Weil <sage@inktank.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* ceph: Translate between uid and gids in cap messages and kuids and kgidsEric W. Biederman2013-02-122-9/+9
| | | | | | | | | | | | | | | | | | | | | - Make the uid and gid arguments of send_cap_msg() used to compose ceph_mds_caps messages of type kuid_t and kgid_t. - Pass inode->i_uid and inode->i_gid in __send_cap to send_cap_msg() through variables of type kuid_t and kgid_t. - Modify struct ceph_cap_snap to store uids and gids in types kuid_t and kgid_t. This allows capturing inode->i_uid and inode->i_gid in ceph_queue_cap_snap() without loss and pssing them to __ceph_flush_snaps() where they are removed from struct ceph_cap_snap and passed to send_cap_msg(). - In handle_cap_grant translate uid and gids in the initial user namespace stored in struct ceph_mds_cap into kuids and kgids before setting inode->i_uid and inode->i_gid. Cc: Sage Weil <sage@inktank.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* ceph: Only allow mounts in the initial network namespaceEric W. Biederman2013-02-121-0/+5
| | | | | | | | | | | Today ceph opens tcp sockets from a delayed work callback. Delayed work happens from kernel threads which are always in the initial network namespace. Therefore fail early if someone attempts to mount a ceph filesystem from something other than the initial network namespace. Cc: Sage Weil <sage@inktank.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* userns: Allow the unprivileged users to mount mqueue fsGao feng2013-01-281-0/+1
| | | | | | | | | | | | | | This patch allow the unprivileged user to mount mqueuefs in user ns. If two userns share the same ipcns,the files in mqueue fs should be seen in both these two userns. If the userns has its own ipcns,it has its own mqueue fs too. ipcns has already done this job well. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
* userns: Allow the userns root to mount tmpfs.Eric W. Biederman2013-01-271-0/+2
| | | | | | | | | | | | | | There is no backing store to tmpfs and file creation rules are the same as for any other filesystem so it is semantically safe to allow unprivileged users to mount it. ramfs is safe for the same reasons so allow either flavor of tmpfs to be mounted by a user namespace root user. The memory control group successfully limits how much memory tmpfs can consume on any system that cares about a user namespace root using tmpfs to exhaust memory the memory control group can be deployed. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* userns: Allow the userns root to mount ramfs.Eric W. Biederman2013-01-271-0/+1
| | | | | | | | | | | | | | There is no backing store to ramfs and file creation rules are the same as for any other filesystem so it is semantically safe to allow unprivileged users to mount it. The memory control group successfully limits how much memory ramfs can consume on any system that cares about a user namespace root using ramfs to exhaust memory the memory control group can be deployed. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* userns: Allow the userns root to mount of devptsEric W. Biederman2013-01-271-0/+18
| | | | | | | | | | | | | | | | - The context in which devpts is mounted has no effect on the creation of ptys as the /dev/ptmx interface has been used by unprivileged users for many years. - Only support unprivileged mounts in combination with the newinstance option to ensure that mounting of /dev/pts in a user namespace will not allow the options of an existing mount of devpts to be modified. - Create /dev/pts/ptmx as the root user in the user namespace that mounts devpts so that it's permissions to be changed. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* userns: Recommend use of memory control groups.Eric W. Biederman2013-01-272-0/+21
| | | | | | | | | | In the help text describing user namespaces recommend use of memory control groups. In many cases memory control groups are the only mechanism there is to limit how much memory a user who can create user namespaces can use. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
* userns: Allow any uid or gid mappings that don't overlap.Eric W. Biederman2013-01-271-6/+39
| | | | | | | | | | | | | | When I initially wrote the code for /proc/<pid>/uid_map. I was lazy and avoided duplicate mappings by the simple expedient of ensuring the first number in a new extent was greater than any number in the previous extent. Unfortunately that precludes a number of valid mappings, and someone noticed and complained. So use a simple check to ensure that ranges in the mapping extents don't overlap. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* userns: Avoid recursion in put_user_nsEric W. Biederman2013-01-273-16/+15
| | | | | | | | | | | | | | | | | | | | | | | | | When freeing a deeply nested user namespace free_user_ns calls put_user_ns on it's parent which may in turn call free_user_ns again. When -fno-optimize-sibling-calls is passed to gcc one stack frame per user namespace is left on the stack, potentially overflowing the kernel stack. CONFIG_FRAME_POINTER forces -fno-optimize-sibling-calls so we can't count on gcc to optimize this code. Remove struct kref and use a plain atomic_t. Making the code more flexible and easier to comprehend. Make the loop in free_user_ns explict to guarantee that the stack does not overflow with CONFIG_FRAME_POINTER enabled. I have tested this fix with a simple program that uses unshare to create a deeply nested user namespace structure and then calls exit. With 1000 nesteuser namespaces before this change running my test program causes the kernel to die a horrible death. With 10,000,000 nested user namespaces after this change my test program runs to completion and causes no harm. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Pointed-out-by: Vasily Kulikov <segoon@openwall.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* userns: Allow unprivileged rebootLi Zefan2012-12-271-2/+3
| | | | | | | | | In a container with its own pid namespace and user namespace, rebooting the system won't reboot the host, but terminate all the processes in it and thus have the container shutdown, so it's safe. Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
* f2fs: Don't assign e_id in f2fs_acl_from_diskEric W. Biederman2012-12-261-1/+0
| | | | | | | | | | | | | | | | | | | | | With user namespaces enabled building f2fs fails with: CC fs/f2fs/acl.o fs/f2fs/acl.c: In function ‘f2fs_acl_from_disk’: fs/f2fs/acl.c:85:21: error: ‘struct posix_acl_entry’ has no member named ‘e_id’ make[2]: *** [fs/f2fs/acl.o] Error 1 make[2]: Target `__build' not remade because of errors. e_id is a backwards compatibility field only used for file systems that haven't been converted to use kuids and kgids. When the posix acl tag field is neither ACL_USER nor ACL_GROUP assigning e_id is unnecessary. Remove the assignment so f2fs will build with user namespaces enabled. Cc: Namjae Jeon <namjae.jeon@samsung.com> Cc: Amit Sahrawat <a.sahrawat@samsung.com> Acked-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* proc: Allow proc_free_inum to be called from any contextEric W. Biederman2012-12-261-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While testing the pid namespace code I hit this nasty warning. [ 176.262617] ------------[ cut here ]------------ [ 176.263388] WARNING: at /home/eric/projects/linux/linux-userns-devel/kernel/softirq.c:160 local_bh_enable_ip+0x7a/0xa0() [ 176.265145] Hardware name: Bochs [ 176.265677] Modules linked in: [ 176.266341] Pid: 742, comm: bash Not tainted 3.7.0userns+ #18 [ 176.266564] Call Trace: [ 176.266564] [<ffffffff810a539f>] warn_slowpath_common+0x7f/0xc0 [ 176.266564] [<ffffffff810a53fa>] warn_slowpath_null+0x1a/0x20 [ 176.266564] [<ffffffff810ad9ea>] local_bh_enable_ip+0x7a/0xa0 [ 176.266564] [<ffffffff819308c9>] _raw_spin_unlock_bh+0x19/0x20 [ 176.266564] [<ffffffff8123dbda>] proc_free_inum+0x3a/0x50 [ 176.266564] [<ffffffff8111d0dc>] free_pid_ns+0x1c/0x80 [ 176.266564] [<ffffffff8111d195>] put_pid_ns+0x35/0x50 [ 176.266564] [<ffffffff810c608a>] put_pid+0x4a/0x60 [ 176.266564] [<ffffffff8146b177>] tty_ioctl+0x717/0xc10 [ 176.266564] [<ffffffff810aa4d5>] ? wait_consider_task+0x855/0xb90 [ 176.266564] [<ffffffff81086bf9>] ? default_spin_lock_flags+0x9/0x10 [ 176.266564] [<ffffffff810cab0a>] ? remove_wait_queue+0x5a/0x70 [ 176.266564] [<ffffffff811e37e8>] do_vfs_ioctl+0x98/0x550 [ 176.266564] [<ffffffff810b8a0f>] ? recalc_sigpending+0x1f/0x60 [ 176.266564] [<ffffffff810b9127>] ? __set_task_blocked+0x37/0x80 [ 176.266564] [<ffffffff810ab95b>] ? sys_wait4+0xab/0xf0 [ 176.266564] [<ffffffff811e3d31>] sys_ioctl+0x91/0xb0 [ 176.266564] [<ffffffff810a95f0>] ? task_stopped_code+0x50/0x50 [ 176.266564] [<ffffffff81939199>] system_call_fastpath+0x16/0x1b [ 176.266564] ---[ end trace 387af88219ad6143 ]--- It turns out that spin_unlock_bh(proc_inum_lock) is not safe when put_pid is called with another spinlock held and irqs disabled. For now take the easy path and use spin_lock_irqsave(proc_inum_lock) in proc_free_inum and spin_loc_irq in proc_alloc_inum(proc_inum_lock). Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* pidns: Stop pid allocation when init diesEric W. Biederman2012-12-264-4/+20
| | | | | | | | | | | | | | | | | | | | | | Oleg pointed out that in a pid namespace the sequence. - pid 1 becomes a zombie - setns(thepidns), fork,... - reaping pid 1. - The injected processes exiting. Can lead to processes attempting access their child reaper and instead following a stale pointer. That waitpid for init can return before all of the processes in the pid namespace have exited is also unfortunate. Avoid these problems by disabling the allocation of new pids in a pid namespace when init dies, instead of when the last process in a pid namespace is reaped. Pointed-out-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* pidns: Outlaw thread creation after unshare(CLONE_NEWPID)Eric W. Biederman2012-12-251-0/+8
| | | | | | | | | | | | | | | | | | The sequence: unshare(CLONE_NEWPID) clone(CLONE_THREAD|CLONE_SIGHAND|CLONE_VM) Creates a new process in the new pid namespace without setting pid_ns->child_reaper. After forking this results in a NULL pointer dereference. Avoid this and other nonsense scenarios that can show up after creating a new pid namespace with unshare by adding a new check in copy_prodcess. Pointed-out-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* Linux 3.8-rc1v3.8-rc1Linus Torvalds2012-12-221-2/+2
|
* Merge git://www.linux-watchdog.org/linux-watchdogLinus Torvalds2012-12-2219-445/+738
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull watchdog updates from Wim Van Sebroeck: "This includes some fixes and code improvements (like clk_prepare_enable and clk_disable_unprepare), conversion from the omap_wdt and twl4030_wdt drivers to the watchdog framework, addition of the SB8x0 chipset support and the DA9055 Watchdog driver and some OF support for the davinci_wdt driver." * git://www.linux-watchdog.org/linux-watchdog: (22 commits) watchdog: mei: avoid oops in watchdog unregister code path watchdog: Orion: Fix possible null-deference in orion_wdt_probe watchdog: sp5100_tco: Add SB8x0 chipset support watchdog: davinci_wdt: add OF support watchdog: da9052: Fix invalid free of devm_ allocated data watchdog: twl4030_wdt: Change TWL4030_MODULE_PM_RECEIVER to TWL_MODULE_PM_RECEIVER watchdog: remove depends on CONFIG_EXPERIMENTAL watchdog: Convert dev_printk(KERN_<LEVEL> to dev_<level>( watchdog: DA9055 Watchdog driver watchdog: omap_wdt: eliminate goto watchdog: omap_wdt: delete redundant platform_set_drvdata() calls watchdog: omap_wdt: convert to devm_ functions watchdog: omap_wdt: convert to new watchdog core watchdog: WatchDog Timer Driver Core: fix comment watchdog: s3c2410_wdt: use clk_prepare_enable and clk_disable_unprepare watchdog: imx2_wdt: Select the driver via ARCH_MXC watchdog: cpu5wdt.c: add missing del_timer call watchdog: hpwdt.c: Increase version string watchdog: Convert twl4030_wdt to watchdog core davinci_wdt: preparation for switch to common clock framework ...
| * watchdog: mei: avoid oops in watchdog unregister code pathTomas Winkler2012-12-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With commit c7d3df3 "mei: use internal watchdog device registration tracking" will crash the kernel on shutdown path on systems where ME watchdog is not present. Since the watchdog was never initialized in such case the WDOG_UNREGISTERED bit is never set and the system crashes on access to uninitialized variables down the path. To solve the issue we query for NULL on watchdog driver driver_data to check whether the device is registered. This is handled in the driver and doesn't depend on watchdog core internals. Cc: Borislav Petkov <bp@alien8.de> Cc: Wanlong Gao <gaowanlong@cn.fujitsu.com> Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: Orion: Fix possible null-deference in orion_wdt_probeJason Gunthorpe2012-12-191-0/+2
| | | | | | | | | | | | | | | | If the DT does not include a regs parameter then the null res would be dereferenced. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: sp5100_tco: Add SB8x0 chipset supportTakahisa Tanaka2012-12-192-61/+306
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current sp5100_tco driver only supports SP5100/SB7x0 chipset, doesn't support SB8x0 chipset, because current sp5100_tco driver doesn't know that the offset address for watchdog timer was changed from SB8x0 chipset. The offset address of SP5100 and SB7x0 chipsets are as follows, quote from the AMD SB700/710/750 Register Reference Guide (Page 164) and the AMD SP5100 Register Reference Guide (Page 166). WatchDogTimerControl 69h WatchDogTimerBase0 6Ch WatchDogTimerBase1 6Dh WatchDogTimerBase2 6Eh WatchDogTimerBase3 6Fh In contrast, the offset address of SB8x0 chipset is as follows, quote from AMD SB800-Series Southbridges Register Reference Guide (Page 147). WatchDogTimerEn 48h WatchDogTimerConfig 4Ch So, In the case of SB8x0 chipset, sp5100_tco reads meaningless MMIO address (for example, 0xbafe00) from wrong offset address, and the following message is logged. SP5100 TCO timer: mmio address 0xbafe00 already in use With this patch, sp5100_tco driver supports SB8x0 chipset, and can avoid iomem resource conflict. The processing of this patch is as follows. Step 1) Attempt to get the watchdog base address from indirect I/O (0xCD6 and 0xCD7). - Go to the step 7 if obtained address hasn't conflicted with other resource. But, currently, the address (0xfec000f0) conflicts with the IOAPIC MMIO address, and the following message is logged. SP5100 TCO timer: mmio address 0xfec000f0 already in use 0xfec000f0 is recommended by AMD BIOS Developer's Guide. So, go to the next step. Step 2) Attempt to get the SBResource_MMIO base address from AcpiMmioEN (for SB8x0, PM_Reg:24h) or SBResource_MMIO (SP5100/SB7x0, PCI_Reg:9Ch) register. - Go to the step 7 if these register has enabled by BIOS, and obtained address hasn't conflicted with other resource. - If above condition isn't true, go to the next step. Step 3) Attempt to get the free MMIO address from allocate_resource(). - Go to the step 7 if these register has enabled by BIOS, and obtained address hasn't conflicted with other resource. - Driver initialization has failed if obtained address has conflicted with other resource, and no 'force_addr' parameter is specified. Step 4) Use the specified address If 'force_addr' parameter is specified. - allocate_resource() function may fail, when the PCI bridge device occupies iomem resource from 0xf0000000 to 0xffffffff. To handle such a case, I added 'force_addr' parameter to sp5100_tco driver. With 'force_addr' parameter, sp5100_tco driver directly can assign MMIO address for watchdog timer from free iomem region. Note that It's dangerous to specify wrong address in the 'force_addr' parameter. Example of force_addr parameter use # cat /proc/iomem ...snip... fec00000-fec003ff : IOAPIC 0 <--- free MMIO region fec10000-fec1001f : pnp 00:0b fec20000-fec203ff : IOAPIC 1 ...snip... # cat /etc/modprobe.d/sp5100_tco.conf options sp5100_tco force_addr=0xfec00800 # modprobe sp5100_tco # cat /proc/iomem ...snip... fec00000-fec003ff : IOAPIC 0 fec00800-fec00807 : SP5100 TCO <--- watchdog timer MMIO address fec10000-fec1001f : pnp 00:0b fec20000-fec203ff : IOAPIC 1 ...snip... # - Driver initialization has failed if specified address has conflicted with other resource. Step 5) Disable the watchdog timer - To rewrite the watchdog timer register of the chipset, absolutely guarantee that the watchdog timer is disabled. Step 6) Re-program the watchdog timer MMIO address to chipset. - Re-program the obtained MMIO address in Step 3 or Step 4 to chipset via indirect I/O (0xCD6 and 0xCD7). Step 7) Enable and setup the watchdog timer This patch has worked fine on my test environment (ASUS M4A89GTD-PRO/USB3 and DL165G7). therefore I believe that it's no problem to re-program the MMIO address for watchdog timer to chipset during disabled watchdog. However, I'm not sure about it, because I don't know much about chipset programming. So, any comments will be welcome. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=43176 Tested-by: Arkadiusz Miskiewicz <arekm@maven.pl> Tested-by: Paul Menzel <paulepanter@users.sourceforge.net> Signed-off-by: Takahisa Tanaka <mc74hc00@gmail.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: davinci_wdt: add OF supportMurali Karicheri2012-12-192-0/+19
| | | | | | | | | | | | | | | | This adds OF support for davinci_wdt driver. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Acked-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: da9052: Fix invalid free of devm_ allocated dataTushar Behera2012-12-191-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | It is not required to free devm_ allocated data. Since kref_put needs a valid release function, da9052_wdt_release_resources() is not deleted. Fixes following warning. drivers/watchdog/da9052_wdt.c:59:1-6: WARNING: invalid free of devm_ allocated data Signed-off-by: Tushar Behera <tushar.behera@linaro.org> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>