From ba0e3427b03c3d1550239779eca5c1c5a53a2152 Mon Sep 17 00:00:00 2001 From: "Eric W. Biederman" Date: Sat, 2 Mar 2013 19:14:03 -0800 Subject: userns: Stop oopsing in key_change_session_keyring Dave Jones writes: > Just hit this on Linus' current tree. > > [ 89.621770] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8 > [ 89.623111] IP: [] commit_creds+0x250/0x2f0 > [ 89.624062] PGD 122bfd067 PUD 122bfe067 PMD 0 > [ 89.624901] Oops: 0000 [#1] PREEMPT SMP > [ 89.625678] Modules linked in: caif_socket caif netrom bridge hidp 8021q garp stp mrp rose llc2 af_rxrpc phonet af_key binfmt_misc bnep l2tp_ppp can_bcm l2tp_core pppoe pppox can_raw scsi_transport_iscsi ppp_generic slhc nfnetlink can ipt_ULOG ax25 decnet irda nfc rds x25 crc_ccitt appletalk atm ipx p8023 psnap p8022 llc lockd sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables btusb bluetooth snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm vhost_net snd_page_alloc snd_timer tun macvtap usb_debug snd rfkill microcode macvlan edac_core pcspkr serio_raw kvm_amd soundcore kvm r8169 mii > [ 89.637846] CPU 2 > [ 89.638175] Pid: 782, comm: trinity-main Not tainted 3.8.0+ #63 Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H > [ 89.639850] RIP: 0010:[] [] commit_creds+0x250/0x2f0 > [ 89.641161] RSP: 0018:ffff880115657eb8 EFLAGS: 00010207 > [ 89.641984] RAX: 00000000000003e8 RBX: ffff88012688b000 RCX: 0000000000000000 > [ 89.643069] RDX: 0000000000000000 RSI: ffffffff81c32960 RDI: ffff880105839600 > [ 89.644167] RBP: ffff880115657ed8 R08: 0000000000000000 R09: 0000000000000000 > [ 89.645254] R10: 0000000000000001 R11: 0000000000000246 R12: ffff880105839600 > [ 89.646340] R13: ffff88011beea490 R14: ffff88011beea490 R15: 0000000000000000 > [ 89.647431] FS: 00007f3ac063b740(0000) GS:ffff88012b200000(0000) knlGS:0000000000000000 > [ 89.648660] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 89.649548] CR2: 00000000000000c8 CR3: 0000000122bfc000 CR4: 00000000000007e0 > [ 89.650635] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 89.651723] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 89.652812] Process trinity-main (pid: 782, threadinfo ffff880115656000, task ffff88011beea490) > [ 89.654128] Stack: > [ 89.654433] 0000000000000000 ffff8801058396a0 ffff880105839600 ffff88011beeaa78 > [ 89.655769] ffff880115657ef8 ffffffff812c7d9b ffffffff82079be0 0000000000000000 > [ 89.657073] ffff880115657f28 ffffffff8106c665 0000000000000002 ffff880115657f58 > [ 89.658399] Call Trace: > [ 89.658822] [] key_change_session_keyring+0xfb/0x140 > [ 89.659845] [] task_work_run+0xa5/0xd0 > [ 89.660698] [] do_notify_resume+0x71/0xb0 > [ 89.661581] [] int_signal+0x12/0x17 > [ 89.662385] Code: 24 90 00 00 00 48 8b b3 90 00 00 00 49 8b 4c 24 40 48 39 f2 75 08 e9 83 00 00 00 48 89 ca 48 81 fa 60 29 c3 81 0f 84 41 fe ff ff <48> 8b 8a c8 00 00 00 48 39 ce 75 e4 3b 82 d0 00 00 00 0f 84 4b > [ 89.667778] RIP [] commit_creds+0x250/0x2f0 > [ 89.668733] RSP > [ 89.669301] CR2: 00000000000000c8 > > My fastest trinity induced oops yet! > > > Appears to be.. > > if ((set_ns == subset_ns->parent) && > 850: 48 8b 8a c8 00 00 00 mov 0xc8(%rdx),%rcx > > from the inlined cred_cap_issubset By historical accident we have been reading trying to set new->user_ns from new->user_ns. Which is totally silly as new->user_ns is NULL (as is every other field in new except session_keyring at that point). The intent is clearly to copy all of the fields from old to new so copy old->user_ns into into new->user_ns. Cc: stable@vger.kernel.org Reported-by: Dave Jones Tested-by: Dave Jones Acked-by: Serge Hallyn Signed-off-by: "Eric W. Biederman" --- security/keys/process_keys.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'security') diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c index 58dfe0890947..a571fad91010 100644 --- a/security/keys/process_keys.c +++ b/security/keys/process_keys.c @@ -839,7 +839,7 @@ void key_change_session_keyring(struct callback_head *twork) new-> sgid = old-> sgid; new->fsgid = old->fsgid; new->user = get_uid(old->user); - new->user_ns = get_user_ns(new->user_ns); + new->user_ns = get_user_ns(old->user_ns); new->group_info = get_group_info(old->group_info); new->securebits = old->securebits; -- cgit v1.2.3 From 0da9dfdd2cd9889201bc6f6f43580c99165cd087 Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 12 Mar 2013 16:44:31 +1100 Subject: keys: fix race with concurrent install_user_keyrings() This fixes CVE-2013-1792. There is a race in install_user_keyrings() that can cause a NULL pointer dereference when called concurrently for the same user if the uid and uid-session keyrings are not yet created. It might be possible for an unprivileged user to trigger this by calling keyctl() from userspace in parallel immediately after logging in. Assume that we have two threads both executing lookup_user_key(), both looking for KEY_SPEC_USER_SESSION_KEYRING. THREAD A THREAD B =============================== =============================== ==>call install_user_keyrings(); if (!cred->user->session_keyring) ==>call install_user_keyrings() ... user->uid_keyring = uid_keyring; if (user->uid_keyring) return 0; <== key = cred->user->session_keyring [== NULL] user->session_keyring = session_keyring; atomic_inc(&key->usage); [oops] At the point thread A dereferences cred->user->session_keyring, thread B hasn't updated user->session_keyring yet, but thread A assumes it is populated because install_user_keyrings() returned ok. The race window is really small but can be exploited if, for example, thread B is interrupted or preempted after initializing uid_keyring, but before doing setting session_keyring. This couldn't be reproduced on a stock kernel. However, after placing systemtap probe on 'user->session_keyring = session_keyring;' that introduced some delay, the kernel could be crashed reliably. Fix this by checking both pointers before deciding whether to return. Alternatively, the test could be done away with entirely as it is checked inside the mutex - but since the mutex is global, that may not be the best way. Signed-off-by: David Howells Reported-by: Mateusz Guzik Cc: Signed-off-by: Andrew Morton Signed-off-by: James Morris --- security/keys/process_keys.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'security') diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c index a571fad91010..42defae1e161 100644 --- a/security/keys/process_keys.c +++ b/security/keys/process_keys.c @@ -57,7 +57,7 @@ int install_user_keyrings(void) kenter("%p{%u}", user, uid); - if (user->uid_keyring) { + if (user->uid_keyring && user->session_keyring) { kleave(" = 0 [exist]"); return 0; } -- cgit v1.2.3 From 8aec0f5d4137532de14e6554fd5dd201ff3a3c49 Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers Date: Mon, 25 Feb 2013 10:20:36 -0500 Subject: Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys Looking at mm/process_vm_access.c:process_vm_rw() and comparing it to compat_process_vm_rw() shows that the compatibility code requires an explicit "access_ok()" check before calling compat_rw_copy_check_uvector(). The same difference seems to appear when we compare fs/read_write.c:do_readv_writev() to fs/compat.c:compat_do_readv_writev(). This subtle difference between the compat and non-compat requirements should probably be debated, as it seems to be error-prone. In fact, there are two others sites that use this function in the Linux kernel, and they both seem to get it wrong: Now shifting our attention to fs/aio.c, we see that aio_setup_iocb() also ends up calling compat_rw_copy_check_uvector() through aio_setup_vectored_rw(). Unfortunately, the access_ok() check appears to be missing. Same situation for security/keys/compat.c:compat_keyctl_instantiate_key_iov(). I propose that we add the access_ok() check directly into compat_rw_copy_check_uvector(), so callers don't have to worry about it, and it therefore makes the compat call code similar to its non-compat counterpart. Place the access_ok() check in the same location where copy_from_user() can trigger a -EFAULT error in the non-compat code, so the ABI behaviors are alike on both compat and non-compat. While we are here, fix compat_do_readv_writev() so it checks for compat_rw_copy_check_uvector() negative return values. And also, fix a memory leak in compat_keyctl_instantiate_key_iov() error handling. Acked-by: Linus Torvalds Acked-by: Al Viro Signed-off-by: Mathieu Desnoyers Signed-off-by: Linus Torvalds --- fs/compat.c | 15 +++++++-------- mm/process_vm_access.c | 8 -------- security/keys/compat.c | 4 ++-- 3 files changed, 9 insertions(+), 18 deletions(-) (limited to 'security') diff --git a/fs/compat.c b/fs/compat.c index fe40fde29111..d487985dd0ea 100644 --- a/fs/compat.c +++ b/fs/compat.c @@ -558,6 +558,10 @@ ssize_t compat_rw_copy_check_uvector(int type, } *ret_pointer = iov; + ret = -EFAULT; + if (!access_ok(VERIFY_READ, uvector, nr_segs*sizeof(*uvector))) + goto out; + /* * Single unix specification: * We should -EINVAL if an element length is not >= 0 and fitting an @@ -1080,17 +1084,12 @@ static ssize_t compat_do_readv_writev(int type, struct file *file, if (!file->f_op) goto out; - ret = -EFAULT; - if (!access_ok(VERIFY_READ, uvector, nr_segs*sizeof(*uvector))) - goto out; - - tot_len = compat_rw_copy_check_uvector(type, uvector, nr_segs, + ret = compat_rw_copy_check_uvector(type, uvector, nr_segs, UIO_FASTIOV, iovstack, &iov); - if (tot_len == 0) { - ret = 0; + if (ret <= 0) goto out; - } + tot_len = ret; ret = rw_verify_area(type, file, pos, tot_len); if (ret < 0) goto out; diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c index 926b46649749..fd26d0433509 100644 --- a/mm/process_vm_access.c +++ b/mm/process_vm_access.c @@ -429,12 +429,6 @@ compat_process_vm_rw(compat_pid_t pid, if (flags != 0) return -EINVAL; - if (!access_ok(VERIFY_READ, lvec, liovcnt * sizeof(*lvec))) - goto out; - - if (!access_ok(VERIFY_READ, rvec, riovcnt * sizeof(*rvec))) - goto out; - if (vm_write) rc = compat_rw_copy_check_uvector(WRITE, lvec, liovcnt, UIO_FASTIOV, iovstack_l, @@ -459,8 +453,6 @@ free_iovecs: kfree(iov_r); if (iov_l != iovstack_l) kfree(iov_l); - -out: return rc; } diff --git a/security/keys/compat.c b/security/keys/compat.c index 1c261763f479..d65fa7fa29ba 100644 --- a/security/keys/compat.c +++ b/security/keys/compat.c @@ -40,12 +40,12 @@ static long compat_keyctl_instantiate_key_iov( ARRAY_SIZE(iovstack), iovstack, &iov); if (ret < 0) - return ret; + goto err; if (ret == 0) goto no_payload_free; ret = keyctl_instantiate_key_common(id, iov, ioc, ret, ringid); - +err: if (iov != iovstack) kfree(iov); return ret; -- cgit v1.2.3 From 4502403dcf8f5c76abd4dbab8726c8e4ecb5cd34 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Sat, 16 Mar 2013 12:48:11 +0300 Subject: selinux: use GFP_ATOMIC under spin_lock The call tree here is: sk_clone_lock() <- takes bh_lock_sock(newsk); xfrm_sk_clone_policy() __xfrm_sk_clone_policy() clone_policy() <- uses GFP_ATOMIC for allocations security_xfrm_policy_clone() security_ops->xfrm_policy_clone_security() selinux_xfrm_policy_clone() Signed-off-by: Dan Carpenter Cc: stable@kernel.org Signed-off-by: James Morris --- security/selinux/xfrm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'security') diff --git a/security/selinux/xfrm.c b/security/selinux/xfrm.c index 48665ecd1197..8ab295154517 100644 --- a/security/selinux/xfrm.c +++ b/security/selinux/xfrm.c @@ -310,7 +310,7 @@ int selinux_xfrm_policy_clone(struct xfrm_sec_ctx *old_ctx, if (old_ctx) { new_ctx = kmalloc(sizeof(*old_ctx) + old_ctx->ctx_len, - GFP_KERNEL); + GFP_ATOMIC); if (!new_ctx) return -ENOMEM; -- cgit v1.2.3 From eddc0a3abff273842a94784d2d022bbc36dc9015 Mon Sep 17 00:00:00 2001 From: "Eric W. Biederman" Date: Thu, 21 Mar 2013 02:30:41 -0700 Subject: yama: Better permission check for ptraceme Change the permission check for yama_ptrace_ptracee to the standard ptrace permission check, testing if the traceer has CAP_SYS_PTRACE in the tracees user namespace. Reviewed-by: Kees Cook Signed-off-by: "Eric W. Biederman" --- security/yama/yama_lsm.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) (limited to 'security') diff --git a/security/yama/yama_lsm.c b/security/yama/yama_lsm.c index 23414b93771f..13c88fbcf037 100644 --- a/security/yama/yama_lsm.c +++ b/security/yama/yama_lsm.c @@ -347,10 +347,8 @@ int yama_ptrace_traceme(struct task_struct *parent) /* Only disallow PTRACE_TRACEME on more aggressive settings. */ switch (ptrace_scope) { case YAMA_SCOPE_CAPABILITY: - rcu_read_lock(); - if (!ns_capable(__task_cred(parent)->user_ns, CAP_SYS_PTRACE)) + if (!has_ns_capability(parent, current_user_ns(), CAP_SYS_PTRACE)) rc = -EPERM; - rcu_read_unlock(); break; case YAMA_SCOPE_NO_ATTACH: rc = -EPERM; -- cgit v1.2.3