summaryrefslogtreecommitdiffstats
path: root/mm
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2023-04-28 21:55:10 +0200
committerLinus Torvalds <torvalds@linux-foundation.org>2023-05-03 19:37:22 +0200
commit6014bc27561f2cc63e0acc18adbc4ed810834e32 (patch)
treea96499264af22da3c61569f1b8df39ccca8435d9 /mm
parentMerge tag 'pinctrl-v6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/l... (diff)
downloadlinux-6014bc27561f2cc63e0acc18adbc4ed810834e32.tar.xz
linux-6014bc27561f2cc63e0acc18adbc4ed810834e32.zip
x86-64: make access_ok() independent of LAM
The linear address masking (LAM) code made access_ok() more complicated, in that it now needs to untag the address in order to verify the access range. See commit 74c228d20a51 ("x86/uaccess: Provide untagged_addr() and remove tags before address check"). We were able to avoid that overhead in the get_user/put_user code paths by simply using the sign bit for the address check, and depending on the GP fault if the address was non-canonical, which made it all independent of LAM. And we can do the same thing for access_ok(): simply check that the user pointer range has the high bit clear. No need to bother with any address bit masking. In fact, we can go a bit further, and just check the starting address for known small accesses ranges: any accesses that overflow will still be in the non-canonical area and will still GP fault. To still make syzkaller catch any potentially unchecked user addresses, we'll continue to warn about GP faults that are caused by accesses in the non-canonical range. But we'll limit that to purely "high bit set and past the one-page 'slop' area". We could probably just do that "check only starting address" for any arbitrary range size: realistically all kernel accesses to user space will be done starting at the low address. But let's leave that kind of optimization for later. As it is, this already allows us to generate simpler code and not worry about any tag bits in the address. The one thing to look out for is the GUP address check: instead of actually copying data in the virtual address range (and thus bad addresses being caught by the GP fault), GUP will look up the page tables manually. As a result, the page table limits need to be checked, and that was previously implicitly done by the access_ok(). With the relaxed access_ok() check, we need to just do an explicit check for TASK_SIZE_MAX in the GUP code instead. The GUP code already needs to do the tag bit unmasking anyway, so there this is all very straightforward, and there are no LAM issues. Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm')
-rw-r--r--mm/gup.c2
1 files changed, 2 insertions, 0 deletions
diff --git a/mm/gup.c b/mm/gup.c
index ff689c88a357..bbe416236593 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2970,6 +2970,8 @@ static int internal_get_user_pages_fast(unsigned long start,
len = nr_pages << PAGE_SHIFT;
if (check_add_overflow(start, len, &end))
return 0;
+ if (end > TASK_SIZE_MAX)
+ return -EFAULT;
if (unlikely(!access_ok((void __user *)start, len)))
return -EFAULT;