summaryrefslogtreecommitdiffstats
path: root/drivers/android (follow)
Commit message (Collapse)AuthorAgeFilesLines
* binder: modify the comment for binder_proc_unlockBa Jing2024-09-111-1/+1
| | | | | | | | | | | Modify the comment for binder_proc_unlock() to clearly indicate which spinlock it releases and to better match the acquire comment block in binder_proc_lock(). Signed-off-by: Ba Jing <bajing@cmss.chinamobile.com> Acked-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20240902052330.3115-1-bajing@cmss.chinamobile.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Merge 6.11-rc7 into char-misc-nextGreg Kroah-Hartman2024-09-091-0/+1
|\ | | | | | | | | | | We need the char-misc fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: fix UAF caused by offsets overwriteCarlos Llamas2024-09-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Binder objects are processed and copied individually into the target buffer during transactions. Any raw data in-between these objects is copied as well. However, this raw data copy lacks an out-of-bounds check. If the raw data exceeds the data section size then the copy overwrites the offsets section. This eventually triggers an error that attempts to unwind the processed objects. However, at this point the offsets used to index these objects are now corrupted. Unwinding with corrupted offsets can result in decrements of arbitrary nodes and lead to their premature release. Other users of such nodes are left with a dangling pointer triggering a use-after-free. This issue is made evident by the following KASAN report (trimmed): ================================================================== BUG: KASAN: slab-use-after-free in _raw_spin_lock+0xe4/0x19c Write of size 4 at addr ffff47fc91598f04 by task binder-util/743 CPU: 9 UID: 0 PID: 743 Comm: binder-util Not tainted 6.11.0-rc4 #1 Hardware name: linux,dummy-virt (DT) Call trace: _raw_spin_lock+0xe4/0x19c binder_free_buf+0x128/0x434 binder_thread_write+0x8a4/0x3260 binder_ioctl+0x18f0/0x258c [...] Allocated by task 743: __kmalloc_cache_noprof+0x110/0x270 binder_new_node+0x50/0x700 binder_transaction+0x413c/0x6da8 binder_thread_write+0x978/0x3260 binder_ioctl+0x18f0/0x258c [...] Freed by task 745: kfree+0xbc/0x208 binder_thread_read+0x1c5c/0x37d4 binder_ioctl+0x16d8/0x258c [...] ================================================================== To avoid this issue, let's check that the raw data copy is within the boundaries of the data section. Fixes: 6d98eb95b450 ("binder: avoid potential data leakage when copying txn") Cc: Todd Kjos <tkjos@google.com> Cc: stable@vger.kernel.org Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20240822182353.2129600-1-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | binder: fix typo in commentRuffalo Lavoisier2024-09-031-1/+1
| | | | | | | | | | | | | | | | | | Correct spelling on 'currently' in comment Signed-off-by: Ruffalo Lavoisier <RuffaloLavoisier@gmail.com> Acked-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20240902130732.46698-1-RuffaloLavoisier@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge 6.11-rc3 into char-misc-nextGreg Kroah-Hartman2024-08-123-25/+14
|\| | | | | | | | | | | We need the char/misc fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder_alloc: Fix sleeping function called from invalid contextMukesh Ojha2024-07-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 36c55ce8703c ("binder_alloc: Replace kcalloc with kvcalloc to mitigate OOM issues") introduced schedule while atomic issue. [ 2689.152635][ T4275] BUG: sleeping function called from invalid context at mm/vmalloc.c:2847 [ 2689.161291][ T4275] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 4275, name: kworker/1:140 [ 2689.170708][ T4275] preempt_count: 1, expected: 0 [ 2689.175572][ T4275] RCU nest depth: 0, expected: 0 [ 2689.180521][ T4275] INFO: lockdep is turned off. [ 2689.180523][ T4275] Preemption disabled at: [ 2689.180525][ T4275] [<ffffffe031f2a2dc>] binder_alloc_deferred_release+0x2c/0x388 .. .. [ 2689.213419][ T4275] __might_resched+0x174/0x178 [ 2689.213423][ T4275] __might_sleep+0x48/0x7c [ 2689.213426][ T4275] vfree+0x4c/0x15c [ 2689.213430][ T4275] kvfree+0x24/0x44 [ 2689.213433][ T4275] binder_alloc_deferred_release+0x2c0/0x388 [ 2689.213436][ T4275] binder_proc_dec_tmpref+0x15c/0x2a8 [ 2689.213440][ T4275] binder_deferred_func+0xa8/0x8ec [ 2689.213442][ T4275] process_one_work+0x254/0x59c [ 2689.213447][ T4275] worker_thread+0x274/0x3ec [ 2689.213450][ T4275] kthread+0x110/0x134 [ 2689.213453][ T4275] ret_from_fork+0x10/0x20 Fix it by moving the place of kvfree outside of spinlock context. Fixes: 36c55ce8703c ("binder_alloc: Replace kcalloc with kvcalloc to mitigate OOM issues") Acked-by: Carlos Llamas <cmllamas@google.com> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> Link: https://lore.kernel.org/r/20240725062510.2856662-1-quic_mojha@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: fix descriptor lookup for context managerCarlos Llamas2024-07-312-24/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 15d9da3f818c ("binder: use bitmap for faster descriptor lookup"), it was incorrectly assumed that references to the context manager node should always get descriptor zero assigned to them. However, if the context manager dies and a new process takes its place, then assigning descriptor zero to the new context manager might lead to collisions, as there could still be references to the older node. This issue was reported by syzbot with the following trace: kernel BUG at drivers/android/binder.c:1173! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 1 PID: 447 Comm: binder-util Not tainted 6.10.0-rc6-00348-g31643d84b8c3 #10 Hardware name: linux,dummy-virt (DT) pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : binder_inc_ref_for_node+0x500/0x544 lr : binder_inc_ref_for_node+0x1e4/0x544 sp : ffff80008112b940 x29: ffff80008112b940 x28: ffff0e0e40310780 x27: 0000000000000000 x26: 0000000000000001 x25: ffff0e0e40310738 x24: ffff0e0e4089ba34 x23: ffff0e0e40310b00 x22: ffff80008112bb50 x21: ffffaf7b8f246970 x20: ffffaf7b8f773f08 x19: ffff0e0e4089b800 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 000000002de4aa60 x14: 0000000000000000 x13: 2de4acf000000000 x12: 0000000000000020 x11: 0000000000000018 x10: 0000000000000020 x9 : ffffaf7b90601000 x8 : ffff0e0e48739140 x7 : 0000000000000000 x6 : 000000000000003f x5 : ffff0e0e40310b28 x4 : 0000000000000000 x3 : ffff0e0e40310720 x2 : ffff0e0e40310728 x1 : 0000000000000000 x0 : ffff0e0e40310710 Call trace: binder_inc_ref_for_node+0x500/0x544 binder_transaction+0xf68/0x2620 binder_thread_write+0x5bc/0x139c binder_ioctl+0xef4/0x10c8 [...] This patch adds back the previous behavior of assigning the next non-zero descriptor if references to previous context managers still exist. It amends both strategies, the newer dbitmap code and also the legacy slow_desc_lookup_olocked(), by allowing them to start looking for available descriptors at a given offset. Fixes: 15d9da3f818c ("binder: use bitmap for faster descriptor lookup") Cc: stable@vger.kernel.org Reported-and-tested-by: syzbot+3dae065ca76952a67257@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000c1c0a0061d1e6979@google.com/ Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20240722150512.4192473-1-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | binder: frozen notification binder_features flagYu-Ting Tseng2024-07-311-0/+8
| | | | | | | | | | | | | | | | | | | | Add a flag to binder_features to indicate that the freeze notification feature is available. Signed-off-by: Yu-Ting Tseng <yutingtseng@google.com> Acked-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20240709070047.4055369-6-yutingtseng@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | binder: frozen notificationYu-Ting Tseng2024-07-312-4/+301
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Frozen processes present a significant challenge in binder transactions. When a process is frozen, it cannot, by design, accept and/or respond to binder transactions. As a result, the sender needs to adjust its behavior, such as postponing transactions until the peer process unfreezes. However, there is currently no way to subscribe to these state change events, making it impossible to implement frozen-aware behaviors efficiently. Introduce a binder API for subscribing to frozen state change events. This allows programs to react to changes in peer process state, mitigating issues related to binder transactions sent to frozen processes. Implementation details: For a given binder_ref, the state of frozen notification can be one of the followings: 1. Userspace doesn't want a notification. binder_ref->freeze is null. 2. Userspace wants a notification but none is in flight. list_empty(&binder_ref->freeze->work.entry) = true 3. A notification is in flight and waiting to be read by userspace. binder_ref_freeze.sent is false. 4. A notification was read by userspace and kernel is waiting for an ack. binder_ref_freeze.sent is true. When a notification is in flight, new state change events are coalesced into the existing binder_ref_freeze struct. If userspace hasn't picked up the notification yet, the driver simply rewrites the state. Otherwise, the notification is flagged as requiring a resend, which will be performed once userspace acks the original notification that's inflight. See https://r.android.com/3070045 for how userspace is going to use this feature. Signed-off-by: Yu-Ting Tseng <yutingtseng@google.com> Acked-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20240709070047.4055369-4-yutingtseng@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* binder: fix hang of unregistered readersCarlos Llamas2024-07-121-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | With the introduction of binder_available_for_proc_work_ilocked() in commit 1b77e9dcc3da ("ANDROID: binder: remove proc waitqueue") a binder thread can only "wait_for_proc_work" after its thread->looper has been marked as BINDER_LOOPER_STATE_{ENTERED|REGISTERED}. This means an unregistered reader risks waiting indefinitely for work since it never gets added to the proc->waiting_threads. If there are no further references to its waitqueue either the task will hang. The same applies to readers using the (e)poll interface. I couldn't find the rationale behind this restriction. So this patch restores the previous behavior of allowing unregistered threads to "wait_for_proc_work". Note that an error message for this scenario, which had previously become unreachable, is now re-enabled. Fixes: 1b77e9dcc3da ("ANDROID: binder: remove proc waitqueue") Cc: stable@vger.kernel.org Cc: Martijn Coenen <maco@google.com> Cc: Arve Hjønnevåg <arve@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20240711201452.2017543-1-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* binder_alloc: Replace kcalloc with kvcalloc to mitigate OOM issuesLei Liu2024-07-031-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In binder_alloc, there is a frequent need for order3 memory allocation, especially on small-memory mobile devices, which can lead to OOM and cause foreground applications to be killed, resulting in flashbacks. We use kvcalloc to allocate memory, which can reduce system OOM occurrences, as well as decrease the time and probability of failure for order3 memory allocations. Additionally, It has little impact on the throughput of the binder. (as verified by Google's binder_benchmark testing tool). We have conducted multiple tests on an 8GB memory phone, kvcalloc has little performance degradation and resolves frequent OOM issues, Below is a partial excerpt of the test data. throughput(TH_PUT) = (size * Iterations)/Time kcalloc->kvcalloc: Sample with kcalloc(): adb shell stop/ kcalloc /8+256G --------------------------------------------------------------------- Benchmark Time CPU Iterations TH-PUT TH-PUTCPU (ns) (ns) (GB/s) (GB/s) --------------------------------------------------------------------- BM_sendVec_binder4 39126 18550 38894 3.976282 8.38684 BM_sendVec_binder8 38924 18542 37786 7.766108 16.3028 BM_sendVec_binder16 38328 18228 36700 15.32039 32.2141 BM_sendVec_binder32 38154 18215 38240 32.07213 67.1798 BM_sendVec_binder64 39093 18809 36142 59.16885 122.977 BM_sendVec_binder128 40169 19188 36461 116.1843 243.2253 BM_sendVec_binder256 40695 19559 35951 226.1569 470.5484 BM_sendVec_binder512 41446 20211 34259 423.2159 867.8743 BM_sendVec_binder1024 44040 22939 28904 672.0639 1290.278 BM_sendVec_binder2048 47817 25821 26595 1139.063 2109.393 BM_sendVec_binder4096 54749 30905 22742 1701.423 3014.115 BM_sendVec_binder8192 68316 42017 16684 2000.634 3252.858 BM_sendVec_binder16384 95435 64081 10961 1881.752 2802.469 BM_sendVec_binder32768 148232 107504 6510 1439.093 1984.295 BM_sendVec_binder65536 326499 229874 3178 637.8991 906.0329 NORAML TEST SUM 10355.79 17188.15 stressapptest eat 2G SUM 10088.39 16625.97 Sample with kvcalloc(): adb shell stop/ kvcalloc /8+256G ---------------------------------------------------------------------- Benchmark Time CPU Iterations TH-PUT TH-PUTCPU (ns) (ns) (GB/s) (GB/s) ---------------------------------------------------------------------- BM_sendVec_binder4 39673 18832 36598 3.689965 7.773577 BM_sendVec_binder8 39869 18969 37188 7.462038 15.68369 BM_sendVec_binder16 39774 18896 36627 14.73405 31.01355 BM_sendVec_binder32 40225 19125 36995 29.43045 61.90013 BM_sendVec_binder64 40549 19529 35148 55.47544 115.1862 BM_sendVec_binder128 41580 19892 35384 108.9262 227.6871 BM_sendVec_binder256 41584 20059 34060 209.6806 434.6857 BM_sendVec_binder512 42829 20899 32493 388.4381 796.0389 BM_sendVec_binder1024 45037 23360 29251 665.0759 1282.236 BM_sendVec_binder2048 47853 25761 27091 1159.433 2153.735 BM_sendVec_binder4096 55574 31745 22405 1651.328 2890.877 BM_sendVec_binder8192 70706 43693 16400 1900.105 3074.836 BM_sendVec_binder16384 96161 64362 10793 1838.921 2747.468 BM_sendVec_binder32768 147875 107292 6296 1395.147 1922.858 BM_sendVec_binder65536 330324 232296 3053 605.7126 861.3209 NORAML TEST SUM 10033.56 16623.35 stressapptest eat 2G SUM 9958.43 16497.55 Signed-off-by: Lei Liu <liulei.rjpt@vivo.com> Acked-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20240619113841.3362-1-liulei.rjpt@vivo.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* binder: use bitmap for faster descriptor lookupCarlos Llamas2024-07-033-14/+279
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When creating new binder references, the driver assigns a descriptor id that is shared with userspace. Regrettably, the driver needs to keep the descriptors small enough to accommodate userspace potentially using them as Vector indexes. Currently, the driver performs a linear search on the rb-tree of references to find the smallest available descriptor id. This approach, however, scales poorly as the number of references grows. This patch introduces the usage of bitmaps to boost the performance of descriptor assignments. This optimization results in notable performance gains, particularly in processes with a large number of references. The following benchmark with 100,000 references showcases the difference in latency between the dbitmap implementation and the legacy approach: [ 587.145098] get_ref_desc_olocked: 15us (dbitmap on) [ 602.788623] get_ref_desc_olocked: 47343us (dbitmap off) Note the bitmap size is dynamically adjusted in line with the number of references, ensuring efficient memory usage. In cases where growing the bitmap is not possible, the driver falls back to the slow legacy method. A previous attempt to solve this issue was proposed in [1]. However, such method involved adding new ioctls which isn't great, plus older userspace code would not have benefited from the optimizations either. Link: https://lore.kernel.org/all/20240417191418.1341988-1-cmllamas@google.com/ [1] Cc: Tim Murray <timmurray@google.com> Cc: Arve Hjønnevåg <arve@android.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Martijn Coenen <maco@android.com> Cc: Todd Kjos <tkjos@android.com> Cc: John Stultz <jstultz@google.com> Cc: Steven Moreland <smoreland@google.com> Suggested-by: Nick Chen <chenjia3@oppo.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20240612042535.1556708-1-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* binder: fix max_thread type inconsistencyCarlos Llamas2024-05-042-2/+2
| | | | | | | | | | | | | | | The type defined for the BINDER_SET_MAX_THREADS ioctl was changed from size_t to __u32 in order to avoid incompatibility issues between 32 and 64-bit kernels. However, the internal types used to copy from user and store the value were never updated. Use u32 to fix the inconsistency. Fixes: a9350fc859ae ("staging: android: binder: fix BINDER_SET_MAX_THREADS declaration") Reported-by: Arve Hjønnevåg <arve@android.com> Cc: stable@vger.kernel.org Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20240421173750.3117808-1-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* binder: check offset alignment in binder_get_object()Carlos Llamas2024-04-111-1/+3
| | | | | | | | | | | | | | | | | | | | | | Commit 6d98eb95b450 ("binder: avoid potential data leakage when copying txn") introduced changes to how binder objects are copied. In doing so, it unintentionally removed an offset alignment check done through calls to binder_alloc_copy_from_buffer() -> check_buffer(). These calls were replaced in binder_get_object() with copy_from_user(), so now an explicit offset alignment check is needed here. This avoids later complications when unwinding the objects gets harder. It is worth noting this check existed prior to commit 7a67a39320df ("binder: add function to copy binder object from buffer"), likely removed due to redundancy at the time. Fixes: 6d98eb95b450 ("binder: avoid potential data leakage when copying txn") Cc: stable@vger.kernel.org Signed-off-by: Carlos Llamas <cmllamas@google.com> Acked-by: Todd Kjos <tkjos@google.com> Link: https://lore.kernel.org/r/20240330190115.1877819-1-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Merge tag 'char-misc-6.9-rc1' of ↵Linus Torvalds2024-03-211-2/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc and other driver subsystem updates from Greg KH: "Here is the big set of char/misc and a number of other driver subsystem updates for 6.9-rc1. Included in here are: - IIO driver updates, loads of new ones and evolution of existing ones - coresight driver updates - const cleanups for many driver subsystems - speakup driver additions - platform remove callback void cleanups - mei driver updates - mhi driver updates - cdx driver updates for MSI interrupt handling - nvmem driver updates - other smaller driver updates and cleanups, full details in the shortlog All of these have been in linux-next for a long time with no reported issue, other than a build warning for the speakup driver" The build warning hits clang and is a gcc (and C23) extension, and is fixed up in the merge. Link: https://lore.kernel.org/all/20240321134831.GA2762840@dev-arch.thelio-3990X/ * tag 'char-misc-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (279 commits) binder: remove redundant variable page_addr uio_dmem_genirq: UIO_MEM_DMA_COHERENT conversion uio_pruss: UIO_MEM_DMA_COHERENT conversion cnic,bnx2,bnx2x: use UIO_MEM_DMA_COHERENT uio: introduce UIO_MEM_DMA_COHERENT type cdx: add MSI support for CDX bus pps: use cflags-y instead of EXTRA_CFLAGS speakup: Add /dev/synthu device speakup: Fix 8bit characters from direct synth parport: sunbpp: Convert to platform remove callback returning void parport: amiga: Convert to platform remove callback returning void char: xillybus: Convert to platform remove callback returning void vmw_balloon: change maintainership MAINTAINERS: change the maintainer for hpilo driver char: xilinx_hwicap: Fix NULL vs IS_ERR() bug hpet: remove hpets::hp_clocksource platform: goldfish: move the separate 'default' propery for CONFIG_GOLDFISH char: xilinx_hwicap: drop casting to void in dev_set_drvdata greybus: move is_gb_* functions out of greybus.h greybus: Remove usage of the deprecated ida_simple_xx() API ...
| * binder: remove redundant variable page_addrColin Ian King2024-03-071-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Variable page_addr is being assigned a value that is never read. The variable is redundant and can be removed. Cleans up clang scan build warning: warning: Value stored to 'page_addr' is never read [deadcode.DeadStores] Signed-off-by: Colin Ian King <colin.i.king@intel.com> Fixes: 162c79731448 ("binder: avoid user addresses in debug logs") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312060851.cudv98wG-lkp@intel.com/ Acked-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20240307221505.101431-1-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | binder: use of hlist_count_nodes()Pierre Gondois2024-02-231-3/+1
|/ | | | | | | | | | | | | | | | | | | | | | | Make use of the newly added hlist_count_nodes(). Link: https://lkml.kernel.org/r/20240104164937.424320-3-pierre.gondois@arm.com Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Acked-by: Carlos Llamas <cmllamas@google.com> Acked-by: Marco Elver <elver@google.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arve Hjønnevåg <arve@android.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Coly Li <colyli@suse.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Kees Cook <keescook@chromium.org> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Martijn Coenen <maco@android.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Todd Kjos <tkjos@android.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* binder: signal epoll threads of self-workCarlos Llamas2024-01-311-0/+10
| | | | | | | | | | | | | | | | | | | | | | In (e)poll mode, threads often depend on I/O events to determine when data is ready for consumption. Within binder, a thread may initiate a command via BINDER_WRITE_READ without a read buffer and then make use of epoll_wait() or similar to consume any responses afterwards. It is then crucial that epoll threads are signaled via wakeup when they queue their own work. Otherwise, they risk waiting indefinitely for an event leaving their work unhandled. What is worse, subsequent commands won't trigger a wakeup either as the thread has pending work. Fixes: 457b9a6f09f0 ("Staging: android: add binder driver") Cc: Arve Hjønnevåg <arve@android.com> Cc: Martijn Coenen <maco@android.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Steven Moreland <smoreland@google.com> Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20240131215347.1808751-1-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Merge tag 'char-misc-6.8-rc1' of ↵Linus Torvalds2024-01-186-475/+494
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc and other driver updates from Greg KH: "Here is the big set of char/misc and other driver subsystem changes for 6.8-rc1. Other than lots of binder driver changes (as you can see by the merge conflicts) included in here are: - lots of iio driver updates and additions - spmi driver updates - eeprom driver updates - firmware driver updates - ocxl driver updates - mhi driver updates - w1 driver updates - nvmem driver updates - coresight driver updates - platform driver remove callback api changes - tags.sh script updates - bus_type constant marking cleanups - lots of other small driver updates All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (341 commits) android: removed duplicate linux/errno uio: Fix use-after-free in uio_open drivers: soc: xilinx: add check for platform firmware: xilinx: Export function to use in other module scripts/tags.sh: remove find_sources scripts/tags.sh: use -n to test archinclude scripts/tags.sh: add local annotation scripts/tags.sh: use more portable -path instead of -wholename scripts/tags.sh: Update comment (addition of gtags) firmware: zynqmp: Convert to platform remove callback returning void firmware: turris-mox-rwtm: Convert to platform remove callback returning void firmware: stratix10-svc: Convert to platform remove callback returning void firmware: stratix10-rsu: Convert to platform remove callback returning void firmware: raspberrypi: Convert to platform remove callback returning void firmware: qemu_fw_cfg: Convert to platform remove callback returning void firmware: mtk-adsp-ipc: Convert to platform remove callback returning void firmware: imx-dsp: Convert to platform remove callback returning void firmware: coreboot_table: Convert to platform remove callback returning void firmware: arm_scpi: Convert to platform remove callback returning void firmware: arm_scmi: Convert to platform remove callback returning void ...
| * android: removed duplicate linux/errnoTanzir Hasan2024-01-071-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | There are two linux/errno.h inclusions in this file. The second one has been removed and the file builds correctly. Fixes: 54ffdab82080 ("android: binder: binderfs.c: removed asm-generic/errno-base.h") Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Tanzir Hasan <tanzirh@google.com> Acked-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20240104-removeduperror-v1-1-d170d4b3675a@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * android: binder: binderfs.c: removed asm-generic/errno-base.hTanzir Hasan2024-01-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | asm-generic/errno-base.h can be replaced by linux/errno.h and the file will still build correctly. It is an asm-generic file which should be avoided if possible. Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Tanzir Hasan <tanzirh@google.com> Acked-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231226-binderfs-v1-1-66829e92b523@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * android: binder: fix a kernel-doc enum warningRandy Dunlap2023-12-061-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add kernel-doc notation for @LOOP_END to prevent a kernel-doc warning. binder_alloc_selftest.c:76: warning: Enum value 'LOOP_END' not described in enum 'buf_end_align_type' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Arve Hjønnevåg <arve@android.com> Cc: Todd Kjos <tkjos@android.com> Cc: Martijn Coenen <maco@android.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Christian Brauner <christian@brauner.io> Cc: Carlos Llamas <cmllamas@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Acked-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231205225324.32362-1-rdunlap@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: switch alloc->mutex to spinlock_tCarlos Llamas2023-12-052-28/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The alloc->mutex is a highly contended lock that causes performance issues on Android devices. When a low-priority task is given this lock and it sleeps, it becomes difficult for the task to wake up and complete its work. This delays other tasks that are also waiting on the mutex. The problem gets worse when there is memory pressure in the system, because this increases the contention on the alloc->mutex while the shrinker reclaims binder pages. Switching to a spinlock helps to keep the waiters running and avoids the overhead of waking up tasks. This significantly improves the transaction latency when the problematic scenario occurs. The performance impact of this patchset was measured by stress-testing the binder alloc contention. In this test, several clients of different priorities send thousands of transactions of different sizes to a single server. In parallel, pages get reclaimed using the shinker's debugfs. The test was run on a Pixel 8, Pixel 6 and qemu machine. The results were similar on all three devices: after: | sched | prio | average | max | min | |--------+------+---------+-----------+---------| | fifo | 99 | 0.135ms | 1.197ms | 0.022ms | | fifo | 01 | 0.136ms | 5.232ms | 0.018ms | | other | -20 | 0.180ms | 7.403ms | 0.019ms | | other | 19 | 0.241ms | 58.094ms | 0.018ms | before: | sched | prio | average | max | min | |--------+------+---------+-----------+---------| | fifo | 99 | 0.350ms | 248.730ms | 0.020ms | | fifo | 01 | 0.357ms | 248.817ms | 0.024ms | | other | -20 | 0.399ms | 249.906ms | 0.020ms | | other | 19 | 0.477ms | 297.756ms | 0.022ms | The key metrics above are the average and max latencies (wall time). These improvements should roughly translate to p95-p99 latencies on real workloads. The response time is up to 200x faster in these scenarios and there is no penalty in the regular path. Note that it is only possible to convert this lock after a series of changes made by previous patches. These mainly include refactoring the sections that might_sleep() and changing the locking order with the mmap_lock amongst others. Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-29-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: reverse locking order in shrinker callbackCarlos Llamas2023-12-051-24/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The locking order currently requires the alloc->mutex to be acquired first followed by the mmap lock. However, the alloc->mutex is converted into a spinlock in subsequent commits so the order needs to be reversed to avoid nesting the sleeping mmap lock under the spinlock. The shrinker's callback binder_alloc_free_page() is the only place that needs to be reordered since other functions have been refactored and no longer nest these locks. Some minor cosmetic changes are also included in this patch. Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-28-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: avoid user addresses in debug logsCarlos Llamas2023-12-052-11/+8
| | | | | | | | | | | | | | | | | | | | | | | | Prefer logging vma offsets instead of addresses or simply drop the debug log altogether if not useful. Note this covers the instances affected by the switch to store addresses as unsigned long. However, there are other sections in the driver that could do the same. Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-27-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: refactor binder_delete_free_buffer()Carlos Llamas2023-12-051-33/+11
| | | | | | | | | | | | | | | | | | | | Skip the freelist call immediately as needed, instead of continuing the pointless checks. Also, drop the debug logs that we don't really need. Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-26-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: collapse print_binder_buffer() into callerCarlos Llamas2023-12-051-13/+9
| | | | | | | | | | | | | | | | | | | | | | | | The code in print_binder_buffer() is quite small so it can be collapsed into its single caller binder_alloc_print_allocated(). No functional change in this patch. Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-25-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: document the final page calculationCarlos Llamas2023-12-051-7/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code to determine the page range for binder_lru_freelist_del() is quite obscure. It leverages the buffer_size calculated before doing an oversized buffer split. This is used to figure out if the last page is being shared with another active buffer. If so, the page gets trimmed out of the range as it has been previously removed from the freelist. This would be equivalent to getting the start page of the next in-use buffer explicitly. However, the code for this is much larger as we can see in binder_free_buf_locked() routine. Instead, lets settle on documenting the tricky step and using better names for now. I believe an ideal solution would be to count the binder_page->users to determine when a page should be added or removed from the freelist. However, this is a much bigger change than what I'm willing to risk at this time. Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-24-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: rename lru shrinker utilitiesCarlos Llamas2023-12-053-25/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the page allocation step is done separately we should rename the binder_free_page_range() and binder_allocate_page_range() functions to provide a more accurate description of what they do. Lets borrow the freelist concept used in other parts of the kernel for this. No functional change here. Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-23-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: make oversized buffer code more readableCarlos Llamas2023-12-051-11/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | The sections in binder_alloc_new_buf_locked() dealing with oversized buffers are scattered which makes them difficult to read. Instead, consolidate this code into a single block to improve readability. No functional change here. Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-22-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: remove redundant debug logCarlos Llamas2023-12-051-3/+0
| | | | | | | | | | | | | | | | | | | | The debug information in this statement is already logged earlier in the same function. We can get rid of this duplicate log. Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-21-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: perform page installation outside of locksCarlos Llamas2023-12-051-28/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split out the insertion of pages to be outside of the alloc->mutex in a separate binder_install_buffer_pages() routine. Since this is no longer serialized, we must look at the full range of pages used by the buffers. The installation is protected with mmap_sem in write mode since multiple tasks might race to install the same page. Besides avoiding unnecessary nested locking this helps in preparation of switching the alloc->mutex into a spinlock_t in subsequent patches. Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-20-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: initialize lru pages in mmap callbackCarlos Llamas2023-12-051-5/+7
| | | | | | | | | | | | | | | | | | | | | | Rather than repeatedly initializing some of the binder_lru_page members during binder_alloc_new_buf(), perform this initialization just once in binder_alloc_mmap_handler(), after the pages have been created. Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-19-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: malloc new_buffer outside of locksCarlos Llamas2023-12-051-21/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | Preallocate new_buffer before acquiring the alloc->mutex and hand it down to binder_alloc_new_buf_locked(). The new buffer will be used in the vast majority of requests (measured at 98.2% in field data). The buffer is discarded otherwise. This change is required in preparation for transitioning alloc->mutex into a spinlock in subsequent commits. Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-18-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: refactor page range allocationCarlos Llamas2023-12-051-60/+47
| | | | | | | | | | | | | | | | | | | | | | Instead of looping through the page range twice to first determine if the mmap lock is required, simply do it per-page as needed. Split out all this logic into a separate binder_install_single_page() function. Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-17-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: relocate binder_alloc_clear_buf()Carlos Llamas2023-12-051-63/+61
| | | | | | | | | | | | | | | | | | | | | | | | Move this function up along with binder_alloc_get_page() so that their prototypes aren't necessary. No functional change in this patch. Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-16-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: relocate low space calculationCarlos Llamas2023-12-051-10/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Move the low async space calculation to debug_low_async_space_locked(). This logic not only fits better here but also offloads some of the many tasks currently done in binder_alloc_new_buf_locked(). No functional change in this patch. Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-15-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: separate the no-space debugging logicCarlos Llamas2023-12-051-31/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | Move the no-space debugging logic into a separate function. Lets also mark this branch as unlikely in binder_alloc_new_buf_locked() as most requests will fit without issue. Also add a few cosmetic changes and suggestions from checkpatch. Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-14-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: remove pid param in binder_alloc_new_buf()Carlos Llamas2023-12-054-17/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | Binder attributes the buffer allocation to the current->tgid everytime. There is no need to pass this as a parameter so drop it. Also add a few touchups to follow the coding guidelines. No functional changes are introduced in this patch. Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-13-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: do unlocked work in binder_alloc_new_buf()Carlos Llamas2023-12-051-39/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Extract non-critical sections from binder_alloc_new_buf_locked() that don't require holding the alloc->mutex. While we are here, consolidate the checks for size overflow and zero-sized padding into a separate sanitized_size() helper function. Also add a few touchups to follow the coding guidelines. Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-12-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: split up binder_update_page_range()Carlos Llamas2023-12-051-39/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The binder_update_page_range() function performs both allocation and freeing of binder pages. However, these two operations are unrelated and have no common logic. In fact, when a free operation is requested, the allocation logic is skipped entirely. This behavior makes the error path unnecessarily complex. To improve readability of the code, this patch splits the allocation and freeing operations into separate functions. No functional changes are introduced by this patch. Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-11-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: keep vma addresses type as unsigned longCarlos Llamas2023-12-055-69/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The vma addresses in binder are currently stored as void __user *. This requires casting back and forth between the mm/ api which uses unsigned long. Since we also do internal arithmetic on these addresses we end up having to cast them _again_ to an integer type. Lets stop all the unnecessary casting which kills code readability and store the virtual addresses as the native unsigned long from mm/. Note that this approach is preferred over uintptr_t as Linus explains in [1]. Opportunistically add a few cosmetic touchups. Link: https://lore.kernel.org/all/CAHk-=wj2OHy-5e+srG1fy+ZU00TmZ1NFp6kFLbVLMXHe7A1d-g@mail.gmail.com/ [1] Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-10-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: remove extern from function prototypesCarlos Llamas2023-12-051-19/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | The kernel coding style does not require 'extern' in function prototypes in .h files, so remove them from drivers/android/binder_alloc.h as they are not needed. No functional changes in this patch. Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-9-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: fix comment on binder_alloc_new_buf() return valueCarlos Llamas2023-12-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the comments of binder_alloc_new_buf() to reflect that the return value of the function is now ERR_PTR(-errno) on failure. No functional changes in this patch. Cc: stable@vger.kernel.org Fixes: 57ada2fb2250 ("binder: add log information for binder transaction failures") Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-8-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: fix trivial typo of binder_free_buf_locked()Carlos Llamas2023-12-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Fix minor misspelling of the function in the comment section. No functional changes in this patch. Cc: stable@vger.kernel.org Fixes: 0f966cba95c7 ("binder: add flag to clear buffer on txn complete") Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-7-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: fix unused alloc->free_async_spaceCarlos Llamas2023-12-051-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each transaction is associated with a 'struct binder_buffer' that stores the metadata about its buffer area. Since commit 74310e06be4d ("android: binder: Move buffer out of area shared with user space") this struct is no longer embedded within the buffer itself but is instead allocated on the heap to prevent userspace access to this driver-exclusive info. Unfortunately, the space of this struct is still being accounted for in the total buffer size calculation, specifically for async transactions. This results in an additional 104 bytes added to every async buffer request, and this area is never used. This wasted space can be substantial. If we consider the maximum mmap buffer space of SZ_4M, the driver will reserve half of it for async transactions, or 0x200000. This area should, in theory, accommodate up to 262,144 buffers of the minimum 8-byte size. However, after adding the extra 'sizeof(struct binder_buffer)', the total number of buffers drops to only 18,724, which is a sad 7.14% of the actual capacity. This patch fixes the buffer size calculation to enable the utilization of the entire async buffer space. This is expected to reduce the number of -ENOSPC errors that are seen on the field. Fixes: 74310e06be4d ("android: binder: Move buffer out of area shared with user space") Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-6-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: fix async space check for 0-sized buffersCarlos Llamas2023-12-051-3/+4
| | | | | | | | | | | | | | | | | | | | | | Move the padding of 0-sized buffers to an earlier stage to account for this round up during the alloc->free_async_space check. Fixes: 74310e06be4d ("android: binder: Move buffer out of area shared with user space") Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-5-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: fix race between mmput() and do_exit()Carlos Llamas2023-12-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Task A calls binder_update_page_range() to allocate and insert pages on a remote address space from Task B. For this, Task A pins the remote mm via mmget_not_zero() first. This can race with Task B do_exit() and the final mmput() refcount decrement will come from Task A. Task A | Task B ------------------+------------------ mmget_not_zero() | | do_exit() | exit_mm() | mmput() mmput() | exit_mmap() | remove_vma() | fput() | In this case, the work of ____fput() from Task B is queued up in Task A as TWA_RESUME. So in theory, Task A returns to userspace and the cleanup work gets executed. However, Task A instead sleep, waiting for a reply from Task B that never comes (it's dead). This means the binder_deferred_release() is blocked until an unrelated binder event forces Task A to go back to userspace. All the associated death notifications will also be delayed until then. In order to fix this use mmput_async() that will schedule the work in the corresponding mm->async_put_work WQ instead of Task A. Fixes: 457b9a6f09f0 ("Staging: android: add binder driver") Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-4-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: fix use-after-free in shinker's callbackCarlos Llamas2023-12-051-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The mmap read lock is used during the shrinker's callback, which means that using alloc->vma pointer isn't safe as it can race with munmap(). As of commit dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") the mmap lock is downgraded after the vma has been isolated. I was able to reproduce this issue by manually adding some delays and triggering page reclaiming through the shrinker's debug sysfs. The following KASAN report confirms the UAF: ================================================================== BUG: KASAN: slab-use-after-free in zap_page_range_single+0x470/0x4b8 Read of size 8 at addr ffff356ed50e50f0 by task bash/478 CPU: 1 PID: 478 Comm: bash Not tainted 6.6.0-rc5-00055-g1c8b86a3799f-dirty #70 Hardware name: linux,dummy-virt (DT) Call trace: zap_page_range_single+0x470/0x4b8 binder_alloc_free_page+0x608/0xadc __list_lru_walk_one+0x130/0x3b0 list_lru_walk_node+0xc4/0x22c binder_shrink_scan+0x108/0x1dc shrinker_debugfs_scan_write+0x2b4/0x500 full_proxy_write+0xd4/0x140 vfs_write+0x1ac/0x758 ksys_write+0xf0/0x1dc __arm64_sys_write+0x6c/0x9c Allocated by task 492: kmem_cache_alloc+0x130/0x368 vm_area_alloc+0x2c/0x190 mmap_region+0x258/0x18bc do_mmap+0x694/0xa60 vm_mmap_pgoff+0x170/0x29c ksys_mmap_pgoff+0x290/0x3a0 __arm64_sys_mmap+0xcc/0x144 Freed by task 491: kmem_cache_free+0x17c/0x3c8 vm_area_free_rcu_cb+0x74/0x98 rcu_core+0xa38/0x26d4 rcu_core_si+0x10/0x1c __do_softirq+0x2fc/0xd24 Last potentially related work creation: __call_rcu_common.constprop.0+0x6c/0xba0 call_rcu+0x10/0x1c vm_area_free+0x18/0x24 remove_vma+0xe4/0x118 do_vmi_align_munmap.isra.0+0x718/0xb5c do_vmi_munmap+0xdc/0x1fc __vm_munmap+0x10c/0x278 __arm64_sys_munmap+0x58/0x7c Fix this issue by performing instead a vma_lookup() which will fail to find the vma that was isolated before the mmap lock downgrade. Note that this option has better performance than upgrading to a mmap write lock which would increase contention. Plus, mmap_write_trylock() has been recently removed anyway. Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") Cc: stable@vger.kernel.org Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Minchan Kim <minchan@kernel.org> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-3-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * binder: use EPOLLERR from eventpoll.hCarlos Llamas2023-12-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use EPOLLERR instead of POLLERR to make sure it is cast to the correct __poll_t type. This fixes the following sparse issue: drivers/android/binder.c:5030:24: warning: incorrect type in return expression (different base types) drivers/android/binder.c:5030:24: expected restricted __poll_t drivers/android/binder.c:5030:24: got int Fixes: f88982679f54 ("binder: check for binder_thread allocation failure in binder_poll()") Cc: stable@vger.kernel.org Cc: Eric Biggers <ebiggers@google.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-2-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>