summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* mm/page_alloc.c: memory hotplug: free pages as higher orderArun KS2019-03-066-25/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When freeing pages are done with higher order, time spent on coalescing pages by buddy allocator can be reduced. With section size of 256MB, hot add latency of a single section shows improvement from 50-60 ms to less than 1 ms, hence improving the hot add latency by 60 times. Modify external providers of online callback to align with the change. [arunks@codeaurora.org: v11] Link: http://lkml.kernel.org/r/1547792588-18032-1-git-send-email-arunks@codeaurora.org [akpm@linux-foundation.org: remove unused local, per Arun] [akpm@linux-foundation.org: avoid return of void-returning __free_pages_core(), per Oscar] [akpm@linux-foundation.org: fix it for mm-convert-totalram_pages-and-totalhigh_pages-variables-to-atomic.patch] [arunks@codeaurora.org: v8] Link: http://lkml.kernel.org/r/1547032395-24582-1-git-send-email-arunks@codeaurora.org [arunks@codeaurora.org: v9] Link: http://lkml.kernel.org/r/1547098543-26452-1-git-send-email-arunks@codeaurora.org Link: http://lkml.kernel.org/r/1538727006-5727-1-git-send-email-arunks@codeaurora.org Signed-off-by: Arun KS <arunks@codeaurora.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mathieu Malaterre <malat@debian.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Aaron Lu <aaron.lu@intel.com> Cc: Srivatsa Vaddagiri <vatsa@codeaurora.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm/slub.c: remove an unused addr argumentQian Cai2019-03-061-3/+2
| | | | | | | | | | | | | | | | "addr" function argument is not used in alloc_consistency_checks() at all, so remove it. Link: http://lkml.kernel.org/r/20190211123214.35592-1-cai@lca.pw Fixes: becfda68abca ("slub: convert SLAB_DEBUG_FREE to SLAB_CONSISTENCY_CHECKS") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* include/linux/slub_def.h: comment fixesTobin C. Harding2019-03-061-6/+6
| | | | | | | | | | | | | | | | | | Capitialize comment string, use C89 comment style, correct grammar/punctuation in comments. Link: http://lkml.kernel.org/r/20190204005713.9463-2-tobin@kernel.org Link: http://lkml.kernel.org/r/20190204005713.9463-3-tobin@kernel.org Link: http://lkml.kernel.org/r/20190204005713.9463-4-tobin@kernel.org Signed-off-by: Tobin C. Harding <tobin@kernel.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm/slab.c: kmemleak no scan alien cachesQian Cai2019-03-061-8/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kmemleak throws endless warnings during boot due to in __alloc_alien_cache(), alc = kmalloc_node(memsize, gfp, node); init_arraycache(&alc->ac, entries, batch); kmemleak_no_scan(ac); Kmemleak does not track the array cache (alc->ac) but the alien cache (alc) instead, so let it track the latter by lifting kmemleak_no_scan() out of init_arraycache(). There is another place that calls init_arraycache(), but alloc_kmem_cache_cpus() uses the percpu allocation where will never be considered as a leak. kmemleak: Found object by alias at 0xffff8007b9aa7e38 CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ #2 Call trace: dump_backtrace+0x0/0x168 show_stack+0x24/0x30 dump_stack+0x88/0xb0 lookup_object+0x84/0xac find_and_get_object+0x84/0xe4 kmemleak_no_scan+0x74/0xf4 setup_kmem_cache_node+0x2b4/0x35c __do_tune_cpucache+0x250/0x2d4 do_tune_cpucache+0x4c/0xe4 enable_cpucache+0xc8/0x110 setup_cpu_cache+0x40/0x1b8 __kmem_cache_create+0x240/0x358 create_cache+0xc0/0x198 kmem_cache_create_usercopy+0x158/0x20c kmem_cache_create+0x50/0x64 fsnotify_init+0x58/0x6c do_one_initcall+0x194/0x388 kernel_init_freeable+0x668/0x688 kernel_init+0x18/0x124 ret_from_fork+0x10/0x18 kmemleak: Object 0xffff8007b9aa7e00 (size 256): kmemleak: comm "swapper/0", pid 1, jiffies 4294697137 kmemleak: min_count = 1 kmemleak: count = 0 kmemleak: flags = 0x1 kmemleak: checksum = 0 kmemleak: backtrace: kmemleak_alloc+0x84/0xb8 kmem_cache_alloc_node_trace+0x31c/0x3a0 __kmalloc_node+0x58/0x78 setup_kmem_cache_node+0x26c/0x35c __do_tune_cpucache+0x250/0x2d4 do_tune_cpucache+0x4c/0xe4 enable_cpucache+0xc8/0x110 setup_cpu_cache+0x40/0x1b8 __kmem_cache_create+0x240/0x358 create_cache+0xc0/0x198 kmem_cache_create_usercopy+0x158/0x20c kmem_cache_create+0x50/0x64 fsnotify_init+0x58/0x6c do_one_initcall+0x194/0x388 kernel_init_freeable+0x668/0x688 kernel_init+0x18/0x124 kmemleak: Not scanning unknown object at 0xffff8007b9aa7e38 CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ #2 Call trace: dump_backtrace+0x0/0x168 show_stack+0x24/0x30 dump_stack+0x88/0xb0 kmemleak_no_scan+0x90/0xf4 setup_kmem_cache_node+0x2b4/0x35c __do_tune_cpucache+0x250/0x2d4 do_tune_cpucache+0x4c/0xe4 enable_cpucache+0xc8/0x110 setup_cpu_cache+0x40/0x1b8 __kmem_cache_create+0x240/0x358 create_cache+0xc0/0x198 kmem_cache_create_usercopy+0x158/0x20c kmem_cache_create+0x50/0x64 fsnotify_init+0x58/0x6c do_one_initcall+0x194/0x388 kernel_init_freeable+0x668/0x688 kernel_init+0x18/0x124 ret_from_fork+0x10/0x18 Link: http://lkml.kernel.org/r/20190129184518.39808-1-cai@lca.pw Fixes: 1fe00d50a9e8 ("slab: factor out initialization of array cache") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm/slub.c: freelist is ensured to be NULL when new_slab() failsPeng Wang2019-03-061-2/+1
| | | | | | | | | | | | | | | | | | | new_slab_objects() will return immediately if freelist is not NULL. if (freelist) return freelist; One more assignment operation could be avoided. Link: http://lkml.kernel.org/r/20181229062512.30469-1-rocking@whu.edu.cn Signed-off-by: Peng Wang <rocking@whu.edu.cn> Reviewed-by: Pekka Enberg <penberg@kernel.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs/file.c: initialize init_files.resize_waitShuriyc Chu2019-03-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (Taken from https://bugzilla.kernel.org/show_bug.cgi?id=200647) 'get_unused_fd_flags' in kthread cause kernel crash. It works fine on 4.1, but causes crash after get 64 fds. It also cause crash on ubuntu1404/1604/1804, centos7.5, and the crash messages are almost the same. The crash message on centos7.5 shows below: start fd 61 start fd 62 start fd 63 BUG: unable to handle kernel NULL pointer dereference at (null) IP: __wake_up_common+0x2e/0x90 PGD 0 Oops: 0000 [#1] SMP Modules linked in: test(OE) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter devlink sunrpc kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg ppdev pcspkr virtio_balloon parport_pc parport i2c_piix4 joydev ip_tables xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi virtio_scsi virtio_console virtio_net cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common crc32c_intel drm ata_piix serio_raw libata virtio_pci virtio_ring i2c_core virtio floppy dm_mirror dm_region_hash dm_log dm_mod CPU: 2 PID: 1820 Comm: test_fd Kdump: loaded Tainted: G OE ------------ 3.10.0-862.3.3.el7.x86_64 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 task: ffff8e92b9431fa0 ti: ffff8e94247a0000 task.ti: ffff8e94247a0000 RIP: 0010:__wake_up_common+0x2e/0x90 RSP: 0018:ffff8e94247a2d18 EFLAGS: 00010086 RAX: 0000000000000000 RBX: ffffffff9d09daa0 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffffffff9d09daa0 RBP: ffff8e94247a2d50 R08: 0000000000000000 R09: ffff8e92b95dfda8 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff9d09daa8 R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff8e9434e80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000017c686000 CR4: 00000000000207e0 Call Trace: __wake_up+0x39/0x50 expand_files+0x131/0x250 __alloc_fd+0x47/0x170 get_unused_fd_flags+0x30/0x40 test_fd+0x12a/0x1c0 [test] kthread+0xd1/0xe0 ret_from_fork_nospec_begin+0x21/0x21 Code: 66 90 55 48 89 e5 41 57 41 89 f7 41 56 41 89 ce 41 55 41 54 49 89 fc 49 83 c4 08 53 48 83 ec 10 48 8b 47 08 89 55 cc 4c 89 45 d0 <48> 8b 08 49 39 c4 48 8d 78 e8 4c 8d 69 e8 75 08 eb 3b 4c 89 ef RIP __wake_up_common+0x2e/0x90 RSP <ffff8e94247a2d18> CR2: 0000000000000000 This issue exists since CentOS 7.5 3.10.0-862 and CentOS 7.4 (3.10.0-693.21.1 ) is ok. Root cause: the item 'resize_wait' is not initialized before being used. Reported-by: Richard Zhang <zhang.zijian@h3c.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs/inode.c: inode_set_flags(): replace opencoded set_mask_bits()Vineet Gupta2019-03-061-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | | It seems that commits 5f16f3225b0624 and 00a1a053ebe5, both with same commitlog ("ext4: atomically set inode->i_flags in ext4_set_inode_flags()") introduced the set_mask_bits API, but somehow missed not using it in ext4 in the end. Also, set_mask_bits() is used in fs quite a bit and we can possibly come up with a generic llsc based implementation (w/o the cmpxchg loop) Link: http://lkml.kernel.org/r/1548275584-18096-3-git-send-email-vgupta@synopsys.com Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ocfs2: Use zero-sized array and struct_size() in kzalloc()Gustavo A. R. Silva2019-03-061-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the code to use a zero-sized array instead of a pointer in structure ocfs2_slot_info and use struct_size() in kzalloc(). Notice that one of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; void *entry[]; }; instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL); This code was detected with the help of Coccinelle. Link: http://lkml.kernel.org/r/20190108191903.GA22056@embeddedor Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ocfs2: fix the application IO timeout when fstrim is runningGang He2019-03-065-63/+106
| | | | | | | | | | | | | | | | | | | | | | | The user reported this problem, the upper application IO was timeout when fstrim was running on this ocfs2 partition. the application monitoring resource agent considered that this application did not work, then this node was fenced by the cluster brain (e.g. pacemaker). The root cause is that fstrim thread always holds main_bm meta-file related locks until all the cluster groups are trimmed. This patch will make fstrim thread release main_bm meta-file related locks when each cluster group is trimmed, this will let the current application IO has a chance to claim the clusters from main_bm meta-file. Link: http://lkml.kernel.org/r/20190111090014.31645-1-ghe@suse.com Signed-off-by: Gang He <ghe@suse.com> Reviewed-by: Changwei Ge <ge.changwei@h3c.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ocfs2: fix a panic problem caused by o2cb_ctlJia Guo2019-03-061-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the process of creating a node, it will cause NULL pointer dereference in kernel if o2cb_ctl failed in the interval (mkdir, o2cb_set_node_attribute(node_num)] in function o2cb_add_node. The node num is initialized to 0 in function o2nm_node_group_make_item, o2nm_node_group_drop_item will mistake the node number 0 for a valid node number when we delete the node before the node number is set correctly. If the local node number of the current host happens to be 0, cluster->cl_local_node will be set to O2NM_INVALID_NODE_NUM while o2hb_thread still running. The panic stack is generated as follows: o2hb_thread \-o2hb_do_disk_heartbeat \-o2hb_check_own_slot |-slot = &reg->hr_slots[o2nm_this_node()]; //o2nm_this_node() return O2NM_INVALID_NODE_NUM We need to check whether the node number is set when we delete the node. Link: http://lkml.kernel.org/r/133d8045-72cc-863e-8eae-5013f9f6bc51@huawei.com Signed-off-by: Jia Guo <guojia12@huawei.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Acked-by: Jun Piao <piaojun@huawei.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <ge.changwei@h3c.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* sh: remove nargs from __SYSCALLFiroz Khan2019-03-062-3/+3
| | | | | | | | | | | | | | | | | | | | | | | The __SYSCALL macro's arguments are system call number, system call entry name and number of arguments for the system call. Argument- nargs in __SYSCALL(nr, entry, nargs) is neither calculated nor used anywhere. So it would be better to keep the implementation as __SYSCALL(nr, entry). This unifies the implementation with some other architectures too. Link: http://lkml.kernel.org/r/1546443445-21075-2-git-send-email-firoz.khan@linaro.org Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: Simon Horman <horms+renesas@verge.net.au> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kate Stewart <kstewart@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* scripts/decode_stacktrace.sh: handle RIP address with segmentKonstantin Khlebnikov2019-03-061-1/+8
| | | | | | | | | | | | | | | decode line: RIP: 0010:khugepaged+0x2a2/0x2280 into RIP: 0010:khugepaged (mm/khugepaged.c:1885) Link: http://lkml.kernel.org/r/154660071227.52726.15645307951282727605.stgit@buzz Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kasan: fix coccinelle warnings in kasan_p*_tableAndrey Konovalov2019-03-061-3/+3
| | | | | | | | | | | | | | kasan_p4d_table(), kasan_pmd_table() and kasan_pud_table() are declared as returning bool, but return 0 instead of false, which produces a coccinelle warning. Fix it. Link: http://lkml.kernel.org/r/1fa6fadf644859e8a6a8ecce258444b49be8c7ee.1551716733.git.andreyknvl@google.com Fixes: 0207df4fa1a8 ("kernel/memremap, kasan: make ZONE_DEVICE with work with KASAN") Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reported-by: kbuild test robot <lkp@intel.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kasan: fix kasan_check_read/write definitionsArnd Bergmann2019-03-062-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Building little-endian allmodconfig kernels on arm64 started failing with the generated atomic.h implementation, since we now try to call kasan helpers from the EFI stub: aarch64-linux-gnu-ld: drivers/firmware/efi/libstub/arm-stub.stub.o: in function `atomic_set': include/generated/atomic-instrumented.h:44: undefined reference to `__efistub_kasan_check_write' I suspect that we get similar problems in other files that explicitly disable KASAN for some reason but call atomic_t based helper functions. We can fix this by checking the predefined __SANITIZE_ADDRESS__ macro that the compiler sets instead of checking CONFIG_KASAN, but this in turn requires a small hack in mm/kasan/common.c so we do see the extern declaration there instead of the inline function. Link: http://lkml.kernel.org/r/20181211133453.2835077-1-arnd@arndb.de Fixes: b1864b828644 ("locking/atomics: build atomic headers as required") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Anders Roxell <anders.roxell@linaro.org> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au>, Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* page_poison: play nicely with KASANQian Cai2019-03-062-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KASAN does not play well with the page poisoning (CONFIG_PAGE_POISONING). It triggers false positives in the allocation path: BUG: KASAN: use-after-free in memchr_inv+0x2ea/0x330 Read of size 8 at addr ffff88881f800000 by task swapper/0 CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc1+ #54 Call Trace: dump_stack+0xe0/0x19a print_address_description.cold.2+0x9/0x28b kasan_report.cold.3+0x7a/0xb5 __asan_report_load8_noabort+0x19/0x20 memchr_inv+0x2ea/0x330 kernel_poison_pages+0x103/0x3d5 get_page_from_freelist+0x15e7/0x4d90 because KASAN has not yet unpoisoned the shadow page for allocation before it checks memchr_inv() but only found a stale poison pattern. Also, false positives in free path, BUG: KASAN: slab-out-of-bounds in kernel_poison_pages+0x29e/0x3d5 Write of size 4096 at addr ffff8888112cc000 by task swapper/0/1 CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc1+ #55 Call Trace: dump_stack+0xe0/0x19a print_address_description.cold.2+0x9/0x28b kasan_report.cold.3+0x7a/0xb5 check_memory_region+0x22d/0x250 memset+0x28/0x40 kernel_poison_pages+0x29e/0x3d5 __free_pages_ok+0x75f/0x13e0 due to KASAN adds poisoned redzones around slab objects, but the page poisoning needs to poison the whole page. Link: http://lkml.kernel.org/r/20190114233405.67843-1-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kasan: remove use after scope bugs detection.Andrey Ryabinin2019-03-069-73/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use after scope bugs detector seems to be almost entirely useless for the linux kernel. It exists over two years, but I've seen only one valid bug so far [1]. And the bug was fixed before it has been reported. There were some other use-after-scope reports, but they were false-positives due to different reasons like incompatibility with structleak plugin. This feature significantly increases stack usage, especially with GCC < 9 version, and causes a 32K stack overflow. It probably adds performance penalty too. Given all that, let's remove use-after-scope detector entirely. While preparing this patch I've noticed that we mistakenly enable use-after-scope detection for clang compiler regardless of CONFIG_KASAN_EXTRA setting. This is also fixed now. [1] http://lkml.kernel.org/r/<20171129052106.rhgbjhhis53hkgfn@wfg-t540p.sh.intel.com> Link: http://lkml.kernel.org/r/20190111185842.13978-1-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Will Deacon <will.deacon@arm.com> [arm64] Cc: Qian Cai <cai@lca.pw> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: hwpoison: fix thp split handing in soft_offline_in_use_page()zhongjiang2019-03-061-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When soft_offline_in_use_page() runs on a thp tail page after pmd is split, we trigger the following VM_BUG_ON_PAGE(): Memory failure: 0x3755ff: non anonymous thp __get_any_page: 0x3755ff: unknown zero refcount page type 2fffff80000000 Soft offlining pfn 0x34d805 at process virtual address 0x20fff000 page:ffffea000d360140 count:0 mapcount:0 mapping:0000000000000000 index:0x1 flags: 0x2fffff80000000() raw: 002fffff80000000 ffffea000d360108 ffffea000d360188 0000000000000000 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0) ------------[ cut here ]------------ kernel BUG at ./include/linux/mm.h:519! soft_offline_in_use_page() passed refcount and page lock from tail page to head page, which is not needed because we can pass any subpage to split_huge_page(). Naoya had fixed a similar issue in c3901e722b29 ("mm: hwpoison: fix thp split handling in memory_failure()"). But he missed fixing soft offline. Link: http://lkml.kernel.org/r/1551452476-24000-1-git-send-email-zhongjiang@huawei.com Fixes: 61f5d698cc97 ("mm: re-enable THP") Signed-off-by: zhongjiang <zhongjiang@huawei.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Hugh Dickins <hughd@google.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> [4.5+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'mips_5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linuxLinus Torvalds2019-03-05119-4615/+1610
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull MIPS updates from Paul Burton: - Support for the MIPSr6 MemoryMapID register & Global INValidate TLB (GINVT) instructions, allowing for more efficient TLB maintenance when running on a CPU such as the I6500 that supports these. - Enable huge page support for MIPS64r6. - Optimize post-DMA cache sync by removing that code entirely for kernel configurations in which we know it won't be needed. - The number of pages allocated for interrupt stacks is now calculated correctly, where before we would wastefully allocate too much memory in some configurations. - The ath79 platform migrates to devicetree. - The bcm47xx platform sees fixes for the Buffalo WHR-G54S board. - The ingenic/jz4740 platform gains support for appended devicetrees. - The cavium_octeon, lantiq, loongson32 & sgi-ip27 platforms all see cleanups as do various pieces of core architecture code. * tag 'mips_5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (66 commits) MIPS: lantiq: Remove separate GPHY Firmware loader MIPS: ingenic: Add support for appended devicetree MIPS: SGI-IP27: rework HUB interrupts MIPS: SGI-IP27: do boot CPU init later MIPS: SGI-IP27: do xtalk scanning later MIPS: SGI-IP27: use pr_info/pr_emerg and pr_cont to fix output MIPS: SGI-IP27: clean up bridge access and header files MIPS: SGI-IP27: get rid of volatile and hubreg_t MIPS: irq: Allocate accurate order pages for irq stack MIPS: dma-noncoherent: Remove bogus condition in dma_sync_phys() MIPS: eBPF: Remove REG_32BIT_ZERO_EX MIPS: eBPF: Always return sign extended 32b values MIPS: CM: Fix indentation MIPS: BCM47XX: Fix/improve Buffalo WHR-G54S support MIPS: OCTEON: program rx/tx-delay always from DT MIPS: OCTEON: delete board-specific link status MIPS: OCTEON: don't lie about interface type of CN3005 board MIPS: OCTEON: warn if deprecated link status is being used MIPS: OCTEON: add fixed-link nodes to in-kernel device tree MIPS: Delete unused flush_cache_sigtramp() ...
| * MIPS: lantiq: Remove separate GPHY Firmware loaderHauke Mehrtens2019-02-256-284/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The separate GPHY Firmware loader driver is not used any more, the GPHY firmware is now loaded by the GSWIP switch driver which also makes use of the GPHY. Remove the old unused GPHY firmware loader driver. The GPHY firmware is useless without an Ethernet and switch driver, it should not harm if loading this does not work for system using an old device tree. I am not aware of any vendor separating the device tree from the kernel binary, it should be ok to remove this. The code and the functionality form this separate GPHY firmware loader was added to the gswip driver in commit 14fceff4771e ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org Cc: john@phrozen.org Cc: netdev@vger.kernel.org
| * MIPS: ingenic: Add support for appended devicetreePaul Cercueil2019-02-222-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | Add support for booting the kernel from an externally-appended devicetree, if no devicetree was built-in. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org
| * MIPS: SGI-IP27: rework HUB interruptsThomas Bogendoerfer2019-02-1911-531/+260
| | | | | | | | | | | | | | | | | | | | | | | | | | This commit rearranges the HUB interrupt code by using MIPS_IRQ_CPU interrupt handling code and modern Linux IRQ framework features to get rid of global arrays. It also adds support for irq affinity setting. Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org
| * MIPS: SGI-IP27: do boot CPU init laterThomas Bogendoerfer2019-02-194-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | To make use of per_cpu variables in interrupt code per_cpu_init() must be done after setup_per_cpu_areas(). This is achieved by calling it in smp_prepare_boot_cpu() via a new smp_ops method. Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org
| * MIPS: SGI-IP27: do xtalk scanning laterThomas Bogendoerfer2019-02-192-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | Move xtalk scanning to a later boot stage to be able using things like kmalloc and friends. Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org
| * MIPS: SGI-IP27: use pr_info/pr_emerg and pr_cont to fix outputThomas Bogendoerfer2019-02-192-45/+45
| | | | | | | | | | | | | | | | | | | | | | | | Topology and NMI output needs pr_cont() to look the way it was in the old days of printk. Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org
| * MIPS: SGI-IP27: clean up bridge access and header filesThomas Bogendoerfer2019-02-195-192/+138
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduced bridge_read/bridge_write/bridge_set/bridge_clr for accessing bridge register and get rid of volatile declarations. Also removed all typedefs from arch/mips/include/asm/pci/bridge.h and cleaned up language in arch/mips/pci/ops-bridge.c Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org
| * MIPS: SGI-IP27: get rid of volatile and hubreg_tThomas Bogendoerfer2019-02-198-75/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | Replace hub register access with __raw_readq/__raw_writeq and get rid of hubreg_t completely. Also remove no longer (probably never used) used defines Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org
| * MIPS: irq: Allocate accurate order pages for irq stackLiu Xiang2019-02-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | The irq_pages is the number of pages for irq stack, but not the order which is needed by __get_free_pages(). We can use get_order() to calculate the accurate order. Signed-off-by: Liu Xiang <liu.xiang6@zte.com.cn> Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: fe8bd18ffea5 ("MIPS: Introduce irq_stack") Cc: linux-mips@vger.kernel.org Cc: stable@vger.kernel.org # v4.11+
| * MIPS: dma-noncoherent: Remove bogus condition in dma_sync_phys()Paul Burton2019-02-191-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e36863a550da ("MIPS: HIGHMEM DMA on noncoherent MIPS32 processors") introduced code which: 1) Calculates an offset within a page, by ANDing an address with ~PAGE_MASK. 2) Checks whether that offset is >= PAGE_SIZE. This check can never evaluate true, making the code it guards unreachable. smatch spots bogus arithmetic resulting from the impossible condition, resulting in the following warning: arch/mips/mm/dma-noncoherent.c:125 dma_sync_phys() warn: mask and shift to zero Fix this by removing the impossible to satisfy condition & the unreachable code it guards. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org Cc: Christoph Hellwig <hch@lst.de> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Robin Murphy <robin.murphy@arm.com>
| * MIPS: eBPF: Remove REG_32BIT_ZERO_EXPaul Burton2019-02-191-10/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | REG_32BIT_ZERO_EX and REG_64BIT are always handled in exactly the same way, and reg_val_propagate_range() never actually sets any register to type REG_32BIT_ZERO_EX. Remove the redundant & unused REG_32BIT_ZERO_EX. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Jiong Wang <jiong.wang@netronome.com>
| * MIPS: eBPF: Always return sign extended 32b valuesPaul Burton2019-02-191-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function prototype used to call JITed eBPF code (ie. the type of the struct bpf_prog bpf_func field) returns an unsigned int. The MIPS n64 ABI that MIPS64 kernels target defines that 32 bit integers should always be sign extended when passed in registers as either arguments or return values. This means that when returning any value which may not already be sign extended (ie. of type REG_64BIT or REG_32BIT_ZERO_EX) we need to perform that sign extension in order to comply with the n64 ABI. Without this we see strange looking test failures from test_bpf.ko, such as: test_bpf: #65 ALU64_MOV_X: dst = 4294967295 jited:1 ret -1 != -1 FAIL (1 times) Although the return value printed matches the expected value, this is only because printf is only examining the least significant 32 bits of the 64 bit register value we returned. The register holding the expected value is sign extended whilst the v0 register was set to a zero extended value by our JITed code, so when compared by a conditional branch instruction the values are not equal. We already handle this when the return value register is of type REG_32BIT_ZERO_EX, so simply extend this to also cover REG_64BIT. Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: b6bd53f9c4e8 ("MIPS: Add missing file for eBPF JIT.") Cc: stable@vger.kernel.org # v4.13+ Cc: linux-mips@vger.kernel.org Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Jiong Wang <jiong.wang@netronome.com>
| * MIPS: CM: Fix indentationPaul Burton2019-02-151-2/+2
| | | | | | | | | | | | | | | | mips_cm_error_report() contains a function call that's incorrectly indented a level further than it ought to be. Remove a tab from the start of both affected lines. Signed-off-by: Paul Burton <paul.burton@mips.com>
| * MIPS: BCM47XX: Fix/improve Buffalo WHR-G54S supportRafał Miłecki2019-02-112-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | 1) Fix reset button support which is active *high* 2) Specify LEDs colors Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: Hauke Mehrtens <hauke@hauke-m.de> Cc: linux-mips@linux-mips.org
| * MIPS: OCTEON: program rx/tx-delay always from DTAaro Koskinen2019-02-086-52/+52
| | | | | | | | | | | | | | | | Program rx/tx-delay always from DT. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org
| * MIPS: OCTEON: delete board-specific link statusAaro Koskinen2019-02-081-42/+1
| | | | | | | | | | | | | | | | | | Delete board-specific link status. This info should now come from the DT only. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org
| * MIPS: OCTEON: don't lie about interface type of CN3005 boardAaro Koskinen2019-02-081-16/+0
| | | | | | | | | | | | | | | | | | The fixed-link node in the DT should now take care of the link status, so this hack can be deleted. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org
| * MIPS: OCTEON: warn if deprecated link status is being usedAaro Koskinen2019-02-082-0/+6
| | | | | | | | | | | | | | | | Warn if deprecated link status is being used. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org
| * MIPS: OCTEON: add fixed-link nodes to in-kernel device treeAaro Koskinen2019-02-082-0/+32
| | | | | | | | | | | | | | | | | | | | | | Currently OCTEON ethernet falls back to phyless operation on boards where we have no known PHY address or a fixed-link node. Add fixed-link support for boards that need it, so we can clean up the platform code and ethernet driver from some legacy code. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org
| * MIPS: Delete unused flush_cache_sigtramp()Paul Burton2019-02-076-183/+0
| | | | | | | | | | | | | | | | | | Commit adcc81f148d7 ("MIPS: math-emu: Write-protect delay slot emulation pages") left flush_cache_sigtramp() unused. Delete the dead code. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: linux-mips@vger.kernel.org
| * MIPS: Fix set_pte() for Netlogic XLR using cmpxchg64()Paul Burton2019-02-062-6/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 46011e6ea392 ("MIPS: Make set_pte() SMP safe.") introduced an open-coded version of cmpxchg() within set_pte(), that always operated on a value the size of an unsigned long. That is, it used ll/sc instructions when CONFIG_32BIT=y or lld/scd instructions when CONFIG_64BIT=y. This was broken for configurations in which pte_t is larger than an unsigned long (with the exception of XPA configurations which have a different implementation of set_pte()), because we no longer update the whole PTE. Indeed commit 46011e6ea392 ("MIPS: Make set_pte() SMP safe.") notes: > The case of CONFIG_64BIT_PHYS_ADDR && CONFIG_CPU_MIPS32 is *not* > handled. In practice this affects Netlogic XLR/XLS systems including nlm_xlr_defconfig. Commit 82f4f66ddf11 ("MIPS: Remove open-coded cmpxchg() in set_pte()") then replaced this open-coded version of cmpxchg() with an actual call to cmpxchg(). Unfortunately the configurations mentioned above then fail to build because cmpxchg() can only operate on values 32 bits or smaller in size, resulting in: arch/mips/include/asm/cmpxchg.h:166:11: error: call to '__cmpxchg_called_with_bad_pointer' declared with attribute error: Bad argument size for cmpxchg One option that would fix the build failure & restore the previous behaviour would be to cast the pte pointer to a pointer to unsigned long, so that cmpxchg() would operate on just 32 bits of the PTE as it has been since commit 46011e6ea392 ("MIPS: Make set_pte() SMP safe."). That feels like an ugly hack though, and the behaviour of set_pte() is likely a little broken. Instead we take advantage of the fact that the affected configurations already know at compile time that the CPU will support 64 bits (ie. have hardcoded cpu_has_64bits in cpu-feature-overrides.h) in order to allow cmpxchg64() to be used in these configurations. set_pte() then makes use of cmpxchg64() when necessary. Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: 46011e6ea392 ("MIPS: Make set_pte() SMP safe.") Fixes: 82f4f66ddf11 ("MIPS: Remove open-coded cmpxchg() in set_pte()")
| * MIPS: Export mm switching functions used by KVMPaul Burton2019-02-051-0/+3
| | | | | | | | | | | | | | | | | | | | | | KVM makes use of check_switch_mmu_context(), check_mmu_context() & get_new_mmu_context() which are no longer static inline functions in a header. As such they need to be exported for KVM to successfully build as a module, which was previously overlooked. Add the missing exports. Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: 4ebea49ce233 ("MIPS: mm: Un-inline get_new_mmu_context") Fixes: 42d5b846574f ("MIPS: mm: Unify ASID version checks")
| * MIPS: Loongson32: Remove DMA & NAND devices from ls1b/board.cPaul Burton2019-02-041-28/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 7b3415f581c7 ("MIPS: Loongson32: Remove unused platform devices") removed the definitions of platform devices which have no in tree drivers from common Loongson32 code, but missed their removal from Loongson 1B board code in arch/mips/loongson32/ls1b/board.c. This causes build failures due to the missing declarations of ls1x_dma_pdev, ls1x_nand_pdev & their associated *_set_platdata functions. Remove the dead code from arch/mips/loongson32/ls1b/board.c to fix the build. Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: 7b3415f581c7 ("MIPS: Loongson32: Remove unused platform devices")
| * MIPS: Loongson32: Fix config brokenness; select SYS_SUPPORTS_32BIT_KERNELPaul Burton2019-02-042-4/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a96d68ba3b41 ("MIPS: Loongson32: clarify we don't support MIPS16 and merge configs") attempted to reduce duplication in Kconfig by consolidating some selects common to Loongson 1B & 1C CPUs under CPU_LOONGSON1. Unfortunately it clearly wasn't tested because by removing SYS_SUPPORTS_32BIT_KERNEL it prevented 32BIT from being enabled leading to all sorts of strange build errors from a kernel configured to build as neither 32 nor 64 bit. Both loongson1b_defconfig & loongson1c_defconfig failed to build due to this problem. Revert the cleanup portions of commit a96d68ba3b41 ("MIPS: Loongson32: clarify we don't support MIPS16 and merge configs"), keeping only its removal of the selection of SYS_SUPPORTS_MIPS16. Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: a96d68ba3b41 ("MIPS: Loongson32: clarify we don't support MIPS16 and merge configs")
| * MIPS: Don't select ARCH_HAS_SYNC_DMA_FOR_CPU when DMA is coherentPaul Burton2019-02-041-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit f263f2a2c682 ("MIPS: Compile post DMA flush only when needed") pushed the selection of ARCH_HAS_SYNC_DMA_FOR_CPU down to various SYS_HAS_CPU_* Kconfig entries corresponding to CPUs for which cpu_needs_post_dma_flush() might return true, but unfortunately missed the fact that some of these CPUs can be used in configurations with DMA_NONCOHERENT=n. When this is the case the kernel build does not include our definition of arch_sync_dma_for_cpu() from arch/mips/mm/dma-noncoherent.c and the build fails with a link error. One example of this problem is ip27_defconfig: kernel/dma/direct.o: In function `dma_direct_sync_single_for_cpu': direct.c:(.text+0x6c): undefined reference to `arch_sync_dma_for_cpu' kernel/dma/direct.o: In function `dma_direct_sync_sg_for_cpu': direct.c:(.text+0x1f0): undefined reference to `arch_sync_dma_for_cpu' kernel/dma/direct.o: In function `dma_direct_alloc': direct.c:(.text+0xc20): undefined reference to `arch_dma_alloc' kernel/dma/direct.o: In function `dma_direct_free': direct.c:(.text+0xc3c): undefined reference to `arch_dma_free' make[1]: *** [Makefile:1021: vmlinux] Error 1 make: *** [Makefile:152: sub-make] Error 2 Fix this by selecting ARCH_HAS_SYNC_DMA_FOR_CPU only when DMA_NONCOHERENT is also selected. The SYS_HAS_CPU_BMIPS5000 case is left as-is because systems with that CPU always select DMA_NONCOHERENT anyway. Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: f263f2a2c682 ("MIPS: Compile post DMA flush only when needed")
| * MIPS: Enable hugepage support for MIPS64r6Paul Burton2019-02-041-0/+1
| | | | | | | | | | | | | | | | | | | | Our hugepage support already exists for MIPS64 CPUs, and is already enabled for older architecture revisions. There's nothing MIPSr6 specific involved, and our hugepage support already works fine for MIPS64r6 CPUs such as the I6500, so allow it to be selected in Kconfig. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org
| * MIPS: Remove open-coded cmpxchg() in set_pte()Paul Burton2019-02-041-43/+2
| | | | | | | | | | | | | | | | | | set_pte() contains an open coded version of cmpxchg() - it atomically replaces the buddy pte's value if it is currently zero. Simplify the code considerably by just using cmpxchg() instead of reinventing it. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org
| * MIPS: MemoryMapID (MMID) SupportPaul Burton2019-02-0415-31/+509
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce support for using MemoryMapIDs (MMIDs) as an alternative to Address Space IDs (ASIDs). The major difference between the two is that MMIDs are global - ie. an MMID uniquely identifies an address space across all coherent CPUs. In contrast ASIDs are non-global per-CPU IDs, wherein each address space is allocated a separate ASID for each CPU upon which it is used. This global namespace allows a new GINVT instruction be used to globally invalidate TLB entries associated with a particular MMID across all coherent CPUs in the system, removing the need for IPIs to invalidate entries with separate ASIDs on each CPU. The allocation scheme used here is largely borrowed from arm64 (see arch/arm64/mm/context.c). In essence we maintain a bitmap to track available MMIDs, and MMIDs in active use at the time of a rollover to a new MMID version are preserved in the new version. The allocation scheme requires efficient 64 bit atomics in order to perform reasonably, so this support depends upon CONFIG_GENERIC_ATOMIC64=n (ie. currently it will only be included in MIPS64 kernels). The first, and currently only, available CPU with support for MMIDs is the MIPS I6500. This CPU supports 16 bit MMIDs, and so for now we cap our MMIDs to 16 bits wide in order to prevent the bitmap growing to absurd sizes if any future CPU does implement 32 bit MMIDs as the architecture manuals suggest is recommended. When MMIDs are in use we also make use of GINVT instruction which is available due to the global nature of MMIDs. By executing a sequence of GINVT & SYNC 0x14 instructions we can avoid the overhead of an IPI to each remote CPU in many cases. One complication is that GINVT will invalidate wired entries (in all cases apart from type 0, which targets the entire TLB). In order to avoid GINVT invalidating any wired TLB entries we set up, we make sure to create those entries using a reserved MMID (0) that we never associate with any address space. Also of note is that KVM will require further work in order to support MMIDs & GINVT, since KVM is involved in allocating IDs for guests & in configuring the MMU. That work is not part of this patch, so for now when MMIDs are in use KVM is disabled. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org
| * MIPS: Add GINVT instruction helpersPaul Burton2019-02-044-0/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a family of ginvt_* functions making it easy to emit a GINVT instruction to globally invalidate TLB entries. We make use of the _ASM_MACRO infrastructure to support emitting the instructions even if the assembler isn't new enough to support them natively. An associated STYPE_GINV definition & sync_ginv() function are added to emit a sync instruction of type 0x14, which operates as a completion barrier for these new GINVT (and GINVI) instructions. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org
| * MIPS: mm: Add set_cpu_context() for ASID assignmentsPaul Burton2019-02-045-13/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we gain MMID support we'll be storing MMIDs as atomic64_t values and accessing them via atomic64_* functions. This necessitates that we don't use cpu_context() as the left hand side of an assignment, ie. as a modifiable lvalue. In preparation for this introduce a new set_cpu_context() function & replace all assignments with cpu_context() on their left hand side with an equivalent call to set_cpu_context(). To enforce that cpu_context() should not be used for assignments, we rewrite it as a static inline function. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org
| * MIPS: mm: Unify ASID version checksPaul Burton2019-02-044-26/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new check_mmu_context() function to check an mm's ASID version & get a new one if it's outdated, and a check_switch_mmu_context() function which additionally sets up the new ASID & page directory. Simplify switch_mm() & various get_new_mmu_context() callsites in MIPS KVM by making use of the new functions, which will help reduce the amount of code that requires modification to gain MMID support. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org
| * MIPS: mm: Un-inline get_new_mmu_contextPaul Burton2019-02-043-19/+21
| | | | | | | | | | | | | | | | | | In preparation for adding MMID support to get_new_mmu_context() which will increase the size of the function somewhat, move it from asm/mmu_context.h into a C file. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org