summaryrefslogtreecommitdiffstats
path: root/drivers/misc/vmw_balloon.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* vmw_balloon: dynamically allocate the vmw-balloon shrinkerQi Zheng2023-10-041-26/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for implementing lockless slab shrink, use new APIs to dynamically allocate the vmw-balloon shrinker, so that it can be freed asynchronously via RCU. Then it doesn't need to wait for RCU read-side critical section when releasing the struct vmballoon. And we can simply exit vmballoon_init() when registering the shrinker fails. So the shrinker_registered indication is redundant, just remove it. Link: https://lkml.kernel.org/r/20230911094444.68966-28-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Nadav Amit <namit@vmware.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Carlos Llamas <cmllamas@google.com> Cc: Chandan Babu R <chandan.babu@oracle.com> Cc: Chao Yu <chao@kernel.org> Cc: Chris Mason <clm@fb.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Chuck Lever <cel@kernel.org> Cc: Coly Li <colyli@suse.de> Cc: Dai Ngo <Dai.Ngo@oracle.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Airlie <airlied@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Sterba <dsterba@suse.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Gao Xiang <hsiangkao@linux.alibaba.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Wang <jasowang@redhat.com> Cc: Jeff Layton <jlayton@kernel.org> Cc: Jeffle Xu <jefflexu@linux.alibaba.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Kirill Tkhai <tkhai@ya.ru> Cc: Marijn Suijten <marijn.suijten@somainline.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Neil Brown <neilb@suse.de> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Cc: Olga Kornievskaia <kolga@netapp.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rob Clark <robdclark@gmail.com> Cc: Rob Herring <robh@kernel.org> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Sean Paul <sean@poorly.run> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Song Liu <song@kernel.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Steven Price <steven.price@arm.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Tom Talpey <tom@talpey.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Yue Hu <huyue2@coolpad.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* misc: vmw_balloon: fix memory leak with using debugfs_lookup()Greg Kroah-Hartman2023-02-081-1/+1
| | | | | | | | | | | | | When calling debugfs_lookup() the result must have dput() called on it, otherwise the memory will leak over time. To make things simpler, just call debugfs_lookup_and_remove() instead which handles all of the logic at once. Cc: Nadav Amit <namit@vmware.com> Cc: VMware PV-Drivers Reviewers <pv-drivers@vmware.com> Cc: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230202141100.2291188-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Merge tag 'mm-stable-2022-08-03' of ↵Linus Torvalds2022-08-061-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Most of the MM queue. A few things are still pending. Liam's maple tree rework didn't make it. This has resulted in a few other minor patch series being held over for next time. Multi-gen LRU still isn't merged as we were waiting for mapletree to stabilize. The current plan is to merge MGLRU into -mm soon and to later reintroduce mapletree, with a view to hopefully getting both into 6.1-rc1. Summary: - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe Lin, Yang Shi, Anshuman Khandual and Mike Rapoport - Some kmemleak fixes from Patrick Wang and Waiman Long - DAMON updates from SeongJae Park - memcg debug/visibility work from Roman Gushchin - vmalloc speedup from Uladzislau Rezki - more folio conversion work from Matthew Wilcox - enhancements for coherent device memory mapping from Alex Sierra - addition of shared pages tracking and CoW support for fsdax, from Shiyang Ruan - hugetlb optimizations from Mike Kravetz - Mel Gorman has contributed some pagealloc changes to improve latency and realtime behaviour. - mprotect soft-dirty checking has been improved by Peter Xu - Many other singleton patches all over the place" [ XFS merge from hell as per Darrick Wong in https://lore.kernel.org/all/YshKnxb4VwXycPO8@magnolia/ ] * tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (282 commits) tools/testing/selftests/vm/hmm-tests.c: fix build mm: Kconfig: fix typo mm: memory-failure: convert to pr_fmt() mm: use is_zone_movable_page() helper hugetlbfs: fix inaccurate comment in hugetlbfs_statfs() hugetlbfs: cleanup some comments in inode.c hugetlbfs: remove unneeded header file hugetlbfs: remove unneeded hugetlbfs_ops forward declaration hugetlbfs: use helper macro SZ_1{K,M} mm: cleanup is_highmem() mm/hmm: add a test for cross device private faults selftests: add soft-dirty into run_vmtests.sh selftests: soft-dirty: add test for mprotect mm/mprotect: fix soft-dirty check in can_change_pte_writable() mm: memcontrol: fix potential oom_lock recursion deadlock mm/gup.c: fix formatting in check_and_migrate_movable_page() xfs: fail dax mount if reflink is enabled on a partition mm/memcontrol.c: remove the redundant updating of stats_flush_threshold userfaultfd: don't fail on unrecognized features hugetlb_cgroup: fix wrong hugetlb cgroup numa stat ...
| * mm: shrinkers: provide shrinkers with namesRoman Gushchin2022-07-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently shrinkers are anonymous objects. For debugging purposes they can be identified by count/scan function names, but it's not always useful: e.g. for superblock's shrinkers it's nice to have at least an idea of to which superblock the shrinker belongs. This commit adds names to shrinkers. register_shrinker() and prealloc_shrinker() functions are extended to take a format and arguments to master a name. In some cases it's not possible to determine a good name at the time when a shrinker is allocated. For such cases shrinker_debugfs_rename() is provided. The expected format is: <subsystem>-<shrinker_type>[:<instance>]-<id> For some shrinkers an instance can be encoded as (MAJOR:MINOR) pair. After this change the shrinker debugfs directory looks like: $ cd /sys/kernel/debug/shrinker/ $ ls dquota-cache-16 sb-devpts-28 sb-proc-47 sb-tmpfs-42 mm-shadow-18 sb-devtmpfs-5 sb-proc-48 sb-tmpfs-43 mm-zspool:zram0-34 sb-hugetlbfs-17 sb-pstore-31 sb-tmpfs-44 rcu-kfree-0 sb-hugetlbfs-33 sb-rootfs-2 sb-tmpfs-49 sb-aio-20 sb-iomem-12 sb-securityfs-6 sb-tracefs-13 sb-anon_inodefs-15 sb-mqueue-21 sb-selinuxfs-22 sb-xfs:vda1-36 sb-bdev-3 sb-nsfs-4 sb-sockfs-8 sb-zsmalloc-19 sb-bpf-32 sb-pipefs-14 sb-sysfs-26 thp-deferred_split-10 sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-1 thp-zero-9 sb-cgroup2-30 sb-proc-39 sb-tmpfs-27 xfs-buf:vda1-37 sb-configfs-23 sb-proc-41 sb-tmpfs-29 xfs-inodegc:vda1-38 sb-dax-11 sb-proc-45 sb-tmpfs-35 sb-debugfs-7 sb-proc-46 sb-tmpfs-40 [roman.gushchin@linux.dev: fix build warnings] Link: https://lkml.kernel.org/r/Yr+ZTnLb9lJk6fJO@castle Reported-by: kernel test robot <lkp@intel.com> Link: https://lkml.kernel.org/r/20220601032227.4076670-4-roman.gushchin@linux.dev Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Dave Chinner <dchinner@redhat.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | mm: Convert all PageMovable users to movable_operationsMatthew Wilcox (Oracle)2022-08-021-58/+3
|/ | | | | | | | | These drivers are rather uncomfortably hammered into the address_space_operations hole. They aren't filesystems and don't behave like filesystems. They just need their own movable_operations structure, which we can point to directly from page->mapping. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
* vmw_balloon: Print errors on reset only onceNadav Amit2022-04-241-2/+2
| | | | | | | | | The VMware balloon might be reset multiple times during execution. Print errors only once to avoid filling the log unnecessarily. Signed-off-by: Nadav Amit <namit@vmware.com> Link: https://lore.kernel.org/r/20220322170052.6351-1-namit@vmware.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* drivers: vmw_balloon: remove dentry pointer for debugfsGreg Kroah-Hartman2021-03-101-8/+3
| | | | | | | | | | | | | | | There is no need to keep the dentry pointer around for the created debugfs file, as it is only needed when removing it from the system. When it is to be removed, ask debugfs itself for the pointer, to save on storage and make things a bit simpler. Cc: Nadav Amit <namit@vmware.com> Cc: "VMware, Inc." <pv-drivers@vmware.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-kernel@vger.kernel.org Acked-by: Nadav Amit <namit@vmware.com> Link: https://lore.kernel.org/r/20210216151209.3954129-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vmw_balloon: Explicitly include linux/io.h for virt_to_phys()Sean Christopherson2019-12-101-0/+1
| | | | | | | | | | | | | Through a labyrinthian sequence of includes, usage of virt_to_phys() is dependent on the include of asm/io.h in x86's asm/realmode.h, which is included in x86's asm/acpi.h and thus by linux/acpi.h. Explicitly include linux/io.h to break the dependency on realmode.h so that a future patch can remove the realmode.h include from acpi.h without breaking the build. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Link: https://lkml.kernel.org/r/20191126165417.22423-9-sean.j.christopherson@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* vmw_balloon: Fix offline page marking with compactionNadav Amit2019-08-281-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The compaction code already marks pages as offline when it enqueues pages in the ballooned page list, and removes the mapping when the pages are removed from the list. VMware balloon also updates the flags, instead of letting the balloon-compaction logic handle it, which causes the assertion VM_BUG_ON_PAGE(!PageOffline(page)) to fire, when __ClearPageOffline is called the second time. This causes the following crash. [ 487.104520] kernel BUG at include/linux/page-flags.h:749! [ 487.106364] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC PTI [ 487.107681] CPU: 7 PID: 1106 Comm: kworker/7:3 Not tainted 5.3.0-rc5balloon #227 [ 487.109196] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018 [ 487.111452] Workqueue: events_freezable vmballoon_work [vmw_balloon] [ 487.112779] RIP: 0010:vmballoon_release_page_list+0xaa/0x100 [vmw_balloon] [ 487.114200] Code: fe 48 c1 e7 06 4c 01 c7 8b 47 30 41 89 c1 41 81 e1 00 01 00 f0 41 81 f9 00 00 00 f0 74 d3 48 c7 c6 08 a1 a1 c0 e8 06 0d e7 ea <0f> 0b 44 89 f6 4c 89 c7 e8 49 9c e9 ea 49 8d 75 08 49 8b 45 08 4d [ 487.118033] RSP: 0018:ffffb82f012bbc98 EFLAGS: 00010246 [ 487.119135] RAX: 0000000000000037 RBX: 0000000000000001 RCX: 0000000000000006 [ 487.120601] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9a85b6bd7620 [ 487.122071] RBP: ffffb82f012bbcc0 R08: 0000000000000001 R09: 0000000000000000 [ 487.123536] R10: 0000000000000000 R11: 0000000000000000 R12: ffffb82f012bbd00 [ 487.125002] R13: ffffe97f4598d9c0 R14: 0000000000000000 R15: ffffb82f012bbd34 [ 487.126463] FS: 0000000000000000(0000) GS:ffff9a85b6bc0000(0000) knlGS:0000000000000000 [ 487.128110] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 487.129316] CR2: 00007ffe6e413ea0 CR3: 0000000230b18001 CR4: 00000000003606e0 [ 487.130812] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 487.132283] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 487.133749] Call Trace: [ 487.134333] vmballoon_deflate+0x22c/0x390 [vmw_balloon] [ 487.135468] vmballoon_work+0x6e7/0x913 [vmw_balloon] [ 487.136711] ? process_one_work+0x21a/0x5e0 [ 487.138581] process_one_work+0x298/0x5e0 [ 487.139926] ? vmballoon_migratepage+0x310/0x310 [vmw_balloon] [ 487.141610] ? process_one_work+0x298/0x5e0 [ 487.143053] worker_thread+0x41/0x400 [ 487.144389] kthread+0x12b/0x150 [ 487.145582] ? process_one_work+0x5e0/0x5e0 [ 487.146937] ? kthread_create_on_node+0x60/0x60 [ 487.148637] ret_from_fork+0x3a/0x50 Fix it by updating the PageOffline indication only when a 2MB page is enqueued and dequeued. The 4KB pages will be handled correctly by the balloon compaction logic. Fixes: 83a8afa72e9c ("vmw_balloon: Compaction support") Cc: David Hildenbrand <david@redhat.com> Reported-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Link: https://lore.kernel.org/r/20190820160121.452-1-namit@vmware.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Merge branch 'work.mount0' of ↵Linus Torvalds2019-07-191-12/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs mount updates from Al Viro: "The first part of mount updates. Convert filesystems to use the new mount API" * 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) mnt_init(): call shmem_init() unconditionally constify ksys_mount() string arguments don't bother with registering rootfs init_rootfs(): don't bother with init_ramfs_fs() vfs: Convert smackfs to use the new mount API vfs: Convert selinuxfs to use the new mount API vfs: Convert securityfs to use the new mount API vfs: Convert apparmorfs to use the new mount API vfs: Convert openpromfs to use the new mount API vfs: Convert xenfs to use the new mount API vfs: Convert gadgetfs to use the new mount API vfs: Convert oprofilefs to use the new mount API vfs: Convert ibmasmfs to use the new mount API vfs: Convert qib_fs/ipathfs to use the new mount API vfs: Convert efivarfs to use the new mount API vfs: Convert configfs to use the new mount API vfs: Convert binfmt_misc to use the new mount API convenience helper: get_tree_single() convenience helper get_tree_nodev() vfs: Kill sget_userns() ...
* \ Merge tag 'driver-core-5.3-rc1' of ↵Linus Torvalds2019-07-121-16/+4
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and debugfs updates from Greg KH: "Here is the "big" driver core and debugfs changes for 5.3-rc1 It's a lot of different patches, all across the tree due to some api changes and lots of debugfs cleanups. Other than the debugfs cleanups, in this set of changes we have: - bus iteration function cleanups - scripts/get_abi.pl tool to display and parse Documentation/ABI entries in a simple way - cleanups to Documenatation/ABI/ entries to make them parse easier due to typos and other minor things - default_attrs use for some ktype users - driver model documentation file conversions to .rst - compressed firmware file loading - deferred probe fixes All of these have been in linux-next for a while, with a bunch of merge issues that Stephen has been patient with me for" * tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits) debugfs: make error message a bit more verbose orangefs: fix build warning from debugfs cleanup patch ubifs: fix build warning after debugfs cleanup patch driver: core: Allow subsystems to continue deferring probe drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT arch_topology: Remove error messages on out-of-memory conditions lib: notifier-error-inject: no need to check return value of debugfs_create functions swiotlb: no need to check return value of debugfs_create functions ceph: no need to check return value of debugfs_create functions sunrpc: no need to check return value of debugfs_create functions ubifs: no need to check return value of debugfs_create functions orangefs: no need to check return value of debugfs_create functions nfsd: no need to check return value of debugfs_create functions lib: 842: no need to check return value of debugfs_create functions debugfs: provide pr_fmt() macro debugfs: log errors when something goes wrong drivers: s390/cio: Fix compilation warning about const qualifiers drivers: Add generic helper to match by of_node driver_find_device: Unify the match function with class_find_device() bus_find_device: Unify the match callback with class_find_device ...
| * | vmw_balloon: no need to check return value of debugfs_create functionsGreg Kroah-Hartman2019-06-121-16/+3
| |/ | | | | | | | | | | | | | | | | | | | | | | | | When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Cc: Julien Freche <jfreche@vmware.com> Cc: "VMware, Inc." <pv-drivers@vmware.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-kernel@vger.kernel.org Acked-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | vmw_balloon: Split refused pagesNadav Amit2019-05-241-11/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The hypervisor might refuse to inflate pages. While the balloon driver handles this scenario correctly, a refusal to inflate a 2MB pages might cause the same page to be allocated again later just for its inflation to be refused again. This wastes energy and time. To avoid this situation, split the 2MB page to 4KB pages, and then try to inflate each one individually. Most of the 4KB pages out of the 2MB should be inflated successfully, and the balloon is likely to prevent the scenario of repeated refused inflation. Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | vmw_balloon: Add memory shrinkerNadav Amit2019-05-241-2/+131
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a shrinker to the VMware balloon to prevent out-of-memory events. We reuse the deflate logic for this matter. Deadlocks should not happen, as no memory allocation is performed while the locks of the communication (batch/page) and page-list are taken. In the unlikely event in which the configuration semaphore is taken for write we bail out and fail gracefully (causing processes to be killed). Once the shrinker is called, inflation is postponed for few seconds. The timeout is updated without any lock, but this should not cause any races, as it is written and read atomically. This feature is disabled by default, since it might cause performance degradation. Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | vmw_balloon: Compaction supportNadav Amit2019-05-241-38/+263
|/ | | | | | | | | | | | | | | | | | | Add support for compaction for VMware balloon. Since unlike the virtio balloon, we also support huge-pages, which are not going through compaction, we keep these pages in vmballoon and handle this list separately. We use the same lock to protect both lists, as this lock is not supposed to be contended. Doing so also eliminates the need for the page_size lists. We update the accounting as needed to reflect inflation, deflation and migration to be reflected in vmstat. Since VMware balloon now provides statistics for inflation, deflation and migration in vmstat, select MEMORY_BALLOON in Kconfig. Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Merge tag 'char-misc-5.1-rc1' of ↵Linus Torvalds2019-03-061-6/+18
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the big char/misc driver patch pull request for 5.1-rc1. The largest thing by far is the new habanalabs driver for their AI accelerator chip. For now it is in the drivers/misc directory but will probably move to a new directory soon along with other drivers of this type. Other than that, just the usual set of individual driver updates and fixes. There's an "odd" merge in here from the DRM tree that they asked me to do as the MEI driver is starting to interact with the i915 driver, and it needed some coordination. All of those patches have been properly acked by the relevant subsystem maintainers. All of these have been in linux-next with no reported issues, most for quite some time" * tag 'char-misc-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (219 commits) habanalabs: adjust Kconfig to fix build errors habanalabs: use %px instead of %p in error print habanalabs: use do_div for 64-bit divisions intel_th: gth: Fix an off-by-one in output unassigning habanalabs: fix little-endian<->cpu conversion warnings habanalabs: use NULL to initialize array of pointers habanalabs: fix little-endian<->cpu conversion warnings habanalabs: soft-reset device if context-switch fails habanalabs: print pointer using %p habanalabs: fix memory leak with CBs with unaligned size habanalabs: return correct error code on MMU mapping failure habanalabs: add comments in uapi/misc/habanalabs.h habanalabs: extend QMAN0 job timeout habanalabs: set DMA0 completion to SOB 1007 habanalabs: fix validation of WREG32 to DMA completion habanalabs: fix mmu cache registers init habanalabs: disable CPU access on timeouts habanalabs: add MMU DRAM default page mapping habanalabs: Dissociate RAZWI info from event types misc/habanalabs: adjust Kconfig to fix build errors ...
| * vmw_balloon: release lock on error in vmballoon_reset()Dan Carpenter2019-02-121-2/+3
| | | | | | | | | | | | | | | | | | | | | | We added some locking to this function but forgot to drop the lock on these two error paths. This bug would lead to an immediate deadlock. Fixes: c7b3690fb152 ("vmw_balloon: stats rework") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable@vger.kernel.org Reviewed-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * vmw_balloon: support 64-bit memory limitXavier Deguillard2019-02-081-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the balloon driver would fail to run if memory is greater than 16TB of vRAM. Previous patches have already converted the balloon target and size to 64-bit, so all that is left to do add is to avoid asserting memory is smaller than 16TB if the hypervisor supports 64-bits target. The driver advertises a new capability VMW_BALLOON_64_BITS_TARGET. Hypervisors that support 16TB of memory or more will report that this capability is enabled. Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * vmw_balloon: remove the version numberNadav Amit2019-02-081-1/+0
| | | | | | | | | | | | | | Following the new kernel policy. Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | vmw_balloon: mark inflated pages PG_offlineDavid Hildenbrand2019-03-061-0/+32
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark inflated and never onlined pages PG_offline, to tell the world that the content is stale and should not be dumped. [david@redhat.com: use vmballoon_page_in_frames more widely] Link: http://lkml.kernel.org/r/20181122100627.5189-7-david@redhat.com Link: http://lkml.kernel.org/r/20181119101616.8901-7-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Nadav Amit <namit@vmware.com> Cc: Xavier Deguillard <xdeguillard@vmware.com> Cc: Nadav Amit <namit@vmware.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Julien Freche <jfreche@vmware.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christian Hansen <chansen3@cisco.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Juergen Gross <jgross@suse.com> Cc: Kairui Song <kasong@redhat.com> Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Len Brown <len.brown@intel.com> Cc: Lianbo Jiang <lijiang@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Miles Chen <miles.chen@mediatek.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Pankaj gupta <pagupta@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'char-misc-4.21-rc1' of ↵Linus Torvalds2018-12-291-12/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the big set of char and misc driver patches for 4.21-rc1. Lots of different types of driver things in here, as this tree seems to be the "collection of various driver subsystems not big enough to have their own git tree" lately. Anyway, some highlights of the changes in here: - binderfs: is it a rule that all driver subsystems will eventually grow to have their own filesystem? Binder now has one to handle the use of it in containerized systems. This was discussed at the Plumbers conference a few months ago and knocked into mergable shape very fast by Christian Brauner. Who also has signed up to be another binder maintainer, showing a distinct lack of good judgement :) - binder updates and fixes - mei driver updates - fpga driver updates and additions - thunderbolt driver updates - soundwire driver updates - extcon driver updates - nvmem driver updates - hyper-v driver updates - coresight driver updates - pvpanic driver additions and reworking for more device support - lp driver updates. Yes really, it's _finally_ moved to the proper parallal port driver model, something I never thought I would see happen. Good stuff. - other tiny driver updates and fixes. All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (116 commits) MAINTAINERS: add another Android binder maintainer intel_th: msu: Fix an off-by-one in attribute store stm class: Add a reference to the SyS-T document stm class: Fix a module refcount leak in policy creation error path char: lp: use new parport device model char: lp: properly count the lp devices char: lp: use first unused lp number while registering char: lp: detach the device when parallel port is removed char: lp: introduce list to save port number bus: qcom: remove duplicated include from qcom-ebi2.c VMCI: Use memdup_user() rather than duplicating its implementation char/rtc: Use of_node_name_eq for node name comparisons misc: mic: fix a DMA pool free failure ptp: fix an IS_ERR() vs NULL check genwqe: Fix size check binder: implement binderfs binder: fix use-after-free due to ksys_close() during fdget() bus: fsl-mc: remove duplicated include files bus: fsl-mc: explicitly define the fsl_mc_command endianness misc: ti-st: make array read_ver_cmd static, shrinks object size ...
| * misc: remove GENWQE_DEBUGFS_RO()Yangtao Li2018-12-061-12/+1
| | | | | | | | | | | | | | | | | | We already have the DEFINE_SHOW_ATTRIBUTE.There is no need to define such a macro,so remove GENWQE_DEBUGFS_RO.Also use DEFINE_SHOW_ATTRIBUTE to simplify some code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | mm: convert totalram_pages and totalhigh_pages variables to atomicArun KS2018-12-281-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | | totalram_pages and totalhigh_pages are made static inline function. Main motivation was that managed_page_count_lock handling was complicating things. It was discussed in length here, https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes better to remove the lock and convert variables to atomic, with preventing poteintial store-to-read tearing as a bonus. [akpm@linux-foundation.org: coding style fixes] Link: http://lkml.kernel.org/r/1542090790-21750-4-git-send-email-arunks@codeaurora.org Signed-off-by: Arun KS <arunks@codeaurora.org> Suggested-by: Michal Hocko <mhocko@suse.com> Suggested-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* vmw_balloon: add reset statNadav Amit2018-09-251-2/+5
| | | | | | | | | It is useful to expose how many times the balloon resets. If it happens more than very rarely - this is an indication for a problem. Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vmw_balloon: general style cleanupNadav Amit2018-09-251-16/+23
| | | | | | | | | Change all the remaining return values to int to avoid mistakes. Reduce indentation when possible. Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vmw_balloon: rework the inflate and deflate loopsNadav Amit2018-09-251-288/+541
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for supporting compaction and OOM notification, this patch reworks the inflate/deflate loops. The main idea is to separate the allocation, communication with the hypervisor, and the handling of errors from each other. Doing will allow us to perform concurrent inflation and deflation, excluding the actual communication with the hypervisor. To do so, we need to get rid of the remaining global state that is kept in the balloon struct, specifically the refuse_list. When the VM communicates with the hypervisor, it does not free or put back pages to the balloon list and instead only moves the pages whose status indicated failure into a refuse_list on the stack. Once the operation completes, the inflation or deflation functions handle the list appropriately. As we do that, we can consolidate the communication with the hypervisor for both the lock and unlock operations into a single function. We also reuse the deflation function for popping the balloon. As a preparation for preventing races, we hold a spinlock when the communication actually takes place, and use atomic operations for updating the balloon size. The balloon page list is still racy and will be handled in the next patch. Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vmw_balloon: stats reworkNadav Amit2018-09-251-103/+281
| | | | | | | | | | | | | | | | | | | | | | To allow the balloon statistics to be updated concurrently, we change the statistics to be held per core and aggregate it when needed. To avoid the memory overhead of keeping the statistics per core, and since it is likely not used by most users, we start updating the statistics only after the first use. A read-write semaphore is used to protect the statistics initialization and avoid races. This semaphore is (and will) be used to protect configuration changes during reset. While we are at it, address some other issues: change the statistics update to inline functions instead of define; use ulong for saving the statistics; and clean the statistics printouts. Note that this patch changes the format of the outputs. If there are any automatic tools that use the statistics, they might fail. Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vmw_balloon: simplify vmballoon_send_get_target()Nadav Amit2018-09-251-21/+14
| | | | | | | | | | | | | | As we want to leave as little as possible on the global balloon structure, to avoid possible future races, we want to get rid sysinfo. We can actually get the total_ram directly, and simplify the logic of vmballoon_send_get_target() a little. While we are doing that, let's return int and avoid mistakes due to bool/int conversions. Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vmw_balloon: refactor change size from vmballoon_workNadav Amit2018-09-251-23/+52
| | | | | | | | | | The required change in the balloon size is currently computed in vmballoon_work(), vmballoon_inflate() and vmballoon_deflate(). Refactor it to simplify the next patches. Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vmw_balloon: rename VMW_BALLOON_2M_SHIFT to VMW_BALLOON_2M_ORDERNadav Amit2018-09-251-4/+4
| | | | | | | | | | | | The name of the macro'd VMW_BALLOON_2M_SHIFT is misleading. The value reflects 2M huge-page order. Unfortunately, we cannot use HPAGE_PMD_ORDER, since it is not defined when transparent huge-pages are off, so we need to define our own one. Rename it to VMW_BALLOON_2M_ORDER. No functional change. Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vmw_balloon: treat all refused pages equallyNadav Amit2018-09-251-23/+29
| | | | | | | | | | | | | | | | | | | | | | | | | Currently, when the hypervisor rejects a page during lock operation, the VM treats pages differently according to the error-code: in certain cases the page is immediately freed, and in others it is put on a rejection list and only freed later. The behavior does not make too much sense. If the page is freed immediately it is very likely to be used again in the next batch of allocations, and be rejected again. In addition, for support of compaction and OOM notifiers, we wish to separate the logic that communicates with the hypervisor (as well as analyzes the status of each page) from the logic that allocates or free pages. Treat all errors the same way, queuing the pages on the refuse list. Move to the next allocation size (4k) when too many pages are refused. Free the refused pages when moving to the next size to avoid situations in which too much memory is waiting to be freed on the refused list. Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vmw_balloon: change batch/single lock abstractionsNadav Amit2018-09-251-193/+165
| | | | | | | | | | | | | | | | | | | | | | | | | | | The current abstractions for batch vs single operations seem suboptimal and complicate the implementation of additional features (OOM, compaction). The immediate problem of the current abstractions is that they cause differences in how operations are handled when batching is on or off. For example, the refused_alloc counter is not updated when batching is on. These discrepancies are caused by code redundancies. Instead, this patch presents three type of operations, according to whether batching is on or off: (1) add page, (2) communication with the hypervisor and (3) retrieving the status of a page. To avoid the overhead of virtual functions, and since we do not expect additional interfaces for communication with the hypervisor, we use static keys instead of virtual functions. Finally, while we are at it, change vmballoon_init_batching() to return int instead of bool, to be consistent in the return type and avoid potential coding errors. Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vmw_balloon: remove sleeping allocationsNadav Amit2018-09-251-49/+18
| | | | | | | | | | | | | | | | | Splitting the allocations between sleeping and non-sleeping made some sort of sense as long as rate-limiting was enabled. Now that it is removed, we need to decide - either we want sleeping allocations or not. Since no other Linux balloon driver (hv, Xen, virtio) uses sleeping allocations, use the same approach. We do distinguish, however, between 2MB allocations and 4kB allocations and prevent reclamation on 2MB. In both cases, we avoid using emergency low-memory pools, as it may cause undesired effects. Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vmw_balloon: simplifying batch accessNadav Amit2018-09-251-51/+30
| | | | | | | | | The use of accessors for batch entries complicates the code and makes it less readable. Remove it an instead use bit-fields. Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vmw_balloon: merge send_lock and send_unlock pathNadav Amit2018-09-251-41/+21
| | | | | | | | | The lock and unlock code paths are very similar, so avoid the duplicate code by merging them together. Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vmw_balloon: unify commands tracing and statsNadav Amit2018-09-251-75/+41
| | | | | | | | | | Now that we have a single point, unify the tracing and collecting the statistics for commands and their failure. While it might somewhat reduce the control over debugging, it cleans the code a lot. Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vmw_balloon: handle commands in a single function.Nadav Amit2018-09-251-107/+116
| | | | | | | | | | | | | | By inlining the hypercall interface, we can unify several operations into one central point in the code: - Updating the target. - Updating when a reset is needed. - Update statistics (which will be done later in the patch-set). - Print debug-messages (although they cannot be enabled as selectively). Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Merge 4.18-rc5 into char-misc-nextGreg Kroah-Hartman2018-07-161-2/+2
|\ | | | | | | | | | | We want the char-misc fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * vmw_balloon: fix inflation with batchingNadav Amit2018-07-031-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Embarrassingly, the recent fix introduced worse problem than it solved, causing the balloon not to inflate. The VM informed the hypervisor that the pages for lock/unlock are sitting in the wrong address, as it used the page that is used the uninitialized page variable. Fixes: b23220fe054e9 ("vmw_balloon: fixing double free when batching mode is off") Cc: stable@vger.kernel.org Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | vmw_balloon: update copyright messageNadav Amit2018-07-031-20/+2
| | | | | | | | | | | | | | | | | | | | Removing the GPL wording and replace it with an SPDX tag. The immediate trigger for doing it now is the need to remove the list of maintainers from the source file, as the maintainer list changed. Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | vmw_balloon: remove inflation rate limitingNadav Amit2018-07-031-81/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 33d268ed0019 ("VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode."), the allocations are not increased, and therefore balloon inflation rate limiting is in practice broken. While we can restore rate limiting, in practice we see that it can result in adverse effect, as the hypervisor throttles down the VM if it does not respond well enough, or alternatively causes it to perform very poorly as the host swaps out the VM memory. Throttling the VM down can even have a cascading effect, in which the VM reclaims memory even slower and consequentially throttled down even further. We therefore remove all the rate limiting mechanisms, including the slow allocation cycles, as they are likely to do more harm than good. Fixes: 33d268ed0019 ("VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode.") Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | vmw_balloon: fix VMCI use when balloon built into kernelNadav Amit2018-07-031-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when all modules, including VMCI and VMware balloon are built into the kernel, the initialization of the balloon happens before the VMCI is probed. As a result, the balloon fails to initialize the VMCI doorbell, which it uses to get asynchronous requests for balloon size changes. The problem can be seen in the logs, in the form of the following message: "vmw_balloon: failed to initialize vmci doorbell" The driver would work correctly but slightly less efficiently, probing for requests periodically. This patch changes the balloon to be initialized using late_initcall() instead of module_init() to address this issue. It does not address a situation in which VMCI is built as a module and the balloon is built into the kernel. Fixes: 48e3d668b790 ("VMware balloon: Enable notification via VMCI") Cc: stable@vger.kernel.org Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | vmw_balloon: VMCI_DOORBELL_SET does not check statusNadav Amit2018-07-031-18/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When vmballoon_vmci_init() sets a doorbell using VMCI_DOORBELL_SET, for some reason it does not consider the status and looks at the result. However, the hypervisor does not update the result - it updates the status. This might cause VMCI doorbell not to be enabled, resulting in degraded performance. Fixes: 48e3d668b790 ("VMware balloon: Enable notification via VMCI") Cc: stable@vger.kernel.org Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | vmw_balloon: do not use 2MB without batchingNadav Amit2018-07-031-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the hypervisor sets 2MB batching is on, while batching is cleared, the balloon code breaks. In this case the legacy mechanism is used with 2MB page. The VM would report a 2MB page is ballooned, and the hypervisor would only take the first 4KB. While the hypervisor should not report such settings, make the code more robust by not enabling 2MB support without batching. Fixes: 365bd7ef7ec8e ("VMware balloon: Support 2m page ballooning.") Cc: stable@vger.kernel.org Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | vmw_balloon: fix inflation of 64-bit GFNsNadav Amit2018-07-031-6/+7
|/ | | | | | | | | | | | | | When balloon batching is not supported by the hypervisor, the guest frame number (GFN) must fit in 32-bit. However, due to a bug, this check was mistakenly ignored. In practice, when total RAM is greater than 16TB, the balloon does not work currently, making this bug unlikely to happen. Fixes: ef0f8f112984 ("VMware balloon: partially inline vmballoon_reserve_page.") Cc: stable@vger.kernel.org Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vmw_balloon: fixing double free when batching mode is offGil Kupfer2018-06-021-16/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The balloon.page field is used for two different purposes if batching is on or off. If batching is on, the field point to the page which is used to communicate with with the hypervisor. If it is off, balloon.page points to the page that is about to be (un)locked. Unfortunately, this dual-purpose of the field introduced a bug: when the balloon is popped (e.g., when the machine is reset or the balloon driver is explicitly removed), the balloon driver frees, unconditionally, the page that is held in balloon.page. As a result, if batching is disabled, this leads to double freeing the last page that is sent to the hypervisor. The following error occurs during rmmod when kernel checkers are on, and the balloon is not empty: [ 42.307653] ------------[ cut here ]------------ [ 42.307657] Kernel BUG at ffffffffba1e4b28 [verbose debug info unavailable] [ 42.307720] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [ 42.312512] Modules linked in: vmw_vsock_vmci_transport vsock ppdev joydev vmw_balloon(-) input_leds serio_raw vmw_vmci parport_pc shpchp parport i2c_piix4 nfit mac_hid autofs4 vmwgfx drm_kms_helper hid_generic syscopyarea sysfillrect usbhid sysimgblt fb_sys_fops hid ttm mptspi scsi_transport_spi ahci mptscsih drm psmouse vmxnet3 libahci mptbase pata_acpi [ 42.312766] CPU: 10 PID: 1527 Comm: rmmod Not tainted 4.12.0+ #5 [ 42.312803] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2016 [ 42.313042] task: ffff9bf9680f8000 task.stack: ffffbfefc1638000 [ 42.313290] RIP: 0010:__free_pages+0x38/0x40 [ 42.313510] RSP: 0018:ffffbfefc163be98 EFLAGS: 00010246 [ 42.313731] RAX: 000000000000003e RBX: ffffffffc02b9720 RCX: 0000000000000006 [ 42.313972] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9bf97e08e0a0 [ 42.314201] RBP: ffffbfefc163be98 R08: 0000000000000000 R09: 0000000000000000 [ 42.314435] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc02b97e4 [ 42.314505] R13: ffffffffc02b9748 R14: ffffffffc02b9728 R15: 0000000000000200 [ 42.314550] FS: 00007f3af5fec700(0000) GS:ffff9bf97e080000(0000) knlGS:0000000000000000 [ 42.314599] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 42.314635] CR2: 00007f44f6f4ab24 CR3: 00000003a7d12000 CR4: 00000000000006e0 [ 42.314864] Call Trace: [ 42.315774] vmballoon_pop+0x102/0x130 [vmw_balloon] [ 42.315816] vmballoon_exit+0x42/0xd64 [vmw_balloon] [ 42.315853] SyS_delete_module+0x1e2/0x250 [ 42.315891] entry_SYSCALL_64_fastpath+0x23/0xc2 [ 42.315924] RIP: 0033:0x7f3af5b0e8e7 [ 42.315949] RSP: 002b:00007fffe6ce0148 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ 42.315996] RAX: ffffffffffffffda RBX: 000055be676401e0 RCX: 00007f3af5b0e8e7 [ 42.316951] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055be67640248 [ 42.317887] RBP: 0000000000000003 R08: 0000000000000000 R09: 1999999999999999 [ 42.318845] R10: 0000000000000883 R11: 0000000000000206 R12: 00007fffe6cdf130 [ 42.319755] R13: 0000000000000000 R14: 0000000000000000 R15: 000055be676401e0 [ 42.320606] Code: c0 74 1c f0 ff 4f 1c 74 02 5d c3 85 f6 74 07 e8 0f d8 ff ff 5d c3 31 f6 e8 c6 fb ff ff 5d c3 48 c7 c6 c8 0f c5 ba e8 58 be 02 00 <0f> 0b 66 0f 1f 44 00 00 66 66 66 66 90 48 85 ff 75 01 c3 55 48 [ 42.323462] RIP: __free_pages+0x38/0x40 RSP: ffffbfefc163be98 [ 42.325735] ---[ end trace 872e008e33f81508 ]--- To solve the bug, we eliminate the dual purpose of balloon.page. Fixes: f220a80f0c2e ("VMware balloon: add batching to the vmw_balloon.") Cc: stable@vger.kernel.org Reported-by: Oleksandr Natalenko <onatalen@redhat.com> Signed-off-by: Gil Kupfer <gilkup@gmail.com> Signed-off-by: Nadav Amit <namit@vmware.com> Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Tested-by: Oleksandr Natalenko <oleksandr@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* x86/virt: Add enum for hypervisors to replace x86_hyperJuergen Gross2017-11-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The x86_hyper pointer is only used for checking whether a virtual device is supporting the hypervisor the system is running on. Use an enum for that purpose instead and drop the x86_hyper pointer. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Xavier Deguillard <xdeguillard@vmware.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: akataria@vmware.com Cc: arnd@arndb.de Cc: boris.ostrovsky@oracle.com Cc: devel@linuxdriverproject.org Cc: dmitry.torokhov@gmail.com Cc: gregkh@linuxfoundation.org Cc: haiyangz@microsoft.com Cc: kvm@vger.kernel.org Cc: kys@microsoft.com Cc: linux-graphics-maintainer@vmware.com Cc: linux-input@vger.kernel.org Cc: moltmann@vmware.com Cc: pbonzini@redhat.com Cc: pv-drivers@vmware.com Cc: rkrcmar@redhat.com Cc: sthemmin@microsoft.com Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20171109132739.23465-3-jgross@suse.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* mm, page_alloc: rename __GFP_WAIT to __GFP_RECLAIMMel Gorman2015-11-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | __GFP_WAIT was used to signal that the caller was in atomic context and could not sleep. Now it is possible to distinguish between true atomic context and callers that are not willing to sleep. The latter should clear __GFP_DIRECT_RECLAIM so kswapd will still wake. As clearing __GFP_WAIT behaves differently, there is a risk that people will clear the wrong flags. This patch renames __GFP_WAIT to __GFP_RECLAIM to clearly indicate what it does -- setting it allows all reclaim activity, clearing them prevents it. [akpm@linux-foundation.org: fix build] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* VMware balloon: Enable notification via VMCIPhilip P. Moltmann2015-10-041-9/+96
| | | | | | | | | | | | | Get notified immediately when a balloon target is set, instead of waiting for up to one second. The up-to 1 second gap could be long enough to cause swapping inside of the VM that receives the VM. Acked-by: Andy King <acking@vmware.com> Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com> Tested-by: Siva Sankar Reddy B <sankars@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* VMware balloon: Treat init like resetPhilip P. Moltmann2015-10-041-35/+18
| | | | | | | | | | Unify the behavior of the first start of the balloon and a reset. Also on unload, declare that the balloon driver does not have any capabilities anymore. Acked-by: Andy King <acking@vmware.com> Signed-off-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>