summaryrefslogtreecommitdiffstats
path: root/drivers/virtio (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2020-10-235-3/+24
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull virtio updates from Michael Tsirkin: "vhost, vdpa, and virtio cleanups and fixes A very quiet cycle, no new features" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: MAINTAINERS: add URL for virtio-mem vhost_vdpa: remove unnecessary spin_lock in vhost_vring_call vringh: fix __vringh_iov() when riov and wiov are different vdpa/mlx5: Setup driver only if VIRTIO_CONFIG_S_DRIVER_OK s390: virtio: PV needs VIRTIO I/O device protection virtio: let arch advertise guest's memory access restrictions vhost_vdpa: Fix duplicate included kernel.h vhost: reduce stack usage in log_used virtio-mem: Constify mem_id_table virtio_input: Constify id_table virtio-balloon: Constify id_table vdpa/mlx5: Fix failure to bring link up vdpa/mlx5: Make use of a specific 16 bit endianness API
| * virtio: let arch advertise guest's memory access restrictionsPierre Morel2020-10-212-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An architecture may restrict host access to guest memory, e.g. IBM s390 Secure Execution or AMD SEV. Provide a new Kconfig entry the architecture can select, CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS, when it provides the arch_has_restricted_virtio_memory_access callback to advertise to VIRTIO common code when the architecture restricts memory access from the host. The common code can then fail the probe for any device where VIRTIO_F_ACCESS_PLATFORM is required, but not set. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Link: https://lore.kernel.org/r/1599728030-17085-2-git-send-email-pmorel@linux.ibm.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * virtio-mem: Constify mem_id_tableRikard Falkeborn2020-10-211-1/+1
| | | | | | | | | | | | | | | | | | | | mem_id_table is not modified, so make it const to allow the compiler to put it in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Link: https://lore.kernel.org/r/20200911203509.26505-4-rikard.falkeborn@gmail.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Hildenbrand <david@redhat.com>
| * virtio_input: Constify id_tableRikard Falkeborn2020-10-211-1/+1
| | | | | | | | | | | | | | | | | | | | id_table is not modified, so make it const to allow the compiler to put it in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Link: https://lore.kernel.org/r/20200911203509.26505-3-rikard.falkeborn@gmail.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com>
| * virtio-balloon: Constify id_tableRikard Falkeborn2020-10-211-1/+1
| | | | | | | | | | | | | | | | | | | | id_table is not modified, so make it const to allow the compiler to put it in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Link: https://lore.kernel.org/r/20200911203509.26505-2-rikard.falkeborn@gmail.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Hildenbrand <david@redhat.com>
* | virtio-mem: try to merge system ram resourcesDavid Hildenbrand2020-10-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | virtio-mem adds memory in memory block granularity, to be able to remove it in the same granularity again later, and to grow slowly on demand. This, however, results in quite a lot of resources when adding a lot of memory. Resources are effectively stored in a list-based tree. Having a lot of resources not only wastes memory, it also makes traversing that tree more expensive, and makes /proc/iomem explode in size (e.g., requiring kexec-tools to manually merge resources later when e.g., trying to create a kdump header). Before this patch, we get (/proc/iomem) when hotplugging 2G via virtio-mem on x86-64: [...] 100000000-13fffffff : System RAM 140000000-33fffffff : virtio0 140000000-147ffffff : System RAM (virtio_mem) 148000000-14fffffff : System RAM (virtio_mem) 150000000-157ffffff : System RAM (virtio_mem) 158000000-15fffffff : System RAM (virtio_mem) 160000000-167ffffff : System RAM (virtio_mem) 168000000-16fffffff : System RAM (virtio_mem) 170000000-177ffffff : System RAM (virtio_mem) 178000000-17fffffff : System RAM (virtio_mem) 180000000-187ffffff : System RAM (virtio_mem) 188000000-18fffffff : System RAM (virtio_mem) 190000000-197ffffff : System RAM (virtio_mem) 198000000-19fffffff : System RAM (virtio_mem) 1a0000000-1a7ffffff : System RAM (virtio_mem) 1a8000000-1afffffff : System RAM (virtio_mem) 1b0000000-1b7ffffff : System RAM (virtio_mem) 1b8000000-1bfffffff : System RAM (virtio_mem) 3280000000-32ffffffff : PCI Bus 0000:00 With this patch, we get (/proc/iomem): [...] fffc0000-ffffffff : Reserved 100000000-13fffffff : System RAM 140000000-33fffffff : virtio0 140000000-1bfffffff : System RAM (virtio_mem) 3280000000-32ffffffff : PCI Bus 0000:00 Of course, with more hotplugged memory, it gets worse. When unplugging memory blocks again, try_remove_memory() (via offline_and_remove_memory()) will properly split the resource up again. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Juergen Gross <jgross@suse.com> Cc: Julien Grall <julien@xen.org> Cc: Kees Cook <keescook@chromium.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Len Brown <lenb@kernel.org> Cc: Leonardo Bras <leobras.c@gmail.com> Cc: Libor Pechacek <lpechacek@suse.cz> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Wei Liu <wei.liu@kernel.org> Link: https://lkml.kernel.org/r/20200911103459.10306-7-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | mm/memory_hotplug: prepare passing flags to add_memory() and friendsDavid Hildenbrand2020-10-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We soon want to pass flags, e.g., to mark added System RAM resources. mergeable. Prepare for that. This patch is based on a similar patch by Oscar Salvador: https://lkml.kernel.org/r/20190625075227.15193-3-osalvador@suse.de Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Juergen Gross <jgross@suse.com> # Xen related part Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Acked-by: Wei Liu <wei.liu@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Libor Pechacek <lpechacek@suse.cz> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Leonardo Bras <leobras.c@gmail.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Julien Grall <julien@xen.org> Cc: Kees Cook <keescook@chromium.org> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richardw.yang@linux.intel.com> Link: https://lkml.kernel.org/r/20200911103459.10306-5-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge tag 'drm-next-2020-10-15' of git://anongit.freedesktop.org/drm/drmLinus Torvalds2020-10-156-0/+228
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull drm updates from Dave Airlie: "Not a major amount of change, the i915 trees got split into display and gt trees to better facilitate higher level review, and there's a major refactoring of i915 GEM locking to use more core kernel concepts (like ww-mutexes). msm gets per-process pagetables, older AMD SI cards get DC support, nouveau got a bump in displayport support with common code extraction from i915. Outside of drm this contains a couple of patches for hexint moduleparams which you've acked, and a virtio common code tree that you should also get via it's regular path. New driver: - Cadence MHDP8546 DisplayPort bridge driver core: - cross-driver scatterlist cleanups - devm_drm conversions - remove drm_dev_init - devm_drm_dev_alloc conversion ttm: - lots of refactoring and cleanups bridges: - chained bridge support in more drivers panel: - misc new panels scheduler: - cleanup priority levels displayport: - refactor i915 code into helpers for nouveau i915: - split into display and GT trees - WW locking refactoring in GEM - execbuf2 extension mechanism - syncobj timeline support - GEN 12 HOBL display powersaving - Rocket Lake display additions - Disable FBC on Tigerlake - Tigerlake Type-C + DP improvements - Hotplug interrupt refactoring amdgpu: - Sienna Cichlid updates - Navy Flounder updates - DCE6 (SI) support for DC - Plane rotation enabled - TMZ state info ioctl - PCIe DPC recovery support - DC interrupt handling refactor - OLED panel fixes amdkfd: - add SMI events for thermal throttling - SMI interface events ioctl update - process eviction counters radeon: - move to dma_ for allocations - expose sclk via sysfs msm: - DSI support for sm8150/sm8250 - per-process GPU pagetable support - Displayport support mediatek: - move HDMI phy driver to PHY - convert mtk-dpi to bridge API - disable mt2701 tmds tegra: - bridge support exynos: - misc cleanups vc4: - dual display cleanups ast: - cleanups gma500: - conversion to GPIOd API hisilicon: - misc reworks ingenic: - clock handling and format improvements mcde: - DSI support mgag200: - desktop g200 support mxsfb: - i.MX7 + i.MX8M - alpha plane support panfrost: - devfreq support - amlogic SoC support ps8640: - EDID from eDP retrieval tidss: - AM65xx YUV workaround virtio: - virtio-gpu exported resources rcar-du: - R8A7742, R8A774E1 and R8A77961 support - YUV planar format fixes - non-visible plane handling - VSP device reference count fix - Kconfig fix to avoid displaying disabled options in .config" * tag 'drm-next-2020-10-15' of git://anongit.freedesktop.org/drm/drm: (1494 commits) drm/ingenic: Fix bad revert drm/amdgpu: Fix invalid number of character '{' in amdgpu_acpi_init drm/amdgpu: Remove warning for virtual_display drm/amdgpu: kfd_initialized can be static drm/amd/pm: setup APU dpm clock table in SMU HW initialization drm/amdgpu: prevent spurious warning drm/amdgpu/swsmu: fix ARC build errors drm/amd/display: Fix OPTC_DATA_FORMAT programming drm/amd/display: Don't allow pstate if no support in blank drm/panfrost: increase readl_relaxed_poll_timeout values MAINTAINERS: Update entry for st7703 driver after the rename Revert "gpu/drm: ingenic: Add option to mmap GEM buffers cached" drm/amd/display: HDMI remote sink need mode validation for Linux drm/amd/display: Change to correct unit on audio rate drm/amd/display: Avoid set zero in the requested clk drm/amdgpu: align frag_end to covered address space drm/amdgpu: fix NULL pointer dereference for Renoir drm/vmwgfx: fix regression in thp code due to ttm init refactor. drm/amdgpu/swsmu: add interrupt work handler for smu11 parts drm/amdgpu/swsmu: add interrupt work function ...
| * \ Merge branch 'virtio-shm' of ↵Maxime Ripard2020-09-162-0/+126
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse into drm-misc-next Topic pull request for core virtio changes that will be required by the DRM driver. Signed-off-by: Maxime Ripard <maxime@cerno.tech> From: Gurchetan Singh <gurchetansingh@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/CAAfnVBn2BzXWFY3hhjDxd5q0P2_JWn-HdkVxgS94x9keAUZiow@mail.gmail.com
| | * | virtio: Implement get_shm_region for MMIO transportSebastien Boeuf2020-09-101-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On MMIO a new set of registers is defined for finding SHM regions. Add their definitions and use them to find the region. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com> Cc: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
| | * | virtio: Implement get_shm_region for PCI transportSebastien Boeuf2020-09-101-0/+95
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On PCI the shm regions are found using capability entries; find a region by searching for the capability. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: kbuild test robot <lkp@intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
| * | virtio: fix build for configs without dma-bufsDavid Stevens2020-08-193-1/+12
| | | | | | | | | | | | | | | | | | | | | Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: David Stevens <stevensd@chromium.org> Link: http://patchwork.freedesktop.org/patch/msgid/20200819031011.310180-1-stevensd@chromium.org Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
| * | Merge v5.9-rc1 into drm-misc-nextMaxime Ripard2020-08-186-64/+57
| |\| | | | | | | | | | | | | | | | Sam needs 5.9-rc1 to have dev_err_probe in to merge some patches. Signed-off-by: Maxime Ripard <maxime@cerno.tech>
| * | virtio: add dma-buf support for exported objectsDavid Stevens2020-08-183-1/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change adds a new flavor of dma-bufs that can be used by virtio drivers to share exported objects. A virtio dma-buf can be queried by virtio drivers to obtain the UUID which identifies the underlying exported object. Signed-off-by: David Stevens <stevensd@chromium.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: http://patchwork.freedesktop.org/patch/msgid/20200818071343.3461203-2-stevensd@chromium.org Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
* | | virtio-mem: don't special-case ZONE_MOVABLEDavid Hildenbrand2020-10-141-39/+8
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When introducing virtio-mem, the semantics of ZONE_MOVABLE were rather unclear, which is why we special-cased ZONE_MOVABLE such that partially plugged blocks would never end up in ZONE_MOVABLE. Now that the semantics are much clearer (and will be documented in a follow-up patch including the new virtio-mem behavior), let's allow to online partially plugged memory blocks to ZONE_MOVABLE and also consider memory blocks that were onlined to ZONE_MOVABLE when unplugging memory. While unplugged memory pages are, in general, unmovable, they can be skipped when offlining memory. virtio-mem only unplugs fairly big chunks (in the megabyte range) and rather tries to shrink the memory region than randomly choosing memory. In theory, if all other pages in the movable zone would be movable, virtio-mem would only shrink that zone and not create any kind of fragmentation. In the future, we might want to remember the zone again and use the information when (un)plugging memory. For now, let's keep it simple. Note: Support for defragmentation is planned, to deal with fragmentation after unplug due to memory chunks within memory blocks that could not get unplugged before (e.g., somebody pinning pages within ZONE_MOVABLE for a longer time). Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Qian Cai <cai@lca.pw> Link: http://lkml.kernel.org/r/20200816125333.7434-6-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | virtio: pci: constify ioreadX() iomem argument (as in generic implementation)Krzysztof Kozlowski2020-08-151-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ioreadX() helpers have inconsistent interface. On some architectures void *__iomem address argument is a pointer to const, on some not. Implementations of ioreadX() do not modify the memory under the address so they can be converted to a "const" version for const-safety and consistency among architectures. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Allen Hubbe <allenbh@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Jiang <dave.jiang@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Helge Deller <deller@gmx.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jakub Kicinski <kuba@kernel.org> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Jon Mason <jdmason@kudzu.us> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Rich Felker <dalias@libc.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200709072837.5869-5-krzk@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2020-08-116-58/+51
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull virtio updates from Michael Tsirkin: - IRQ bypass support for vdpa and IFC - MLX5 vdpa driver - Endianness fixes for virtio drivers - Misc other fixes * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (71 commits) vdpa/mlx5: fix up endian-ness for mtu vdpa: Fix pointer math bug in vdpasim_get_config() vdpa/mlx5: Fix pointer math in mlx5_vdpa_get_config() vdpa/mlx5: fix memory allocation failure checks vdpa/mlx5: Fix uninitialised variable in core/mr.c vdpa_sim: init iommu lock virtio_config: fix up warnings on parisc vdpa/mlx5: Add VDPA driver for supported mlx5 devices vdpa/mlx5: Add shared memory registration code vdpa/mlx5: Add support library for mlx5 VDPA implementation vdpa/mlx5: Add hardware descriptive header file vdpa: Modify get_vq_state() to return error code net/vdpa: Use struct for set/get vq state vdpa: remove hard coded virtq num vdpasim: support batch updating vhost-vdpa: support IOTLB batching hints vhost-vdpa: support get/set backend features vhost: generialize backend features setting/getting vhost-vdpa: refine ioctl pre-processing vDPA: dont change vq irq after DRIVER_OK ...
| * | virtio_pci_modern: Fix the comment of virtio_pci_find_capability()Liao Pingfang2020-08-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the comment of virtio_pci_find_capability() by adding missing comment for the last parameter: bars. Fixes: 59a5b0f7bf74 ("virtio-pci: alloc only resources actually used.") Signed-off-by: Liao Pingfang <liao.pingfang@zte.com.cn> Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Link: https://lore.kernel.org/r/1596455545-43556-1-git-send-email-wang.yi59@zte.com.cn Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
| * | virtio_ring: Avoid loop when vq is broken in virtqueue_pollMao Wenan2020-08-051-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The loop may exist if vq->broken is true, virtqueue_get_buf_ctx_packed or virtqueue_get_buf_ctx_split will return NULL, so virtnet_poll will reschedule napi to receive packet, it will lead cpu usage(si) to 100%. call trace as below: virtnet_poll virtnet_receive virtqueue_get_buf_ctx virtqueue_get_buf_ctx_packed virtqueue_get_buf_ctx_split virtqueue_napi_complete virtqueue_poll //return true virtqueue_napi_schedule //it will reschedule napi to fix this, return false if vq is broken in virtqueue_poll. Signed-off-by: Mao Wenan <wenan.mao@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/1596354249-96204-1-git-send-email-wenan.mao@linux.alibaba.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
| * | virtio_mem: convert to LE accessorsMichael S. Tsirkin2020-08-051-15/+15
| | | | | | | | | | | | | | | | | | Virtio mem is modern-only. Use LE accessors for config space. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | virtio_input: convert to LE accessorsMichael S. Tsirkin2020-08-051-16/+16
| | | | | | | | | | | | | | | | | | Virtio input is modern-only. Use LE accessors for config space. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | virtio_balloon: use LE config space accessesMichael S. Tsirkin2020-08-051-17/+9
| | | | | | | | | | | | | | | | | | Balloon is LE, it's cleaner to access it as such directly. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | virtio_vdpa: legacy features handlingMichael S. Tsirkin2020-08-051-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | We normally expect vdpa to use the modern interface. However for consistency, let's use same APIs as vhost for legacy guests. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | virtio_balloon: fix sparse warningMichael S. Tsirkin2020-08-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | balloon uses virtio32_to_cpu instead of cpu_to_virtio32 to convert a native endian number to virtio. No practical difference but makes sparse warn. Fix it up. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
| * | virtio: virtio_has_iommu_quirk -> virtio_has_dma_quirkMichael S. Tsirkin2020-08-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Now that the corresponding feature bit has been renamed, rename the quirk too - it's about special ways to do DMA, not necessarily about the IOMMU. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | virtio: VIRTIO_F_IOMMU_PLATFORM -> VIRTIO_F_ACCESS_PLATFORMMichael S. Tsirkin2020-08-032-2/+2
| |/ | | | | | | | | | | | | | | Rename the bit to match latest virtio spec. Add a compat macro to avoid breaking existing userspace. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com>
* | Merge tag 'uninit-macro-v5.9-rc1' of ↵Linus Torvalds2020-08-041-3/+3
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull uninitialized_var() macro removal from Kees Cook: "This is long overdue, and has hidden too many bugs over the years. The series has several "by hand" fixes, and then a trivial treewide replacement. - Clean up non-trivial uses of uninitialized_var() - Update documentation and checkpatch for uninitialized_var() removal - Treewide removal of uninitialized_var()" * tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: compiler: Remove uninitialized_var() macro treewide: Remove uninitialized_var() usage checkpatch: Remove awareness of uninitialized_var() macro mm/debug_vm_pgtable: Remove uninitialized_var() usage f2fs: Eliminate usage of uninitialized_var() macro media: sur40: Remove uninitialized_var() usage KVM: PPC: Book3S PR: Remove uninitialized_var() usage clk: spear: Remove uninitialized_var() usage clk: st: Remove uninitialized_var() usage spi: davinci: Remove uninitialized_var() usage ide: Remove uninitialized_var() usage rtlwifi: rtl8192cu: Remove uninitialized_var() usage b43: Remove uninitialized_var() usage drbd: Remove uninitialized_var() usage x86/mm/numa: Remove uninitialized_var() usage docs: deprecated.rst: Add uninitialized_var()
| * treewide: Remove uninitialized_var() usageKees Cook2020-07-161-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using uninitialized_var() is dangerous as it papers over real bugs[1] (or can in the future), and suppresses unrelated compiler warnings (e.g. "unused variable"). If the compiler thinks it is uninitialized, either simply initialize the variable or make compiler changes. In preparation for removing[2] the[3] macro[4], remove all remaining needless uses with the following script: git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \ xargs perl -pi -e \ 's/\buninitialized_var\(([^\)]+)\)/\1/g; s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;' drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid pathological white-space. No outstanding warnings were found building allmodconfig with GCC 9.3.0 for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64, alpha, and m68k. [1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/ [2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/ [3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/ [4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/ Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5 Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs Signed-off-by: Kees Cook <keescook@chromium.org>
* | virtio-mem: Fix build error due to improper use 'select'Weilong Chen2020-07-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As noted in: https://www.kernel.org/doc/Documentation/kbuild/kconfig-language.txt "select should be used with care. select will force a symbol to a value without visiting the dependencies." Config VIRTIO_MEM should not select CONTIG_ALLOC directly. Otherwise it will cause an error: https://bugzilla.kernel.org/show_bug.cgi?id=208245 Signed-off-by: Weilong Chen <chenweilong@huawei.com> Link: https://lore.kernel.org/r/20200619080333.194753-1-chenweilong@huawei.com Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: David Hildenbrand <david@redhat.com>
* | virtio_balloon: fix up endian-ness for free cmd idMichael S. Tsirkin2020-07-291-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | free cmd id is read using virtio endian, spec says all fields in balloon are LE. Fix it up. Fixes: 86a559787e6f ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") Cc: stable@vger.kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Wei Wang <wei.w.wang@intel.com> Acked-by: David Hildenbrand <david@redhat.com>
* | virtio-balloon: Document byte ordering of poison_valAlexander Duyck2020-07-291-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The poison_val field in the virtio_balloon_config is treated as a little-endian field by the host. Since we are currently only having to deal with a single byte poison value this isn't a problem, however if the value should ever expand it would cause byte ordering issues. Document that in the code so that we know that if the value should ever expand we need to byte swap the value on big-endian architectures. Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Link: https://lore.kernel.org/r/20200713203539.17140.71425.stgit@localhost.localdomain Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Hildenbrand <david@redhat.com>
* | Merge tag 'pci-v5.8-fixes-2' of ↵Linus Torvalds2020-07-251-2/+2
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci into master Pull PCI fixes from Bjorn Helgaas: - Reject invalid IRQ 0 command line argument for virtio_mmio because IRQ 0 now generates warnings (Bjorn Helgaas) - Revert "PCI/PM: Assume ports without DLL Link Active train links in 100 ms", which broke nouveau (Bjorn Helgaas) * tag 'pci-v5.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: Revert "PCI/PM: Assume ports without DLL Link Active train links in 100 ms" virtio-mmio: Reject invalid IRQ 0 command line argument
| * virtio-mmio: Reject invalid IRQ 0 command line argumentBjorn Helgaas2020-07-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "virtio_mmio.device=" command line argument allows a user to specify the size, address, and IRQ of a virtio device. Previously the only requirement for the IRQ was that it be an unsigned integer. Zero is an unsigned integer but an invalid IRQ number, and after a85a6c86c25be ("driver core: platform: Clarify that IRQ 0 is invalid"), attempts to use IRQ 0 cause warnings. If the user specifies IRQ 0, return failure instead of registering a device with IRQ 0. Fixes: a85a6c86c25be ("driver core: platform: Clarify that IRQ 0 is invalid") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>
* | virtio-mem: add memory via add_memory_driver_managed()David Hildenbrand2020-06-221-3/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Virtio-mem managed memory is always detected and added by the virtio-mem driver, never using something like the firmware-provided memory map. This is the case after an ordinary system reboot, and has to be guaranteed after kexec. Especially, virtio-mem added memory resources can contain inaccessible parts ("unblocked memory blocks"), blindly forwarding them to a kexec kernel is dangerous, as unplugged memory will get accessed (esp. written). Let's use the new way of adding special driver-managed memory introduced in commit 7b7b27214bba ("mm/memory_hotplug: introduce add_memory_driver_managed()"). This will result in no entries in /sys/firmware/memmap ("raw firmware- provided memory map"), the memory resource will be flagged IORESOURCE_MEM_DRIVER_MANAGED (esp., kexec_file_load() will not place kexec images on this memory), and it is exposed as "System RAM (virtio_mem)" in /proc/iomem, so esp. kexec-tools can properly handle it. Example /proc/iomem before this change: [...] 140000000-333ffffff : virtio0 140000000-147ffffff : System RAM 334000000-533ffffff : virtio1 338000000-33fffffff : System RAM 340000000-347ffffff : System RAM 348000000-34fffffff : System RAM [...] Example /proc/iomem after this change: [...] 140000000-333ffffff : virtio0 140000000-147ffffff : System RAM (virtio_mem) 334000000-533ffffff : virtio1 338000000-33fffffff : System RAM (virtio_mem) 340000000-347ffffff : System RAM (virtio_mem) 348000000-34fffffff : System RAM (virtio_mem) [...] Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: teawater <teawaterz@linux.alibaba.com> Fixes: 5f1f79bbc9e26 ("virtio-mem: Paravirtualized memory hotplug") Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200611093518.5737-1-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
* | virtio-mem: silence a static checker warningDan Carpenter2020-06-221-1/+1
|/ | | | | | | | | | | | Smatch complains that "rc" can be uninitialized if we hit the "break;" statement on the first iteration through the loop. I suspect that this can't happen in real life, but returning a zero literal is cleaner and silence the static checker warning. Fixes: 5f1f79bbc9e2 ("virtio-mem: Paravirtualized memory hotplug") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20200610085911.GC5439@mwanda Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* treewide: replace '---help---' in Kconfig files with 'help'Masahiro Yamada2020-06-131-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 84af7a6194e4 ("checkpatch: kconfig: prefer 'help' over '---help---'"), the number of '---help---' has been gradually decreasing, but there are still more than 2400 instances. This commit finishes the conversion. While I touched the lines, I also fixed the indentation. There are a variety of indentation styles found. a) 4 spaces + '---help---' b) 7 spaces + '---help---' c) 8 spaces + '---help---' d) 1 space + 1 tab + '---help---' e) 1 tab + '---help---' (correct indentation) f) 1 tab + 1 space + '---help---' g) 1 tab + 2 spaces + '---help---' In order to convert all of them to 1 tab + 'help', I ran the following commend: $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/' Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
* virtio_mem: convert device block size into 64bitMichael S. Tsirkin2020-06-091-9/+9
| | | | | | | | | | | | | | If subblock size is large (e.g. 1G) 32 bit math involving it can overflow. Rather than try to catch all instances of that, let's tweak block size to 64 bit. It ripples through UAPI which is an ABI change, but it's not too late to make it, and it will allow supporting >4Gbyte blocks while might become necessary down the road. Fixes: 5f1f79bbc9e26 ("virtio-mem: Paravirtualized memory hotplug") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Hildenbrand <david@redhat.com>
* virtio-mem: drop unnecessary initializationMichael S. Tsirkin2020-06-081-1/+1
| | | | | | | | | rc is initialized to -ENIVAL but that's never used. Drop it. Fixes: 5f1f79bbc9e2 ("virtio-mem: Paravirtualized memory hotplug") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Hildenbrand <david@redhat.com>
* virtio-mem: Don't rely on implicit compiler padding for requestsDavid Hildenbrand2020-06-041-0/+3
| | | | | | | | | | | | | The compiler will add padding after the last member, make that explicit. The size of a request is always 24 bytes. The size of a response always 10 bytes. Add compile-time checks. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: teawater <teawaterz@linux.alibaba.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200515101402.16597-1-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-mem: Try to unplug the complete online memory block firstDavid Hildenbrand2020-06-041-31/+57
| | | | | | | | | | | | | | Right now, we always try to unplug single subblocks when processing an online memory block. Let's try to unplug the complete online memory block first, in case it is fully plugged and the unplug request is large enough. Fallback to single subblocks in case the memory block cannot get unplugged as a whole. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-16-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-mem: Use -ETXTBSY as error code if the device is busyDavid Hildenbrand2020-06-041-6/+10
| | | | | | | | | | Let's be able to distinguish if the device or if memory is busy. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-15-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-mem: Unplug subblocks right-to-leftDavid Hildenbrand2020-06-041-22/+16
| | | | | | | | | | | We unplug blocks right-to-left, let's also unplug subblocks within a block right-to-left. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-14-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-mem: Drop manual check for already present memoryDavid Hildenbrand2020-06-041-43/+12
| | | | | | | | | | | | | | Registering our parent resource will fail if any memory is still present (e.g., because somebody unloaded the driver and tries to reload it). No need for the manual check. Move our "unplug all" handling to after registering the resource. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-13-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-mem: Add parent resource for all added "System RAM"David Hildenbrand2020-06-041-1/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's add a parent resource, named after the virtio device (inspired by drivers/dax/kmem.c). This allows user space to identify which memory belongs to which virtio-mem device. With this change and two virtio-mem devices: :/# cat /proc/iomem 00000000-00000fff : Reserved 00001000-0009fbff : System RAM [...] 140000000-333ffffff : virtio0 140000000-147ffffff : System RAM 148000000-14fffffff : System RAM 150000000-157ffffff : System RAM [...] 334000000-3033ffffff : virtio1 338000000-33fffffff : System RAM 340000000-347ffffff : System RAM 348000000-34fffffff : System RAM [...] Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-12-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
* virtio-mem: Better retry handlingDavid Hildenbrand2020-06-041-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | Let's start with a retry interval of 5 seconds and double the time until we reach 5 minutes, in case we keep getting errors. Reset the retry interval in case we succeeded. The two main reasons for having to retry are - The hypervisor is busy and cannot process our request - We cannot reach the desired requested_size (esp., not enough memory can get unplugged because we can't allocate any subblocks). Tested-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-11-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-mem: Offline and remove completely unplugged memory blocksDavid Hildenbrand2020-06-041-4/+43
| | | | | | | | | | | | | | | | | | | | | | | | | Let's offline+remove memory blocks once all subblocks are unplugged. We can use the new Linux MM interface for that. As no memory is in use anymore, this shouldn't take a long time and shouldn't fail. There might be corner cases where the offlining could still fail (especially, if another notifier NACKs the offlining request). Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Tested-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-10-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-mem: Allow to offline partially unplugged memory blocksDavid Hildenbrand2020-06-041-1/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Dropping the reference count of PageOffline() pages during MEM_GOING_ONLINE allows offlining code to skip them. However, we also have to clear PG_reserved, because PG_reserved pages get detected as unmovable right away. Take care of restoring the reference count when offlining is canceled. Clarify why we don't have to perform any action when unloading the driver. Also, let's add a warning if anybody is still holding a reference to unplugged pages when offlining. Tested-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-8-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-mem: Paravirtualized memory hotunplug part 2David Hildenbrand2020-06-042-14/+144
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We also want to unplug online memory (contained in online memory blocks and, therefore, managed by the buddy), and eventually replug it later. When requested to unplug memory, we use alloc_contig_range() to allocate subblocks in online memory blocks (so we are the owner) and send them to our hypervisor. When requested to plug memory, we can replug such memory using free_contig_range() after asking our hypervisor. We also want to mark all allocated pages PG_offline, so nobody will touch them. To differentiate pages that were never onlined when onlining the memory block from pages allocated via alloc_contig_range(), we use PageDirty(). Based on this flag, virtio_mem_fake_online() can either online the pages for the first time or use free_contig_range(). It is worth noting that there are no guarantees on how much memory can actually get unplugged again. All device memory might completely be fragmented with unmovable data, such that no subblock can get unplugged. We are not touching the ZONE_MOVABLE. If memory is onlined to the ZONE_MOVABLE, it can only get unplugged after that memory was offlined manually by user space. In normal operation, virtio-mem memory is suggested to be onlined to ZONE_NORMAL. In the future, we will try to make unplug more likely to succeed. Add a module parameter to control if online memory shall be touched. As we want to access alloc_contig_range()/free_contig_range() from kernel module context, export the symbols. Note: Whenever virtio-mem uses alloc_contig_range(), all affected pages are on the same node, in the same zone, and contain no holes. Acked-by: Michal Hocko <mhocko@suse.com> # to export contig range allocator API Tested-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-6-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-mem: Paravirtualized memory hotunplug part 1David Hildenbrand2020-06-041-2/+114
| | | | | | | | | | | | | | | | | | | | | Unplugging subblocks of memory blocks that are offline is easy. All we have to do is watch out for concurrent onlining activity. Tested-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-5-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-mem: Allow to specify an ACPI PXM as nidDavid Hildenbrand2020-06-041-2/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to allow to specify (similar as for a DIMM), to which node a virtio-mem device (and, therefore, its memory) belongs. Add a new virtio-mem feature flag and export pxm_to_node, so it can be used in kernel module context. Acked-by: Michal Hocko <mhocko@suse.com> # for the export Acked-by: "Rafael J. Wysocki" <rafael@kernel.org> # for the export Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Tested-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Len Brown <lenb@kernel.org> Cc: linux-acpi@vger.kernel.org Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-4-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>