summaryrefslogtreecommitdiffstats
path: root/fs/btrfs (unfollow)
Commit message (Collapse)AuthorFilesLines
2018-07-29squashfs: be more careful about metadata corruptionLinus Torvalds4-5/+16
Anatoly Trosinenko reports that a corrupted squashfs image can cause a kernel oops. It turns out that squashfs can end up being confused about negative fragment lengths. The regular squashfs_read_data() does check for negative lengths, but squashfs_read_metadata() did not, and the fragment size code just blindly trusted the on-disk value. Fix both the fragment parsing and the metadata reading code. Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-29ext4: fix check to prevent initializing reserved inodesTheodore Ts'o2-8/+5
Commit 8844618d8aa7: "ext4: only look at the bg_flags field if it is valid" will complain if block group zero does not have the EXT4_BG_INODE_ZEROED flag set. Unfortunately, this is not correct, since a freshly created file system has this flag cleared. It gets almost immediately after the file system is mounted read-write --- but the following somewhat unlikely sequence will end up triggering a false positive report of a corrupted file system: mkfs.ext4 /dev/vdc mount -o ro /dev/vdc /vdc mount -o remount,rw /dev/vdc Instead, when initializing the inode table for block group zero, test to make sure that itable_unused count is not too large, since that is the case that will result in some or all of the reserved inodes getting cleared. This fixes the failures reported by Eric Whiteney when running generic/230 and generic/231 in the the nojournal test case. Fixes: 8844618d8aa7 ("ext4: only look at the bg_flags field if it is valid") Reported-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-07-27Revert "MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum"Rafał Miłecki2-9/+0
This reverts commit 2a027b47dba6 ("MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum"). Enabling ExternalSync caused a regression for BCM4718A1 (used e.g. in Netgear E3000 and ASUS RT-N16): it simply hangs during PCIe initialization. It's likely that BCM4717A1 is also affected. I didn't notice that earlier as the only BCM47XX devices with PCIe I own are: 1) BCM4706 with 2 x 14e4:4331 2) BCM4706 with 14e4:4360 and 14e4:4331 it appears that BCM4706 is unaffected. While BCM5300X-ES300-RDS.pdf seems to document that erratum and its workarounds (according to quotes provided by Tokunori) it seems not even Broadcom follows them. According to the provided info Broadcom should define CONF7_ES in their SDK's mipsinc.h and implement workaround in the si_mips_init(). Checking both didn't reveal such code. It *could* mean Broadcom also had some problems with the given workaround. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: Paul Burton <paul.burton@mips.com> Reported-by: Michael Marley <michael@michaelmarley.com> Patchwork: https://patchwork.linux-mips.org/patch/20032/ URL: https://bugs.openwrt.org/index.php?do=details&task_id=1688 Cc: Tokunori Ikegami <ikegami@allied-telesis.co.jp> Cc: Hauke Mehrtens <hauke@hauke-m.de> Cc: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org
2018-07-27block: reset bi_iter.bi_done after splitting bioGreg Edwards1-0/+1
After the bio has been updated to represent the remaining sectors, reset bi_done so bio_rewind_iter() does not rewind further than it should. This resolves a bio_integrity_process() failure on reads where the original request was split. Fixes: 63573e359d05 ("bio-integrity: Restore original iterator on verify stage") Signed-off-by: Greg Edwards <gedwards@ddn.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-27kvm, mm: account shadow page tables to kmemcgShakeel Butt1-1/+1
The size of kvm's shadow page tables corresponds to the size of the guest virtual machines on the system. Large VMs can spend a significant amount of memory as shadow page tables which can not be left as system memory overhead. So, account shadow page tables to the kmemcg. [shakeelb@google.com: replace (GFP_KERNEL|__GFP_ACCOUNT) with GFP_KERNEL_ACCOUNT] Link: http://lkml.kernel.org/r/20180629140224.205849-1-shakeelb@google.com Link: http://lkml.kernel.org/r/20180627181349.149778-1-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Peter Feiner <pfeiner@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-27zswap: re-check zswap_is_full() after do zswap_shrink()Li Wang1-0/+9
/sys/../zswap/stored_pages keeps rising in a zswap test with "zswap.max_pool_percent=0" parameter. But it should not compress or store pages any more since there is no space in the compressed pool. Reproduce steps: 1. Boot kernel with "zswap.enabled=1" 2. Set the max_pool_percent to 0 # echo 0 > /sys/module/zswap/parameters/max_pool_percent 3. Do memory stress test to see if some pages have been compressed # stress --vm 1 --vm-bytes $mem_available"M" --timeout 60s 4. Watching the 'stored_pages' number increasing or not The root cause is: When zswap_max_pool_percent is set to 0 via kernel parameter, zswap_is_full() will always return true due to zswap_shrink(). But if the shinking is able to reclain a page successfully the code then proceeds to compressing/storing another page, so the value of stored_pages will keep changing. To solve the issue, this patch adds a zswap_is_full() check again after zswap_shrink() to make sure it's now under the max_pool_percent, and to not compress/store if we reached the limit. Link: http://lkml.kernel.org/r/20180530103936.17812-1-liwang@redhat.com Signed-off-by: Li Wang <liwang@redhat.com> Acked-by: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjenning@redhat.com> Cc: Huang Ying <huang.ying.caritas@gmail.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-27include/linux/eventfd.h: include linux/errno.hArnd Bergmann1-0/+1
The new gasket staging driver ran into a randconfig build failure when CONFIG_EVENTFD is disabled: In file included from drivers/staging/gasket/gasket_interrupt.h:11, from drivers/staging/gasket/gasket_interrupt.c:4: include/linux/eventfd.h: In function 'eventfd_ctx_fdget': include/linux/eventfd.h:51:9: error: implicit declaration of function 'ERR_PTR' [-Werror=implicit-function-declaration] I can't see anything wrong with including eventfd.h before err.h, so the easiest fix is to make it possible to do this by including the file where it is needed. Link: http://lkml.kernel.org/r/20180724110737.3985088-1-arnd@arndb.de Fixes: 9a69f5087ccc ("drivers/staging: Gasket driver framework + Apex driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Eric Biggers <ebiggers@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-27mm: fix vma_is_anonymous() false-positivesKirill A. Shutemov5-0/+15
vma_is_anonymous() relies on ->vm_ops being NULL to detect anonymous VMA. This is unreliable as ->mmap may not set ->vm_ops. False-positive vma_is_anonymous() may lead to crashes: next ffff8801ce5e7040 prev ffff8801d20eca50 mm ffff88019c1e13c0 prot 27 anon_vma ffff88019680cdd8 vm_ops 0000000000000000 pgoff 0 file ffff8801b2ec2d00 private_data 0000000000000000 flags: 0xff(read|write|exec|shared|mayread|maywrite|mayexec|mayshare) ------------[ cut here ]------------ kernel BUG at mm/memory.c:1422! invalid opcode: 0000 [#1] SMP KASAN CPU: 0 PID: 18486 Comm: syz-executor3 Not tainted 4.18.0-rc3+ #136 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:zap_pmd_range mm/memory.c:1421 [inline] RIP: 0010:zap_pud_range mm/memory.c:1466 [inline] RIP: 0010:zap_p4d_range mm/memory.c:1487 [inline] RIP: 0010:unmap_page_range+0x1c18/0x2220 mm/memory.c:1508 Call Trace: unmap_single_vma+0x1a0/0x310 mm/memory.c:1553 zap_page_range_single+0x3cc/0x580 mm/memory.c:1644 unmap_mapping_range_vma mm/memory.c:2792 [inline] unmap_mapping_range_tree mm/memory.c:2813 [inline] unmap_mapping_pages+0x3a7/0x5b0 mm/memory.c:2845 unmap_mapping_range+0x48/0x60 mm/memory.c:2880 truncate_pagecache+0x54/0x90 mm/truncate.c:800 truncate_setsize+0x70/0xb0 mm/truncate.c:826 simple_setattr+0xe9/0x110 fs/libfs.c:409 notify_change+0xf13/0x10f0 fs/attr.c:335 do_truncate+0x1ac/0x2b0 fs/open.c:63 do_sys_ftruncate+0x492/0x560 fs/open.c:205 __do_sys_ftruncate fs/open.c:215 [inline] __se_sys_ftruncate fs/open.c:213 [inline] __x64_sys_ftruncate+0x59/0x80 fs/open.c:213 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Reproducer: #include <stdio.h> #include <stddef.h> #include <stdint.h> #include <stdlib.h> #include <string.h> #include <sys/types.h> #include <sys/stat.h> #include <sys/ioctl.h> #include <sys/mman.h> #include <unistd.h> #include <fcntl.h> #define KCOV_INIT_TRACE _IOR('c', 1, unsigned long) #define KCOV_ENABLE _IO('c', 100) #define KCOV_DISABLE _IO('c', 101) #define COVER_SIZE (1024<<10) #define KCOV_TRACE_PC 0 #define KCOV_TRACE_CMP 1 int main(int argc, char **argv) { int fd; unsigned long *cover; system("mount -t debugfs none /sys/kernel/debug"); fd = open("/sys/kernel/debug/kcov", O_RDWR); ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE); cover = mmap(NULL, COVER_SIZE * sizeof(unsigned long), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); munmap(cover, COVER_SIZE * sizeof(unsigned long)); cover = mmap(NULL, COVER_SIZE * sizeof(unsigned long), PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0); memset(cover, 0, COVER_SIZE * sizeof(unsigned long)); ftruncate(fd, 3UL << 20); return 0; } This can be fixed by assigning anonymous VMAs own vm_ops and not relying on it being NULL. If ->mmap() failed to set ->vm_ops, mmap_region() will set it to dummy_vm_ops. This way we will have non-NULL ->vm_ops for all VMAs. Link: http://lkml.kernel.org/r/20180724121139.62570-4-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: syzbot+3f84280d52be9b7083cc@syzkaller.appspotmail.com Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-27mm: use vma_init() to initialize VMAs on stack and data segmentsKirill A. Shutemov10-7/+17
Make sure to initialize all VMAs properly, not only those which come from vm_area_cachep. Link: http://lkml.kernel.org/r/20180724121139.62570-3-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-27mm: introduce vma_init()Kirill A. Shutemov2-4/+8
Not all VMAs allocated with vm_area_alloc(). Some of them allocated on stack or in data segment. The new helper can be use to initialize VMA properly regardless where it was allocated. Link: http://lkml.kernel.org/r/20180724121139.62570-2-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-27mm: fix exports that inadvertently make put_page() EXPORT_SYMBOL_GPLDan Williams1-2/+2
Commit e76384884344 ("mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS") added two EXPORT_SYMBOL_GPL() symbols, but these symbols are required by the inlined put_page(), thus accidentally making put_page() a GPL export only. This breaks OpenAFS (at least). Mark them EXPORT_SYMBOL() instead. Link: http://lkml.kernel.org/r/153128611970.2928.11310692420711601254.stgit@dwillia2-desk3.amr.corp.intel.com Fixes: e76384884344 ("mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Joe Gorse <jhgorse@gmail.com> Reported-by: John Hubbard <jhubbard@nvidia.com> Tested-by: Joe Gorse <jhgorse@gmail.com> Tested-by: John Hubbard <jhubbard@nvidia.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Mark Vitale <mvitale@sinenomine.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-27ipc/sem.c: prevent queue.status tearing in semopDavidlohr Bueso1-1/+1
In order for load/store tearing prevention to work, _all_ accesses to the variable in question need to be done around READ and WRITE_ONCE() macros. Ensure everyone does so for q->status variable for semtimedop(). Link: http://lkml.kernel.org/r/20180717052654.676-1-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-27mm: disallow mappings that conflict for devm_memremap_pages()Dave Jiang1-1/+17
When pmem namespaces created are smaller than section size, this can cause an issue during removal and gpf was observed: general protection fault: 0000 1 SMP PTI CPU: 36 PID: 3941 Comm: ndctl Tainted: G W 4.14.28-1.el7uek.x86_64 #2 task: ffff88acda150000 task.stack: ffffc900233a4000 RIP: 0010:__put_page+0x56/0x79 Call Trace: devm_memremap_pages_release+0x155/0x23a release_nodes+0x21e/0x260 devres_release_all+0x3c/0x48 device_release_driver_internal+0x15c/0x207 device_release_driver+0x12/0x14 unbind_store+0xba/0xd8 drv_attr_store+0x27/0x31 sysfs_kf_write+0x3f/0x46 kernfs_fop_write+0x10f/0x18b __vfs_write+0x3a/0x16d vfs_write+0xb2/0x1a1 SyS_write+0x55/0xb9 do_syscall_64+0x79/0x1ae entry_SYSCALL_64_after_hwframe+0x3d/0x0 Add code to check whether we have a mapping already in the same section and prevent additional mappings from being created if that is the case. Link: http://lkml.kernel.org/r/152909478401.50143.312364396244072931.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Robert Elliott <elliott@hpe.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-27kasan: only select SLUB_DEBUG with SYSFS=yArnd Bergmann1-1/+1
Building with KASAN and SLUB but without sysfs now results in a build-time error: WARNING: unmet direct dependencies detected for SLUB_DEBUG Depends on [n]: SLUB [=y] && SYSFS [=n] Selected by [y]: - KASAN [=y] && HAVE_ARCH_KASAN [=y] && (SLUB [=y] || SLAB [=n] && !DEBUG_SLAB [=n]) && SLUB [=y] mm/slub.c:4565:12: error: 'list_locations' defined but not used [-Werror=unused-function] static int list_locations(struct kmem_cache *s, char *buf, ^~~~~~~~~~~~~~ mm/slub.c:4406:13: error: 'validate_slab_cache' defined but not used [-Werror=unused-function] static long validate_slab_cache(struct kmem_cache *s) This disallows that broken configuration in Kconfig. Link: http://lkml.kernel.org/r/20180709154019.1693026-1-arnd@arndb.de Fixes: dd275caf4a0d ("kasan: depend on CONFIG_SLUB_DEBUG") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Shakeel Butt <shakeelb@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-27delayacct: fix crash in delayacct_blkio_end() after delayacct init failureTejun Heo1-1/+1
While forking, if delayacct init fails due to memory shortage, it continues expecting all delayacct users to check task->delays pointer against NULL before dereferencing it, which all of them used to do. Commit c96f5471ce7d ("delayacct: Account blkio completion on the correct task"), while updating delayacct_blkio_end() to take the target task instead of always using %current, made the function test NULL on %current->delays and then continue to operated on @p->delays. If %current succeeded init while @p didn't, it leads to the following crash. BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: __delayacct_blkio_end+0xc/0x40 PGD 8000001fd07e1067 P4D 8000001fd07e1067 PUD 1fcffbb067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 4 PID: 25774 Comm: QIOThread0 Not tainted 4.16.0-9_fbk1_rc2_1180_g6b593215b4d7 #9 RIP: 0010:__delayacct_blkio_end+0xc/0x40 Call Trace: try_to_wake_up+0x2c0/0x600 autoremove_wake_function+0xe/0x30 __wake_up_common+0x74/0x120 wake_up_page_bit+0x9c/0xe0 mpage_end_io+0x27/0x70 blk_update_request+0x78/0x2c0 scsi_end_request+0x2c/0x1e0 scsi_io_completion+0x20b/0x5f0 blk_mq_complete_request+0xa2/0x100 ata_scsi_qc_complete+0x79/0x400 ata_qc_complete_multiple+0x86/0xd0 ahci_handle_port_interrupt+0xc9/0x5c0 ahci_handle_port_intr+0x54/0xb0 ahci_single_level_irq_intr+0x3b/0x60 __handle_irq_event_percpu+0x43/0x190 handle_irq_event_percpu+0x20/0x50 handle_irq_event+0x2a/0x50 handle_edge_irq+0x80/0x1c0 handle_irq+0xaf/0x120 do_IRQ+0x41/0xc0 common_interrupt+0xf/0xf Fix it by updating delayacct_blkio_end() check @p->delays instead. Link: http://lkml.kernel.org/r/20180724175542.GP1934745@devbig577.frc2.facebook.com Fixes: c96f5471ce7d ("delayacct: Account blkio completion on the correct task") Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Dave Jones <dsj@fb.com> Debugged-by: Dave Jones <dsj@fb.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Josh Snyder <joshs@netflix.com> Cc: <stable@vger.kernel.org> [4.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-26block: bio_iov_iter_get_pages: pin more pages for multi-segment IOsMartin Wilck1-3/+32
bio_iov_iter_get_pages() currently only adds pages for the next non-zero segment from the iov_iter to the bio. That's suboptimal for callers, which typically try to pin as many pages as fit into the bio. This patch converts the current bio_iov_iter_get_pages() into a static helper, and introduces a new helper that allocates as many pages as 1) fit into the bio, 2) are present in the iov_iter, 3) and can be pinned by MM. Error is returned only if zero pages could be pinned. Because of 3), a zero return value doesn't necessarily mean all pages have been pinned. Callers that have to pin every page in the iov_iter must still call this function in a loop (this is currently the case). This change matters most for __blkdev_direct_IO_simple(), which calls bio_iov_iter_get_pages() only once. If it obtains less pages than requested, it returns a "short write" or "short read", and __generic_file_write_iter() falls back to buffered writes, which may lead to data corruption. Fixes: 72ecad22d9f1 ("block: support a full bio worth of IO for simplified bdev direct-io") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-26blkdev: __blkdev_direct_IO_simple: fix leak in error caseMartin Wilck1-4/+5
Fixes: 72ecad22d9f1 ("block: support a full bio worth of IO for simplified bdev direct-io") Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-26block: bio_iov_iter_get_pages: fix size of last iovecMartin Wilck1-10/+8
If the last page of the bio is not "full", the length of the last vector slot needs to be corrected. This slot has the index (bio->bi_vcnt - 1), but only in bio->bi_io_vec. In the "bv" helper array, which is shifted by the value of bio->bi_vcnt at function invocation, the correct index is (nr_pages - 1). v2: improved readability following suggestions from Ming Lei. v3: followed a formatting suggestion from Christoph Hellwig. Fixes: 2cefe4dbaadf ("block: add bio_iov_iter_get_pages()") Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-26PCI/AER: Work around use-after-free in pcie_do_fatal_recovery()Thomas Tai1-0/+2
When an fatal error is received by a non-bridge device, the device is removed, and pci_stop_and_remove_bus_device() deallocates the device structure. The freed device structure is used by subsequent code to send uevents and print messages. Hold a reference on the device until we're finished using it. This is not an ideal fix because pcie_do_fatal_recovery() should not use the device at all after removing it, but that's too big a project for right now. Fixes: 7e9084b36740 ("PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices") Signed-off-by: Thomas Tai <thomas.tai@oracle.com> [bhelgaas: changelog, reduce get/put coverage] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-07-26kthread, tracing: Don't expose half-written comm when creating kthreadsSnild Dolkow1-1/+7
There is a window for racing when printing directly to task->comm, allowing other threads to see a non-terminated string. The vsnprintf function fills the buffer, counts the truncated chars, then finally writes the \0 at the end. creator other vsnprintf: fill (not terminated) count the rest trace_sched_waking(p): ... memcpy(comm, p->comm, TASK_COMM_LEN) write \0 The consequences depend on how 'other' uses the string. In our case, it was copied into the tracing system's saved cmdlines, a buffer of adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be): crash-arm64> x/1024s savedcmd->saved_cmdlines | grep 'evenk' 0xffffffd5b3818640: "irq/497-pwr_evenkworker/u16:12" ...and a strcpy out of there would cause stack corruption: [224761.522292] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffff9bf9783c78 crash-arm64> kbt | grep 'comm\|trace_print_context' #6 0xffffff9bf9783c78 in trace_print_context+0x18c(+396) comm (char [16]) = "irq/497-pwr_even" crash-arm64> rd 0xffffffd4d0e17d14 8 ffffffd4d0e17d14: 2f71726900000000 5f7277702d373934 ....irq/497-pwr_ ffffffd4d0e17d24: 726f776b6e657665 3a3631752f72656b evenkworker/u16: ffffffd4d0e17d34: f9780248ff003231 cede60e0ffffff9b 12..H.x......`.. ffffffd4d0e17d44: cede60c8ffffffd4 00000fffffffffd4 .....`.......... The workaround in e09e28671 (use strlcpy in __trace_find_cmdline) was likely needed because of this same bug. Solved by vsnprintf:ing to a local buffer, then using set_task_comm(). This way, there won't be a window where comm is not terminated. Link: http://lkml.kernel.org/r/20180726071539.188015-1-snild@sony.com Cc: stable@vger.kernel.org Fixes: bc0c38d139ec7 ("ftrace: latency tracer infrastructure") Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Snild Dolkow <snild@sony.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-07-26tracing: Quiet gcc warning about maybe unused link variableSteven Rostedt (VMware)1-2/+4
Commit 57ea2a34adf4 ("tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure") added an if statement that depends on another if statement that gcc doesn't see will initialize the "link" variable and gives the warning: "warning: 'link' may be used uninitialized in this function" It is really a false positive, but to quiet the warning, and also to make sure that it never actually is used uninitialized, initialize the "link" variable to NULL and add an if (!WARN_ON_ONCE(!link)) where the compiler thinks it could be used uninitialized. Cc: stable@vger.kernel.org Fixes: 57ea2a34adf4 ("tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-07-26tracing: Fix possible double free in event_enable_trigger_func()Steven Rostedt (VMware)1-1/+5
There was a case that triggered a double free in event_trigger_callback() due to the called reg() function freeing the trigger_data and then it getting freed again by the error return by the caller. The solution there was to up the trigger_data ref count. Code inspection found that event_enable_trigger_func() has the same issue, but is not as easy to trigger (requires harder to trigger failures). It needs to be solved slightly different as it needs more to clean up when the reg() function fails. Link: http://lkml.kernel.org/r/20180725124008.7008e586@gandalf.local.home Cc: stable@vger.kernel.org Fixes: 7862ad1846e99 ("tracing: Add 'enable_event' and 'disable_event' event trigger commands") Reivewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-07-25drm/i915/glk: Add Quirk for GLK NUC HDMI port issues.Clint Taylor4-5/+33
On GLK NUC platforms the HDMI retiming buffer needs additional disabled time to correctly sync to a faster incoming signal. When measured on a scope the highspeed lines of the HDMI clock turn off for ~400uS during a normal resolution change. The HDMI retimer on the GLK NUC appears to require at least a full frame of quiet time before a new faster clock can be correctly sync'd. Wait 100ms due to msleep inaccuracies while waiting for a completed frame. Add a quirk to the driver for GLK boards that use ITE66317 HDMI retimers. V2: Add more devices to the quirk list V3: Delay increased to 100ms, check to confirm crtc type is HDMI. V4: crtc type check extended to include _DDI and whitespace fixes v5: Fix white spaces, remove the macro for delay. Revert the crtc type check introduced in v4. Cc: Imre Deak <imre.deak@intel.com> Cc: <stable@vger.kernel.org> # v4.14+ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105887 Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com> Tested-by: Daniel Scheller <d.scheller.oss@gmail.com> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180710200205.1478-1-radhakrishna.sripada@intel.com (cherry picked from commit 90c3e2198777aaa355b6994a31a79c636c8d4306) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2018-07-25tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failureArtem Savkov1-2/+11
If enable_trace_kprobe fails to enable the probe in enable_k(ret)probe it returns an error, but does not unset the tp flags it set previously. This results in a probe being considered enabled and failures like being unable to remove the probe through kprobe_events file since probes_open() expects every probe to be disabled. Link: http://lkml.kernel.org/r/20180725102826.8300-1-asavkov@redhat.com Link: http://lkml.kernel.org/r/20180725142038.4765-1-asavkov@redhat.com Cc: Ingo Molnar <mingo@redhat.com> Cc: stable@vger.kernel.org Fixes: 41a7dd420c57 ("tracing/kprobes: Support ftrace_event_file base multibuffer") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>