summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* drm/i915/cml: Separate U series pci id from origianl list.Lee Shawn C2019-12-123-7/+17
| | | | | | | | | | | | | | U series device need different DDI buffer setup for eDP and DP. If driver did not recognize ULT id proerply. The setting for H and S series would be used. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Cc: Cooper Chiou <cooper.chiou@intel.com> Signed-off-by: Lee Shawn C <shawn.c.lee@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191210150415.10705-2-shawn.c.lee@intel.com
* drm/i915/cml: Remove unsupport PCI IDLee Shawn C2019-12-121-4/+0
| | | | | | | | | | | | | | | commit 'a7b4deeb02b9 ("drm/i915/cml: Add CML PCI IDS)' introduced new PCI ID that CML support. But some PCI IDs were removed in BSpec for CML. This patch is used to eliminate the unsed ID. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Cc: Cooper Chiou <cooper.chiou@intel.com> Signed-off-by: Lee Shawn C <shawn.c.lee@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191210150415.10705-1-shawn.c.lee@intel.com
* drm/i915/gt: Mark up ips_mchdev pointer accessChris Wilson2019-12-121-1/+1
| | | | | | | | | | drivers/gpu/drm/i915/gt/intel_rps.c:1726:24: error: incompatible types in comparison expression (different address spaces): drivers/gpu/drm/i915/gt/intel_rps.c:1726:24: struct drm_i915_private [noderef] <asn:4> * drivers/gpu/drm/i915/gt/intel_rps.c:1726:24: struct drm_i915_private * Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191212140459.1307617-7-chris@chris-wilson.co.uk
* drm/i915: Set fence_work.ops before dma_fence_initChris Wilson2019-12-121-2/+1
| | | | | | | | | | | | | Since dma_fence_init may call ops (because of a meaningless trace_dma_fence), we need to set the worker ops prior to that call. Reported-by: Vince Weaver <vincent.weaver@maine.edu> Fixes: 8e458fe2ee05 ("drm/i915: Generalise the clflush dma-worker") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191212154224.1631531-1-chris@chris-wilson.co.uk
* drm/i915: Improve i915_inject_probe_error macroMichal Wajdeczko2019-12-121-1/+1
| | | | | | | | | | | | On non-debug builds we were not using i915 param and thus we may cause "unused variable" warning/error if caller was not using i915 elsewhere. Let compiler see this param. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191212121903.72524-1-michal.wajdeczko@intel.com
* drm/i915/gem: Asynchronous cmdparserChris Wilson2019-12-122-32/+99
| | | | | | | | | | | | | | | | | | | Execute the cmdparser asynchronously as part of the submission pipeline. Using our dma-fences, we can schedule execution after an asynchronous piece of work, so we move the cmdparser out from under the struct_mutex inside execbuf as run it as part of the submission pipeline. The same security rules apply, we copy the user batch before validation and userspace cannot touch the validation shadow. The only caveat is that we will do request construction before we complete cmdparsing and so we cannot know the outcome of the validation step until later -- so the execbuf ioctl does not report -EINVAL directly, but we must cancel execution of the request and flag the error on the out-fence. Closes: https://gitlab.freedesktop.org/drm/intel/issues/611 Closes: https://gitlab.freedesktop.org/drm/intel/issues/412 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191211230858.599030-2-chris@chris-wilson.co.uk
* drm/i915/gem: Prepare gen7 cmdparser for async executionChris Wilson2019-12-124-59/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The gen7 cmdparser is primarily a promotion-based system to allow access to additional registers beyond the HW validation, and allows fallback to normal execution of the user batch buffer if valid and requires chaining. In the next patch, we will do the cmdparser validation in the pipeline asynchronously and so at the point of request construction we will not know if we want to execute the privileged and validated batch, or the original user batch. The solution employed here is to execute both batches, one with raised privileges and one as normal. This is because the gen7 MI_BATCH_BUFFER_START command cannot change privilege level within a batch and must strictly use the current privilege level (or undefined behaviour kills the GPU). So in order to execute the original batch, we need a second non-priviledged batch buffer chain from the ring, i.e. we need to emit two batches for each user batch. Inside the two batches we determine which one should actually execute, we provide a conditional trampoline to call the original batch. Implementation-wise, we create a single buffer and write the shadow and the trampoline inside it at different offsets; and bind the buffer into both the kernel GGTT for the privileged execution of the shadow and into the user ppGTT for the non-privileged execution of the trampoline and original batch. One buffer, two batches and two vma. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191211230858.599030-1-chris@chris-wilson.co.uk
* drm/i915/gt: Only ignore rc6 parking for PCU on byt/bswChris Wilson2019-12-122-1/+3
| | | | | | | | | | | An oversight in that we use rc6->ctl_enable to disable rc6 on gen9 and so it does not simply indicate indirect control via a PCU. Switch the rc6->ctl_enable check for a platform-based check. Fixes: 972745fd5770 ("drm/i915/gt: Disable manual rc6 for Braswell/Baytrail") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191212072737.884335-2-chris@chris-wilson.co.uk
* drm/i915: Align start for memcpy_from_wcChris Wilson2019-12-113-9/+77
| | | | | | | | | | | | The movntqda requires 16-byte alignment for the source pointer. Avoid falling back to clflush if the source pointer is misaligned by doing the doing a small uncached memcpy to fixup the alignments. v2: Turn the unaligned copy into a genuine helper Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191211110437.4082687-5-chris@chris-wilson.co.uk
* drm/i915/gem: Tidy up error handling for eb_parse()Chris Wilson2019-12-111-21/+18
| | | | | | | | | As the caller no longer uses the i915_vma result, stop returning it and just return the error code instead. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191211110437.4082687-4-chris@chris-wilson.co.uk
* drm/i915: Simplify error escape from cmdparserChris Wilson2019-12-111-8/+4
| | | | | | | | | | We need to flush the destination buffer, even on error, to maintain consistent cache state. Thereby removing the jump on error past the clear, and reducing the loop-escape mechanism to a mere break. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191211110437.4082687-3-chris@chris-wilson.co.uk
* drm/i915: Remove redundant parameters from intel_engine_cmd_parserChris Wilson2019-12-114-88/+82
| | | | | | | | | Declutter the calling interface by reducing the parameters to the i915_vma and associated offsets. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191211110437.4082687-2-chris@chris-wilson.co.uk
* drm/i915: Fix cmdparser drm.debugChris Wilson2019-12-111-28/+27
| | | | | | | | | The cmdparser rejection debug is not for driver development, but for the user, for which we use a plain DRM_DEBUG(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191211110437.4082687-1-chris@chris-wilson.co.uk
* drm/i915/gt: Disable manual rc6 for Braswell/BaytrailChris Wilson2019-12-111-0/+3
| | | | | | | | | | | | | | | | | The initial investigated showed that while the PCU on Braswell/Baytrail controlled RC6 itself. setting the software RC6 request made no difference. Further testing reveals though that it causes a delay in the PCU on enabling RC6. Closes: https://gitlab.freedesktop.org/drm/intel/issues/763 Fixes: 730eaeb52426 ("drm/i915/gt: Manual rc6 entry upon parking") Testcase: igt/perf/rc6-disable Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Imre Deak <imre.deak@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191210180111.3958558-1-chris@chris-wilson.co.uk
* drm/i915/uc: Drop explicit ggtt param in some uc_fw functionsMichal Wajdeczko2019-12-111-4/+5
| | | | | | | | | | | There is no need to pass explicit ggtt since we already have a trick to get parent gt from uc_fw, we only need to use it. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191211124549.59516-4-michal.wajdeczko@intel.com
* drm/i915/uc: Drop explicit gt param in some uc_fw functionsMichal Wajdeczko2019-12-114-20/+16
| | | | | | | | | | | There is no need to pass explicit gt since we already have a trick to get parent gt from uc_fw, we only need to use it. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191211124549.59516-3-michal.wajdeczko@intel.com
* drm/i915/uc: Drop explicit i915 param in some uc_fw functionsMichal Wajdeczko2019-12-113-12/+10
| | | | | | | | | | | | There is no need to pass explicit i915 since we already have a debug trick to get parent gt from uc_fw, we only need to make this trick available on non-debug builds. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191211124549.59516-2-michal.wajdeczko@intel.com
* drm/i915: Use the i915_device name for identifying our request fencesChris Wilson2019-12-111-2/+2
| | | | | | | | | | Use the dev_name(i915) to identify the requests for debugging, so we can tell different device timelines apart. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191211150204.133471-1-chris@chris-wilson.co.uk
* drm/i915: remove redundant checks for a null fb pointerColin Ian King2019-12-111-2/+2
| | | | | | | | | | | A prior check and return when pointer fb is null makes subsequent null checks on fb redundant. Remove the redundant null checks. Addresses-Coverity: ("Logically dead code") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191210142349.333171-1-colin.king@canonical.com
* drm/i915/display: remove duplicated assignment to pointer crtc_stateColin Ian King2019-12-111-1/+1
| | | | | | | | | | Pointer crtc_state is being assigned twice, one of these is redundant and can be removed. Addresses-Coverity: ("Evaluation order violation") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191210144535.341977-1-colin.king@canonical.com
* drm/i915: Pass cpu transcoder to assert_pipe()Ville Syrjälä2019-12-112-27/+18
| | | | | | | | | | | | | | | | | In order to eliminate intel_pipe_to_cpu_transcoder() (and its crtc->config usage) let's pass the cpu transcoder to assert_pipe() so we don't have to do the pipe->cpu transcoder lookup on HSW+. On VLV/CHV this can get called during eDP init, which happens before crtc->config->cpu_transcoder is even populated. So currently we're always reading PIPECONF(A) there even if we're trying to check the state of some other pipe. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191112163812.22075-4-ville.syrjala@linux.intel.com Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
* drm/i915: ELiminate intel_pipe_to_cpu_transcoder() from assert_fdi_tx()Ville Syrjälä2019-12-111-3/+7
| | | | | | | | | | | | | | | | Let's start to eliminate intel_pipe_to_cpu_transcoder() so that we can get rid of one more crtc->config usage (which we will want to nuke as well). In the case of assert_fdi_tx() we know that we're never dealing with the EDP transcoder so we can simply replace this with a cast. v2: Fix poor English in comment Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191112163812.22075-3-ville.syrjala@linux.intel.com Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
* drm/i915/selftests: Show the i915_active on failureChris Wilson2019-12-111-0/+8
| | | | | | | | | | Print the i915_active state on selftest failure, with a hope it helps illuminate the cause of the failure. References: https://gitlab.freedesktop.org/drm/intel/issues/765 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191210115502.3767070-1-chris@chris-wilson.co.uk
* drm/i915/gem: Wait on unbind barriers when invalidating userptrChris Wilson2019-12-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we are told we have to drop all references to userptr, wait for any barriers required for unbinding. <4> [2055.808787] WARNING: CPU: 3 PID: 6239 at mm/mmu_notifier.c:472 __mmu_notifier_invalidate_range_start+0x1f2/0x250 <4> [2055.808792] Modules linked in: vgem mei_hdcp snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel r8169 lpc_ich realtek i915 snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core pinctrl_broxton snd_pcm pinctrl_intel mei_me intel_lpss_pci mei prime_numbers [last unloaded: vgem] <4> [2055.808834] CPU: 3 PID: 6239 Comm: gem_userptr_bli Tainted: G U 5.5.0-rc1-CI-CI_DRM_7522+ #1 <4> [2055.808839] Hardware name: /NUC6CAYB, BIOS AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018 <4> [2055.808847] RIP: 0010:__mmu_notifier_invalidate_range_start+0x1f2/0x250 <4> [2055.808853] Code: c2 48 c7 c7 70 17 2e 82 44 89 45 d4 48 8b 70 28 e8 ec 01 ef ff 41 f6 46 20 01 44 8b 45 d4 75 0a 41 83 f8 f5 44 89 7d d4 74 89 <0f> 0b 44 89 45 d4 eb 81 0f 0b 49 8b 46 18 49 8b 76 10 4c 89 ff 48 <4> [2055.808858] RSP: 0018:ffffc90002937d40 EFLAGS: 00010202 <4> [2055.808865] RAX: 0000000000000061 RBX: ffff8882703a33e0 RCX: 0000000000000001 <4> [2055.808870] RDX: 0000000000000000 RSI: ffff888277da8cb8 RDI: 00000000ffffffff <4> [2055.808874] RBP: ffffc90002937d70 R08: 00000000fffffff5 R09: 0000000000000000 <4> [2055.808879] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 <4> [2055.808884] R13: ffffffff822e1716 R14: ffffc90002937d80 R15: 00000000fffffff5 <4> [2055.808890] FS: 00007fda75004e40(0000) GS:ffff888277d80000(0000) knlGS:0000000000000000 <4> [2055.808895] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [2055.808900] CR2: 000055ad72ec3000 CR3: 00000002697b2000 CR4: 00000000003406e0 <4> [2055.808904] Call Trace: <4> [2055.808920] unmap_vmas+0x13e/0x150 <4> [2055.808937] unmap_region+0xa3/0x100 <4> [2055.808964] __do_munmap+0x26d/0x490 <4> [2055.808980] __vm_munmap+0x66/0xc0 <4> [2055.808994] __x64_sys_munmap+0x12/0x20 <4> [2055.809001] do_syscall_64+0x4f/0x220 Closes: https://gitlab.freedesktop.org/drm/intel/issues/771 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191210133719.3874455-1-chris@chris-wilson.co.uk
* drm/i915/gt: Check we are the Ironlake IPS provider before deregisteringChris Wilson2019-12-111-1/+3
| | | | | | | | | | | Check that we own the global pointer before deregistering. Reported-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191210153620.3929372-1-chris@chris-wilson.co.uk
* drm/i915: Improve execbuf debugTvrtko Ursulin2019-12-111-10/+12
| | | | | | | | | | Convert i915_gem_check_execbuffer to return the error code instead of a boolean so our neat EINVAL debugging trick works within this function. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191209122314.16289-1-tvrtko.ursulin@linux.intel.com
* drm/i915: Copy across scheduler behaviour flags across submit fencesChris Wilson2019-12-112-26/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want the bonded request to have the same scheduler properties as its master so that it is placed at the same depth in the queue. For example, consider we have requests A, B and B', where B & B' are a bonded pair to run in parallel on two engines. A -> B \- B' B will run after A and so may be scheduled on an idle engine and wait on A using a semaphore. B' sees B being executed and so enters the queue on the same engine as A. As B' did not inherit the semaphore-chain from B, it may have higher precedence than A and so preempts execution. However, B' then sits on a semaphore waiting for B, who is waiting for A, who is blocked by B. Ergo B' needs to inherit the scheduler properties from B (i.e. the semaphore chain) so that it is scheduled with the same priority as B and will not be executed ahead of Bs dependencies. Furthermore, to prevent the priorities changing via the expose fence on B', we need to couple in the dependencies for PI. This requires us to relax our sanity-checks that dependencies are strictly in order. v2: Synchronise (B, B') execution on all platforms, regardless of using a scheduler, any no-op syncs should be elided. Fixes: ee1136908e9b ("drm/i915/execlists: Virtual engine bonding") Closes: https://gitlab.freedesktop.org/drm/intel/issues/464 Testcase: igt/gem_exec_balancer/bonded-chain Testcase: igt/gem_exec_balancer/bonded-semaphore Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191210151332.3902215-1-chris@chris-wilson.co.uk
* drm/i915/dsb: Fix in mmio offset calculation of DSB instanceAnimesh Manna2019-12-111-1/+1
| | | | | | | | | | | | As the current usage is restricted to first DSB instance per pipe, so existing code could not catch the issue to calculate the mmio offset of different DSB instance per pipe. Corrected the offset calculation. Fixes: a6e58d9a2e04 ("drm/i915/dsb: Check DSB engine status.") Signed-off-by: Animesh Manna <animesh.manna@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191205123513.22603-1-animesh.manna@intel.com
* Merge drm/drm-next into drm-intel-next-queuedJani Nikula2019-12-119927-208221/+485252
|\ | | | | | | | | | | | | | | | | Sync up with v5.5-rc1 to get the updated lock_release() API among other things. Fix the conflict reported by Stephen Rothwell [1]. [1] http://lore.kernel.org/r/20191210093957.5120f717@canb.auug.org.au Signed-off-by: Jani Nikula <jani.nikula@intel.com>
| * Linux 5.5-rc1v5.5-rc1Linus Torvalds2019-12-081-2/+2
| |
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds2019-12-08119-628/+1025
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) More jumbo frame fixes in r8169, from Heiner Kallweit. 2) Fix bpf build in minimal configuration, from Alexei Starovoitov. 3) Use after free in slcan driver, from Jouni Hogander. 4) Flower classifier port ranges don't work properly in the HW offload case, from Yoshiki Komachi. 5) Use after free in hns3_nic_maybe_stop_tx(), from Yunsheng Lin. 6) Out of bounds access in mqprio_dump(), from Vladyslav Tarasiuk. 7) Fix flow dissection in dsa TX path, from Alexander Lobakin. 8) Stale syncookie timestampe fixes from Guillaume Nault. [ Did an evil merge to silence a warning introduced by this pull - Linus ] * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (84 commits) r8169: fix rtl_hw_jumbo_disable for RTL8168evl net_sched: validate TCA_KIND attribute in tc_chain_tmplt_add() r8169: add missing RX enabling for WoL on RTL8125 vhost/vsock: accept only packets with the right dst_cid net: phy: dp83867: fix hfs boot in rgmii mode net: ethernet: ti: cpsw: fix extra rx interrupt inet: protect against too small mtu values. gre: refetch erspan header from skb->data after pskb_may_pull() pppoe: remove redundant BUG_ON() check in pppoe_pernet tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE() tcp: tighten acceptance of ACKs not matching a child socket tcp: fix rejected syncookies due to stale timestamps lpc_eth: kernel BUG on remove tcp: md5: fix potential overestimation of TCP option space net: sched: allow indirect blocks to bind to clsact in TC net: core: rename indirect block ingress cb function net-sysfs: Call dev_hold always in netdev_queue_add_kobject net: dsa: fix flow dissection on Tx path net/tls: Fix return values to avoid ENOTSUPP net: avoid an indirect call in ____sys_recvmsg() ...
| | * r8169: fix rtl_hw_jumbo_disable for RTL8168evlHeiner Kallweit2019-12-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In referenced fix we removed the RTL8168e-specific jumbo config for RTL8168evl in rtl_hw_jumbo_enable(). We have to do the same in rtl_hw_jumbo_disable(). v2: fix referenced commit id Fixes: 14012c9f3bb9 ("r8169: fix jumbo configuration for RTL8168evl") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net_sched: validate TCA_KIND attribute in tc_chain_tmplt_add()Eric Dumazet2019-12-071-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the new tcf_proto_check_kind() helper to make sure user provided value is well formed. BUG: KMSAN: uninit-value in string_nocheck lib/vsprintf.c:606 [inline] BUG: KMSAN: uninit-value in string+0x4be/0x600 lib/vsprintf.c:668 CPU: 0 PID: 12358 Comm: syz-executor.1 Not tainted 5.4.0-rc8-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0x128/0x220 mm/kmsan/kmsan_report.c:108 __msan_warning+0x64/0xc0 mm/kmsan/kmsan_instr.c:245 string_nocheck lib/vsprintf.c:606 [inline] string+0x4be/0x600 lib/vsprintf.c:668 vsnprintf+0x218f/0x3210 lib/vsprintf.c:2510 __request_module+0x2b1/0x11c0 kernel/kmod.c:143 tcf_proto_lookup_ops+0x171/0x700 net/sched/cls_api.c:139 tc_chain_tmplt_add net/sched/cls_api.c:2730 [inline] tc_ctl_chain+0x1904/0x38a0 net/sched/cls_api.c:2850 rtnetlink_rcv_msg+0x115a/0x1580 net/core/rtnetlink.c:5224 netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2477 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5242 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0xf3e/0x1020 net/netlink/af_netlink.c:1328 netlink_sendmsg+0x110f/0x1330 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg net/socket.c:657 [inline] ___sys_sendmsg+0x14ff/0x1590 net/socket.c:2311 __sys_sendmsg net/socket.c:2356 [inline] __do_sys_sendmsg net/socket.c:2365 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2363 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2363 do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45a649 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f0790795c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 000000000045a649 RDX: 0000000000000000 RSI: 0000000020000300 RDI: 0000000000000006 RBP: 000000000075bfc8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f07907966d4 R13: 00000000004c8db5 R14: 00000000004df630 R15: 00000000ffffffff Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:149 [inline] kmsan_internal_poison_shadow+0x5c/0x110 mm/kmsan/kmsan.c:132 kmsan_slab_alloc+0x97/0x100 mm/kmsan/kmsan_hooks.c:86 slab_alloc_node mm/slub.c:2773 [inline] __kmalloc_node_track_caller+0xe27/0x11a0 mm/slub.c:4381 __kmalloc_reserve net/core/skbuff.c:141 [inline] __alloc_skb+0x306/0xa10 net/core/skbuff.c:209 alloc_skb include/linux/skbuff.h:1049 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1174 [inline] netlink_sendmsg+0x783/0x1330 net/netlink/af_netlink.c:1892 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg net/socket.c:657 [inline] ___sys_sendmsg+0x14ff/0x1590 net/socket.c:2311 __sys_sendmsg net/socket.c:2356 [inline] __do_sys_sendmsg net/socket.c:2365 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2363 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2363 do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 6f96c3c6904c ("net_sched: fix backward compatibility for TCA_KIND") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * r8169: add missing RX enabling for WoL on RTL8125Heiner Kallweit2019-12-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RTL8125 also requires to enable RX for WoL. v2: add missing Fixes tag Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * vhost/vsock: accept only packets with the right dst_cidStefano Garzarella2019-12-071-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we receive a new packet from the guest, we check if the src_cid is correct, but we forgot to check the dst_cid. The host should accept only packets where dst_cid is equal to the host CID. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net: phy: dp83867: fix hfs boot in rgmii modeGrygorii Strashko2019-12-071-48/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit ef87f7da6b28 ("net: phy: dp83867: move dt parsing to probe") causes regression on TI dra71x-evm and dra72x-evm, where DP83867 PHY is used in "rgmii-id" mode - the networking stops working. Unfortunately, it's not enough to just move DT parsing code to .probe() as it depends on phydev->interface value, which is set to correct value abter the .probe() is completed and before calling .config_init(). So, RGMII configuration can't be loaded from DT. To fix and issue - move RGMII validation code to .config_init() - parse RGMII parameters in dp83867_of_init(), but consider them as optional. Fixes: ef87f7da6b28 ("net: phy: dp83867: move dt parsing to probe") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net: ethernet: ti: cpsw: fix extra rx interruptGrygorii Strashko2019-12-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now RX interrupt is triggered twice every time, because in cpsw_rx_interrupt() it is asked first and then disabled. So there will be pending interrupt always, when RX interrupt is enabled again in NAPI handler. Fix it by first disabling IRQ and then do ask. Fixes: 870915feabdc ("drivers: net: cpsw: remove disable_irq/enable_irq as irq can be masked from cpsw itself") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * inet: protect against too small mtu values.Eric Dumazet2019-12-075-11/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syzbot was once again able to crash a host by setting a very small mtu on loopback device. Let's make inetdev_valid_mtu() available in include/net/ip.h, and use it in ip_setup_cork(), so that we protect both ip_append_page() and __ip_append_data() Also add a READ_ONCE() when the device mtu is read. Pairs this lockless read with one WRITE_ONCE() in __dev_set_mtu(), even if other code paths might write over this field. Add a big comment in include/linux/netdevice.h about dev->mtu needing READ_ONCE()/WRITE_ONCE() annotations. Hopefully we will add the missing ones in followup patches. [1] refcount_t: saturated; leaking memory. WARNING: CPU: 0 PID: 9464 at lib/refcount.c:22 refcount_warn_saturate+0x138/0x1f0 lib/refcount.c:22 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 9464 Comm: syz-executor850 Not tainted 5.4.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 panic+0x2e3/0x75c kernel/panic.c:221 __warn.cold+0x2f/0x3e kernel/panic.c:582 report_bug+0x289/0x300 lib/bug.c:195 fixup_bug arch/x86/kernel/traps.c:174 [inline] fixup_bug arch/x86/kernel/traps.c:169 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:267 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:286 invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027 RIP: 0010:refcount_warn_saturate+0x138/0x1f0 lib/refcount.c:22 Code: 06 31 ff 89 de e8 c8 f5 e6 fd 84 db 0f 85 6f ff ff ff e8 7b f4 e6 fd 48 c7 c7 e0 71 4f 88 c6 05 56 a6 a4 06 01 e8 c7 a8 b7 fd <0f> 0b e9 50 ff ff ff e8 5c f4 e6 fd 0f b6 1d 3d a6 a4 06 31 ff 89 RSP: 0018:ffff88809689f550 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff815e4336 RDI: ffffed1012d13e9c RBP: ffff88809689f560 R08: ffff88809c50a3c0 R09: fffffbfff15d31b1 R10: fffffbfff15d31b0 R11: ffffffff8ae98d87 R12: 0000000000000001 R13: 0000000000040100 R14: ffff888099041104 R15: ffff888218d96e40 refcount_add include/linux/refcount.h:193 [inline] skb_set_owner_w+0x2b6/0x410 net/core/sock.c:1999 sock_wmalloc+0xf1/0x120 net/core/sock.c:2096 ip_append_page+0x7ef/0x1190 net/ipv4/ip_output.c:1383 udp_sendpage+0x1c7/0x480 net/ipv4/udp.c:1276 inet_sendpage+0xdb/0x150 net/ipv4/af_inet.c:821 kernel_sendpage+0x92/0xf0 net/socket.c:3794 sock_sendpage+0x8b/0xc0 net/socket.c:936 pipe_to_sendpage+0x2da/0x3c0 fs/splice.c:458 splice_from_pipe_feed fs/splice.c:512 [inline] __splice_from_pipe+0x3ee/0x7c0 fs/splice.c:636 splice_from_pipe+0x108/0x170 fs/splice.c:671 generic_splice_sendpage+0x3c/0x50 fs/splice.c:842 do_splice_from fs/splice.c:861 [inline] direct_splice_actor+0x123/0x190 fs/splice.c:1035 splice_direct_to_actor+0x3b4/0xa30 fs/splice.c:990 do_splice_direct+0x1da/0x2a0 fs/splice.c:1078 do_sendfile+0x597/0xd00 fs/read_write.c:1464 __do_sys_sendfile64 fs/read_write.c:1525 [inline] __se_sys_sendfile64 fs/read_write.c:1511 [inline] __x64_sys_sendfile64+0x1dd/0x220 fs/read_write.c:1511 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x441409 Code: e8 ac e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fffb64c4f78 EFLAGS: 00000246 ORIG_RAX: 0000000000000028 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441409 RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000005 RBP: 0000000000073b8a R08: 0000000000000010 R09: 0000000000000010 R10: 0000000000010001 R11: 0000000000000246 R12: 0000000000402180 R13: 0000000000402210 R14: 0000000000000000 R15: 0000000000000000 Kernel Offset: disabled Rebooting in 86400 seconds.. Fixes: 1470ddf7f8ce ("inet: Remove explicit write references to sk/inet in ip_append_data") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * gre: refetch erspan header from skb->data after pskb_may_pull()Cong Wang2019-12-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After pskb_may_pull() we should always refetch the header pointers from the skb->data in case it got reallocated. In gre_parse_header(), the erspan header is still fetched from the 'options' pointer which is fetched before pskb_may_pull(). Found this during code review of a KMSAN bug report. Fixes: cb73ee40b1b3 ("net: ip_gre: use erspan key field for tunnel lookup") Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Acked-by: William Tu <u9012063@gmail.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * pppoe: remove redundant BUG_ON() check in pppoe_pernetAditya Pakki2019-12-071-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | Passing NULL to pppoe_pernet causes a crash via BUG_ON. Dereferencing net in net_generici() also has the same effect. This patch removes the redundant BUG_ON check on the same parameter. Signed-off-by: Aditya Pakki <pakki001@umn.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * Merge branch 'tcp-fix-handling-of-stale-syncookies-timestamps'David S. Miller2019-12-072-8/+32
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Guillaume Nault says: ==================== tcp: fix handling of stale syncookies timestamps The synflood timestamps (->ts_recent_stamp and ->synq_overflow_ts) are only refreshed when the syncookie protection triggers. Therefore, their value can become very far apart from jiffies if no synflood happens for a long time. If jiffies grows too much and wraps while the synflood timestamp isn't refreshed, then time_after32() might consider the later to be in the future. This can trick tcp_synq_no_recent_overflow() into returning erroneous values and rejecting valid ACKs. Patch 1 handles the case of ACKs using legitimate syncookies. Patch 2 handles the case of stray ACKs. Patch 3 annotates lockless timestamp operations with READ_ONCE() and WRITE_ONCE(). Changes from v3: - Fix description of time_between32() (found by Eric Dumazet). - Use more accurate Fixes tag in patch 3 (suggested by Eric Dumazet). Changes from v2: - Define and use time_between32() instead of a pair of time_before32/time_after32 (suggested by Eric Dumazet). - Use 'last_overflow - HZ' as lower bound in tcp_synq_no_recent_overflow(), to accommodate for concurrent timestamp updates (found by Eric Dumazet). - Add a third patch to annotate lockless accesses to .ts_recent_stamp. Changes from v1: - Initialising timestamps at socket creation time is not enough because jiffies wraps in 24 days with HZ=1000 (Eric Dumazet). Handle stale timestamps in tcp_synq_overflow() and tcp_synq_no_recent_overflow() instead. - Rework commit description. - Add a second patch to handle the case of stray ACKs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE()Guillaume Nault2019-12-071-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Syncookies borrow the ->rx_opt.ts_recent_stamp field to store the timestamp of the last synflood. Protect them with READ_ONCE() and WRITE_ONCE() since reads and writes aren't serialised. Use of .rx_opt.ts_recent_stamp for storing the synflood timestamp was introduced by a0f82f64e269 ("syncookies: remove last_synq_overflow from struct tcp_sock"). But unprotected accesses were already there when timestamp was stored in .last_synq_overflow. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * tcp: tighten acceptance of ACKs not matching a child socketGuillaume Nault2019-12-071-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When no synflood occurs, the synflood timestamp isn't updated. Therefore it can be so old that time_after32() can consider it to be in the future. That's a problem for tcp_synq_no_recent_overflow() as it may report that a recent overflow occurred while, in fact, it's just that jiffies has grown past 'last_overflow' + TCP_SYNCOOKIE_VALID + 2^31. Spurious detection of recent overflows lead to extra syncookie verification in cookie_v[46]_check(). At that point, the verification should fail and the packet dropped. But we should have dropped the packet earlier as we didn't even send a syncookie. Let's refine tcp_synq_no_recent_overflow() to report a recent overflow only if jiffies is within the [last_overflow, last_overflow + TCP_SYNCOOKIE_VALID] interval. This way, no spurious recent overflow is reported when jiffies wraps and 'last_overflow' becomes in the future from the point of view of time_after32(). However, if jiffies wraps and enters the [last_overflow, last_overflow + TCP_SYNCOOKIE_VALID] interval (with 'last_overflow' being a stale synflood timestamp), then tcp_synq_no_recent_overflow() still erroneously reports an overflow. In such cases, we have to rely on syncookie verification to drop the packet. We unfortunately have no way to differentiate between a fresh and a stale syncookie timestamp. In practice, using last_overflow as lower bound is problematic. If the synflood timestamp is concurrently updated between the time we read jiffies and the moment we store the timestamp in 'last_overflow', then 'now' becomes smaller than 'last_overflow' and tcp_synq_no_recent_overflow() returns true, potentially dropping a valid syncookie. Reading jiffies after loading the timestamp could fix the problem, but that'd require a memory barrier. Let's just accommodate for potential timestamp growth instead and extend the interval using 'last_overflow - HZ' as lower bound. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * tcp: fix rejected syncookies due to stale timestampsGuillaume Nault2019-12-072-2/+16
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If no synflood happens for a long enough period of time, then the synflood timestamp isn't refreshed and jiffies can advance so much that time_after32() can't accurately compare them any more. Therefore, we can end up in a situation where time_after32(now, last_overflow + HZ) returns false, just because these two values are too far apart. In that case, the synflood timestamp isn't updated as it should be, which can trick tcp_synq_no_recent_overflow() into rejecting valid syncookies. For example, let's consider the following scenario on a system with HZ=1000: * The synflood timestamp is 0, either because that's the timestamp of the last synflood or, more commonly, because we're working with a freshly created socket. * We receive a new SYN, which triggers synflood protection. Let's say that this happens when jiffies == 2147484649 (that is, 'synflood timestamp' + HZ + 2^31 + 1). * Then tcp_synq_overflow() doesn't update the synflood timestamp, because time_after32(2147484649, 1000) returns false. With: - 2147484649: the value of jiffies, aka. 'now'. - 1000: the value of 'last_overflow' + HZ. * A bit later, we receive the ACK completing the 3WHS. But cookie_v[46]_check() rejects it because tcp_synq_no_recent_overflow() says that we're not under synflood. That's because time_after32(2147484649, 120000) returns false. With: - 2147484649: the value of jiffies, aka. 'now'. - 120000: the value of 'last_overflow' + TCP_SYNCOOKIE_VALID. Of course, in reality jiffies would have increased a bit, but this condition will last for the next 119 seconds, which is far enough to accommodate for jiffie's growth. Fix this by updating the overflow timestamp whenever jiffies isn't within the [last_overflow, last_overflow + HZ] range. That shouldn't have any performance impact since the update still happens at most once per second. Now we're guaranteed to have fresh timestamps while under synflood, so tcp_synq_no_recent_overflow() can safely use it with time_after32() in such situations. Stale timestamps can still make tcp_synq_no_recent_overflow() return the wrong verdict when not under synflood. This will be handled in the next patch. For 64 bits architectures, the problem was introduced with the conversion of ->tw_ts_recent_stamp to 32 bits integer by commit cca9bab1b72c ("tcp: use monotonic timestamps for PAWS"). The problem has always been there on 32 bits architectures. Fixes: cca9bab1b72c ("tcp: use monotonic timestamps for PAWS") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * Merge tag 'mlx5-fixes-2019-12-05' of ↵David S. Miller2019-12-0710-75/+143
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2019-12-05 This series introduces some fixes to mlx5 driver. Please pull and let me know if there is any problem. For -stable v4.19: ('net/mlx5e: Query global pause state before setting prio2buffer') For -stable v5.3 ('net/mlx5e: Fix SFF 8472 eeprom length') ('net/mlx5e: Fix translation of link mode into speed') ('net/mlx5e: Fix freeing flow with kfree() and not kvfree()') ('net/mlx5e: ethtool, Fix analysis of speed setting') ('net/mlx5e: Fix TXQ indices to be sequential') ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * net/mlx5e: E-switch, Fix Ingress ACL groups in switchdev mode for prio tagParav Pandit2019-12-052-38/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In cited commit, when prio tag mode is enabled, FTE creation fails due to missing group with valid match criteria. Hence, (a) create prio tag group metadata_prio_tag_grp when prio tag is enabled with match criteria for vlan push FTE. (b) Rename metadata_grp to metadata_allmatch_grp to reflect its purpose. Also when priority tag is enabled, delete metadata settings after deleting ingress rules, which are using it. Tide up rest of the ingress config code for unnecessary labels. Fixes: 10652f39943e ("net/mlx5: Refactor ingress acl configuration") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | | * net/mlx5e: ethtool, Fix analysis of speed settingAya Levin2019-12-051-10/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When setting speed to 100G via ethtool (AN is set to off), only 25G*4 is configured while the user, who has an advanced HW which supports extended PTYS, expects also 50G*2 to be configured. With this patch, when extended PTYS mode is available, configure PTYS via extended fields. Fixes: 4b95840a6ced ("net/mlx5e: Fix matching of speed to PRM link modes") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | | * net/mlx5e: Fix translation of link mode into speedAya Levin2019-12-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a missing value in translation of PTYS ext_eth_proto_oper to its corresponding speed. When ext_eth_proto_oper bit 10 is set, ethtool shows unknown speed. With this fix, ethtool shows speed is 100G as expected. Fixes: a08b4ed1373d ("net/mlx5: Add support to ext_* fields introduced in Port Type and Speed register") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | | * net/mlx5e: Fix free peer_flow when refcount is 0Roi Dayan2019-12-051-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It could be neigh update flow took a refcount on peer flow so sometimes we cannot release peer flow even if parent flow is being freed now. Fixes: 5a7e5bcb663d ("net/mlx5e: Extend tc flow struct with reference counter") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | | * net/mlx5e: Fix freeing flow with kfree() and not kvfree()Roi Dayan2019-12-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Flows are allocated with kzalloc() so free with kfree(). Fixes: 04de7dda7394 ("net/mlx5e: Infrastructure for duplicated offloading of TC flows") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>