summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'imx-drm-next-2017-07-18' of ↵Dave Airlie2017-08-221-11/+42
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.pengutronix.de/git/pza/linux into drm-next imx-drm: lock scanout transfers for consecutive bursts - Lock the IDMAC scanout channel for multiple back-to-back bursts if possible, to improve memory bandwidth utilisation. - Replace a few occurences of state->fb with the already existing local fb variable in ipu_plane_atomic_update * tag 'imx-drm-next-2017-07-18' of git://git.pengutronix.de/git/pza/linux: drm/imx: lock scanout transfers for consecutive bursts drm/imx: ipuv3-plane: use fb local variable instead of state->fb
| * drm/imx: lock scanout transfers for consecutive burstsPhilipp Zabel2017-07-171-3/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because of its shallow queues and limited reordering ability, the i.MX6Q memory controller likes AXI bursts of consecutive addresses a lot. To optimize memory access performance, lock the IPU scanout channels for a number of burst accesses each, before switching to the next channel. The burst size and length of a locked burst chain are chosen not to overshoot the stride. Enabling the 8-burst channel lock on a single 1920x1080@60Hz RGBx scanout (474 MiB/s of 64-byte IPU memory read accesses) reduces the reported memory controller busy cycles from 46% to below 28% on an otherwise idle i.MX6Q. Tested-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
| * drm/imx: ipuv3-plane: use fb local variable instead of state->fbPhilipp Zabel2017-07-171-8/+7
| | | | | | | | | | | | We already have a local variable assigned to state->fb, use it. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
* | Merge tag 'drm-intel-next-2017-08-18' of ↵Dave Airlie2017-08-22121-32713/+4722
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://anongit.freedesktop.org/git/drm-intel into drm-next Final pile of features for 4.14 - New ioctl to change NOA configurations, plus prep (Lionel) - CCS (color compression) scanout support, based on the fancy new modifier additions (Ville&Ben) - Document i915 register macro style (Jani) - Many more gen10/cnl patches (Rodrigo, Pualo, ...) - More gpu reset vs. modeset duct-tape to restore the old way. - prep work for cnl: hpd_pin reorg (Rodrigo), support for more power wells (Imre), i2c pin reorg (Anusha) - drm_syncobj support (Jason Ekstrand) - forcewake vs gpu reset fix (Chris) - execbuf speedup for the no-relocs fastpath, anv/vk low-overhead ftw (Chris) - switch to idr/radixtree instead of the resizing ht for execbuf id->vma lookups (Chris) gvt: - MMIO save/restore optimization (Changbin) - Split workload scan vs. dispatch for more parallel exec (Ping) - vGPU full 48bit ppgtt support (Joonas, Tina) - vGPU hw id expose for perf (Zhenyu) Bunch of work all over to make the igt CI runs more complete/stable. Watch https://intel-gfx-ci.01.org/tree/drm-tip/shards-all.html for progress in getting this ready. Next week we're going into production mode (i.e. will send results to intel-gfx) on hsw, more platforms to come. Also, a new maintainer tram, I'm stepping out. Huge thanks to Jani for being an awesome co-maintainer the past few years, and all the best for Jani, Joonas&Rodrigo as the new maintainers! * tag 'drm-intel-next-2017-08-18' of git://anongit.freedesktop.org/git/drm-intel: (179 commits) drm/i915: Update DRIVER_DATE to 20170818 drm/i915/bxt: use NULL for GPIO connection ID drm/i915: Mark the GT as busy before idling the previous request drm/i915: Trivial grammar fix s/opt of/opt out of/ in comment drm/i915: Replace execbuf vma ht with an idr drm/i915: Simplify eb_lookup_vmas() drm/i915: Convert execbuf to use struct-of-array packing for critical fields drm/i915: Check context status before looking up our obj/vma drm/i915: Don't use MI_STORE_DWORD_IMM on Sandybridge/vcs drm/i915: Stop touching forcewake following a gen6+ engine reset MAINTAINERS: drm/i915 has a new maintainer team drm/i915: Split pin mapping into per platform functions drm/i915/opregion: let user specify override VBT via firmware load drm/i915/cnl: Reuse skl_wm_get_hw_state on Cannonlake. drm/i915/gen10: implement gen 10 watermarks calculations drm/i915/cnl: Fix LSPCON support. drm/i915/vbt: ignore extraneous child devices for a port drm/i915/cnl: Setup PAT Index. drm/i915/edp: Allow alternate fixed mode for eDP if available. drm/i915: Add support for drm syncobjs ...
| * | drm/i915: Update DRIVER_DATE to 20170818Daniel Vetter2017-08-181-2/+2
| | | | | | | | | | | | Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
| * | drm/i915/bxt: use NULL for GPIO connection IDAndy Shevchenko2017-08-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit 213e08ad60ba ("drm/i915/bxt: add bxt dsi gpio element support") enables GPIO support for Broxton based platforms. While using that API we might get into troubles in the future, because we can't rely on label name in the driver since vendor firmware might provide any GPIO pin there, e.g. "reset", and even mark it in _DSD (in which case the request will fail). To avoid inconsistency and potential issues we have two options: a) generate GPIO ACPI mapping table and supply it via acpi_dev_add_driver_gpios(), or b) just pass NULL as connection ID. The b) approach is much simpler and would work since the driver relies on GPIO indices only. Moreover, the _CRS fallback mechanism, when requesting GPIO, has been made stricter, and supplying non-NULL connection ID when neither _DSD, nor GPIO ACPI mapping is present, is making request fail. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101921 Fixes: f10e4bf6632b ("gpio: acpi: Even more tighten up ACPI GPIO lookups") Cc: Mika Kahola <mika.kahola@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Tested-by: Mika Kahola <mika.kahola@intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170817105541.63914-1-andriy.shevchenko@linux.intel.com
| * | drm/i915: Mark the GT as busy before idling the previous requestChris Wilson2017-08-181-43/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In a synchronous setup, we may retire the last request before we complete allocating the next request. As the last request is retired, we queue a timer to mark the device as idle, and promptly have to execute ad cancel that timer once we complete allocating the request and need to keep the device awake. If we rearrange the mark_busy() to occur before we retire the previous request, we can skip this ping-pong. v2: Joonas pointed out that unreserve_seqno() was now doing more than doing seqno handling and should be renamed to reflect its wider purpose. That also highlighted the new asymmetry with reserve_seqno(), so fixup that and rename both to [un]reserve_engine(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170817144719.10968-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
| * | drm/i915: Trivial grammar fix s/opt of/opt out of/ in commentChris Wilson2017-08-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | The word out was dropped from the sentence across the line break, put it back. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-6-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
| * | drm/i915: Replace execbuf vma ht with an idrChris Wilson2017-08-1811-208/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was the competing idea long ago, but it was only with the rewrite of the idr as an radixtree and using the radixtree directly ourselves, along with the realisation that we can store the vma directly in the radixtree and only need a list for the reverse mapping, that made the patch performant enough to displace using a hashtable. Though the vma ht is fast and doesn't require any extra allocation (as we can embed the node inside the vma), it does require a thread for resizing and serialization and will have the occasional slow lookup. That is hairy enough to investigate alternatives and favour them if equivalent in peak performance. One advantage of allocating an indirection entry is that we can support a single shared bo between many clients, something that was done on a first-come first-serve basis for shared GGTT vma previously. To offset the extra allocations, we create yet another kmem_cache for them. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-5-chris@chris-wilson.co.uk
| * | drm/i915: Simplify eb_lookup_vmas()Chris Wilson2017-08-181-86/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the introduction of being able to perform a lockless lookup of an object (i915_gem_object_get_rcu() in fbbd37b36fa5 ("drm/i915: Move object release to a freelist + worker") we no longer need to split the object/vma lookup into 3 phases and so combine them into a much simpler single loop. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-4-chris@chris-wilson.co.uk
| * | drm/i915: Convert execbuf to use struct-of-array packing for critical fieldsChris Wilson2017-08-183-151/+156
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When userspace is doing most of the work, avoiding relocs (using NO_RELOC) and opting out of implicit synchronisation (using ASYNC), we still spend a lot of time processing the arrays in execbuf, even though we now should have nothing to do most of the time. One issue that becomes readily apparent in profiling anv is that iterating over the large execobj[] is unfriendly to the loop prefetchers of the CPU and it much prefers iterating over a pair of arrays rather than one big array. v2: Clear vma[] on construction to handle errors during vma lookup Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-3-chris@chris-wilson.co.uk
| * | drm/i915: Check context status before looking up our obj/vmaChris Wilson2017-08-181-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we keep the context around across the slow lookup where we may drop the struct_mutex, we should double check that the context is still valid upon reacquisition. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-2-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
| * | drm/i915: Don't use MI_STORE_DWORD_IMM on Sandybridge/vcsChris Wilson2017-08-187-15/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MI_STORE_DWORD_IMM just doesn't work on the video decode engine under Sandybridge, so refrain from using it. Then switch the selftests over to using the now common test prior to using MI_STORE_DWORD_IMM. Fixes: 7dd4f6729f92 ("drm/i915: Async GPU relocation processing") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.13-rc1+ Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
| * | drm/i915: Stop touching forcewake following a gen6+ engine resetChris Wilson2017-08-181-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Forcewake is not affected by the engine reset on gen6+. Indeed the reason why we added intel_uncore_forcewake_reset() to gen6_reset_engines() was to keep the bookkeeping intact because the reset did not touch the forcewake bit (yet we cancelled the forcewake consumers)! This was done in commit 521198a2e7095: Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Fri Aug 23 16:52:30 2013 +0300 drm/i915: sanitize forcewake registers on reset In reset we try to restore the forcewake state to pre reset state, using forcewake_count. The reset doesn't seem to clear the forcewake bits so we get warn on forcewake ack register not clearing. That futzing of the forcewake bookkeeping was dropped in commit 0294ae7b44bb ("drm/i915: Consolidate forcewake resetting to a single function"), but it did not make the realisation that the remaining intel_uncore_forcewake_reset() was redundant. The new danger with using intel_uncore_forcewake_reset() with per-engine resets is that the driver and hw are still in an active state as we perform the reset. We may be using the forcewake to read protected registers elsewhere and those results may be clobbered by the concurrent dropping of forcewake. Reported-by: Michel Thierry <michel.thierry@intel.com> Fixes: 142bc7d99bcf ("drm/i915: Modify error handler for per engine hang recovery") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170817173229.20324-1-chris@chris-wilson.co.uk Reviewed-by: Michel Thierry <michel.thierry@intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
| * | MAINTAINERS: drm/i915 has a new maintainer teamDaniel Vetter2017-08-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For a bunch of reasons[1] I've decided to step down as maintainer and let some other folks enjoy the reputation and hang out in the spotlight. Jani is going to stick around with his expertise in kms and having done the fixes flow for a long time now. Joonas will join and bring in his knowledge on all things GEM. Rodrigo has been less visible because he's been doing tons of work taking care of the internal branch, and it'd be good to have more continuity between these two worlds also on the maintainer side. 1: They all boil down to: This is going to happen sooner or later anyway, we have a great team, with the process improvements over the last few years things work rather well, now is as good as any time to do this. With that change I'll have more time for other aspects of the stack development than maintainership. Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Dave Airlie <airlied@gmail.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Jani Nikula <jani.nikula@linux.intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170815160101.1683-1-daniel.vetter@ffwll.ch
| * | drm/i915: Split pin mapping into per platform functionsAnusha Srivatsa2017-08-171-22/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cleanup the code. Map the pins in accordance to individual platforms rather than according to ports. Create separate functions for platforms. v2: - Add missing condition for CoffeeLake. Make platform specific functions static. Add function i915_ddc_pin_mapping(). v3: - Rename functions to x_port_to_ddc_pin() which directly indicates the purpose. Correct default return values on CNP and BXT. Rename i915_port_to_ to g4x_port_to since that was the first platform to run this. Correct code style. (Paulo) Sugested-by Ville Syrjala <ville.syrjala@linux.intel.com> Cc: Ville Syrjala <ville.syrjala@linux.intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Clinton Taylor <clinton.a.taylor@intel.com> Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1502927114-24012-1-git-send-email-anusha.srivatsa@intel.com
| * | drm/i915/opregion: let user specify override VBT via firmware loadJani Nikula2017-08-174-0/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sometimes it would be most enlightening to debug systems by replacing the VBT to be used. For example, in the referenced bug the BIOS provides different VBT depending on the boot mode (UEFI vs. legacy). It would be interesting to try the failing boot mode with the VBT from the working boot, and see if that makes a difference. Add a module parameter to load the VBT using the firmware loader, not unlike the EDID firmware mechanism. As a starting point for experimenting, one can pick up the BIOS provided VBT from /sys/kernel/debug/dri/0/i915_opregion/i915_vbt. v2: clarify firmware load return value check (Bob) v3: kfree the loaded firmware blob References: https://bugs.freedesktop.org/show_bug.cgi?id=97822#c83 Reviewed-by: Bob Paauwe <bob.j.paauwe@intel.com> Acked-by: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170817115209.25912-1-jani.nikula@intel.com
| * | drm/i915/cnl: Reuse skl_wm_get_hw_state on Cannonlake.Rodrigo Vivi2017-08-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise it reuses the ilk that has a completely different wm. Cc: Mahesh Kumar <mahesh1.kumar@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170809205248.11917-6-rodrigo.vivi@intel.com
| * | drm/i915/gen10: implement gen 10 watermarks calculationsPaulo Zanoni2017-08-161-4/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | They're slightly different than the gen 9 calculations. v2: Remove TODO comment. Code matches recent spec. v3: Rebase on top of latest skl code using new fp16.16 and fixing a logic issue. Auto rebase bot has apparently made some bad decisions that changed the logic of the code. (Noticed by Manesh, updated by Rodrigo). Cc: Mahesh Kumar <mahesh1.kumar@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Mahesh Kumar <mahesh1.kumar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170811233825.32083-1-rodrigo.vivi@intel.com
| * | drm/i915/cnl: Fix LSPCON support.Rodrigo Vivi2017-08-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When LSPCON support was extended to CNL one part was missed on lspcon_init. So, instead of adding check per platform on lspcon_init let's use HAS_LSPCON that is already there for that purpose. Fixes: ff15947e0f02 ("drm/i915/cnl: LSPCON support is gen9+") Cc: Shashank Sharma <shashank.sharma@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Shashank Sharma <shashank.sharma@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170816030403.11368-1-rodrigo.vivi@intel.com
| * | drm/i915/vbt: ignore extraneous child devices for a portJani Nikula2017-08-161-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ever since we've parsed VBT child devices, starting from 6acab15a7b0d ("drm/i915: use the HDMI DDI buffer translations from VBT"), we've ignored the child device information if more than one child device references the same port. The rationale for this seems lost in time. Since commit 311a20949f04 ("drm/i915: don't init DP or HDMI when not supported by DDI port") we started using this information more to skip HDMI/DP init if the port wasn't there per VBT child devices. However, at the same time it added port defaults without further explanation. Thus, if the child device info was skipped due to multiple child devices referencing the same port, the device info would be retrieved from the somewhat arbitrary defaults. Finally, when commit bb1d132935c2 ("drm/i915/vbt: split out defaults that are set when there is no VBT") stopped initializing the defaults whenever VBT is present, thus trusting the VBT more, we stopped initializing ports which were referenced by more than one child device. Apparently at least Asus UX305UA, UX305U, and UX306U laptops have VBT child device blocks which cause this behaviour. Arguably they were shipped with a broken VBT. Relax the rules for multiple references to the same port, and use the first child device info to reference a port. Retain the logic to debug log about this, though. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101745 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196233 Fixes: bb1d132935c2 ("drm/i915/vbt: split out defaults that are set when there is no VBT") Tested-by: Oliver Weißbarth <mail@oweissbarth.de> Reported-by: Oliver Weißbarth <mail@oweissbarth.de> Reported-by: Didier G <didierg-divers@orange.fr> Reported-by: Giles Anderson <agander@gmail.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: <stable@vger.kernel.org> # v4.12+ Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170811113907.6716-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
| * | drm/i915/cnl: Setup PAT Index.Rodrigo Vivi2017-08-162-2/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Different from previous platforms, on CNL+ there's separated registers for separated indexes. v2: Remove comments regarding uncertainty around the table. v3: Remove extra line (by Ben) Cc: Clint Taylor <clinton.a.taylor@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170815232539.3562-1-rodrigo.vivi@intel.com
| * | drm/i915/edp: Allow alternate fixed mode for eDP if available.Jim Bride2017-08-166-8/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some fixed resolution panels actually support more than one mode, with the only thing different being the refresh rate. Having this alternate mode available to us is desirable, because it allows us to test PSR on panels whose setup time at the preferred mode is too long. With this patch we allow the use of the alternate mode if it's available and it was specifically requested. v2 and v3: Rebase v4: * Fix up some leaky mode stuff (Chris) * Rebase v5: * Fix a NULL pointer derefrence (David Weinehall) v6: * Whitespace / spelling / checkpatch clean-up; no functional change. (David) * Rebase Cc: David Weinehall <david.weinehall@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: David Weinehall <david.weinehall@linux.intel.com> Signed-off-by: Jim Bride <jim.bride@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1502308133-26892-1-git-send-email-jim.bride@linux.intel.com
| * | drm/i915: Add support for drm syncobjsJason Ekstrand2017-08-153-8/+178
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds support for waiting on or signaling DRM syncobjs as part of execbuf. It does so by hijacking the currently unused cliprects pointer to instead point to an array of i915_gem_exec_fence structs which containe a DRM syncobj and a flags parameter which specifies whether to wait on it or to signal it. This implementation theoretically allows for both flags to be set in which case it waits on the dma_fence that was in the syncobj and then immediately replaces it with the dma_fence from the current execbuf. v2: - Rebase on new syncobj API v3: - Pull everything out into helpers - Do all allocation in gem_execbuffer2 - Pack the flags in the bottom 2 bits of the drm_syncobj* v4: - Prevent a potential race on syncobj->fence Testcase: igt/gem_exec_fence/syncobj* Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Link: https://patchwork.freedesktop.org/patch/msgid/1499289202-25441-1-git-send-email-jason.ekstrand@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170815145733.4562-1-chris@chris-wilson.co.uk
| * | drm/i915: Handle full s64 precision for wait-ioctlChris Wilson2017-08-152-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The wait-ioctl is optionally supplied a timeout with nanosecond precision in a s64 field. We use nsecs_to_jiffies64() to convert that into the jiffies consumed by the scheduler, but internally nsecs_to_jiffies64() does not guard against overflow (as it's purpose is for use by the scheduler and not drivers!). So we must guard against the overflow ourselves, and in the process note that we may then return much earlier than the timeout selected by the user, so don't report ETIME unless we do hit the timeout. (Woe betold us though if the user waits for a year (32bit) and the request is still not complete!) v2: Refine overflow detection (to not include an overffow itself) Reported-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20170811105731.9482-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
| * | drm/i915: Split obj->cache_coherent to track r/wChris Wilson2017-08-1511-32/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Another month, another story in the cache coherency saga. This time, we come to the realisation that i915_gem_object_is_coherent() has been reporting whether we can read from the target without requiring a cache invalidate; but we were using it in places for testing whether we could write into the object without requiring a cache flush. So split the tracking into two, one to decide before reads, one after writes. See commit e27ab73d17ef ("drm/i915: Mark CPU cache as dirty on every transition for CPU writes") for the previous entry in this saga. v2: Be verbose v3: Remove unused function (i915_gem_object_is_coherent) v4: Fix inverted coherency check prior to execbuf (from v2) v5: Add comment for nasty code where we are optimising on gcc's behalf. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101109 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101555 Testcase: igt/kms_mmap_write_crc Testcase: igt/kms_pwrite_crc Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Dongwon Kim <dongwon.kim@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Tested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170811111116.10373-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
| * | drm/i915/hsw+: Add support for multiple power well regsImre Deak2017-08-154-33/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Future platforms increase the number of power wells which require additional control registers. A convenient way to select the correct register is to use the high bits of the power well ID as index. This patch only prepares for this, while upcoming platform enabling patches will add the actual new power well IDs and corresponding power well control registers. Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Animesh Manna <animesh.manna@intel.com> Cc: Rakshmi Bhatia <rakshmi.bhatia@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Animesh Manna <animesh.manna@intel.com> Reviewed-by: Rakshmi Bhatia <rakshmi.bhatia@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170814151530.24154-2-imre.deak@intel.com
| * | drm/i915: Work around GCC anonymous union initialization bugImre Deak2017-08-151-24/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GCC 4.4 can't cope with anonymous union initializers which seems to be a bug in that version (see the Reference) and is fixed since GCC version 4.6. A workaround which is also used elsewhere in the kernel for the same purpose is to wrap the initialization in curly braces, so do the same here. Fixes: b5565a2efc12 ("drm/i915/bxt, glk: Give a proper name to the power well struct phy field") Reference: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10676 Reported-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170814151530.24154-1-imre.deak@intel.com
| * | Merge tag 'gvt-next-2017-08-15' of https://github.com/01org/gvt-linux into ↵Daniel Vetter2017-08-1518-133/+297
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | drm-intel-next-queued gvt-next-2017-08-15 gvt update for 4.14 - MMIO save/restore optimization (Changbin) - Split workload scan vs. dispatch for more parallel exec (Ping) - vGPU full 48bit ppgtt support (Joonas, Tina) - vGPU hw id expose for perf (Zhenyu) - other misc cleanup and fixes Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20170815023940.skhjfcsyrao7axqi@zhen-hp.sh.intel.com
| | * | drm/i915/gvt: Fix guest i915 full ppgtt blocking issueTina Zhang2017-08-152-17/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Guest i915 full ppgtt functionality was blocking by an issue, which would lead to gpu hardware hang. Guest i915 driver may update the ppgtt table just before this workload is going to be submitted to the hardware by device model. This case wasn't handled well by device model before, due to the small time window between removing old ppgtt entry and adding the new one. Errors occur when the workload is executed by hardware during that small time window. This patch is to remove this time window by adding the new ppgtt entry first and then remove the old one. Changes in v2: - Move VGT_CAPS_FULL_PPGTT introduction to patch 2/4. (Joonas) Changes since v2: - Divide the whole patch set into two separate patch series, with one patch in i915 side to check guest i915 full ppgtt capability and enable it when this capability is supported by the device model, and the other one in gvt side which fixs the blocking issue and enables the device model to provide the capability to guest. And this patch focuses on gvt side. (Joonas) - Change the title from "reorder the shadow ppgtt update process by adding entry first" to "Fix guest i915 full ppgtt blocking issue". (Tina) Changes since v3: - Rebase to the latest branch. Changes since v4: - Tested by Tina Zhang. Changes since v5: - Rebase to the latest branch. v6: - Update full 48bit ppgtt definition Cc: Tina Zhang <tina.zhang@intel.com> Signed-off-by: Tina Zhang <tina.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| | * | drm/i915: Enable guest i915 full ppgtt functionalityTina Zhang2017-08-155-3/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable the guest i915 full ppgtt functionality when host can provide this capability. vgt_caps is introduced to guest i915 driver to get the vgpu capabilities from the device model. VGT_CPAS_FULL_PPGTT is one of the capabilities type to let guest i915 dirver know that the guest i915 full ppgtt is supported by device model. Notice that the minor version of pvinfo isn't bumped because of this vgt_caps introduction, due to older guest would be broken by simply increasing the pvinfo version. Although the pvinfo minor version doesn't increase, the compatibility won't be blocked. The compatibility is ensured by checking the value of caps field in pvinfo. Zero means no full ppgtt support and BIT(2) means this feature is provided. Changes since v1: - Use u32 instead of uint32_t (Joonas) - Move VGT_CAPS_FULL_PPGTT introduction to this patch and use #define instead of enum (Joonas) - Rewrite the vgpu full ppgtt capability checking logic. (Joonas) - Some coding style refine. (Joonas) Changes since v2: - Divide the whole patch set into two separate patch series, with one patch in i915 side to check guest i915 full ppgtt capability and enable it when this capability is supported by the device model, and the other one in gvt side which fixs the blocking issue and enables the device model to provide the capability to guest. And this patch focuses on guest i915 side. (Joonas) - Change the title from "introduce vgt_caps to pvinfo" to "Enable guest i915 full ppgtt functionality". (Tina) Change since v3: - Add some comments about pvinfo caps and version. (Joonas) Change since v4: - Tested by Tina Zhang. Change since v5: - Add limitation about supporting 32bit full ppgtt. Change since v6: - Change the fallback to 48bit full ppgtt if i915.ppgtt_enable=2. (Zhenyu) Change in v9: - Remove the fixme comment due to no plan for 32bit full ppgtt support. (Zhenyu) - Reorder the patch-set to fix compiling issue with git-bisect. (Zhenyu) - Add print log when forcing guest 48bit full ppgtt. (Zhenyu) v10: - Update against Joonas's has_full_ppgtt and has_full_48bit_ppgtt disconnect change. (Zhenyu) Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> # in v2 Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tina Zhang <tina.zhang@intel.com> Signed-off-by: Tina Zhang <tina.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| | * | drm/i915: Disconnect 32 and 48 bit ppGTT supportJoonas Lahtinen2017-08-151-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Configurations like virtualized environments may support only 48 bit ppGTT without supporting 32 bit ppGTT. Support this by disconnecting the relationship of the two feature bits. Cc: Tina Zhang <tina.zhang@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| | * | drm/i915/gvt: Add shadow context descriptor updatingKechen Lu2017-08-103-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current context logic only updates the descriptor of context when it's being pinned to graphics memory space. But this cannot satisfy the requirement of shadow context. The addressing mode of the pinned shadow context descriptor may be changed according to the guest addressing mode. And this won't be updated, as the already pinned shadow context has no chance to update its descriptor. And this will lead to GPU hang issue, as shadow context is used with wrong descriptor. This patch fixes this issue by letting the pinned shadow context descriptor update its addressing mode on demand. This patch fixes GPU HANG issue which happends after changing the grub parameter i915.enable_ppgtt form 0x01 to 0x03 or vice versa and then rebooting the guest. Signed-off-by: Tina Zhang <tina.zhang@intel.com> Signed-off-by: Kechen Lu <kechen.lu@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| | * | drm/i915/gvt: expose vGPU context hw idZhenyu Wang2017-08-101-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This exposes vGPU context hw id in mdev sysfs which is used to do vGPU based profiling. Retrieved vGPU context hw id can be set through i915 perf ioctl to set profiling for target vGPU. Cc: Jiao Pengyuan <pengyuan.jiao@intel.com> Cc: Niu Bing <bing.niu@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| | * | drm/i915/gvt: Refine the intel_vgpu_reset_gtt reset functionChuanxiao Dong2017-08-103-9/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When doing the VGPU reset, we don't need to do the gtt/ppgtt reset. This will make the GVT to do the ppgtt shadow every time for a workload and caused really bad performance after a VGPU reset. This patch will make sure ppgtt clean only happen at device module level reset to fix this. Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| | * | drm/i915/gvt: Add carefully checking in GTT walker pathsChangbin Du2017-08-102-37/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When debugging the gtt code, found the intel_vgpu_gma_to_gpa() can translate any given GMA though the GMA is not valid. This because the GTT ops suppress the possible errors, which may result in an invalid PT entry is retrieved by upper caller. This patch changed the prototype of pte ops to propagate status to callers. Then we make sure the GTT walker stop as early as when a error is detected to prevent undefined behavior. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| | * | drm/i915/gvt: Remove duplicated MMIO entriesJian Jun Chen2017-08-101-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove duplicated MMIO entries in the tracked MMIO list. -EEXIST is returned if duplicated MMIO entries are found when new MMIO entry is added. v2: - Use WARN(1, ...) for more verbose message. (Zhenyu) Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Changbin Du <changbin.du@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Reviewed-by: Yulei Zhang <yulei.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| | * | drm/i915/gvt: take runtime pm when do early scan and shadowZhenyu Wang2017-08-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Need to take runtime pm when do early scan/shadow of workload for request operations. Fixes: 7fa56bd159bc ("drm/i915/gvt: Audit and shadow workload during ELSP writing") Cc: Ping Gao <ping.a.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| | * | drm/i915/gvt: Replace duplicated code with exist functionPing Gao2017-08-101-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the exist function intel_gvt_ggtt_validate_range to replace these duplicated code that do the same thing. Signed-off-by: Ping Gao <ping.a.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| | * | drm/i915/gvt: To check whether workload scan and shadow has mutex holdPing Gao2017-08-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function workload scan and shadow have to hold the drm.struct_mutex before called. To avoid misusing of this function, add a lockdep assert in it. Signed-off-by: Ping Gao <ping.a.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| | * | drm/i915/gvt: Audit and shadow workload during ELSP writingPing Gao2017-08-103-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let the workload audit and shadow ahead of vGPU scheduling, that will eliminate GPU idle time and improve performance for multi-VM. The performance of Heaven running simultaneously in 3VMs has improved 20% after this patch. v2:Remove condition current->vgpu==vgpu when shadow during ELSP writing. Signed-off-by: Ping Gao <ping.a.gao@intel.com> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| | * | drm/i915/gvt: Factor out scan and shadow from workload dispatchPing Gao2017-08-104-35/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To perform the workload scan and shadow in ELSP writing stage for performance consideration, the workload scan and shadow stuffs should be factored out from dispatch_workload(). v2:Put context pin before i915_add_request; Refine the comments; Rename some APIs; v3:workload->status should set only when error happens. v4:i915_add_request is must to have after i915_gem_request_alloc. Signed-off-by: Ping Gao <ping.a.gao@intel.com> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| | * | drm/i915/gvt: Optimize ring siwtch 2x faster again by light weight mmio ↵Changbin Du2017-08-101-13/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | access wrapper The I915_READ/WRITE is not only a mmio read/write, it also contains debug checking and Forcewake domain lookup. This is too heavy for GVT ring switch case which access batch of mmio registers on ring switch. We can handle Forcewake manually and use the raw i915_read/write instead. The benefit from this is 2x faster mmio switch performance. Before After cycles ~550000 ~250000 v2: Use existing I915_READ_FW/I915_WRITE_FW macro. (zhenyu) Signed-off-by: Changbin Du <changbin.du@intel.com> Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| | * | drm/i915/gvt: Optimize ring siwtch 2x faster by removing unnecessary ↵Changbin Du2017-08-101-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POSTING_READ There are lots of POSTING_READ alongside each mmio write Op. While actually this is not necessary. It just bring too much latency since PCIe read Op is very slow which is of non-posted transaction. For PCIe device, the mem transaction for strong ordering rules are: o PCIe mmio write sequence is FIFO. Posted request cannot pass previous posted request. o PCIe mmio read will not go ahead of previous write. Intel graphics doesn't support RO, so we can apply above rules. In our case, we only need one POSTING_READ at last. This can remove half of mmio read Op and then the average ring switch performance is nearly doubled. Before After cycles ~970000 ~550000 Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| | * | drm/i915/gvt: Use gvt_err to print the resource not enough errorChuanxiao Dong2017-08-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is better to use gvt_err when the gvt resource is not enough so the user can be notified from the kernel dmesg. And this kind of error message is gvt related. Suggested-by: Bing Niu <bing.niu@intel.com> Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Cc: Bing Niu <bing.niu@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| * | | drm/i915: Initialize 'data' in intel_dsi_dcs_backlight.cBalasubramaniam, Hari Chand2017-08-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | variable 'data' may be used uninitialized in this function. thus, 'function dcs_get_backlight' will return unwanted value/fail. Thus, adding NULL initialized to 'data' variable will solve the return failure happening. v2: Change commit message to reflect upstream with proper message Fixes: 90198355b83c ("drm/i915/dsi: Add DCS control for Panel PWM") Cc: Jani Nikula <jani.nikula@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Yetunde Adebisi <yetundex.adebisi@intel.com> Cc: Deepak M <m.deepak@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Balasubramaniam, Hari Chand <hari.chand.balasubramaniam@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1502762746-191826-1-git-send-email-hari.chand.balasubramaniam@intel.com
| * | | drm/i915/dp: Validate the compliance test link parametersManasi Navare2017-08-151-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Validate the compliance test link parameters when the compliance test dpcd registers are read. Also validate them in compute_config before using them since the max values might have been reduced due to link training fallback. If either the link rate or lane count is invalid, we still bail from using the test parameters since the combination would not work and instead use the fallback values. v2: * Added commit message to explain why we still bail when either of of the params is invalid (Ville Syrjala) * Add reason for validating in the comment (Jani Nikula) * Also check if index >= 0 after validating (Jani Nikula) Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Ville Syrjala <ville.syrjala@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Tested-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1496954463-18038-2-git-send-email-manasi.d.navare@intel.com
| * | | drm/i915/dp: Generalize intel_dp_link_params function to accept arguments to ↵Manasi Navare2017-08-151-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | be validated This function now takes the link rate and lane ocunt to be validated as an argument so that this can be used for validating even the compliance test link parameters. Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> Cc: Ville Syrjala <ville.syrjala@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com> Tested-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1496954463-18038-1-git-send-email-manasi.d.navare@intel.com
| * | | drm/i915: More surgically unbreak the modeset vs reset deadlockDaniel Vetter2017-08-144-10/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's no reason to entirely wedge the gpu, for the minimal deadlock bugfix we only need to unbreak/decouple the atomic commit from the gpu reset. The simplest way to fix that is by replacing the unconditional fence wait a the top of commit_tail by a wait which completes either when the fences are done (normal case, or when a reset doesn't need to touch the display state). Or when the gpu reset needs to force-unblock all pending modeset states. The lesser source of deadlocks is when we try to pin a new framebuffer and run into a stall. There's a bunch of places this can happen, like eviction, changing the caching mode, acquiring a fence on older platforms. And we can't just break the depency loop and keep going, the only way would be to break out and restart. But the problem with that approach is that we must stall for the reset to complete before we grab any locks, and with the atomic infrastructure that's a bit tricky. The only place is the ioctl code, and we don't want to insert code into e.g. the BUSY ioctl. Hence for that problem just create a critical section, and if any code is in there, wedge the GPU. For the steady-state this should never be a problem. Note that in both cases TDR itself keeps working, so from a userspace pov this trickery isn't observable. Users themselvs might spot a short glitch while the rendering is catching up again, but that's still better than pre-TDR where we've thrown away all the rendering, including innocent batches. Also, this fixes the regression TDR introduced of making gpu resets deadlock-prone when we do need to touch the display. One thing I noticed is that gpu_error.flags seems to use both our own wait-queue in gpu_error.wait_queue, and the generic wait_on_bit facilities. Not entirely sure why this inconsistency exists, I just picked one style. A possible future avenue could be to insert the gpu reset in-between ongoing modeset changes, which would avoid the momentary glitch. But that's a lot more work to implement in the atomic commit machinery, and given that we only need this for pre-g4x hw, of questionable utility just for the sake of polishing gpu reset even more on those old boxes. It might be useful for other features though. v2: Rebase onto 4.13 with a s/wait_queue_t/struct wait_queue_entry/. v3: Really emabarrassing fixup, I checked the wrong bit and broke the unbreak/wakeup logic. v4: Also handle deadlocks in pin_to_display. v5: Review from Michel: - Fixup the BUILD_BUG_ON - Don't forget about the overlay Cc: Michel Thierry <michel.thierry@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v2) Cc: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170808080828.23650-3-daniel.vetter@ffwll.ch Reviewed-by: Michel Thierry <michel.thierry@intel.com>
| * | | drm/i915: Push i915_sw_fence_wait into the nonblocking atomic commitDaniel Vetter2017-08-141-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Blocking in a worker is ok, that's what the unbound_wq is for. And it unifies the paths between the blocking and nonblocking commit, giving me just one path where I have to implement the deadlock avoidance trickery in the next patch. I first tried to implement the following patch without this rework, but force-completing i915_sw_fence creates some serious challenges around properly cleaning things up. So wasn't a feasible short-term approach. Another approach would be to simple keep track of all pending atomic commit work items and manually queue them from the reset code. With the caveat that double-queue in case we race with the i915_sw_fence must be avoided. Given all that, taking the cost of a double schedule in atomic for the short-term fix is the best approach, but can be changed in the future of course. v2: Amend commit message (Chris). v3: Add comment explaining why we do nothing in the sw_fence complete callback (Michel). Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v2) Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170808080828.23650-2-daniel.vetter@ffwll.ch