summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* drm/i915: Put back lane_count into intel_dp and add link_rate tooVille Syrjälä2015-08-264-31/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With MST there won't be a crtc assigned to the main link encoder, so trying to dig up the pipe_config from there is a recipe for an oops. Instead store the parameters (lane_count and link_rate) in the encoder, and use those values during link training etc. Since those parameters are now assigned only when the link is actually enabled, .compute_config() won't clobber them as it did before. Hardware state readout is still bonkers though as we don't transfer the link parameters from pipe_config intel_dp. We should do that during encoder sanitation. But since we don't even do a proper job of reading out the main link encoder state for MST there's littel point in worrying about this now. Fixes a regression with MST caused by: commit 90a6b7b052b1aa17fbb98b049e9c8b7f729c35a7 Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Mon Jul 6 16:39:15 2015 +0300 drm/i915: Move intel_dp->lane_count into pipe_config v2: Different apporoach that should keep intel_dp_check_mst_status() somewhat less oopsy Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/bxt: don't allow cached GEM mappings on A steppingImre Deak2015-08-261-0/+9
| | | | | | | | | | | | | | | | | | Due to a coherency issue on BXT A steppings we can't guarantee a coherent view of cached (CPU snooped) GPU mappings, so fail such requests. User space is supposed to fall back to uncached mappings in this case. v2: - limit the WA to A steppings, on later stepping this HW issue is fixed v3: - return error instead of trying to work around the issue in kernel, since that could confuse user space (Chris) Testcast: igt/gem_store_dword_batches_loop/cached-mapping Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/bxt: work around HW coherency issue when accessing GPU seqnoImre Deak2015-08-262-8/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | By running igt/store_dword_loop_render on BXT we can hit a coherency problem where the seqno written at GPU command completion time is not seen by the CPU. This results in __i915_wait_request seeing the stale seqno and not completing the request (not considering the lost interrupt/GPU reset mechanism). I also verified that this isn't a case of a lost interrupt, or that the command didn't complete somehow: when the coherency issue occured I read the seqno via an uncached GTT mapping too. While the cached version of the seqno still showed the stale value the one read via the uncached mapping was the correct one. Work around this issue by clflushing the corresponding CPU cacheline following any store of the seqno and preceding any reading of it. When reading it do this only when the caller expects a coherent view. v2: - fix using the proper logical && instead of a bitwise & (Jani, Mika) - limit the workaround to A stepping, on later steppings this HW issue is fixed v3: - use a separate get_seqno/set_seqno vfunc (Chris) Testcase: igt/store_dword_loop_render Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Also call frontbuffer flip when disabling planes.Rodrigo Vivi2015-08-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | We also need to call the frontbuffer flip to trigger proper invalidations when disabling planes. Otherwise we will miss screen updates when disabling sprites or cursor. On core platforms where HW tracking also works, this issue is totally masked because HW tracking triggers PSR exit however on VLV/CHV that has only SW tracking we miss screen updates when disabling planes. It was caught with kms_psr_sink_crc sprite_plane_onoff and cursor_plane_onoff subtests running on VLV/CHV. This is probably a regression since I can also get this with the manual test case, but with so many changes on atomic modeset I couldn't track exactly when this was introduced. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Change SRM, LRM instructions to use correct lengthArun Siluvery2015-08-264-13/+13
| | | | | | | | | | | | | | | | MI_STORE_REGISTER_MEM, MI_LOAD_REGISTER_MEM instructions are not really variable length instructions unlike MI_LOAD_REGISTER_IMM where it expects (reg, addr) pairs so use fixed length for these instructions. v2: rebase Cc: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: Appease checkpatch as Mika spotted in i915_reg.h - it seems terminally unhappy about i915_cmd_parser.c so that would be a separate patch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* doc: drm: Fix mis-spelling of i915_guc_submission includesGraham Whaley2015-08-251-2/+2
| | | | | | | | | | | | | | | In commit d1675198e: drm/i915: Integrate GuC-based command submission the drm.tmpl include lines reference the intel_guc_submission.c but the patch adds the file i915_guc_submission.c. drm.tmpl fails to build with: docproc: .//drivers/gpu/drm/i915/intel_guc_submission.c: No such file or directory Change the file reference to the actual file. Signed-off-by: Graham Whaley <graham.whaley@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: remove excessive scaler debugging messagesJani Nikula2015-08-142-5/+0
| | | | | | | | There's so much scaler debugging messages that it makes other debugging hard. Remove them. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Debugfs interface for GuC submission statisticsDave Gordon2015-08-141-0/+76
| | | | | | | | | | | | | | | | | | | This provides a means of reading status and counts relating to GuC actions and submissions. v2: Remove surplus blank line in output [Chris Wilson] v5: Added GuC per-engine submission & seqno statistics v6: Add per-ring statistics to client, refactor client-dumper. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Alex Dai <yu.dai@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Integrate GuC-based command submissionAlex Dai2015-08-147-29/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GuC-based submission is mostly the same as execlist mode, up to intel_logical_ring_advance_and_submit(), where the context being dispatched would be added to the execlist queue; at this point we submit the context to the GuC backend instead. There are, however, a few other changes also required, notably: 1. Contexts must be pinned at GGTT addresses accessible by the GuC i.e. NOT in the range [0..WOPCM_SIZE), so we have to add the PIN_OFFSET_BIAS flag to the relevant GGTT-pinning calls. 2. The GuC's TLB must be invalidated after a context is pinned at a new GGTT address. 3. GuC firmware uses the one page before Ring Context as shared data. Therefore, whenever driver wants to get base address of LRC, we will offset one page for it. LRC_PPHWSP_PN is defined as the page number of LRCA. 4. In the work queue used to pass requests to the GuC, the GuC firmware requires the ring-tail-offset to be represented as an 11-bit value, expressed in QWords. Therefore, the ringbuffer size must be reduced to the representable range (4 pages). v2: Defer adding #defines until needed [Chris Wilson] Rationalise type declarations [Chris Wilson] v4: Squashed kerneldoc patch into here [Daniel Vetter] v5: Update request->tail in code common to both GuC and execlist modes. Add a private version of lr_context_update(), as sharing the execlist version leads to race conditions when the CPU and the GuC both update TAIL in the context image. Conversion of error-captured HWS page to string must account for offset from start of object to actual HWS (LRC_PPHWSP_PN). Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Interrupt routing for GuC submissionDave Gordon2015-08-142-2/+60
| | | | | | | | | | | | | Turn on interrupt steering to route necessary interrupts to GuC. v6: Rebased Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Implementation of GuC submission clientDave Gordon2015-08-144-3/+725
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A GuC client has its own doorbell and workqueue. It maintains the doorbell cache line, process description object and work queue item. A default guc_client is created for the i915 driver to use for normal-priority in-order submission. Note that the created client is not yet ready for use; doorbell allocation will fail as we haven't yet linked the GuC's context descriptor to the default contexts for each ring (see later patch). v2: Defer adding structure members until needed [Chris Wilson] Rationalise type declarations [Chris Wilson] v5: Add GuC per-engine submission & seqno statistics. Move wq locking to encompass both get_space() and add_item(). Take forcewake lock in host2guc_action() [Tom O'Rourke] v6: Fix GuC doorbell cacheline selection code (the cacheline-within-page calculation was wrong). Rename GuC priorities to make them closer to the names used in the GuC firmware source, matching what the autogenerated versions will (probably) be. Add per-ring statistics to client. Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Enable GuC firmware logAlex Dai2015-08-143-0/+76
| | | | | | | | | | | | | | | | | Allocate a GEM object to hold GuC log data. A debugfs interface (i915_guc_log_dump) is provided to print out the log content. v2: Add struct members at point of use [Chris Wilson] v6: Rebased Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Prepare for GuC-based command submissionAlex Dai2015-08-144-1/+148
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds the first of the data structures used to communicate with the GuC (the pool of guc_context structures). We create a GuC-specific wrapper round the GEM object allocator as all GEM objects shared with the GuC must be pinned into GGTT space at an address that is NOT in the range [0..WOPCM_TOP), as that range of GGTT addresses is not accessible to the GuC (from the GuC's point of view, it's permanently reserved for other objects such as the BootROM & SRAM). Later, we will need to allocate additional GuC-sharable objects for the submission client(s) and the GuC's debug log. v2: Remove redundant initialisation [Chris Wilson] Defer adding struct members until needed [Chris Wilson] Local functions should pass dev_priv rather than dev [Chris Wilson] v5: Invalidate GuC TLB after allocating and pinning a new object v6: Rebased Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Expose one LRC function for GuC submission modeDave Gordon2015-08-142-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GuC submission is basically execlist submission, but with the GuC handling the actual writes to the ELSP and the resulting context switch interrupts. So to describe a context for submission via the GuC, we need one of the same functions used in execlist mode. This commit exposes one such function, changing its name to better describe what it does (it's related to logical ring contexts rather than to execlists per se). v2: Replaces previous "drm/i915: Move execlists defines from .c to .h" v3: Incorporates a change to one of the functions exposed here that was previously part of an internal patch, but which was omitted from the version recently committed to drm-intel-nightly: 7a01a0a drm/i915/lrc: Update PDPx registers with lri commands So we reinstate this change here. v4: Drop v3 change, update function parameters due to collision with 8ee3615 drm/i915: Convert execlists_ctx_descriptor() for requests v5: Don't expose execlists_update_context() after all. The current version is no longer compatible with GuC submission; trying to share the execlist version of this function results in both GuC and CPU updating TAIL in the context image, with bad results when they get out of step. The GuC submission path now has its own private version that just updates the ringbuffer start address, and not TAIL or PDPx. v6: Rebased Issue: VIZ-4884 Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Debugfs interface to read GuC load statusAlex Dai2015-08-141-0/+39
| | | | | | | | | | | | | | | | | | The new node provides access to the status of the GuC-specific loader; also the scratch registers used for communication between the i915 driver and the GuC firmware. v2: Changes to output formats per Chris Wilson's suggestions v6: Rebased Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: GuC-specific firmware loaderAlex Dai2015-08-149-9/+650
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fetches the required firmware image from the filesystem, then loads it into the GuC's memory via a dedicated DMA engine. This patch is derived from GuC loading work originally done by Vinit Azad and Ben Widawsky. v2: Various improvements per review comments by Chris Wilson v3: Removed 'wait' parameter to intel_guc_ucode_load() as firmware prefetch is no longer supported in the common firmware loader, per Daniel Vetter's request. Firmware checker callback fn now returns errno rather than bool. v4: Squash uC-independent code into GuC-specifc loader [Daniel Vetter] Don't keep the driver working (by falling back to execlist mode) if GuC firmware loading fails [Daniel Vetter] v5: Clarify WOPCM-related #defines [Tom O'Rourke] Delete obsolete code no longer required with current h/w & f/w [Tom O'Rourke] Move the call to intel_guc_ucode_init() later, so that it can allocate GEM objects, and have it fetch the firmware; then intel_guc_ucode_load() doesn't need to fetch it later. [Daniel Vetter]. v6: Update comment describing intel_guc_ucode_load() [Tom O'Rourke] Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Kill intel_dp->{link_bw, rate_select}Ville Syrjälä2015-08-143-27/+27
| | | | | | | | | | | | | We only need the link_bw/rate_select parameters when starting link training, and they should be computed based on the currently active config, so throw them out from intel_dp and just compute on demand. Toss in an extra debug print to see rate_select in addition to link_bw, as the latter may be 0 for eDP 1.4. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Don't use link_bw to select between TP1 and TP3Ville Syrjälä2015-08-141-2/+2
| | | | | | | | | intel_dp->link_bw is going away, so consul the port_clock instead when choosing between TP1 and TP3. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Move intel_dp->lane_count into pipe_configVille Syrjälä2015-08-146-28/+61
| | | | | | | | | | | | Currently we clobber intel_dp->lane_count in compute config, which means after a rejected modeset we may no longer be able to retrain the current link. Move lane_count into pipe_config to avoid that. v2: Add missing ':' to the pipe config debug dump Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Avoid confusion between DP and TRANS_DP_CTL in DP .get_config()Ville Syrjälä2015-08-141-3/+4
| | | | | | | | | Use a separate variable for the TRANS_DP_CTL value instead of reusing 'tmp' that otherwise contains the DP port register value. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Don't pass clock to DDI PLL select functionsVille Syrjälä2015-08-141-10/+10
| | | | | | | | | | All the *_ddi_pll_select() functions get passed the port_clock and pipe config as parameters. We only need to pass the pipe config, and the functions can dig up the port_clock themselves. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Don't use link_bw for PLL setupVille Syrjälä2015-08-142-29/+26
| | | | | | | | | | | | | Use port_clock instead of link_bw when picking the PLL parameters for DP. link_bw may be zero with an eDP 1.4 sink that supports DP_LINK_RATE_SET so we shouldn't use it for anything other than feed it to the sink appropriately. v2: Fix typo in commit message (Sivakumar) Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Clean up DP/HDMI limited color range handlingVille Syrjälä2015-08-143-29/+26
| | | | | | | | | | | | | | | Currently we treat intel_{dp,hdmi}->color_range as partly user controller value (via the property) but we also change it during .compute_config() when using the "Automatic" mode. That is a bit confusing, so let's just change things so that we store the user property values in intel_dp, and only change what's stored in pipe_config during .compute_config(). There should be no functional change. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Do not check or a stalled pageflip prior to it being queuedChris Wilson2015-08-141-0/+3
| | | | | | | | | | When we queue the command or operation to change the scanout address, we mark the flip as in progress. We can use this flag to prevent us from checking for a stalled flip prior to its existence! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: clflush on pin_to_display after pwrite to UC bo in LLCVille Syrjälä2015-08-141-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently we don't clflush on pin_to_display if the bo is already UC/WT and is not in the CPU write domain. This causes problems with pwrite since pwrite doesn't change the write domain, and it avoids clflushing on UC/WT buffers on LLC platforms unless the buffer is currently being scanned out. Fix the problem by marking the cache dirty and adjusting i915_gem_object_set_cache_level() to clflush when the cache is dirty even if the cache_level doesn't change. My last attempt [1] at fixing this via write domain frobbing was shot down, but now with the cache_dirty flag we can do things in a nicer way. [1] http://lists.freedesktop.org/archives/intel-gfx/2014-November/055390.html v2: Drop the I915_CACHE_NONE/WT checks from pwrite Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86422 Testcase: igt/kms_pwrite_crc Testcase: igt/gem_pwrite_snooped Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/bxt: WA for swapped HPD pins in A steppingSonika Jindal2015-08-143-2/+19
| | | | | | | | | | | | | | | | | | | WA for BXT A0/A1, where DDIB's HPD pin is swapped to DDIA, so enabling DDIA HPD pin in place of DDIB. v2: For DP, irq_port is used to determine the encoder instead of hpd_pin and removing the edp HPD logic because port A HPD is not present(Imre) v3: Rebased on top of Imre's patchset for enabling HPD on PORT A. Added hpd_pin swapping for intel_dp_init_connector, setting encoder for PORT_A as per the WA in irq_port (Imre) v4: Dont enable interrupt for edp, also reframe the description (Siva) v5: Don’t check for PORT_A in intel_ddi_init to update dig_port, instead avoid setting hpd_pin itself (Imre) Signed-off-by: Sonika Jindal <sonika.jindal@intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/bxt: Add HPD support for DDIASonika Jindal2015-08-141-7/+3
| | | | | | | | Also remove redundant comments. Signed-off-by: Sonika Jindal <sonika.jindal@intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Always pass dev pointer in pdp_initMichel Thierry2015-08-141-1/+1
| | | | | | | | | And fix 0-DAY kernel test infrastructure warning. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Use complete virtual address range on 32-bit platformsMichel Thierry2015-08-141-8/+0
| | | | | | | | | | | | | | | | With the offset length being taken care of in ("drm/i915/gtt: Allow >= 4GB offsets in X86_32"), the code should be finally safe in 32-bit kernels. This reverts commit 501fd70fcaebc911b6b96a7b331e6960e5af67e7 Author: Michel Thierry <michel.thierry@intel.com> Date: Fri May 29 14:15:05 2015 +0100 drm/i915: limit PPGTT size to 2GB in 32-bit platforms Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/gtt: Allow >= 4GB offsets in X86_32Michel Thierry2015-08-146-26/+20
| | | | | | | | | | | | | | | | | | | | | | Similar to commit c44ef60e437019b8ca1dab8b4d2e8761fd4ce1e9 ("drm/i915/gtt: Allow >= 4GB sizes for vm"), i915_gem_obj_offset and i915_gem_obj_ggtt_offset return an unsigned long, which in only 4-bytes long in 32-bit kernels. Change return type (and other related offset variables) to u64. Since Global GTT is always limited to 4GB, this change would not be required in i915_gem_obj_ggtt_offset, but this is done for consistency. v2: Remove unnecessary offset variable in do_pin, as we already have vma->node.start (Chris). Update GGTT offset too (Tvrtko). Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Dont -ETIMEDOUT on identical new and previous (count, crc).Rodrigo Vivi2015-08-141-6/+15
| | | | | | | | | | | | | | | | | By Vesa DP 1.2 spec TEST_CRC_COUNT is a "4 bit wrap counter which increments each time the TEST_CRC_x_x are updated." However if we are trying to verify the screen hasn't changed we get same (count, crc) pair twice. Without this patch we would return -ETIMEOUT in this case. So, if in 6 vblanks the pair (count, crc) hasn't changed we return it anyway instead of returning error and let test case decide if it was right or not. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Save latest known sink CRC to compensate delayed counter reset.Rodrigo Vivi2015-08-142-17/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By Vesa DP 1.2 Spec TEST_CRC_COUNT should be "reset to 0 when TEST_SINK bit 0 = 0." However for some strange reason when PSR is enabled in certain platforms this is not true. At least not immediatelly. So we face cases like this: first get_sink_crc operation: count: 0, crc: 000000000000 count: 1, crc: c101c101c101 returned expected crc: c101c101c101 secont get_sink_crc operation: count: 1, crc: c101c101c101 count: 0, crc: 000000000000 count: 1, crc: 0000c1010000 should return expected crc: 0000c1010000 But also the reset to 0 should be faster resulting into: get_sink_crc operation: count: 1, crc: c101c101c101 count: 1, crc: 0000c1010000 should return expected crc: 0000c1010000 So in order to know that the second one is valid one we need to compare the pair (count, crc) with latest (count, crc). If the pair changed you have your valid CRC. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Force sink crc stop before start.Rodrigo Vivi2015-08-142-3/+20
| | | | | | | | | | | | By Vesa DP spec, test counter at DP_TEST_SINK_MISC just reset to 0 when unsetting DP_TEST_SINK_START, so let's force this stop here. But let's minimize the aux transactions and just do it when we know it hasn't been properly stoped. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/userptr: Kill user_size limit checkMichel Thierry2015-08-141-4/+0
| | | | | | | | | | | | | | | | | | GTT was only 32b and its max value is 4GB. In order to allow objects bigger than 4GB in 48b PPGTT, i915_gem_userptr_ioctl we could check against max 48b range (1ULL << 48). But since the check no longer applies, just kill the limit. v2: Use the default ctx to infer the ppgtt max size (Akash). v3: Just kill the limit, it was only there for early detection of an error when used for execbuffer (Chris). Cc: Akash Goel <akash.goel@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: batch_obj vm offset must be u64Michel Thierry2015-08-141-1/+1
| | | | | | | | | | | | Otherwise it can overflow in 48-bit mode, and cause an incorrect exec_start. Before commit 5f19e2bffa63a91cd4ac1adcec648e14a44277ce ("drm/i915: Merged the many do_execbuf() parameters into a structure"), it was already an u64. Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: object size needs to be u64Michel Thierry2015-08-141-2/+3
| | | | | | | | | In a 48b world, users can try to allocate buffers bigger than 4GB; in these cases it is important that size is a 64b variable. v2: Drop the warning about bind with size 0, it shouldn't happen anyway. Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/gen8: Add ppgtt info and debug_dumpMichel Thierry2015-08-142-8/+94
| | | | | | | | | | | | | | v2: Clean up patch after rebases. v3: gen8_dump_ppgtt for 32b and 48b PPGTT. v4: Use used_pml4es/pdpes (Akash). v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series. v6: Rely on used_px bits instead of null checking (Akash) Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Expand error state's address width to 64bMichel Thierry2015-08-142-12/+16
| | | | | | | | | | | | | v2: For semaphore errors, object is mapped to GGTT and offset will not be > 4GB, print only lower 32-bits (Akash) v3: Print gtt_offset in groups of 32-bit (Chris) Cc: Akash Goel <akash.goel@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/gen8: Initialize PDPs and PML4Michel Thierry2015-08-142-0/+39
| | | | | | | | | | | | | | | | | | | | | | | Similar to PDs, while setting up a page directory pointer, make all entries of the pdp point to the scratch pd before mapping (and make all its entries point to the scratch page); this is to be safe in case of out of bound access or proactive prefetch. Also add a scratch pdp, which the PML4 entries point to. v2: Handle scratch_pdp allocation failure correctly, and keep initialize_px functions together (Akash) v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series. Rely on the added macros to initialize the pdps. v4: Rebase after final merged version of Mika's ppgtt/scratch patches (and removed commit message part related to v3). v5: Update commit message to also mention PML4 table initialization and the new scratch pdp (Akash). Suggested-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/gen8: Add 4 level support in insert_entries and clear_rangeMichel Thierry2015-08-141-16/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When 48b is enabled, gen8_ppgtt_insert_entries needs to read the Page Map Level 4 (PML4), before it selects which Page Directory Pointer (PDP) it will write to. Similarly, gen8_ppgtt_clear_range needs to get the correct PDP/PD range. This patch was inspired by Ben's "Depend exclusively on map and unmap_vma". v2: Rebase after s/page_tables/page_table/. v3: Remove unnecessary pdpe loop in gen8_ppgtt_clear_range_4lvl and use clamp_pdp in gen8_ppgtt_insert_entries (Akash). v4: Merge gen8_ppgtt_clear_range_4lvl into gen8_ppgtt_clear_range to maintain symmetry with gen8_ppgtt_insert_entries (Akash). v5: Do not mix pages and bytes in insert_entries (Akash). v6: Prevent overflow in sg_nents << PAGE_SHIFT, when inserting 4GB at once. v7: Rebase after Mika's ppgtt cleanup / scratch merge patch series. Use gen8_px_index functions, and remove unnecessary number of pages parameter in insert_pte_entries. v8: Change gen8_ppgtt_clear_pte_range to stop at PDP boundary, instead of adding and extra clamp function; remove unnecessary pdp_start/pdp_len variables (Akash). v9: pages->orig_nents instead of sg_nents(pages->sgl) to get the length (Akash). v10: Remove pdp warning check ingen8_ppgtt_insert_pte_entries until this commit (Akash). Reviewed-by: Akash Goel <akash.goel@intel.com> (v9) Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/gen8: Pass sg_iter through pte insertsMichel Thierry2015-08-141-5/+6
| | | | | | | | | | | | | | | | | | | As a step towards implementing 4 levels, while not discarding the existing pte insert functions, we need to pass the sg_iter through. The current function understands to the page directory granularity. An object's pages may span the page directory, and so using the iter directly as we write the PTEs allows the iterator to stay coherent through a VMA insert operation spanning multiple page table levels. v2: Rebase after s/page_tables/page_table/. v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series; updated commit message (s/map/insert). v4: Rebase. Reviewed-by: Akash Goel <akash.goel@intel.com> (v3) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/gen8: Add 4 level switching infrastructure and lrc supportMichel Thierry2015-08-143-23/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In 64b (48bit canonical) PPGTT addressing, the PDP0 register contains the base address to PML4, while the other PDP registers are ignored. In LRC, the addressing mode must be specified in every context descriptor, and the base address to PML4 is stored in the reg state. v2: PML4 update in legacy context switch is left for historic reasons, the preferred mode of operation is with lrc context based submission. v3: s/gen8_map_page_directory/gen8_setup_page_directory and s/gen8_map_page_directory_pointer/gen8_setup_page_directory_pointer. Also, clflush will be needed for bxt. (Akash) v4: Squashed lrc-specific code and use a macro to set PML4 register. v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series. PDP update in bb_start is only for legacy 32b mode. v6: Rebase after final merged version of Mika's ppgtt/scratch patches. v7: There is no need to update the pml4 register value in execlists_update_context. (Akash) v8: Move pd and pdp setup functions to a previous patch, they do not belong here. (Akash) v9: Check USES_FULL_48BIT_PPGTT instead of GEN8_CTX_ADDRESSING_MODE in gen8_emit_bb_start to check if emit pdps is needed. (Akash) Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/gen8: implement alloc/free for 4lvlMichel Thierry2015-08-143-16/+246
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PML4 has no special attributes, and there will always be a PML4. So simply initialize it at creation, and destroy it at the end. The code for 4lvl is able to call into the existing 3lvl page table code to handle all of the lower levels. v2: Return something at the end of gen8_alloc_va_range_4lvl to keep the compiler happy. And define ret only in one place. Updated gen8_ppgtt_unmap_pages and gen8_ppgtt_free to handle 4lvl. v3: Use i915_dma_unmap_single instead of pci API. Fix a couple of incorrect checks when unmapping pdp and pd pages (Akash). v4: Call __pdp_fini also for 32b PPGTT. Clean up alloc_pdp param list. v5: Prevent (harmless) out of range access in gen8_for_each_pml4e. v6: Simplify alloc_vma_range_4lvl and gen8_ppgtt_init_common error paths. (Akash) v7: Rebase, s/gen8_ppgtt_free_*/gen8_ppgtt_cleanup_*/. v8: Change location of pml4_init/fini. It will make next patches cleaner. v9: Rebase after Mika's ppgtt cleanup / scratch merge patch series, while trying to reuse as much as possible for pdp alloc. pml4_init/fini replaced by setup/cleanup_px macros. v10: Rebase after Mika's merged ppgtt cleanup patch series. v11: Rebase after final merged version of Mika's ppgtt/scratch patches. v12: Fix pdpe start value in trace (Akash) v13: Define all 4lvl functions in this patch directly, instead of previous patches, add i915_page_directory_pointer_entry_alloc here, use test_bit to detect when pdp is already allocated (Akash). v14: Move pdp allocation into a new gen8_ppgtt_alloc_page_dirpointers funtion, as we do for pds and pts; move pd and pdp setup functions to this patch (Akash). v15: Added kfree(pdp) from previous patch to this (Akash). Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/gen8: Add PML4 structureMichel Thierry2015-08-143-16/+40
| | | | | | | | | | | | | | | | | | | | Introduces the Page Map Level 4 (PML4), ie. the new top level structure of the page tables. To facilitate testing, 48b mode will be available on Broadwell and GEN9+, when i915.enable_ppgtt = 3. v2: Remove unnecessary CONFIG_X86_64 checks, ppgtt code is already 32/64-bit safe (Chris). v3: Add goto free_scratch in temp 48-bit mode init code (Akash). v4: kfree the pdp until the 4lvl alloc/free patch (Akash). v5: Postpone 48-bit code in sanitize_enable_ppgtt (Akash). v6: Keep _insert_pte_entries changes outside this patch (Akash). Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/gen8: Add dynamic page trace eventsMichel Thierry2015-08-142-8/+22
| | | | | | | | | | | | | | | | | | | The dynamic page allocation patch series added it for GEN6, this patch adds them for GEN8. v2: Consolidate pagetable/page_directory events v3: Multiple rebases. v4: Rebase after s/page_tables/page_table/. v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series. v6: Rebase after gen8_map_pagetable_range removal. v7: Use generic page name (px) in DECLARE_EVENT_CLASS (Akash) v8: Defer define of i915_page_directory_pointer_entry_alloc (Akash) Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+) Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/gen8: Generalize PTE writing for GEN8 PPGTTMichel Thierry2015-08-141-12/+41
| | | | | | | | | | | | | | | | | | | The insert_entries function was the function used to write PTEs. For the PPGTT it was "hardcoded" to only understand two level page tables, which was the case for GEN7. We can reuse this for 4 level page tables, and remove the concept of insert_entries, which was never viable past 2 level page tables anyway, but it requires a bit of rework to make the function a bit more generic. v2: Rebase after Mika's ppgtt cleanup / scratch merge patch series. v3: Rebase after final merged version of Mika's ppgtt/scratch patches. v4: Check and warn for NULL value of pdp pointer (Akash). Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2) Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/gen8: Abstract PDP usageMichel Thierry2015-08-141-40/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up until now, ppgtt->pdp has always been the root of our page tables. Legacy 32b addresses acted like it had 1 PDP with 4 PDPEs. In preparation for 4 level page tables, we need to stop using ppgtt->pdp directly unless we know it's what we want. The future structure will use ppgtt->pml4 for the top level, and the pdp is just one of the entries being pointed to by a pml4e. The temporal pdp local variable will be removed once the rest of the 4-level code lands. Also, start passing the vm pointer to the alloc functions, instead of ppgtt. v2: Updated after dynamic page allocation changes. v3: Rebase after s/page_tables/page_table/. v4: Rebase after changes in "Dynamic page table allocations" patch. v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series. v6: Rebase after final merged version of Mika's ppgtt/scratch patches. v7: Keep pagetable map in-line (and avoid unnecessary for_each_pde loops), remove redundant ppgtt pointer in _alloc_pagetabs (Akash) v8: Fix text indentation in _alloc_pagetabs/page_directories (Chris) v9: Defer gen8_alloc_va_range_4lvl definition until 4lvl is implemented, clean-up gen8_ppgtt_cleanup [pun intended] (Akash). v10: Clean-up commit message (Akash). Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: "Akash Goel" <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915/gen8: Make pdp allocation more dynamicMichel Thierry2015-08-142-23/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This transitional patch doesn't do much for the existing code. However, it should make upcoming patches to use the full 48b address space a bit easier. 32-bit ppgtt uses just 4 PDPs, while 48-bit ppgtt will have up-to 512; this patch prepares the existing functions to query the right number of pdps at run-time. This also means that used_pdpes should also be allocated during ppgtt_init, as the bitmap size will depend on the ppgtt address range selected. v2: Renamed pdp_free to be similar to pd/pt (unmap_and_free_pdp). v3: To facilitate testing, 48b mode will be available on Broadwell and GEN9+, when i915.enable_ppgtt = 3. v4: Rebase after s/page_tables/page_table/, added extra information about 4-level page table formats and use IS_ENABLED macro. v5: Check CONFIG_X86_64 instead of CONFIG_64BIT. v6: Rebase after Mika's ppgtt cleanup / scratch merge patch series, and follow his nomenclature in pdp functions (there is no alloc_pdp yet). v7: Rebase after merged version of Mika's ppgtt cleanup patch series. v8: Rebase after final merged version of Mika's ppgtt/scratch patches. v9: Introduce PML4 (and 48-bit checks) until next patch (Akash). v10: Also use test_bit to detect when pd/pt are already allocated (Akash) Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Akash Goel <akash.goel@intel.com> [danvet: Amend commit message as suggested by Michel.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Remove unnecessary gen8_clamp_pdMichel Thierry2015-08-142-12/+1
| | | | | | | | | | | | | | | | gen8_clamp_pd clamps to the next page directory boundary, but the macro gen8_for_each_pde already has a check to stop at the page directory boundary. Furthermore, i915_pte_count also restricts to the next page table boundary. v2: Rebase after Mika's ppgtt cleanup / scratch merge patch series. Suggested-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: "Akash Goel" <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* drm/i915: Per-DDI I_boost overrideAntti Koskipaa2015-08-144-8/+63
| | | | | | | | | | | | | An OEM may request increased I_boost beyond the recommended values by specifying an I_boost value to be applied to all swing entries for a port. These override values are specified in VBT. v2: rebase and remove unused iboost_bit variable Issue: VIZ-5676 Signed-off-by: Antti Koskipaa <antti.koskipaa@linux.intel.com> Reviewed-by: David Weinehall <david.weinehall@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>