summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915 (follow)
Commit message (Collapse)AuthorAgeFilesLines
* drm/i915/execlists: Delay writing to ELSP until HW has processed the ↵Michel Thierry2017-11-202-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | previous write The hardware needs some time to process the information received in the ExecList Submission Port, and expects us to not write anything more until it has 'acknowledged' this new submission by sending an IDLE_ACTIVE or PREEMPTED CSB event. If we do not follow this, the driver could write new data into the ELSP before HW had finishing fetching the previous one, putting us in 'undefined behaviour' space. This seems to be the problem causing the spurious PREEMPTED & COMPLETE events after a COMPLETE like the one below: [] vcs0: sw rd pointer = 2, hw wr pointer = 0, current 'head' = 3. [] vcs0: Execlist CSB[0]: 0x00000018 _ 0x00000007 [] vcs0: Execlist CSB[1]: 0x00000001 _ 0x00000000 [] vcs0: Execlist CSB[2]: 0x00000018 _ 0x00000007 <<< COMPLETE [] vcs0: Execlist CSB[3]: 0x00000012 _ 0x00000007 <<< PREEMPTED & COMPLETE [] vcs0: Execlist CSB[4]: 0x00008002 _ 0x00000006 [] vcs0: Execlist CSB[5]: 0x00000014 _ 0x00000006 The ELSP writes that lead to this CSB sequence show that the HW hadn't started executing the previous execlist (the one with only ctx 0x6) by the time the new one was submitted; this is a bit more clear in the data show in the EXECLIST_STATUS register at the time of the ELSP write. [] vcs0: ELSP[0] = 0x0_0 [execlist1] - status_reg = 0x0_302 [] vcs0: ELSP[1] = 0x6_fedb2119 [execlist0] - status_reg = 0x0_8302 [] vcs0: ELSP[2] = 0x7_fedaf119 [execlist1] - status_reg = 0x0_8308 [] vcs0: ELSP[3] = 0x6_fedb2119 [execlist0] - status_reg = 0x7_8308 Note that having to wait for this ack does not disable lite-restores, although it may reduce their numbers. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102035 Signed-off-by: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/<20171118003038.7935-1-michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-4-chris@chris-wilson.co.uk Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915/selftest: Make guc clients staticChris Wilson2017-11-201-1/+1
| | | | | | | | | | | | | Make the private array used for stashing test clients static, to silence sparse. References: 55bd6bd75717 ("drm/i915/selftests: Add a GuC doorbells selftest") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120132606.4254-1-chris@chris-wilson.co.uk Reviewed-by: Michel Thierry <michel.thierry@intel.com>
* drm/i915/perf: reuse timestamp frequency from device infoLionel Landwerlin2017-11-203-35/+9
| | | | | | | | | | | | | | | | | | Now that we have this stored in the device info, we can drop it from perf part of the driver. Note that this requires to init perf after we've computed the frequency, hence why we move i915_perf_init() from i915_driver_init_early() to after intel_device_info_runtime_init(). v2: Use div_u64 (Chris) v3: Drop u64 divs by switching to kHz (Chris/Ville) Move i915_perf_fini to i915_driver_cleanup_hw (Matthew) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171113181902.12411-2-lionel.g.landwerlin@intel.com
* drm/i915: Automatic i915_switch_context for legacyChris Wilson2017-11-2010-43/+14
| | | | | | | | | | | | | During request construction, after pinning the context we know whether or not we have to emit a context switch. So move this common operation from every caller into i915_gem_request_alloc() itself. v2: Always submit the request if we emitted some commands during request construction, as typically it also involves changes in global state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120102002.22254-2-chris@chris-wilson.co.uk
* drm/i915: Pull the unconditional GPU cache invalidation into request ↵Chris Wilson2017-11-207-37/+20
| | | | | | | | | | | | | | | | | construction As the request will, in the following patch, implicitly invoke a context-switch on construction, we should precede that with a GPU TLB invalidation. Also, even before using GGTT, we always want to invalidate the TLBs for any updates (as well as the ppgtt invalidates that are unconditionally applied by execbuf). Since we almost always require the TLB invalidate, do it unconditionally on request allocation and so we can remove it from all other paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120102002.22254-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
* drm/i915/perf: replace .reg accesses with i915_mmio_reg_offsetLionel Landwerlin2017-11-201-15/+24
| | | | | | | | | | This replaces accesses to the reg field of the i915_reg_t structure with the i915_mmio_reg_offset() inline function. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ewelina Musial <ewelina.musial@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171113233455.12085-2-lionel.g.landwerlin@intel.com
* drm/i915/execlists: Assert that we don't get mixed IDLE_ACTIVE | COMPLETE eventsChris Wilson2017-11-201-0/+3
| | | | | | | | | | | | | | | | If IDLE_ACTIVE is set, then all other bits are invalid. For us, we can assert that if we see a COMPLETE | PREEMPTED event, then it should be impossible for it to also contain an IDLE_ACTIVE flag. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
* drm/i915/execlists: Reduce completed event mask to COMPLETE | PREEMPTEDChris Wilson2017-11-201-3/+3
| | | | | | | | | | | | | | | | | | | Since we get a COMPLETE event when the context switch occurs on RING_HEAD == RING_TAIL and a PREEMPTED event when a switch occurs before that point, COMPLETE | PREEMPTED should cover all possible context switch completion events. We can move the ELEMENT_SWITCH info message from the COMPLETED_MASK into an assertion for when we are performing a switch to port[1]. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-2-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
* drm/i915/execlists: Listen to COMPLETE context event not ACTIVE_IDLEChris Wilson2017-11-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit e1fee72c2ea2e9c0c6e6743d32a6832f21337d6c Author: Oscar Mateo <oscar.mateo@intel.com> Date: Thu Jul 24 17:04:40 2014 +0100 drm/i915/bdw: Avoid non-lite-restore preemptions execlists has listened to (ACTIVE_IDLE | ELEMENT_SWITCH) for detecting when one context completed and it either continued onto the next (in port 1) or idled. We would always see COMPLETE | ACTIVE_IDLE on the final context-switch event, but on recent gen it appears that we now get separate ACTIVE_IDLE and COMPLETE events. In particular, the ACTIVE_IDLE events may not be coupled to a context (since it is a general state rather than a specific context completion event). v2: Update the history, execlists did originally start out by listening to the COMPLETE event not ACTIVE_IDLE. v3: Update preempt completion test to also use COMPLETE not ACTIVE_IDLE. References: bspec/12255 References: https://bugs.freedesktop.org/show_bug.cgi?id=103800 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Acked-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-1-chris@chris-wilson.co.uk
* drm/i915: Fix init_clock_gating for resumeVille Syrjälä2017-11-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Moving the init_clock_gating() call from intel_modeset_init_hw() to intel_modeset_gem_init() had an unintended effect of not applying some workarounds on resume. This, for example, cause some kind of corruption to appear at the top of my IVB Thinkpad X1 Carbon LVDS screen after hibernation. Fix the problem by explicitly calling init_clock_gating() from the resume path. I really hope this doesn't break something else again. At least the problems reported at https://bugs.freedesktop.org/show_bug.cgi?id=103549 didn't make a comeback, even after a hibernate cycle. v2: Reorder the init_clock_gating vs. modeset_init_hw to match the display reset path (Rodrigo) Cc: stable@vger.kernel.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Fixes: 6ac43272768c ("drm/i915: Move init_clock_gating() back to where it was") Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171116160215.25715-1-ville.syrjala@linux.intel.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
* drm/i915: Update DRIVER_DATE to 20171117Rodrigo Vivi2017-11-171-2/+2
| | | | Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* drm/i915: Add a policy note for removing workaroundsChris Wilson2017-11-171-0/+5
| | | | | | | | | | | | | | | | | | | | | | Rodrigo gave a persuasive argument for keeping workarounds: that they serve as a good guide for the bring up of the next generation. Not only do workarounds persist into the early revisions, they show where the workarounds were previously added to the code flow and sometimes the old workarounds have an explanation that give insight into their wider implications. Based on his suggestion, document the policy that we want to keep the workarounds from the current generation to guide the next. Older preproduction workarounds we still want to remove to keep the code clean. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171117102635.8689-1-chris@chris-wilson.co.uk Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
* drm/i915/selftests: Report ENOMEM clearly for an allocation failureChris Wilson2017-11-171-10/+26
| | | | | | | | | | | | | | | | | | If we can not run the drunk_hole test because we couldn't allocate the memory for the permutation array (even after we tried trimming the size), report a clear ENOMEM. Similary, if we are asked to operate on a hole too small for ourselves, make it skip quietly. v2: Avoid malloc(0) since that returns ZERO_SIZE_PTR not NULL. v3: Fixup similar construction for lowlevel_hole v4: Use u64 >> 1 to avoid 64b div. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171117101732.4335-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171117162945.16390-1-chris@chris-wilson.co.uk
* Revert "drm/i915: Display WA #1133 WaFbcSkipSegments:cnl, glk"Radhakrishna Sripada2017-11-172-15/+0
| | | | | | | | | | | | | | This reverts commit 8f067837c4b713ce2e69be95af7b2a5eb3bd7de8. HSD says "WA withdrawn. It was causing corruption with some images. WA is not strictly necessary since this bug just causes loss of FBC compression with some sizes and images, but doesn't break anything." Fixes: 8f067837c4b7 ("drm/i915: Display WA #1133 WaFbcSkipSegments:cnl, glk") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171117010825.23118-1-radhakrishna.sripada@intel.com
* drm/i915: Calculate g4x intermediate watermarks correctlyMaarten Lankhorst2017-11-171-7/+20
| | | | | | | | | | | | | | | | | | | | | The watermarks it should calculate against are the old optimal watermarks. The currently active crtc watermarks are pure fiction, and are invalid in case of a nonblocking modeset, page flip enabling/disabling planes or any other reason. When the crtc is disabled or during a modeset the intermediate watermarks don't need to be programmed separately, and could be directly assigned to the optimal watermarks. CXSR must always be disabled in the intermediate case for modesets, else we get a WARN for vblank wait timeout. Also rename crtc_state to new_crtc_state, to distinguish it from the old state. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171115163157.14372-2-maarten.lankhorst@linux.intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
* drm/i915: Calculate vlv/chv intermediate watermarks correctly, v3.Maarten Lankhorst2017-11-171-6/+18
| | | | | | | | | | | | | | | | | | | | | | | | | The watermarks it should calculate against are the old optimal watermarks. The currently active crtc watermarks are pure fiction, and are invalid in case of a nonblocking modeset, page flip enabling/disabling planes or any other reason. When the crtc is disabled or during a modeset the intermediate watermarks don't need to be programmed separately, and could be directly assigned to the optimal watermarks. CXSR must always be disabled in the intermediate case for modesets, else we get a WARN for vblank wait timeout. Also rename crtc_state to new_crtc_state, to distinguish it from the old state. Changes since v1: - Use intel_atomic_get_old_crtc_state. (ville) Changes since v2: - Always unset cxsr during modeset. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171115163157.14372-1-maarten.lankhorst@linux.intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
* drm/i915: Pass crtc_state to ips toggle functions, v2Maarten Lankhorst2017-11-174-17/+21
| | | | | | | | | Changes since v1: - Only pass crtc_state, not crtc. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171110113503.16253-8-maarten.lankhorst@linux.intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
* drm/i915: Pass idle crtc_state to intel_dp_sink_crcMaarten Lankhorst2017-11-173-13/+30
| | | | | | | | | | | | IPS can only be enabled if the primary plane is visible, so first make sure sw state matches hw state by waiting for hw_done. After this pass crtc_state to intel_dp_sink_crc() so that can be used, instead of using legacy pointers. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171110113503.16253-7-maarten.lankhorst@linux.intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
* drm/i915: Enable FIFO underrun reporting after initial fastset, v4.Maarten Lankhorst2017-11-171-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | The firmware may have set up the pipe correctly, but the FIFO underrun and CRC interrupts are likely not enabled. This resulted in debugfs_test.read_all_entries failing on haswell, because of a timeout when reading the crc debugfs entry. Solve this by enabling FIFO underrun reporting after the initial fastset, which lets interrupts be generated as expected. Changes since v1: - Always enable CPU FIFO underrun reporting for >GEN2, and handle GEN2 correctly. Changes since v2: - Remove unneeded HAS_DDI, simplify GEN2 case. Changes since v3: - Use intel_crtc_pch_transcoder to determine pch transcoder for underruns. (Ville) - Remove crtc->config dereference in intel_crtc_pch_transcoder. (Ville) Testcase: debugfs_test.read_all_entries Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171113144043.58658-1-maarten.lankhorst@linux.intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
* drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIMChris Wilson2017-11-171-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 21cc6431e0c2 ("drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM") tried to fixup the check_flush_dependency warning for hitting i915_gem_userptr_mn_invalidate_range_start from within the shrinker, but I failed to notice userptr has 2 similarly named workqueues. I marked up i915-userptr-acquire as WQ_MEM_RECLAIM whereas we only wait upon i915-userptr-release from inside the reclaim paths. [62530.869510] workqueue: PF_MEMALLOC task 7983(gem_shrink) is flushing !WQ_MEM_RECLAIM i915-userptr-release: (null) [62530.869515] ------------[ cut here ]------------ [62530.869519] WARNING: CPU: 1 PID: 7983 at kernel/workqueue.c:2434 check_flush_dependency+0x7f/0x110 [62530.869519] Modules linked in: pegasus mii ip6table_filter ip6_tables bnep iptable_filter snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic binfmt_misc nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_intel snd_hda_codec kvm_intel snd_hda_core snd_hwdep kvm snd_pcm irqbypass snd_seq_midi snd_seq_midi_event snd_rawmidi crct10dif_pclmul crc32_pclmul 8250_dw ghash_clmulni_intel snd_seq pcbc snd_seq_device snd_timer btusb aesni_intel btrtl btbcm aes_x86_64 iwlwifi btintel crypto_simd glue_helper cryptd bluetooth snd intel_cstate input_leds idma64 intel_rapl_perf ecdh_generic serio_raw soundcore cfg80211 wmi_bmof virt_dma intel_lpss_pci intel_lpss acpi_als kfifo_buf industrialio winbond_cir soc_button_array rc_core spidev tpm_crb intel_hid acpi_pad mac_hid sparse_keymap [62530.869546] parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid i915 i2c_algo_bit prime_numbers drm_kms_helper syscopyarea e1000e sysfillrect sysimgblt fb_sys_fops ahci ptp pps_core libahci drm wmi video i2c_hid hid [62530.869557] CPU: 1 PID: 7983 Comm: gem_shrink Tainted: G U W L 4.14.0-rc8-drm-tip-ww45-commit-1342299+ #1 [62530.869558] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake H DDR4 RVP, BIOS CNLSFWR1.R00.X098.A00.1707301945 07/30/2017 [62530.869559] task: ffffa1049dbeec80 task.stack: ffffae7d05c44000 [62530.869560] RIP: 0010:check_flush_dependency+0x7f/0x110 [62530.869561] RSP: 0018:ffffae7d05c473a0 EFLAGS: 00010286 [62530.869562] RAX: 000000000000006e RBX: ffffa1049540f400 RCX: ffffffffa3e55788 [62530.869562] RDX: 0000000000000000 RSI: 0000000000000092 RDI: 0000000000000202 [62530.869563] RBP: ffffae7d05c473c0 R08: 000000000000006e R09: 000000000038bb0e [62530.869563] R10: 0000000000000000 R11: 000000000000006e R12: ffffa1049dbeec80 [62530.869564] R13: 0000000000000000 R14: 0000000000000000 R15: ffffae7d05c473e0 [62530.869565] FS: 00007f621b129880(0000) GS:ffffa1050b240000(0000) knlGS:0000000000000000 [62530.869566] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [62530.869566] CR2: 00007f6214400000 CR3: 0000000353a17003 CR4: 00000000003606e0 [62530.869567] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [62530.869567] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [62530.869568] Call Trace: [62530.869570] flush_workqueue+0x115/0x3d0 [62530.869573] ? wake_up_process+0x15/0x20 [62530.869596] i915_gem_userptr_mn_invalidate_range_start+0x12f/0x160 [i915] [62530.869614] ? i915_gem_userptr_mn_invalidate_range_start+0x12f/0x160 [i915] [62530.869616] __mmu_notifier_invalidate_range_start+0x55/0x80 [62530.869618] try_to_unmap_one+0x791/0x8b0 [62530.869620] ? call_rwsem_down_read_failed+0x18/0x30 [62530.869622] rmap_walk_anon+0x10b/0x260 [62530.869624] rmap_walk+0x48/0x60 [62530.869625] try_to_unmap+0x93/0xf0 [62530.869626] ? page_remove_rmap+0x2a0/0x2a0 [62530.869627] ? page_not_mapped+0x20/0x20 [62530.869629] ? page_get_anon_vma+0x90/0x90 [62530.869630] ? invalid_mkclean_vma+0x20/0x20 [62530.869631] migrate_pages+0x946/0xaa0 [62530.869633] ? __ClearPageMovable+0x10/0x10 [62530.869635] ? isolate_freepages_block+0x3c0/0x3c0 [62530.869636] compact_zone+0x22f/0x970 [62530.869638] compact_zone_order+0xa3/0xd0 [62530.869640] try_to_compact_pages+0x1a5/0x2a0 [62530.869641] ? try_to_compact_pages+0x1a5/0x2a0 [62530.869643] __alloc_pages_direct_compact+0x50/0x110 [62530.869644] __alloc_pages_slowpath+0x4da/0xf30 [62530.869646] __alloc_pages_nodemask+0x262/0x280 [62530.869648] alloc_pages_vma+0x165/0x1e0 [62530.869649] shmem_alloc_hugepage+0xd0/0x130 [62530.869651] ? __radix_tree_insert+0x45/0x230 [62530.869652] ? __vm_enough_memory+0x29/0x130 [62530.869654] shmem_alloc_and_acct_page+0x10d/0x1e0 [62530.869655] shmem_getpage_gfp+0x426/0xc00 [62530.869657] shmem_fault+0xa0/0x1e0 [62530.869659] ? file_update_time+0x60/0x110 [62530.869660] __do_fault+0x1e/0xc0 [62530.869661] __handle_mm_fault+0xa35/0x1170 [62530.869662] handle_mm_fault+0xcc/0x1c0 [62530.869664] __do_page_fault+0x262/0x4f0 [62530.869666] do_page_fault+0x2e/0xe0 [62530.869667] page_fault+0x22/0x30 [62530.869668] RIP: 0033:0x404335 [62530.869669] RSP: 002b:00007fff7829e420 EFLAGS: 00010216 [62530.869670] RAX: 00007f6210400000 RBX: 0000000000000004 RCX: 0000000000b80000 [62530.869670] RDX: 0000000000002e01 RSI: 0000000000008000 RDI: 0000000000000004 [62530.869671] RBP: 0000000000000019 R08: 0000000000000002 R09: 0000000000000000 [62530.869671] R10: 0000000000000559 R11: 0000000000000246 R12: 0000000008000000 [62530.869672] R13: 00000000004042f0 R14: 0000000000000004 R15: 000000000000007e [62530.869673] Code: 00 8b b0 18 05 00 00 48 8d 8b b0 00 00 00 48 8d 90 c0 06 00 00 4d 89 f0 48 c7 c7 40 c0 c8 a3 c6 05 68 c5 e8 00 01 e8 c2 68 04 00 <0f> ff 4d 85 ed 74 18 49 8b 45 20 48 8b 70 08 8b 86 00 01 00 00 [62530.869691] ---[ end trace 01e01ad0ff5781f8 ]--- Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103739 Fixes: 21cc6431e0c2 ("drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171114173520.8829-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>
* drm/i915: Add might_sleep() check to wait_for()Chris Wilson2017-11-171-9/+2
| | | | | | | | | | | | We should long past the time of trying to use wait_for() from inside atomic contexts, so add a might_sleep() check to prevent misuse. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171114215655.4849-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
* drm/i915/selftests: Add a GuC doorbells selftestMichel Thierry2017-11-173-0/+372
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The first test aims to check guc_init_doorbell_hw, changing the existing guc clients and doorbells state before calling it. The second test tries to create as many clients as it is currently possible (currently limited to max number of doorbells) and exercise the doorbell alloc/dealloc code. Since our usage mode require very few clients/doorbells, this code has been exercised very lightly and it's good to have a simple test for it. As reference, this test already helped identify the bug fixed by commit 7f1ea2ac3017 ("drm/i915/guc: Fix doorbell id selection"). v2: Extend number of clients; check for client allocation failure when number of doorbells is exceeded; validate client properties; reuse guc_init_doorbell_hw (Chris). v3: guc_init_doorbell_hw test added per Chris suggestion. v4: Try to explain why guc_init_doorbell_hw exist and comment some details in the subtest. v5: Remove redundant pr_info at the beginning of each subtest (Chris); rebase (s/i915_guc_client/intel_guc_client/). Signed-off-by: Michel Thierry <michel.thierry@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171116220632.1909-1-michel.thierry@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* Merge tag 'gvt-next-2017-11-16' of https://github.com/intel/gvt-linux into ↵Rodrigo Vivi2017-11-1623-1010/+1906
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | drm-intel-next-queued gvt-next-2017-11-16 - CSB HWSP update support (Weinan) - GVT debug helpers, dyndbg and debugfs (Chuanxiao, Shuo) - full virtualized opregion (Xiaolin) - VM health check for sane fallback (Fred) - workload submission code refactor for future enabling (Zhi) - Updated repo URL in MAINTAINERS (Zhenyu) - other many misc fixes Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171116092007.ww5bvfx7rf36bjmn@zhen-hp.sh.intel.com
| * drm/i915/gvt: Move request alloc to dispatch_workload path onlyfred gao2017-11-161-4/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously the performance is improved through the workload auditing and shadowing ahead of vGPU scheduling, however, there is the case that more requests are allocated in submit_context before the previous request is added, the timeline will hold its seqno which is later. This patch is to move the request alloc to dispatch_workload function, where is the same place as request is added. It will fix the issue of kernel BUG for (timeline->seqno != request->fence.seqno) check when add_request. Fixes: 89ea20b930cb ("drm/i915/gvt: Factor out scan and shadow from workload dispatch") Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Signed-off-by: fred gao <fred.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| * drm/i915/gvt: Let each vgpu has separate opregion memoryXiong Zhang2017-11-163-78/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently every vgpu share a common gvt opregion memory, but it is freed at vgpu destroy, then the later vgpu doesn't have opregion memory once the first vgpu is destroyed. This cause guest function failure like reboot, second or later boot. This patch allocate and init virt opregion memory for each vgpu, so this memory could be freed at vgpu destroy. Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| * drm/i915/gvt: Limit read hw reg to active vgpuXiong Zhang2017-11-162-4/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mmio_read_from_hw() let vgpu could read hw reg, if vgpu's workload is running on hw, things is good. Otherwise vgpu will get other vgpu's reg val, it is unsafe. This patch limit such hw access to active vgpu. If vgpu isn't running on hw, the reg read of this vgpu will get the last active val which saved at schedule_out. v2: ring timestamp is walking continuously even if the ring is idle. so read hw directly. (Zhenyu) Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| * Revert "drm/i915/gvt: Refine broken PPGTT scratch"Zhenyu Wang2017-11-162-112/+101
| | | | | | | | | | | | | | | | | | | | | | This reverts commit b20d09886fd1b74cd2255d846029a049e524db14. This caused windows driver boot errors for invalid page address. Revert for now. Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| * drm/i915/gvt: Emulate PCI expansion ROM base address registerChangbin Du2017-11-161-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our vGPU doesn't have a device ROM, we need follow the PCI spec to report this info to drivers. Otherwise, we would see below errors. Inspecting possible rom at 0xfe049000 (vd=8086:1912 bdf=00:10.0) qemu-system-x86_64: vfio-pci: Cannot read device rom at 00000000-0000-0000-0000-000000000001 Device option ROM contents are probably invalid (check dmesg). Skip option ROM probe with rombar=0, or load from file with romfile=No option rom signature (got 4860) I will also send a improvement patch to PCI subsystem related to PCI ROM. But no idea to omit below error, since no pattern to detect vbios shadow without touch its content. 0000:00:10.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0x0000 Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| * drm/i915/gvt: Make gvt_vgpu_err use pr_errChangbin Du2017-11-161-2/+2
| | | | | | | | | | | | | | | | gvt_vgpu_err means something goes wrong. We need the error propagates to kernel message by default. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| * drm/i915/gvt: Don't dump partial state in cmd parserChangbin Du2017-11-161-8/+3
| | | | | | | | | | | | | | | | I have seen the cmd parser dump partial odd info. Stop that and only dump the full verbose info when debug enabled. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| * drm/i915/gvt: Reduce rcs mocs switch latencyChangbin Du2017-11-161-1/+1
| | | | | | | | | | | | | | | | Use I915_WRITE_FW instead of I915_WRITE to reduce overhead. The overall mmio switch latency lowers from ~600us to ~180us. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| * drm/i915/gvt: Add new debugfs tool mmio_diffChangbin Du2017-11-161-0/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new debugfs entry is used to figure out which registers of vGPU is different to host. It is a useful tool for new platform enabling and debugging. When read this entry, all the diff mmio are recognized and sorted by mmio offset. Besides, the bit positions of different value are listed in 'Diff' column. Here is a show: $ sudo cat ./mmio_diff Offset HW vGPU Diff 00002030 000025f8 00000000 3-8,10,13 00002034 012025f8 00000000 3-8,10,13,21,24 00002038 027fb000 00000000 12-13,15-22,25 0000203c 00003000 00000000 12-13 00002054 0000000a 00000040 1,3,6 00002074 012025f8 00000000 3-8,10,13,21,24 00002080 fffe6000 00000000 13-14,17-31 000020a8 fffffeff ffffffff 8 000020d4 00000004 00000000 2 .... 00145974 eb42718c 010c11b0 2-5,13-14,17-19,22,25,27,29-31 00145978 0000002f 0000002a 0,2 0014597c 0000002f 0000002a 0,2 00145980 0000002b 00000028 0-1 00145984 a5a87c9e b27d20c0 1-4,6,10-12,14,16,18,20,22-26,28 001459c0 88390000 883c0000 16,18 00146200 88350000 883a0000 16-19 Total: 72432, Diff: 901 v3: fix a typo. v2: add mmio_hw_access_pre/post(). Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| * drm/i915/gvt: Add mmio iterator intel_gvt_for_each_tracked_mmio()Changbin Du2017-11-163-15/+49
| | | | | | | | | | | | | | | | | | | | | | This patch add a function intel_gvt_for_each_tracked_mmio() to iterate each tracked mmio. The caller don't be aware of how the tracked mmios are presented internally. v2: remove snapshot_hw_mmio_registers(). Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| * drm/i915/gvt: opregion virtualization for win guestXiaolin Zhang2017-11-161-9/+73
| | | | | | | | | | | | | | | | | | | | this is an enhanced opregion emulation for win guest support by initializing more data members including opregion header size, version and child device propertity for display port. for simplicity, redefined child_device_config structure. Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| * drm/i915/gvt: update CSB and CSB write pointer in virtual HWSPWeinan Li2017-11-164-1/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The engine provides a mirror of the CSB and CSB write pointer in the HWSP. Read these status from virtual HWSP in VM can reduce CPU utilization while applications have much more short GPU workloads. Here we update the corresponding data in virtual HWSP as it in virtual MMIO. Before read these status from HWSP in GVT-g VM, please ensure the host support it by checking the BIT(3) of caps in PVINFO. Virtual HWSP only support GEN8+ platform, since the HWSP MMIO may change follow the platform update, please add the corresponding MMIO emulation when enable new platforms in GVT-g. v3 : Add address audit in HWSP address update. v4 : Separate this patch with enalbe virtual HWSP in VM. Use intel_gvt_render_mmio_to_ring_id() to determine ring_id by offset. v5 : Remove unnessary check about Gen8, GVT-g only support Gen8+. Signed-off-by: Weinan Li <weinan.z.li@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
| * drm/i915/gvt: Refine broken PPGTT scratchZhi Wang2017-11-162-101/+112
| | | | | | | | | | | | | | | | Refine previously broken PPGTT scratch. Scratch PTE was no correctly handled and also the handling of scratch entries in page table walk was not well organized, which brings gaps of introducing lazy shadow. Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
| * drm/i915/gvt: Introduce ops->set_present()Zhi Wang2017-11-162-0/+7
| | | | | | | | | | | | | | We need ops->set_present() during generating a new scratch page table entry. Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
| * drm/i915/gvt: Introduce page table type of current level in GTT type ↵Zhi Wang2017-11-161-1/+21
| | | | | | | | | | | | | | | | | | enumerations Need to figure out page table type of current level by GTT entry type during getting a scratch page table entry. Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
| * drm/i915/gvt: Fix a bug of unexpectedly clear scratch page tableZhi Wang2017-11-161-9/+0
| | | | | | | | | | | | | | During a vGPU reset, the scratch page table shouldn't be cleared, what needs to be cleared should be the scratch page. Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
| * drm/i915/gvt: Let the caller choose if a shadow page should be put into hash ↵Zhi Wang2017-11-162-14/+15
| | | | | | | | | | | | | | | | | | | | | | table As we want to re-use intel_vgpu_shadow_page in buidling scrach page table and we don't want to put scrach page table page into hash table, a new param is introduced to give the caller a choice to decide if a shadow page should be put into hash table. Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
| * drm/i915/gvt: Use I915_GTT_PAGE_SIZEZhi Wang2017-11-165-43/+45
| | | | | | | | | | | | | | As there is already an I915_GTT_PAGE_SIZE marco in i915, let GVT-g use it as well. Also this patch re-names some GTT marcos with additional prefix. Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
| * drm/i915/gvt: Export intel_gvt_render_mmio_to_ring_id()Zhi Wang2017-11-162-6/+17
| | | | | | | | | | | | | | Since many emulation logic needs to convert the offset of ring registers into ring id, we export it for other caller which might need it. Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
| * drm/i915/gvt: Factor intel_vgpu_page_trackZhi Wang2017-11-164-114/+136
| | | | | | | | | | | | | | | | As the data structure of "intel_vgpu_guest_page" will become much heavier in future, it's better to factor out the guest memory page track mechnisim as early as possible. Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
| * drm/i915/gvt: Refine shadow batch bufferZhi Wang2017-11-163-71/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 1) Use standard i915 GEM object sequence to access the shadow batch buffer. 2) Manage i915 vma life cycle to solve one FIXME. v2: - Refine code structure. - Refine the usage of GEM APIs. - Add the missing lock/unlock in release_shadow_batch_buffer. Test on my SKL NuC. Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
| * drm/i915/gvt: Refine find_bb_size()Zhi Wang2017-11-161-16/+14
| | | | | | | | | | | | | | Returns the error code if something is wrong and the size of batch buffer is passed through the pointer. Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
| * drm/i915/gvt: Use BIT() to make klockwork happyZhi Wang2017-11-161-3/+3
| | | | | | | | | | | | | | Replace the plain bit usage with BIT() to make klockwork happy. Cc: Deng Hongyi <hongyi.deng@intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
| * drm/i915/gvt: Add basic debugfs infrastructureChangbin Du2017-11-166-4/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need debugfs entry to expose some debug information of gvt and vGPUs. The first tool will be added is mmio-diff, which help to find the difference values of host and vGPU mmio. It's useful for platform enabling. This patch just add a basic debugfs infrastructure, each vGPU has its own sub-folder. Two simple attributes are created as a template. . ├── num_tracked_mmio ├── vgpu1 | └── active └── vgpu2 └── active Signed-off-by: Changbin Du <changbin.du@intel.com> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
| * drm/i915/gvt: Refactor vGPU type code in kvmgt partfred gao2017-11-161-120/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | all the vGPU type related code in kvmgt will be moved into gvt.c/gvt.h files while the common vGPU type related interfaces will be called. v2: - intel_gvt_{init,cleanup}_vgpu_type_groups are initialized in gvt part. (Wang, Zhi) Signed-off-by: fred gao <fred.gao@intel.com> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
| * drm/i915/gvt: Move vGPU type related code into gvt filefred gao2017-11-162-0/+138
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In this patch, all the vGPU type related code will be merged into same gvt file and the common interface will be exposed to both XenGT and KvmGT. v2: - remove the useless mdev_* gvt_ops. add get_gvt_attr ops for MPT module. intel_gvt_{init,cleanup}_vgpu_type_groups are initialized in gvt part. (Wang, Zhi) - set gvt_vgpu_type_groups[i] to NULL. (Zhang,Xiong) Signed-off-by: fred gao <fred.gao@intel.com> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
| * drm/i915/gvt: Move clean_workloads() into scheduler.cZhi Wang2017-11-162-40/+38
| | | | | | | | | | | | | | | | | | | | | | Move clean_workloads() into scheduler.c since it's not specific to execlist. v2: - Remove clean_workloads in intel_vgpu_select_submission_ops. (Zhenyu) Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>