linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	x86/perf: Fix a typo	Hu Haowen	2020-07-22	1	-1/+1
\| \| \| \| \| \| \| \|	The word "Zhoaxin" is incorrect and the right one is "Zhaoxin". Signed-off-by: Hu Haowen <xianfengting221@163.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200719105007.57649-1-xianfengting221@163.com
*	perf: <linux/perf_event.h>: drop a duplicated word	Randy Dunlap	2020-07-22	1	-1/+1
\| \| \| \| \| \| \| \|	Drop the repeated word "the" in a comment. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200719003027.20798-1-rdunlap@infradead.org
*	perf/x86/intel/lbr: Support XSAVES for arch LBR read	Kan Liang	2020-07-08	3	-1/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reading LBR registers in a perf NMI handler for a non-PEBS event causes a high overhead because the number of LBR registers is huge. To reduce the overhead, the XSAVES instruction should be used to replace the LBR registers' reading method. The XSAVES buffer used for LBR read has to be per-CPU because the NMI handler invoked the lbr_read(). The existing task_ctx_data buffer cannot be used which is per-task and only be allocated for the LBR call stack mode. A new lbr_xsave pointer is introduced in the cpu_hw_events as an XSAVES buffer for LBR read. The XSAVES buffer should be allocated only when LBR is used by a non-PEBS event on the CPU because the total size of the lbr_xsave is not small (~1.4KB). The XSAVES buffer is allocated when a non-PEBS event is added, but it is lazily released in x86_release_hardware() when perf releases the entire PMU hardware resource, because perf may frequently schedule the event, e.g. high context switch. The lazy release method reduces the overhead of frequently allocate/free the buffer. If the lbr_xsave fails to be allocated, roll back to normal Arch LBR lbr_read(). Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Link: https://lkml.kernel.org/r/1593780569-62993-24-git-send-email-kan.liang@linux.intel.com
*	perf/x86/intel/lbr: Support XSAVES/XRSTORS for LBR context switch	Kan Liang	2020-07-08	6	-10/+119
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the LBR call stack mode, LBR information is used to reconstruct a call stack. To get the complete call stack, perf has to save/restore all LBR registers during a context switch. Due to a large number of the LBR registers, this process causes a high CPU overhead. To reduce the CPU overhead during a context switch, use the XSAVES/XRSTORS instructions. Every XSAVE area must follow a canonical format: the legacy region, an XSAVE header and the extended region. Although the LBR information is only kept in the extended region, a space for the legacy region and XSAVE header is still required. Add a new dedicated structure for LBR XSAVES support. Before enabling XSAVES support, the size of the LBR state has to be sanity checked, because: - the size of the software structure is calculated from the max number of the LBR depth, which is enumerated by the CPUID leaf for Arch LBR. The size of the LBR state is enumerated by the CPUID leaf for XSAVE support of Arch LBR. If the values from the two CPUID leaves are not consistent, it may trigger a buffer overflow. For example, a hypervisor may unconsciously set inconsistent values for the two emulated CPUID. - unlike other state components, the size of an LBR state depends on the max number of LBRs, which may vary from generation to generation. Expose the function xfeature_size() for the sanity check. The LBR XSAVES support will be disabled if the size of the LBR state enumerated by CPUID doesn't match with the size of the software structure. The XSAVE instruction requires 64-byte alignment for state buffers. A new macro is added to reflect the alignment requirement. A 64-byte aligned kmem_cache is created for architecture LBR. Currently, the structure for each state component is maintained in fpu/types.h. The structure for the new LBR state component should be maintained in the same place. Move structure lbr_entry to fpu/types.h as well for broader sharing. Add dedicated lbr_save/lbr_restore functions for LBR XSAVES support, which invokes the corresponding xstate helpers to XSAVES/XRSTORS LBR information at the context switch when the call stack mode is enabled. Since the XSAVES/XRSTORS instructions will be eventually invoked, the dedicated functions is named with '_xsaves'/'_xrstors' postfix. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Link: https://lkml.kernel.org/r/1593780569-62993-23-git-send-email-kan.liang@linux.intel.com
*	x86/fpu/xstate: Add helpers for LBR dynamic supervisor feature	Kan Liang	2020-07-08	2	-0/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The perf subsystem will only need to save/restore the LBR state. However, the existing helpers save all supported supervisor states to a kernel buffer, which will be unnecessary. Two helpers are introduced to only save/restore requested dynamic supervisor states. The supervisor features in XFEATURE_MASK_SUPERVISOR_SUPPORTED and XFEATURE_MASK_SUPERVISOR_UNSUPPORTED mask cannot be saved/restored using these helpers. The helpers will be used in the following patch. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Link: https://lkml.kernel.org/r/1593780569-62993-22-git-send-email-kan.liang@linux.intel.com
*	x86/fpu/xstate: Support dynamic supervisor feature for LBR	Kan Liang	2020-07-08	3	-5/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Last Branch Records (LBR) registers are used to log taken branches and other control flows. In perf with call stack mode, LBR information is used to reconstruct a call stack. To get the complete call stack, perf has to save/restore all LBR registers during a context switch. Due to the large number of the LBR registers, e.g., the current platform has 96 LBR registers, this process causes a high CPU overhead. To reduce the CPU overhead during a context switch, an LBR state component that contains all the LBR related registers is introduced in hardware. All LBR registers can be saved/restored together using one XSAVES/XRSTORS instruction. However, the kernel should not save/restore the LBR state component at each context switch, like other state components, because of the following unique features of LBR: - The LBR state component only contains valuable information when LBR is enabled in the perf subsystem, but for most of the time, LBR is disabled. - The size of the LBR state component is huge. For the current platform, it's 808 bytes. If the kernel saves/restores the LBR state at each context switch, for most of the time, it is just a waste of space and cycles. To efficiently support the LBR state component, it is desired to have: - only context-switch the LBR when the LBR feature is enabled in perf. - only allocate an LBR-specific XSAVE buffer on demand. (Besides the LBR state, a legacy region and an XSAVE header have to be included in the buffer as well. There is a total of (808+576) byte overhead for the LBR-specific XSAVE buffer. The overhead only happens when the perf is actively using LBRs. There is still a space-saving, on average, when it replaces the constant 808 bytes of overhead for every task, all the time on the systems that support architectural LBR.) - be able to use XSAVES/XRSTORS for accessing LBR at run time. However, the IA32_XSS should not be adjusted at run time. (The XCR0 \| IA32_XSS are used to determine the requested-feature bitmap (RFBM) of XSAVES.) A solution, called dynamic supervisor feature, is introduced to address this issue, which - does not allocate a buffer in each task->fpu; - does not save/restore a state component at each context switch; - sets the bit corresponding to the dynamic supervisor feature in IA32_XSS at boot time, and avoids setting it at run time. - dynamically allocates a specific buffer for a state component on demand, e.g. only allocates LBR-specific XSAVE buffer when LBR is enabled in perf. (Note: The buffer has to include the LBR state component, a legacy region and a XSAVE header space.) (Implemented in a later patch) - saves/restores a state component on demand, e.g. manually invokes the XSAVES/XRSTORS instruction to save/restore the LBR state to/from the buffer when perf is active and a call stack is required. (Implemented in a later patch) A new mask XFEATURE_MASK_DYNAMIC and a helper xfeatures_mask_dynamic() are introduced to indicate the dynamic supervisor feature. For the systems which support the Architecture LBR, LBR is the only dynamic supervisor feature for now. For the previous systems, there is no dynamic supervisor feature available. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Link: https://lkml.kernel.org/r/1593780569-62993-21-git-send-email-kan.liang@linux.intel.com
*	x86/fpu: Use proper mask to replace full instruction mask	Kan Liang	2020-07-08	2	-40/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When saving xstate to a kernel/user XSAVE area with the XSAVE family of instructions, the current code applies the 'full' instruction mask (-1), which tries to XSAVE all possible features. This method relies on hardware to trim 'all possible' down to what is enabled in the hardware. The code works well for now. However, there will be a problem, if some features are enabled in hardware, but are not suitable to be saved into all kernel XSAVE buffers, like task->fpu, due to performance consideration. One such example is the Last Branch Records (LBR) state. The LBR state only contains valuable information when LBR is explicitly enabled by the perf subsystem, and the size of an LBR state is large (808 bytes for now). To avoid both CPU overhead and space overhead at each context switch, the LBR state should not be saved into task->fpu like other state components. It should be saved/restored on demand when LBR is enabled in the perf subsystem. Current copy_xregs_to_* will trigger a buffer overflow for such cases. Three sites use the '-1' instruction mask which must be updated. Two are saving/restoring the xstate to/from a kernel-allocated XSAVE buffer and can use 'xfeatures_mask_all', which will save/restore all of the features present in a normal task FPU buffer. The last one saves the register state directly to a user buffer. It could also use 'xfeatures_mask_all'. Just as it was with the '-1' argument, any supervisor states in the mask will be filtered out by the hardware and not saved to the buffer. But, to be more explicit about what is expected to be saved, use xfeatures_mask_user() for the instruction mask. KVM includes the header file fpu/internal.h. To avoid 'undefined xfeatures_mask_all' compiling issue, move copy_fpregs_to_fpstate() to fpu/core.c and export it, because: - The xfeatures_mask_all is indirectly used via copy_fpregs_to_fpstate() by KVM. The function which is directly used by other modules should be exported. - The copy_fpregs_to_fpstate() is a function, while xfeatures_mask_all is a variable for the "internal" FPU state. It's safer to export a function than a variable, which may be implicitly changed by others. - The copy_fpregs_to_fpstate() is a big function with many checks. The removal of the inline keyword should not impact the performance. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Link: https://lkml.kernel.org/r/1593780569-62993-20-git-send-email-kan.liang@linux.intel.com
*	perf/x86: Remove task_ctx_size	Kan Liang	2020-07-08	4	-9/+1
\| \| \| \| \| \| \| \| \|	A new kmem_cache method has replaced the kzalloc() to allocate the PMU specific data. The task_ctx_size is not required anymore. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-19-git-send-email-kan.liang@linux.intel.com
*	perf/x86/intel/lbr: Create kmem_cache for the LBR context data	Kan Liang	2020-07-08	1	-2/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A new kmem_cache method is introduced to allocate the PMU specific data task_ctx_data, which requires the PMU specific code to create a kmem_cache. Currently, the task_ctx_data is only used by the Intel LBR call stack feature, which is introduced since Haswell. The kmem_cache should be only created for Haswell and later platforms. There is no alignment requirement for the existing platforms. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-18-git-send-email-kan.liang@linux.intel.com
*	perf/core: Use kmem_cache to allocate the PMU specific data	Kan Liang	2020-07-08	2	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, the PMU specific data task_ctx_data is allocated by the function kzalloc() in the perf generic code. When there is no specific alignment requirement for the task_ctx_data, the method works well for now. However, there will be a problem once a specific alignment requirement is introduced in future features, e.g., the Architecture LBR XSAVE feature requires 64-byte alignment. If the specific alignment requirement is not fulfilled, the XSAVE family of instructions will fail to save/restore the xstate to/from the task_ctx_data. The function kzalloc() itself only guarantees a natural alignment. A new method to allocate the task_ctx_data has to be introduced, which has to meet the requirements as below: - must be a generic method can be used by different architectures, because the allocation of the task_ctx_data is implemented in the perf generic code; - must be an alignment-guarantee method (The alignment requirement is not changed after the boot); - must be able to allocate/free a buffer (smaller than a page size) dynamically; - should not cause extra CPU overhead or space overhead. Several options were considered as below: - One option is to allocate a larger buffer for task_ctx_data. E.g., ptr = kmalloc(size + alignment, GFP_KERNEL); ptr &= ~(alignment - 1); This option causes space overhead. - Another option is to allocate the task_ctx_data in the PMU specific code. To do so, several function pointers have to be added. As a result, both the generic structure and the PMU specific structure will become bigger. Besides, extra function calls are added when allocating/freeing the buffer. This option will increase both the space overhead and CPU overhead. - The third option is to use a kmem_cache to allocate a buffer for the task_ctx_data. The kmem_cache can be created with a specific alignment requirement by the PMU at boot time. A new pointer for kmem_cache has to be added in the generic struct pmu, which would be used to dynamically allocate a buffer for the task_ctx_data at run time. Although the new pointer is added to the struct pmu, the existing variable task_ctx_size is not required anymore. The size of the generic structure is kept the same. The third option which meets all the aforementioned requirements is used to replace kzalloc() for the PMU specific data allocation. A later patch will remove the kzalloc() method and the related variables. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-17-git-send-email-kan.liang@linux.intel.com
*	perf/core: Factor out functions to allocate/free the task_ctx_data	Kan Liang	2020-07-08	1	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \| \|	The method to allocate/free the task_ctx_data is going to be changed in the following patch. Currently, the task_ctx_data is allocated/freed in several different places. To avoid repeatedly modifying the same codes in several different places, alloc_task_ctx_data() and free_task_ctx_data() are factored out to allocate/free the task_ctx_data. The modification only needs to be applied once. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-16-git-send-email-kan.liang@linux.intel.com
*	perf/x86/intel/lbr: Support Architectural LBR	Kan Liang	2020-07-08	3	-11/+253
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Last Branch Records (LBR) enables recording of software path history by logging taken branches and other control flows within architectural registers now. Intel CPUs have had model-specific LBR for quite some time, but this evolves them into an architectural feature now. The main improvements of Architectural LBR implemented includes: - Linux kernel can support the LBR features without knowing the model number of the current CPU. - Architectural LBR capabilities can be enumerated by CPUID. The lbr_ctl_map is based on the CPUID Enumeration. - The possible LBR depth can be retrieved from CPUID enumeration. The max value is written to the new MSR_ARCH_LBR_DEPTH as the number of LBR entries. - A new IA32_LBR_CTL MSR is introduced to enable and configure LBRs, which replaces the IA32_DEBUGCTL[bit 0] and the LBR_SELECT MSR. - Each LBR record or entry is still comprised of three MSRs, IA32_LBR_x_FROM_IP, IA32_LBR_x_TO_IP and IA32_LBR_x_TO_IP. But they become the architectural MSRs. - Architectural LBR is stack-like now. Entry 0 is always the youngest branch, entry 1 the next youngest... The TOS MSR has been removed. The way to enable/disable Architectural LBR is similar to the previous model-specific LBR. __intel_pmu_lbr_enable/disable() can be reused, but some modifications are required, which include: - MSR_ARCH_LBR_CTL is used to enable and configure the Architectural LBR. - When checking the value of the IA32_DEBUGCTL MSR, ignoring the DEBUGCTLMSR_LBR (bit 0) for Architectural LBR, which has no meaning and always return 0. - The FREEZE_LBRS_ON_PMI has to be explicitly set/clear, because MSR_IA32_DEBUGCTLMSR is not touched in __intel_pmu_lbr_disable() for Architectural LBR. - Only MSR_ARCH_LBR_CTL is cleared in __intel_pmu_lbr_disable() for Architectural LBR. Some Architectural LBR dedicated functions are implemented to reset/read/save/restore LBR. - For reset, writing to the ARCH_LBR_DEPTH MSR clears all Arch LBR entries, which is a lot faster and can improve the context switch latency. - For read, the branch type information can be retrieved from the MSR_ARCH_LBR_INFO_*. But it's not fully compatible due to OTHER_BRANCH type. The software decoding is still required for the OTHER_BRANCH case. LBR records are stored in the age order as well. Reuse intel_pmu_store_lbr(). Check the CPUID enumeration before accessing the corresponding bits in LBR_INFO. - For save/restore, applying the fast reset (writing ARCH_LBR_DEPTH). Reading 'lbr_from' of entry 0 instead of the TOS MSR to check if the LBR registers are reset in the deep C-state. If 'the deep C-state reset' bit is not set in CPUID enumeration, ignoring the check. XSAVE support for Architectural LBR will be implemented later. The number of LBR entries cannot be hardcoded anymore, which should be retrieved from CPUID enumeration. A new structure x86_perf_task_context_arch_lbr is introduced for Architectural LBR. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-15-git-send-email-kan.liang@linux.intel.com
*	perf/x86/intel/lbr: Factor out intel_pmu_store_lbr	Kan Liang	2020-07-08	1	-26/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The way to store the LBR information from a PEBS LBR record can be reused in Architecture LBR, because - The LBR information is stored like a stack. Entry 0 is always the youngest branch. - The layout of the LBR INFO MSR is similar. The LBR information may be retrieved from either the LBR registers (non-PEBS event) or a buffer (PEBS event). Extend rdlbr_*() to support both methods. Explicitly check the invalid entry (0s), which can avoid unnecessary MSR access if using a non-PEBS event. For a PEBS event, the check should slightly improve the performance as well. The invalid entries are cut. The intel_pmu_lbr_filter() doesn't need to check and filter them out. Cannot share the function with current model-specific LBR read, because the direction of the LBR growth is opposite. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-14-git-send-email-kan.liang@linux.intel.com
*	perf/x86/intel/lbr: Factor out rdlbr_all() and wrlbr_all()	Kan Liang	2020-07-08	2	-17/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous model-specific LBR and Architecture LBR (legacy way) use a similar method to save/restore the LBR information, which directly accesses the LBR registers. The codes which read/write a set of LBR registers can be shared between them. Factor out two functions which are used to read/write a set of LBR registers. Add lbr_info into structure x86_pmu, and use it to replace the hardcoded LBR INFO MSR, because the LBR INFO MSR address of the previous model-specific LBR is different from Architecture LBR. The MSR address should be assigned at boot time. For now, only Sky Lake and later platforms have the LBR INFO MSR. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-13-git-send-email-kan.liang@linux.intel.com
*	perf/x86/intel/lbr: Mark the {rd,wr}lbr_{to,from} wrappers __always_inline	Kan Liang	2020-07-08	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The {rd,wr}lbr_{to,from} wrappers are invoked in hot paths, e.g. context switch and NMI handler. They should be always inline to achieve better performance. However, the CONFIG_OPTIMIZE_INLINING allows the compiler to uninline functions marked 'inline'. Mark the {rd,wr}lbr_{to,from} wrappers as __always_inline to force inline the wrappers. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-12-git-send-email-kan.liang@linux.intel.com
*	perf/x86/intel/lbr: Unify the stored format of LBR information	Kan Liang	2020-07-08	4	-22/+16
\| \| \| \| \| \| \| \| \| \| \| \| \|	Current LBR information in the structure x86_perf_task_context is stored in a different format from the PEBS LBR record and Architecture LBR, which prevents the sharing of the common codes. Use the format of the PEBS LBR record as a unified format. Use a generic name lbr_entry to replace pebs_lbr_entry. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-11-git-send-email-kan.liang@linux.intel.com
*	perf/x86/intel/lbr: Support LBR_CTL	Kan Liang	2020-07-08	2	-3/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An IA32_LBR_CTL is introduced for Architecture LBR to enable and config LBR registers to replace the previous LBR_SELECT. All the related members in struct cpu_hw_events and struct x86_pmu have to be renamed. Some new macros are added to reflect the layout of LBR_CTL. The mapping from PERF_SAMPLE_BRANCH_* to the corresponding bits in LBR_CTL MSR is saved in lbr_ctl_map now, which is not a const value. The value relies on the CPUID enumeration. For the previous model-specific LBR, most of the bits in LBR_SELECT operate in the suppressed mode. For the bits in LBR_CTL, the polarity is inverted. For the previous model-specific LBR format 5 (LBR_FORMAT_INFO), if the NO_CYCLES and NO_FLAGS type are set, the flag LBR_NO_INFO will be set to avoid the unnecessary LBR_INFO MSR read. Although Architecture LBR also has a dedicated LBR_INFO MSR, perf doesn't need to check and set the flag LBR_NO_INFO. For Architecture LBR, XSAVES instruction will be used as the default way to read the LBR MSRs all together. The overhead which the flag tries to avoid doesn't exist anymore. Dropping the flag can save the extra check for the flag in the lbr_read() later, and make the code cleaner. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-10-git-send-email-kan.liang@linux.intel.com
*	perf/x86: Expose CPUID enumeration bits for arch LBR	Kan Liang	2020-07-08	2	-0/+53
\| \| \| \| \| \| \| \| \| \| \| \| \|	The LBR capabilities of Architecture LBR are retrieved from the CPUID enumeration once at boot time. The capabilities have to be saved for future usage. Several new fields are added into structure x86_pmu to indicate the capabilities. The fields will be used in the following patches. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-9-git-send-email-kan.liang@linux.intel.com
*	x86/msr-index: Add bunch of MSRs for Arch LBR	Kan Liang	2020-07-08	1	-0/+16
\| \| \| \| \| \| \| \|	Add Arch LBR related MSRs and the new LBR INFO bits in MSR-index. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-8-git-send-email-kan.liang@linux.intel.com
*	perf/x86/intel/lbr: Use dynamic data structure for task_ctx	Kan Liang	2020-07-08	2	-34/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The type of task_ctx is hardcoded as struct x86_perf_task_context, which doesn't apply for Architecture LBR. For example, Architecture LBR doesn't have the TOS MSR. The number of LBR entries is variable. A new struct will be introduced for Architecture LBR. Perf has to determine the type of task_ctx at run time. The type of task_ctx pointer is changed to 'void *', which will be determined at run time. The generic LBR optimization can be shared between Architecture LBR and model-specific LBR. Both need to access the structure for the generic LBR optimization. A helper task_context_opt() is introduced to retrieve the pointer of the structure at run time. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-7-git-send-email-kan.liang@linux.intel.com
*	perf/x86/intel/lbr: Factor out a new struct for generic optimization	Kan Liang	2020-07-08	2	-20/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To reduce the overhead of a context switch with LBR enabled, some generic optimizations were introduced, e.g. avoiding restore LBR if no one else touched them. The generic optimizations can also be used by Architecture LBR later. Currently, the fields for the generic optimizations are part of structure x86_perf_task_context, which will be deprecated by Architecture LBR. A new structure should be introduced for the common fields of generic optimization, which can be shared between Architecture LBR and model-specific LBR. Both 'valid_lbrs' and 'tos' are also used by the generic optimizations, but they are not moved into the new structure, because Architecture LBR is stack-like. The 'valid_lbrs' which records the index of the valid LBR is not required anymore. The TOS MSR will be removed. LBR registers may be cleared in the deep Cstate. If so, the generic optimizations should not be applied. Perf has to unconditionally restore the LBR registers. A generic function is required to detect the reset due to the deep Cstate. lbr_is_reset_in_cstate() is introduced. Currently, for the model-specific LBR, the TOS MSR is used to detect the reset. There will be another method introduced for Architecture LBR later. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-6-git-send-email-kan.liang@linux.intel.com
*	perf/x86/intel/lbr: Add the function pointers for LBR save and restore	Kan Liang	2020-07-08	3	-30/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The MSRs of Architectural LBR are different from previous model-specific LBR. Perf has to implement different functions to save and restore them. The function pointers for LBR save and restore are introduced. Perf should initialize the corresponding functions at boot time. The generic optimizations, e.g. avoiding restore LBR if no one else touched them, still apply for Architectural LBRs. The related codes are not moved to model-specific functions. Current model-specific LBR functions are set as default. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-5-git-send-email-kan.liang@linux.intel.com
*	perf/x86/intel/lbr: Add a function pointer for LBR read	Kan Liang	2020-07-08	3	-7/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The method to read Architectural LBRs is different from previous model-specific LBR. Perf has to implement a different function. A function pointer for LBR read is introduced. Perf should initialize the corresponding function at boot time, and avoid checking lbr_format at run time. The current 64-bit LBR read function is set as default. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-4-git-send-email-kan.liang@linux.intel.com
*	perf/x86/intel/lbr: Add a function pointer for LBR reset	Kan Liang	2020-07-08	3	-17/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The method to reset Architectural LBRs is different from previous model-specific LBR. Perf has to implement a different function. A function pointer is introduced for LBR reset. The enum of LBR_FORMAT_* is also moved to perf_event.h. Perf should initialize the corresponding functions at boot time, and avoid checking lbr_format at run time. The current 64-bit LBR reset function is set as default. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1593780569-62993-3-git-send-email-kan.liang@linux.intel.com
*	x86/cpufeatures: Add Architectural LBRs feature bit	Kan Liang	2020-07-08	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CPUID.(EAX=07H, ECX=0):EDX[19] indicates whether an Intel CPU supports Architectural LBRs. The "X86_FEATURE_..., word 18" is already mirrored from CPUID "0x00000007:0 (EDX)". Add X86_FEATURE_ARCH_LBR under the "word 18" section. The feature will appear as "arch_lbr" in /proc/cpuinfo. The Architectural Last Branch Records (LBR) feature enables recording of software path history by logging taken branches and other control flows. The feature will be supported in the perf_events subsystem. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Link: https://lkml.kernel.org/r/1593780569-62993-2-git-send-email-kan.liang@linux.intel.com
*	Merge branch 'perf/vlbr'	Peter Zijlstra	2020-07-02	1067	-5649/+9632
\|\
\| *	perf/x86: Keep LBR records unchanged in host context for guest usage	Like Xu	2020-07-02	3	-7/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a guest wants to use the LBR registers, its hypervisor creates a guest LBR event and let host perf schedules it. The LBR records msrs are accessible to the guest when its guest LBR event is scheduled on by the perf subsystem. Before scheduling this event out, we should avoid host changes on IA32_DEBUGCTLMSR or LBR_SELECT. Otherwise, some unexpected branch operations may interfere with guest behavior, pollute LBR records, and even cause host branches leakage. In addition, the read operation on host is also avoidable. To ensure that guest LBR records are not lost during the context switch, the guest LBR event would enable the callstack mode which could save/restore guest unread LBR records with the help of intel_pmu_lbr_sched_task() naturally. However, the guest LBR_SELECT may changes for its own use and the host LBR event doesn't save/restore it. To ensure that we doesn't lost the guest LBR_SELECT value when the guest LBR event is running, the vlbr_constraint is bound up with a new constraint flag PERF_X86_EVENT_LBR_SELECT. Signed-off-by: Like Xu <like.xu@linux.intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200514083054.62538-6-like.xu@linux.intel.com
\| *	perf/x86: Add constraint to create guest LBR event without hw counter	Like Xu	2020-07-02	5	-1/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The hypervisor may request the perf subsystem to schedule a time window to directly access the LBR records msrs for its own use. Normally, it would create a guest LBR event with callstack mode enabled, which is scheduled along with other ordinary LBR events on the host but in an exclusive way. To avoid wasting a counter for the guest LBR event, the perf tracks its hw->idx via INTEL_PMC_IDX_FIXED_VLBR and assigns it with a fake VLBR counter with the help of new vlbr_constraint. As with the BTS event, there is actually no hardware counter assigned for the guest LBR event. Signed-off-by: Like Xu <like.xu@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200514083054.62538-5-like.xu@linux.intel.com
\| *	perf/x86/lbr: Add interface to get LBR information	Like Xu	2020-07-02	2	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The LBR records msrs are model specific. The perf subsystem has already obtained the base addresses of LBR records based on the cpu model. Therefore, an interface is added to allow callers outside the perf subsystem to obtain these LBR information. It's useful for hypervisors to emulate the LBR feature for guests with less code. Signed-off-by: Like Xu <like.xu@linux.intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200613080958.132489-4-like.xu@linux.intel.com
\| *	perf/x86/core: Refactor hw->idx checks and cleanup	Like Xu	2020-07-02	2	-48/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For intel_pmu_en/disable_event(), reorder the branches checks for hw->idx and make them sorted by probability: gp,fixed,bts,others. Clean up the x86_assign_hw_event() by converting multiple if-else statements to a switch statement. To skip x86_perf_event_update() and x86_perf_event_set_period(), it's generic to replace "idx == INTEL_PMC_IDX_FIXED_BTS" check with '!hwc->event_base' because that should be 0 for all non-gp/fixed cases. Wrap related bit operations into intel_set/clear_masks() and make the main path more cleaner and readable. No functional changes. Signed-off-by: Like Xu <like.xu@linux.intel.com> Original-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200613080958.132489-3-like.xu@linux.intel.com
\| *	perf/x86: Fix variable types for LBR registers	Wei Wang	2020-07-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The MSR variable type can be 'unsigned int', which uses less memory than the longer 'unsigned long'. Fix 'struct x86_pmu' for that. The lbr_nr won't be a negative number, so make it 'unsigned int' as well. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200613080958.132489-2-like.xu@linux.intel.com
\| *	Linux 5.8-rc3v5.8-rc3	Linus Torvalds	2020-06-29	1	-1/+1
\| \|
\| *	Merge tag 'arm-omap-fixes-5.8-1' of ↵	Linus Torvalds	2020-06-28	30	-148/+125
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM OMAP fixes from Arnd Bergmann: "The OMAP developers are particularly active at hunting down regressions, so this is a separate branch with OMAP specific fixes for v5.8: As Tony explains "The recent display subsystem (DSS) related platform data changes caused display related regressions for suspend and resume. Looks like I only tested suspend and resume before dropping the legacy platform data, and forgot to test it after dropping it. Turns out the main issue was that we no longer have platform code calling pm_runtime_suspend for DSS like we did for the legacy platform data case, and that fix is still being discussed on the dri-devel list and will get merged separately. The DSS related testing exposed a pile other other display related issues that also need fixing though": - Fix ti-sysc optional clock handling and reset status checks for devices that reset automatically in idle like DSS - Ignore ti-sysc clockactivity bit unless separately requested to avoid unexpected performance issues - Init ti-sysc framedonetv_irq to true and disable for am4 - Avoid duplicate DSS reset for legacy mode with dts data - Remove LCD timings for am4 as they cause warnings now that we're using generic panels Other OMAP changes from Tony include: - Fix omap_prm reset deassert as we still have drivers setting the pm_runtime_irq_safe() flag - Flush posted write for ti-sysc enable and disable - Fix droid4 spi related errors with spi flags - Fix am335x USB range and a typo for softreset - Fix dra7 timer nodes for clocks for IPU and DSP - Drop duplicate mailboxes after mismerge for dra7 - Prevent pocketgeagle header line signal from accidentally setting micro-SD write protection signal by removing the default mux - Fix NFSroot flakeyness after resume for duover by switching the smsc911x gpio interrupt to back to level sensitive - Fix regression for omap4 clockevent source after recent system timer changes - Yet another ethernet regression fix for the "rgmii" vs "rgmii-rxid" phy-mode - One patch to convert am3/am4 DT files to use the regular sdhci-omap driver instead of the old hsmmc driver, this was meant for the merge window but got lost in the process" * tag 'arm-omap-fixes-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (21 commits) ARM: dts: am5729: beaglebone-ai: fix rgmii phy-mode ARM: dts: Fix omap4 system timer source clocks ARM: dts: Fix duovero smsc interrupt for suspend ARM: dts: am335x-pocketbeagle: Fix mmc0 Write Protect Revert "bus: ti-sysc: Increase max softreset wait" ARM: dts: am437x-epos-evm: remove lcd timings ARM: dts: am437x-gp-evm: remove lcd timings ARM: dts: am437x-sk-evm: remove lcd timings ARM: dts: dra7-evm-common: Fix duplicate mailbox nodes ARM: dts: dra7: Fix timer nodes properly for timer_sys_ck clocks ARM: dts: Fix am33xx.dtsi ti,sysc-mask wrong softreset flag ARM: dts: Fix am33xx.dtsi USB ranges length bus: ti-sysc: Increase max softreset wait ARM: OMAP2+: Fix legacy mode dss_reset bus: ti-sysc: Fix uninitialized framedonetv_irq bus: ti-sysc: Ignore clockactivity unless specified as a quirk bus: ti-sysc: Use optional clocks on for enable and wait for softreset bit ARM: dts: omap4-droid4: Fix spi configuration and increase rate bus: ti-sysc: Flush posted write on enable and disable soc: ti: omap-prm: use atomic iopoll instead of sleeping one ...
\| \| *	Merge tag 'omap-for-v5.8/fixes-rc1-signed' of ↵	Arnd Bergmann	2020-06-28	4	-4/+3
\| \| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/omap-fixes Few dts fixes for omaps for v5.8 Few fixes for various devices: - Prevent pocketgeagle header line signal from accidentally setting micro-SD write protection signal by removing the default mux - Fix NFSroot flakeyness after resume for duover by switching the smsc911x gpio interrupt to back to level sensitive - Fix regression for omap4 clockevent source after recent system timer changes - Yet another ethernet regression fix for the "rgmii" vs "rgmii-rxid" phy-mode * tag 'omap-for-v5.8/fixes-rc1-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: am5729: beaglebone-ai: fix rgmii phy-mode ARM: dts: Fix omap4 system timer source clocks ARM: dts: Fix duovero smsc interrupt for suspend ARM: dts: am335x-pocketbeagle: Fix mmc0 Write Protect Link: https://lore.kernel.org/r/pull-1592499282-121092@atomide.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
\| \| \| *	Merge branch 'omap-for-v5.8/fixes-rc1' into fixes	Tony Lindgren	2020-06-16	4	-4/+3
\| \| \| \|\
\| \| \| \| *	ARM: dts: am5729: beaglebone-ai: fix rgmii phy-mode	Drew Fustini	2020-06-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since commit cd28d1d6e52e ("net: phy: at803x: Disable phy delay for RGMII mode") the networking is broken on the BeagleBone AI which has the AR8035 PHY for Gigabit Ethernet [0]. The fix is to switch from phy-mode = "rgmii" to phy-mode = "rgmii-rxid". Note: Grygorii made a similar DT fix for other AM57xx boards with a different phy in commit 820f8a870f65 ("ARM: dts: am57xx: fix networking on boards with ksz9031 phy"). [0] https://git.io/Jf7PX Fixes: 520557d4854b ("ARM: dts: am5729: beaglebone-ai: adding device tree") Cc: Vinod Koul <vkoul@kernel.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Robert Nelson <robertcnelson@gmail.com> Signed-off-by: Drew Fustini <drew@beagleboard.org> Signed-off-by: Tony Lindgren <tony@atomide.com>
\| \| \| \| *	ARM: dts: Fix omap4 system timer source clocks	Tony Lindgren	2020-06-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I accidentally flipped the system timer to use system clock instead of the 32k source clock. Fixes: 14b1925a7219 ("ARM: dts: Configure system timers for omap4") Signed-off-by: Tony Lindgren <tony@atomide.com>
\| \| \| \| *	ARM: dts: Fix duovero smsc interrupt for suspend	Tony Lindgren	2020-06-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While testing the recent suspend and resume regressions I noticed that duovero can still end up losing edge gpio interrupts on runtime suspend. This causes NFSroot easily stopping working after resume on duovero. Let's fix the issue by using gpio level interrupts for smsc as then the gpio interrupt state is seen by the gpio controller on resume. Fixes: 731b409878a3 ("ARM: dts: Configure duovero for to allow core retention during idle") Signed-off-by: Tony Lindgren <tony@atomide.com>
\| \| \| \| *	ARM: dts: am335x-pocketbeagle: Fix mmc0 Write Protect	Drew Fustini	2020-06-16	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AM3358 pin mcasp0_aclkr (ZCZ ball B13) [0] is routed to P1.31 header [1] Mode 4 of this pin is mmc0_sdwp (SD Write Protect). A signal connected to P1.31 may accidentally trigger mmc0 write protection. To avoid this situation, do not put mcasp0_aclkr in mode 4 (mmc0_sdwp) by default. [0] http://www.ti.com/lit/ds/symlink/am3358.pdf [1] https://github.com/beagleboard/pocketbeagle/wiki/System-Reference-Manual#531_Expansion_Headers Fixes: 047905376a16 (ARM: dts: Add am335x-pocketbeagle) Signed-off-by: Robert Nelson <robertcnelson@gmail.com> Signed-off-by: Drew Fustini <drew@beagleboard.org> Signed-off-by: Tony Lindgren <tony@atomide.com>
\| \| \| * \|	Merge tag 'v5.8-rc1' into fixes	Tony Lindgren	2020-06-16	14550	-317882/+861017
\| \| \| \|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Linux 5.8-rc1
\| \| * \| \|	Merge tag 'omap-for-v5.8/dt-missed-signed' of ↵	Arnd Bergmann	2020-06-28	19	-26/+22
\| \| \|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/omap-fixes Missed sdhci patch for am3 and am4 I forgot to send a pull request earlier for converting am3 and am4 to use sdhci-omap driver instead of the old omap_hsmmc driver. There was a display subsystem related suspend and resume regression found recently and looks like I forgot to send a pull request for this patch while debugging the regression. This patch has been tested without the display subsystem, and has been in Linux next for several weeks now, so would be good to have merged for v5.8. * tag 'omap-for-v5.8/dt-missed-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: Move am33xx and am43xx mmc nodes to sdhci-omap driver Link: https://lore.kernel.org/r/pull-1591637467-607254@atomide.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
\| \| \| * \| \|	ARM: dts: Move am33xx and am43xx mmc nodes to sdhci-omap driver	Faiz Abbas	2020-05-19	19	-26/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move mmc nodes to be compatible with the sdhci-omap driver. The following modifications are required for omap_hsmmc specific properties: ti,non-removable: convert to the generic mmc non-removable ti,needs-special-reset: co-opted into the sdhci-omap driver ti,dual-volt: removed. Legacy property not used in am335x or am43xx ti,needs-special-hs-handling: removed. Legacy property not used in am335x or am43xx Also since the sdhci-omap driver does not support runtime PM, explicitly disable the mmc3 instance in the dtsi. Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
\| \| * \| \| \|	Merge tag 'omap-for-v5.8/fixes-merge-window-signed' of ↵	Arnd Bergmann	2020-06-28	10	-118/+100
\| \| \|\ \ \ \ \| \| \| \| \|/ / \| \| \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes Fixes for omaps for v5.8 The recent display subsystem (DSS) related platform data changes caused display related regressions for suspend and resume. Looks like I only tested suspend and resume before dropping the legacy platform data, and forgot to test it after dropping it. Turns out the main issue was that we no longer have platform code calling pm_runtime_suspend for DSS like we did for the legacy platform data case, and that fix is still being discussed on the dri-devel list and will get merged separately. The DSS related testing exposed a pile other other display related issues that also need fixing though: - Fix ti-sysc optional clock handling and reset status checks for devices that reset automatically in idle like DSS - Ignore ti-sysc clockactivity bit unless separately requested to avoid unexpected performance issues - Init ti-sysc framedonetv_irq to true and disable for am4 - Avoid duplicate DSS reset for legacy mode with dts data - Remove LCD timings for am4 as they cause warnings now that we're using generic panels Then there is a pile of other fixes not related to the DSS: - Fix omap_prm reset deassert as we still have drivers setting the pm_runtime_irq_safe() flag - Flush posted write for ti-sysc enable and disable - Fix droid4 spi related errors with spi flags - Fix am335x USB range and a typo for softreset - Fix dra7 timer nodes for clocks for IPU and DSP - Drop duplicate mailboxes after mismerge for dra7 * tag 'omap-for-v5.8/fixes-merge-window-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: Revert "bus: ti-sysc: Increase max softreset wait" ARM: dts: am437x-epos-evm: remove lcd timings ARM: dts: am437x-gp-evm: remove lcd timings ARM: dts: am437x-sk-evm: remove lcd timings ARM: dts: dra7-evm-common: Fix duplicate mailbox nodes ARM: dts: dra7: Fix timer nodes properly for timer_sys_ck clocks ARM: dts: Fix am33xx.dtsi ti,sysc-mask wrong softreset flag ARM: dts: Fix am33xx.dtsi USB ranges length bus: ti-sysc: Increase max softreset wait ARM: OMAP2+: Fix legacy mode dss_reset bus: ti-sysc: Fix uninitialized framedonetv_irq bus: ti-sysc: Ignore clockactivity unless specified as a quirk bus: ti-sysc: Use optional clocks on for enable and wait for softreset bit ARM: dts: omap4-droid4: Fix spi configuration and increase rate bus: ti-sysc: Flush posted write on enable and disable soc: ti: omap-prm: use atomic iopoll instead of sleeping one Link: https://lore.kernel.org/r/pull-1591889257-410830@atomide.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
\| \| \| * \| \|	Revert "bus: ti-sysc: Increase max softreset wait"	Tony Lindgren	2020-06-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 636338d7968e47c7f2e0b772a2a825ad932883fb. This patch is not a proper fixes the i2c2 timeouts are still happening in some cases. Signed-off-by: Tony Lindgren <tony@atomide.com>
\| \| \| * \| \|	ARM: dts: am437x-epos-evm: remove lcd timings	Tomi Valkeinen	2020-06-11	1	-16/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LCD timings now come from panel-simple. Having timings in the DT will cause a WARN. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
\| \| \| * \| \|	ARM: dts: am437x-gp-evm: remove lcd timings	Tomi Valkeinen	2020-06-11	1	-16/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LCD timings now come from panel-simple. Having timings in the DT will cause a WARN. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
\| \| \| * \| \|	ARM: dts: am437x-sk-evm: remove lcd timings	Tomi Valkeinen	2020-06-09	1	-16/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LCD timings now come from panel-simple. Having timings in the DT will cause a WARN. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
\| \| \| * \| \|	Merge branch 'fixes-v5.7' into fixes	Tony Lindgren	2020-06-08	1757	-9942/+19609
\| \| \| \|\ \ \
\| \| \| \| * \| \|	ARM: dts: dra7-evm-common: Fix duplicate mailbox nodes	Suman Anna	2020-06-08	1	-20/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The mailbox nodes defined in various dts files have been moved to common dra7-ipu-dsp-common.dtsi and dra74-ipu-dsp-common.dtsi files in commit a11a2f73b32d ("ARM: dts: dra7-ipu-dsp-common: Move mailboxes into common files"), but the nodes were erroneously left out in the dra7-evm-common.dtsi file. Fix this by removing these duplicate nodes. Fixes: a11a2f73b32d ("ARM: dts: dra7-ipu-dsp-common: Move mailboxes into common files") Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
\| \| \| \| * \| \|	ARM: dts: dra7: Fix timer nodes properly for timer_sys_ck clocks	Suman Anna	2020-06-08	1	-18/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The commit 5390130f3b28 ("ARM: dts: dra7: add timer_sys_ck entries for IPU/DSP timers") was added to allow the OMAP clocksource timer driver to use the clock aliases when reconfiguring the parent clock source for the timer functional clocks after the timer_sys_ck clock aliases got cleaned up in commit a8202cd5174d ("clk: ti: dra7: drop unnecessary clock aliases"). The above patch however has missed adding the entries for couple of timers (14, 15 and 16), and also added erroneously in the parent ti-sysc nodes for couple of clocks (timers 4, 5 and 6). Fix these properly, so that any of these timers can be used with OMAP remoteproc IPU and DSP devices. The always-on timers 1 and 12 are not expected to use this clock source, so they are not modified. Fixes: 5390130f3b28 ("ARM: dts: dra7: add timer_sys_ck entries for IPU/DSP timers") Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>