linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge tag 'fixes-for-v5.7-rc2' of ↵	Greg Kroah-Hartman	2020-04-20	8	-37/+62
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus Felipe writes: USB: fixes for v5.7-rc2 DWC3 learns how to properly set maxpacket limit and got a fix for a request completion bug. The raw gadget got a fix for copy_to/from_user() checks. Atmel got an improvement on vbus disconnect handling. We're also adding support for another SoC to the Renesas DRD driver. * tag 'fixes-for-v5.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb: usb: raw-gadget: Fix copy_to/from_user() checks usb: raw-gadget: fix raw_event_queue_fetch locking usb: gadget: udc: atmel: Fix vbus disconnect handling usb: dwc3: gadget: Fix request completion check usb: dwc3: gadget: Do link recovery for SS and SSP dt-bindings: usb: renesas,usb3-peri: add r8a77961 support dt-bindings: usb: renesas,usbhs: add r8a77961 support dt-bindings: usb: usb-xhci: add r8a77961 support docs: dt: qcom,dwc3.txt: fix cross-reference for a converted file usb: dwc3: gadget: Properly set maxpacket limit usb: dwc3: Fix GTXFIFOSIZ.TXFDEP macro name usb: gadget: udc: bdc: Remove unnecessary NULL checks in bdc_req_complete
\| *	usb: raw-gadget: Fix copy_to/from_user() checks	Dan Carpenter	2020-04-17	1	-24/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The copy_to/from_user() functions return the number of bytes remaining but we want to return negative error codes. I changed a couple checks in raw_ioctl_ep_read() and raw_ioctl_ep0_read() to show that we still we returning zero on error. Fixes: f2c2e717642c ("usb: gadget: add raw-gadget interface") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
\| *	usb: raw-gadget: fix raw_event_queue_fetch locking	Andrey Konovalov	2020-04-17	1	-5/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If queue->size check in raw_event_queue_fetch() fails (which normally shouldn't happen, that check is a fail-safe), the function returns without reenabling interrupts. This patch fixes that issue, along with propagating the cause of failure to the function caller. Fixes: f2c2e717642c ("usb: gadget: add raw-gadget interface" Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
\| *	usb: gadget: udc: atmel: Fix vbus disconnect handling	Cristian Birsan	2020-04-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A DMA transfer can be in progress while vbus is lost due to a cable disconnect. For endpoints that use DMA, this condition can lead to peripheral hang. The patch ensures that endpoints are disabled before the clocks are stopped to prevent this issue. Fixes: a64ef71ddc13 ("usb: gadget: atmel_usba_udc: condition clocks to vbus state") Signed-off-by: Cristian Birsan <cristian.birsan@microchip.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
\| *	usb: dwc3: gadget: Fix request completion check	Thinh Nguyen	2020-04-17	1	-10/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A request may not be completed because not all the TRBs are prepared for it. This happens when we run out of available TRBs. When some TRBs are completed, the driver needs to prepare the rest of the TRBs for the request. The check dwc3_gadget_ep_request_completed() shouldn't be checking the amount of data received but rather the number of pending TRBs. Revise this request completion check. Cc: stable@vger.kernel.org Fixes: e0c42ce590fe ("usb: dwc3: gadget: simplify IOC handling") Signed-off-by: Thinh Nguyen <thinhn@synopsys.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
\| *	usb: dwc3: gadget: Do link recovery for SS and SSP	Thinh Nguyen	2020-04-16	1	-6/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The controller always supports link recovery for device in SS and SSP. Remove the speed limit check. Also, when the device is in RESUME or RESET state, it means the controller received the resume/reset request. The driver must send the link recovery to acknowledge the request. They are valid states for the driver to send link recovery. Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver") Fixes: ee5cd41c9117 ("usb: dwc3: Update speed checks for SuperSpeedPlus") Signed-off-by: Thinh Nguyen <thinhn@synopsys.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
\| *	dt-bindings: usb: renesas,usb3-peri: add r8a77961 support	Yoshihiro Shimoda	2020-04-16	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for r8a77961 (R-Car M3-W+). Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Felipe Balbi <balbi@kernel.org>
\| *	dt-bindings: usb: renesas,usbhs: add r8a77961 support	Yoshihiro Shimoda	2020-04-16	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for r8a77961 (R-Car M3-W+). Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Felipe Balbi <balbi@kernel.org>
\| *	dt-bindings: usb: usb-xhci: add r8a77961 support	Yoshihiro Shimoda	2020-04-16	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for r8a77961 (R-Car M3-W+). To avoid confusion between R-Car M3-W (R8A77960) and R-Car M3-W+ (R8A77961), this patch also updates the comment of "renesas,xhci-r8a7796". Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Felipe Balbi <balbi@kernel.org>
\| *	docs: dt: qcom,dwc3.txt: fix cross-reference for a converted file	Mauro Carvalho Chehab	2020-04-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The qcom-qusb2-phy.txt file was converted and renamed to yaml. Update cross-reference accordingly. Fixes: 8ce65d8d38df ("dt-bindings: phy: qcom,qusb2: Convert QUSB2 phy bindings to yaml") Reviewed-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Felipe Balbi <balbi@kernel.org>
\| *	usb: dwc3: gadget: Properly set maxpacket limit	Thinh Nguyen	2020-04-16	2	-11/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the calculation of max packet size limit for IN endpoints is too restrictive. This prevents a matching of a capable hardware endpoint during configuration. Below is the minimum recommended HW configuration to support a particular endpoint setup from the databook: For OUT endpoints, the databook recommended the minimum RxFIFO size to be at least 3x MaxPacketSize + 3x setup packets size (8 bytes each) + clock crossing margin (16 bytes). For IN endpoints, the databook recommended the minimum TxFIFO size to be at least 3x MaxPacketSize for endpoints that support burst. If the endpoint doesn't support burst or when the device is operating in USB 2.0 mode, a minimum TxFIFO size of 2x MaxPacketSize is recommended. Base on these recommendations, we can calculate the MaxPacketSize limit of each endpoint. This patch revises the IN endpoint MaxPacketSize limit and also sets the MaxPacketSize limit for OUT endpoints. Reference: Databook 3.30a section 3.2.2 and 3.2.3 Signed-off-by: Thinh Nguyen <thinhn@synopsys.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
\| *	usb: dwc3: Fix GTXFIFOSIZ.TXFDEP macro name	Thinh Nguyen	2020-04-16	2	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change the macro name DWC3_GTXFIFOSIZ_TXFDEF to DWC3_GTXFIFOSIZ_TXFDEP to match with the register name GTXFIFOSIZ.TXFDEP. Fixes: 457e84b6624b ("usb: dwc3: gadget: dynamically re-size TxFifos") Fixes: 0cab8d26d6e5 ("usb: dwc3: Update DWC_usb31 GTXFIFOSIZ reg fields") Signed-off-by: Thinh Nguyen <thinhn@synopsys.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
\| *	usb: gadget: udc: bdc: Remove unnecessary NULL checks in bdc_req_complete	Nathan Chancellor	2020-04-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When building with Clang + -Wtautological-pointer-compare: drivers/usb/gadget/udc/bdc/bdc_ep.c:543:28: warning: comparison of address of 'req->queue' equal to a null pointer is always false [-Wtautological-pointer-compare] if (req == NULL \|\| &req->queue == NULL \|\| &req->usb_req == NULL) ~~~~~^~~~~ ~~~~ drivers/usb/gadget/udc/bdc/bdc_ep.c:543:51: warning: comparison of address of 'req->usb_req' equal to a null pointer is always false [-Wtautological-pointer-compare] if (req == NULL \|\| &req->queue == NULL \|\| &req->usb_req == NULL) ~~~~~^~~~~~~ ~~~~ 2 warnings generated. As it notes, these statements will always evaluate to false so remove them. Fixes: efed421a94e6 ("usb: gadget: Add UDC driver for Broadcom USB3.0 device controller IP BDC") Link: https://github.com/ClangBuiltLinux/linux/issues/749 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
* \|	USB: Add USB_QUIRK_DELAY_CTRL_MSG and USB_QUIRK_DELAY_INIT for Corsair K70 ↵	Jonathan Cox	2020-04-16	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RGB RAPIDFIRE The Corsair K70 RGB RAPIDFIRE needs the USB_QUIRK_DELAY_INIT and USB_QUIRK_DELAY_CTRL_MSG to function or it will randomly not respond on boot, just like other Corsair keyboards Signed-off-by: Jonathan Cox <jonathan@jdcox.net> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200410212427.2886-1-jonathan@jdcox.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* \|	phy: tegra: Select USB_COMMON for usb_get_maximum_speed()	Thierry Reding	2020-04-16	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The usb_get_maximum_speed() function is part of the usb-common module, so enable it by selecting the corresponding Kconfig symbol. While at it, also make sure to depend on USB_SUPPORT because USB_PHY requires that. This can lead to Kconfig conflicts if USB_SUPPORT is not enabled while attempting to enable PHY_TEGRA_XUSB. Reported-by: kbuild test robot <lkp@intel.com> Suggested-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20200330101038.2422389-1-thierry.reding@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* \|	usb: typec: tcpm: Ignore CC and vbus changes in PORT_RESET change	Badhri Jagan Sridharan	2020-04-16	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After PORT_RESET, the port is set to the appropriate default_state. Ignore processing CC changes here as this could cause the port to be switched into sink states by default. echo source > /sys/class/typec/port0/port_type Before: [ 154.528547] pending state change PORT_RESET -> PORT_RESET_WAIT_OFF @ 100 ms [ 154.528560] CC1: 0 -> 0, CC2: 3 -> 0 [state PORT_RESET, polarity 0, disconnected] [ 154.528564] state change PORT_RESET -> SNK_UNATTACHED After: [ 151.068814] pending state change PORT_RESET -> PORT_RESET_WAIT_OFF @ 100 ms [rev3 NONE_AMS] [ 151.072440] CC1: 3 -> 0, CC2: 0 -> 0 [state PORT_RESET, polarity 0, disconnected] [ 151.172117] state change PORT_RESET -> PORT_RESET_WAIT_OFF [delayed 100 ms] [ 151.172136] pending state change PORT_RESET_WAIT_OFF -> SRC_UNATTACHED @ 870 ms [rev3 NONE_AMS] [ 152.060106] state change PORT_RESET_WAIT_OFF -> SRC_UNATTACHED [delayed 870 ms] [ 152.060118] Start toggling Signed-off-by: Badhri Jagan Sridharan <badhri@google.com> Cc: stable <stable@vger.kernel.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20200402215947.176577-1-badhri@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* \|	usb: f_fs: Clear OS Extended descriptor counts to zero in ffs_data_reset()	Udipto Goswami	2020-04-16	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For userspace functions using OS Descriptors, if a function also supplies Extended Property descriptors currently the counts and lengths stored in the ms_os_descs_ext_prop_{count,name_len,data_len} variables are not getting reset to 0 during an unbind or when the epfiles are closed. If the same function is re-bound and the descriptors are re-written, this results in those count/length variables to monotonically increase causing the VLA allocation in _ffs_func_bind() to grow larger and larger at each bind/unbind cycle and eventually fail to allocate. Fix this by clearing the ms_os_descs_ext_prop count & lengths to 0 in ffs_data_reset(). Fixes: f0175ab51993 ("usb: gadget: f_fs: OS descriptors support") Cc: stable@vger.kernel.org Signed-off-by: Udipto Goswami <ugoswami@codeaurora.org> Signed-off-by: Sriharsha Allenki <sallenki@codeaurora.org> Reviewed-by: Manu Gautam <mgautam@codeaurora.org> Link: https://lore.kernel.org/r/20200402044521.9312-1-sallenki@codeaurora.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* \|	cdc-acm: introduce a cool down	Oliver Neukum	2020-04-16	2	-3/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Immediate submission in case of a babbling device can lead to a busy loop. Introducing a delayed work. Signed-off-by: Oliver Neukum <oneukum@suse.com> Cc: stable <stable@vger.kernel.org> Tested-by: Jonas Karlsson <jonas.karlsson@actia.se> Link: https://lore.kernel.org/r/20200415151358.32664-2-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* \|	cdc-acm: close race betrween suspend() and acm_softint	Oliver Neukum	2020-04-16	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Suspend increments a counter, then kills the URBs, then kills the scheduled work. The scheduled work, however, may reschedule the URBs. Fix this by having the work check the counter. Signed-off-by: Oliver Neukum <oneukum@suse.com> Cc: stable <stable@vger.kernel.org> Tested-by: Jonas Karlsson <jonas.karlsson@actia.se> Link: https://lore.kernel.org/r/20200415151358.32664-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* \|	UAS: fix deadlock in error handling and PM flushing work	Oliver Neukum	2020-04-16	1	-3/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A SCSI error handler and block runtime PM must not allocate memory with GFP_KERNEL. Furthermore they must not wait for tasks allocating memory with GFP_KERNEL. That means that they cannot share a workqueue with arbitrary tasks. Fix this for UAS using a private workqueue. Signed-off-by: Oliver Neukum <oneukum@suse.com> Fixes: f9dc024a2da1f ("uas: pre_reset and suspend: Fix a few races") Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200415141750.811-2-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* \|	UAS: no use logging any details in case of ENODEV	Oliver Neukum	2020-04-16	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Once a device is gone, the internal state does not matter anymore. There is no need to spam the logs. Signed-off-by: Oliver Neukum <oneukum@suse.com> Cc: stable <stable@vger.kernel.org> Fixes: 326349f824619 ("uas: add dead request list") Link: https://lore.kernel.org/r/20200415141750.811-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* \|	usb: raw-gadget: fix raw_event_queue_fetch locking	Andrey Konovalov	2020-04-16	1	-5/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If queue->size check in raw_event_queue_fetch() fails (which normally shouldn't happen, that check is a fail-safe), the function returns without reenabling interrupts. This patch fixes that issue, along with propagating the cause of failure to the function caller. Fixes: f2c2e717642c ("usb: gadget: add raw-gadget interface") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Link: https://lore.kernel.org/r/9f7ce7a1472cfb9447f6c5a494186fa1f2670f6f.1586270396.git.andreyknvl@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* \|	usb: raw-gadget: Fix copy_to/from_user() checks	Dan Carpenter	2020-04-16	1	-24/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The copy_to/from_user() functions return the number of bytes remaining but we want to return negative error codes. I changed a couple checks in raw_ioctl_ep_read() and raw_ioctl_ep0_read() to show that we still we returning zero on error. Fixes: f2c2e717642c ("usb: gadget: add raw-gadget interface") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Link: https://lore.kernel.org/r/20200406145119.GG68494@mwanda Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* \|	usb: typec: pi3usb30532: Set switch_ / mux_desc name field to NULL	Hans de Goede	2020-04-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since commit ef441dd6af91 ("usb: typec: mux: Allow the muxes to be named") the typec_switch_desc and typec_mux_desc structs contain a name field. The pi3usb30532 driver allocates these structs on the stack and so far did not explicitly zero the mem used for the structs. This causes the new name fields to point to a random memory address, which in my test case happens to be a valid address leading to "interesting" mux / switch names: [root@localhost ~]# ls -l /sys/class/typec_mux/ total 0 lrwxrwxrwx. 1 root root 0 Apr 14 12:55 ''$'\r''-switch' -> ... lrwxrwxrwx. 1 root root 0 Apr 14 12:55 ''$'\320\302\006''2'$'... Explicitly initialize the structs to zero when declaring them on the stack so that any unused fields get set to 0, fixing this. Fixes: ef441dd6af91 ("usb: typec: mux: Allow the muxes to be named") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20200414133313.131802-1-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* \|	USB: early: Handle AMD's spec-compliant identifiers, too	Jann Horn	2020-04-16	2	-6/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a bug that causes the USB3 early console to freeze after printing a single line on AMD machines because it can't parse the Transfer TRB properly. The spec at https://www.intel.com/content/dam/www/public/us/en/documents/technical-specifications/extensible-host-controler-interface-usb-xhci.pdf says in section "4.5.1 Device Context Index" that the Context Index, also known as Endpoint ID according to section "1.6 Terms and Abbreviations", is normally computed as `DCI = (Endpoint Number * 2) + Direction`, which matches the current definitions of XDBC_EPID_OUT and XDBC_EPID_IN. However, the numbering in a Debug Capability Context data structure is supposed to be different: Section "7.6.3.2 Endpoint Contexts and Transfer Rings" explains that a Debug Capability Context data structure has the endpoints mapped to indices 0 and 1. Change XDBC_EPID_OUT/XDBC_EPID_IN to the spec-compliant values, add XDBC_EPID_OUT_INTEL/XDBC_EPID_IN_INTEL with Intel's incorrect values, and let xdbc_handle_tx_event() handle both. I have verified that with this patch applied, the USB3 early console works on both an Intel and an AMD machine. Fixes: aeb9dd1de98c ("usb/early: Add driver for xhci debug capability") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/r/20200401074619.8024-1-jannh@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* \|	USB: core: Fix free-while-in-use bug in the USB S-Glibrary	Alan Stern	2020-04-16	1	-1/+8
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	FuzzUSB (a variant of syzkaller) found a free-while-still-in-use bug in the USB scatter-gather library: BUG: KASAN: use-after-free in atomic_read include/asm-generic/atomic-instrumented.h:26 [inline] BUG: KASAN: use-after-free in usb_hcd_unlink_urb+0x5f/0x170 drivers/usb/core/hcd.c:1607 Read of size 4 at addr ffff888065379610 by task kworker/u4:1/27 CPU: 1 PID: 27 Comm: kworker/u4:1 Not tainted 5.5.11 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Workqueue: scsi_tmf_2 scmd_eh_abort_handler Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xce/0x128 lib/dump_stack.c:118 print_address_description.constprop.4+0x21/0x3c0 mm/kasan/report.c:374 __kasan_report+0x153/0x1cb mm/kasan/report.c:506 kasan_report+0x12/0x20 mm/kasan/common.c:639 check_memory_region_inline mm/kasan/generic.c:185 [inline] check_memory_region+0x152/0x1b0 mm/kasan/generic.c:192 __kasan_check_read+0x11/0x20 mm/kasan/common.c:95 atomic_read include/asm-generic/atomic-instrumented.h:26 [inline] usb_hcd_unlink_urb+0x5f/0x170 drivers/usb/core/hcd.c:1607 usb_unlink_urb+0x72/0xb0 drivers/usb/core/urb.c:657 usb_sg_cancel+0x14e/0x290 drivers/usb/core/message.c:602 usb_stor_stop_transport+0x5e/0xa0 drivers/usb/storage/transport.c:937 This bug occurs when cancellation of the S-G transfer races with transfer completion. When that happens, usb_sg_cancel() may continue to access the transfer's URBs after usb_sg_wait() has freed them. The bug is caused by the fact that usb_sg_cancel() does not take any sort of reference to the transfer, and so there is nothing to prevent the URBs from being deallocated while the routine is trying to use them. The fix is to take such a reference by incrementing the transfer's io->count field while the cancellation is in progres and decrementing it afterward. The transfer's URBs are not deallocated until io->complete is triggered, which happens when io->count reaches zero. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: Kyungtae Kim <kt0755@gmail.com> CC: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.2003281615140.14837-100000@netrider.rowland.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	Linux 5.7-rc1v5.7-rc1	Linus Torvalds	2020-04-12	1	-2/+2
\|
*	MAINTAINERS: sort field names for all entries	Linus Torvalds	2020-04-12	1	-1974/+1974
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This sorts the actual field names too, potentially causing even more chaos and confusion at merge time if you have edited the MAINTAINERS file. But the end result is a more consistent layout, and hopefully it's a one-time pain minimized by doing this just before the -rc1 release. This was entirely scripted: ./scripts/parse-maintainers.pl --input=MAINTAINERS --output=MAINTAINERS --order Requested-by: Joe Perches <joe@perches.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	MAINTAINERS: sort entries by entry name	Linus Torvalds	2020-04-12	1	-820/+820
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	They are all supposed to be sorted, but people who add new entries don't always know the alphabet. Plus sometimes the entry names get edited, and people don't then re-order the entry. Let's see how painful this will be for merging purposes (the MAINTAINERS file is often edited in various different trees), but Joe claims there's relatively few patches in -next that touch this, and doing it just before -rc1 is likely the best time. Fingers crossed. This was scripted with /scripts/parse-maintainers.pl --input=MAINTAINERS --output=MAINTAINERS but then I also ended up manually upper-casing a few entry names that stood out when looking at the end result. Requested-by: Joe Perches <joe@perches.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Merge tag 'x86-urgent-2020-04-12' of ↵	Linus Torvalds	2020-04-12	4	-9/+79
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A set of three patches to fix the fallout of the newly added split lock detection feature. It addressed the case where a KVM guest triggers a split lock #AC and KVM reinjects it into the guest which is not prepared to handle it. Add proper sanity checks which prevent the unconditional injection into the guest and handles the #AC on the host side in the same way as user space detections are handled. Depending on the detection mode it either warns and disables detection for the task or kills the task if the mode is set to fatal" * tag 'x86-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: KVM: VMX: Extend VMXs #AC interceptor to handle split lock #AC in guest KVM: x86: Emulate split-lock access as a write in emulator x86/split_lock: Provide handle_guest_split_lock()
\| *	KVM: VMX: Extend VMXs #AC interceptor to handle split lock #AC in guest	Xiaoyao Li	2020-04-11	1	-3/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Two types of #AC can be generated in Intel CPUs: 1. legacy alignment check #AC 2. split lock #AC Reflect #AC back into the guest if the guest has legacy alignment checks enabled or if split lock detection is disabled. If the #AC is not a legacy one and split lock detection is enabled, then invoke handle_guest_split_lock() which will either warn and disable split lock detection for this task or force SIGBUS on it. [ tglx: Switch it to handle_guest_split_lock() and rename the misnamed helper function. ] Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lkml.kernel.org/r/20200410115517.176308876@linutronix.de
\| *	KVM: x86: Emulate split-lock access as a write in emulator	Xiaoyao Li	2020-04-11	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Emulate split-lock accesses as writes if split lock detection is on to avoid #AC during emulation, which will result in a panic(). This should never occur for a well-behaved guest, but a malicious guest can manipulate the TLB to trigger emulation of a locked instruction[1]. More discussion can be found at [2][3]. [1] https://lkml.kernel.org/r/8c5b11c9-58df-38e7-a514-dc12d687b198@redhat.com [2] https://lkml.kernel.org/r/20200131200134.GD18946@linux.intel.com [3] https://lkml.kernel.org/r/20200227001117.GX9940@linux.intel.com Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lkml.kernel.org/r/20200410115517.084300242@linutronix.de
\| *	x86/split_lock: Provide handle_guest_split_lock()	Thomas Gleixner	2020-04-11	2	-5/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Without at least minimal handling for split lock detection induced #AC, VMX will just run into the same problem as the VMWare hypervisor, which was reported by Kenneth. It will inject the #AC blindly into the guest whether the guest is prepared or not. Provide a function for guest mode which acts depending on the host SLD mode. If mode == sld_warn, treat it like user space, i.e. emit a warning, disable SLD and mark the task accordingly. Otherwise force SIGBUS. [ bp: Add a !CPU_SUP_INTEL stub for handle_guest_split_lock(). ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lkml.kernel.org/r/20200410115516.978037132@linutronix.de Link: https://lkml.kernel.org/r/20200402123258.895628824@linutronix.de
* \|	Merge tag 'timers-urgent-2020-04-12' of ↵	Linus Torvalds	2020-04-12	3	-0/+10
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull time(keeping) updates from Thomas Gleixner: - Fix the time_for_children symlink in /proc/$PID/ so it properly reflects that it part of the 'time' namespace - Add the missing userns limit for the allowed number of time namespaces, which was half defined but the actual array member was not added. This went unnoticed as the array has an exessive empty member at the end but introduced a user visible regression as the output was corrupted. - Prevent further silent ucount corruption by adding a BUILD_BUG_ON() to catch half updated data. * tag 'timers-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ucount: Make sure ucounts in /proc/sys/user don't regress again time/namespace: Add max_time_namespaces ucount time/namespace: Fix time_for_children symlink
\| * \|	ucount: Make sure ucounts in /proc/sys/user don't regress again	Jan Kara	2020-04-07	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 769071ac9f20 "ns: Introduce Time Namespace" broke reporting of inotify ucounts (max_inotify_instances, max_inotify_watches) in /proc/sys/user because it has added UCOUNT_TIME_NAMESPACES into enum ucount_type but didn't properly update reporting in kernel/ucount.c:setup_userns_sysctls(). This problem got fixed in commit eeec26d5da82 "time/namespace: Add max_time_namespaces ucount". Add BUILD_BUG_ON to catch a similar problem in the future. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Andrei Vagin <avagin@gmail.com> Link: https://lkml.kernel.org/r/20200407154643.10102-1-jack@suse.cz
\| * \|	time/namespace: Add max_time_namespaces ucount	Dmitry Safonov	2020-04-07	2	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Michael noticed that userns limit for number of time namespaces is missing. Furthermore, time namespace introduced UCOUNT_TIME_NAMESPACES, but didn't introduce an array member in user_table[]. It would make array's initialisation OOB write, but by luck the user_table array has an excessive empty member (all accesses to the array are limited with UCOUNT_COUNTS - so it silently reuses the last free member. Fixes user-visible regression: max_inotify_instances by reason of the missing UCOUNT_ENTRY() has limited max number of namespaces instead of the number of inotify instances. Fixes: 769071ac9f20 ("ns: Introduce Time Namespace") Reported-by: Michael Kerrisk (man-pages) <mtk.manpages@gmail.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Andrei Vagin <avagin@gmail.com> Acked-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: stable@kernel.org Link: https://lkml.kernel.org/r/20200406171342.128733-1-dima@arista.com
\| * \|	time/namespace: Fix time_for_children symlink	Michael Kerrisk (man-pages)	2020-04-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Looking at the contents of the /proc/PID/ns/time_for_children symlink shows an anomaly: $ ls -l /proc/self/ns/* \|awk '{print $9, $10, $11}' ... /proc/self/ns/pid -> pid:[4026531836] /proc/self/ns/pid_for_children -> pid:[4026531836] /proc/self/ns/time -> time:[4026531834] /proc/self/ns/time_for_children -> time_for_children:[4026531834] /proc/self/ns/user -> user:[4026531837] ... The reference for 'time_for_children' should be a 'time' namespace, just as the reference for 'pid_for_children' is a 'pid' namespace. In other words, the above time_for_children link should read: /proc/self/ns/time_for_children -> time:[4026531834] Fixes: 769071ac9f20 ("ns: Introduce Time Namespace") Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dmitry Safonov <dima@arista.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Andrei Vagin <avagin@gmail.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/a2418c48-ed80-3afe-116e-6611cb799557@gmail.com
* \| \|	Merge tag 'sched-urgent-2020-04-12' of ↵	Linus Torvalds	2020-04-12	5	-62/+51
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes/updates from Thomas Gleixner: - Deduplicate the average computations in the scheduler core and the fair class code. - Fix a raise between runtime distribution and assignement which can cause exceeding the quota by up to 70%. - Prevent negative results in the imbalanace calculation - Remove a stale warning in the workqueue code which can be triggered since the call site was moved out of preempt disabled code. It's a false positive. - Deduplicate the print macros for procfs - Add the ucmap values to the SCHED_DEBUG procfs output for completness * tag 'sched-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/debug: Add task uclamp values to SCHED_DEBUG procfs sched/debug: Factor out printing formats into common macros sched/debug: Remove redundant macro define sched/core: Remove unused rq::last_load_update_tick workqueue: Remove the warning in wq_worker_sleeping() sched/fair: Fix negative imbalance in imbalance calculation sched/fair: Fix race between runtime distribution and assignment sched/fair: Align rq->avg_idle and rq->avg_scan_cost
\| * \| \|	sched/debug: Add task uclamp values to SCHED_DEBUG procfs	Valentin Schneider	2020-04-08	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Requested and effective uclamp values can be a bit tricky to decipher when playing with cgroup hierarchies. Add them to a task's procfs when SCHED_DEBUG is enabled. Reviewed-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20200226124543.31986-4-valentin.schneider@arm.com
\| * \| \|	sched/debug: Factor out printing formats into common macros	Valentin Schneider	2020-04-08	1	-14/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The printing macros in debug.c keep redefining the same output format. Collect each output format in a single definition, and reuse that definition in the other macros. While at it, add a layer of parentheses and replace printf's with the newly introduced macros. Reviewed-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20200226124543.31986-3-valentin.schneider@arm.com
\| * \| \|	sched/debug: Remove redundant macro define	Valentin Schneider	2020-04-08	1	-12/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most printing macros for procfs are defined globally in debug.c, and they are re-defined (to the exact same thing) within proc_sched_show_task(). Get rid of the duplicate defines. Reviewed-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20200226124543.31986-2-valentin.schneider@arm.com
\| * \| \|	sched/core: Remove unused rq::last_load_update_tick	Vincent Donnefort	2020-04-08	2	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following commit: 5e83eafbfd3b ("sched/fair: Remove the rq->cpu_load[] update code") eliminated the last use case for rq->last_load_update_tick, so remove the field as well. Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/1584710495-308969-1-git-send-email-vincent.donnefort@arm.com
\| * \| \|	workqueue: Remove the warning in wq_worker_sleeping()	Sebastian Andrzej Siewior	2020-04-08	2	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The kernel test robot triggered a warning with the following race: task-ctx A interrupt-ctx B worker -> process_one_work() -> work_item() -> schedule(); -> sched_submit_work() -> wq_worker_sleeping() -> ->sleeping = 1 atomic_dec_and_test(nr_running) __schedule(); interrupt async_page_fault() -> local_irq_enable(); -> schedule(); -> sched_submit_work() -> wq_worker_sleeping() -> if (WARN_ON(->sleeping)) return -> __schedule() -> sched_update_worker() -> wq_worker_running() -> atomic_inc(nr_running); -> ->sleeping = 0; -> sched_update_worker() -> wq_worker_running() if (!->sleeping) return In this context the warning is pointless everything is fine. An interrupt before wq_worker_sleeping() will perform the ->sleeping assignment (0 -> 1 > 0) twice. An interrupt after wq_worker_sleeping() will trigger the warning and nr_running will be decremented (by A) and incremented once (only by B, A will skip it). This is the case until the ->sleeping is zeroed again in wq_worker_running(). Remove the WARN statement because this condition may happen. Document that preemption around wq_worker_sleeping() needs to be disabled to protect ->sleeping and not just as an optimisation. Fixes: 6d25be5782e48 ("sched/core, workqueues: Distangle worker accounting from rq lock") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Tejun Heo <tj@kernel.org> Link: https://lkml.kernel.org/r/20200327074308.GY11705@shao2-debian
\| * \| \|	sched/fair: Fix negative imbalance in imbalance calculation	Aubrey Li	2020-04-08	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A negative imbalance value was observed after imbalance calculation, this happens when the local sched group type is group_fully_busy, and the average load of local group is greater than the selected busiest group. Fix this problem by comparing the average load of the local and busiest group before imbalance calculation formula. Suggested-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Phil Auld <pauld@redhat.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/1585201349-70192-1-git-send-email-aubrey.li@intel.com
\| * \| \|	sched/fair: Fix race between runtime distribution and assignment	Huaixin Chang	2020-04-08	1	-20/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, there is a potential race between distribute_cfs_runtime() and assign_cfs_rq_runtime(). Race happens when cfs_b->runtime is read, distributes without holding lock and finds out there is not enough runtime to charge against after distribution. Because assign_cfs_rq_runtime() might be called during distribution, and use cfs_b->runtime at the same time. Fibtest is the tool to test this race. Assume all gcfs_rq is throttled and cfs period timer runs, slow threads might run and sleep, returning unused cfs_rq runtime and keeping min_cfs_rq_runtime in their local pool. If all this happens sufficiently quickly, cfs_b->runtime will drop a lot. If runtime distributed is large too, over-use of runtime happens. A runtime over-using by about 70 percent of quota is seen when we test fibtest on a 96-core machine. We run fibtest with 1 fast thread and 95 slow threads in test group, configure 10ms quota for this group and see the CPU usage of fibtest is 17.0%, which is far more than the expected 10%. On a smaller machine with 32 cores, we also run fibtest with 96 threads. CPU usage is more than 12%, which is also more than expected 10%. This shows that on similar workloads, this race do affect CPU bandwidth control. Solve this by holding lock inside distribute_cfs_runtime(). Fixes: c06f04c70489 ("sched: Fix potential near-infinite distribute_cfs_runtime() loop") Reviewed-by: Ben Segall <bsegall@google.com> Signed-off-by: Huaixin Chang <changhuaixin@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/lkml/20200325092602.22471-1-changhuaixin@linux.alibaba.com/
\| * \| \|	sched/fair: Align rq->avg_idle and rq->avg_scan_cost	Valentin Schneider	2020-04-08	3	-11/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sched/core.c uses update_avg() for rq->avg_idle and sched/fair.c uses an open-coded version (with the exact same decay factor) for rq->avg_scan_cost. On top of that, select_idle_cpu() expects to be able to compare these two fields. The only difference between the two is that rq->avg_scan_cost is computed using a pure division rather than a shift. Turns out it actually matters, first of all because the shifted value can be negative, and the standard has this to say about it: """ The result of E1 >> E2 is E1 right-shifted E2 bit positions. [...] If E1 has a signed type and a negative value, the resulting value is implementation-defined. """ Not only this, but (arithmetic) right shifting a negative value (using 2's complement) is not equivalent to dividing it by the corresponding power of 2. Let's look at a few examples: -4 -> 0xF..FC -4 >> 3 -> 0xF..FF == -1 != -4 / 8 -8 -> 0xF..F8 -8 >> 3 -> 0xF..FF == -1 == -8 / 8 -9 -> 0xF..F7 -9 >> 3 -> 0xF..FE == -2 != -9 / 8 Make update_avg() use a division, and export it to the private scheduler header to reuse it where relevant. Note that this still lets compilers use a shift here, but should prevent any unwanted surprise. The disassembly of select_idle_cpu() remains unchanged on arm64, and ttwu_do_wakeup() gains 2 instructions; the diff sort of looks like this: - sub x1, x1, x0 + subs x1, x1, x0 // set condition codes + add x0, x1, #0x7 + csel x0, x0, x1, mi // x0 = x1 < 0 ? x0 : x1 add x0, x3, x0, asr #3 which does the right thing (i.e. gives us the expected result while still using an arithmetic shift) Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20200330090127.16294-1-valentin.schneider@arm.com
* \| \| \|	Merge tag 'perf-urgent-2020-04-12' of ↵	Linus Torvalds	2020-04-12	4	-31/+573
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "Three fixes/updates for perf: - Fix the perf event cgroup tracking which tries to track the cgroup even for disabled events. - Add Ice Lake server support for uncore events - Disable pagefaults when retrieving the physical address in the sampling code" * tag 'perf-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Disable page faults when getting phys address perf/x86/intel/uncore: Add Ice Lake server uncore support perf/cgroup: Correct indirection in perf_less_group_idx() perf/core: Fix event cgroup tracking
\| * \| \| \|	perf/core: Disable page faults when getting phys address	Jiri Olsa	2020-04-08	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We hit following warning when running tests on kernel compiled with CONFIG_DEBUG_ATOMIC_SLEEP=y: WARNING: CPU: 19 PID: 4472 at mm/gup.c:2381 __get_user_pages_fast+0x1a4/0x200 CPU: 19 PID: 4472 Comm: dummy Not tainted 5.6.0-rc6+ #3 RIP: 0010:__get_user_pages_fast+0x1a4/0x200 ... Call Trace: perf_prepare_sample+0xff1/0x1d90 perf_event_output_forward+0xe8/0x210 __perf_event_overflow+0x11a/0x310 __intel_pmu_pebs_event+0x657/0x850 intel_pmu_drain_pebs_nhm+0x7de/0x11d0 handle_pmi_common+0x1b2/0x650 intel_pmu_handle_irq+0x17b/0x370 perf_event_nmi_handler+0x40/0x60 nmi_handle+0x192/0x590 default_do_nmi+0x6d/0x150 do_nmi+0x2f9/0x3c0 nmi+0x8e/0xd7 While __get_user_pages_fast() is IRQ-safe, it calls access_ok(), which warns on: WARN_ON_ONCE(!in_task() && !pagefault_disabled()) Peter suggested disabling page faults around __get_user_pages_fast(), which gets rid of the warning in access_ok() call. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20200407141427.3184722-1-jolsa@kernel.org
\| * \| \| \|	perf/x86/intel/uncore: Add Ice Lake server uncore support	Kan Liang	2020-04-08	3	-0/+522
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The uncore subsystem in Ice Lake server is similar to previous server. There are some differences in config register encoding and pci device IDs. The uncore PMON units in Ice Lake server include Ubox, Chabox, IIO, IRP, M2PCIE, PCU, M2M, PCIE3 and IMC. - For CHA, filter 1 register has been removed. The filter 0 register can be used by and of CHA events to be filterd by Thread/Core-ID. To do so, the control register's tid_en bit must be set to 1. - For IIO, there are some changes on event constraints. The MSR address and MSR offsets among counters are also changed. - For IRP, the MSR address and MSR offsets among counters are changed. - For M2PCIE, the counters are accessed by MSR now. Add new MSR address and MSR offsets. Change event constraints. - To determine the number of CHAs, have to read CAPID6(Low) and CAPID7 (High) now. - For M2M, update the PCICFG address and Device ID. - For UPI, update the PCICFG address, Device ID and counter address. - For M3UPI, update the PCICFG address, Device ID, counter address and event constraints. - For IMC, update the formular to calculate MMIO BAR address, which is MMIO_BASE + specific MEM_BAR offset. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/1585842411-150452-1-git-send-email-kan.liang@linux.intel.com
\| * \| \| \|	perf/cgroup: Correct indirection in perf_less_group_idx()	Ian Rogers	2020-04-08	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The void* in perf_less_group_idx() is to a member in the array which points at a perf_event, as such it is a perf_event*. Reported-By: John Sperbeck <jsperbeck@google.com> Fixes: 6eef8a7116de ("perf/core: Use min_heap in visit_groups_merge()") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20200321164331.107337-1-irogers@google.com