summaryrefslogtreecommitdiffstats
path: root/drivers (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'usb-6.12-rc4' of ↵Linus Torvalds2024-10-2013-58/+140
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB driver fixes from Greg KH: "Here are some small USB driver fixes and new device ids for 6.12-rc4: - xhci driver fixes for a number of reported issues - new usb-serial driver ids - dwc3 driver fixes for reported problems. - usb gadget driver fixes for reported problems - typec driver fixes - MAINTAINER file updates All of these have been in linux-next this week with no reported issues" * tag 'usb-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: serial: option: add Telit FN920C04 MBIM compositions USB: serial: option: add support for Quectel EG916Q-GL xhci: dbc: honor usb transfer size boundaries. usb: xhci: Fix handling errors mid TD followed by other errors xhci: Mitigate failed set dequeue pointer commands xhci: Fix incorrect stream context type macro USB: gadget: dummy-hcd: Fix "task hung" problem usb: gadget: f_uac2: fix return value for UAC2_ATTRIBUTE_STRING store usb: dwc3: core: Fix system suspend on TI AM62 platforms xhci: tegra: fix checked USB2 port number usb: dwc3: Wait for EndXfer completion before restoring GUSB2PHYCFG usb: typec: qcom-pmic-typec: fix sink status being overwritten with RP_DEF usb: typec: altmode should keep reference to parent MAINTAINERS: usb: raw-gadget: add bug tracker link MAINTAINERS: Add an entry for the LJCA drivers
| * Merge tag 'usb-serial-6.12-rc4' of ↵Greg Kroah-Hartman2024-10-181-0/+8
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: USB-serial device ids for 6.12-rc4 Here are some new modem device ids. Everything has been in linux-next over night with no reported issues. * tag 'usb-serial-6.12-rc4' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: USB: serial: option: add Telit FN920C04 MBIM compositions USB: serial: option: add support for Quectel EG916Q-GL
| | * USB: serial: option: add Telit FN920C04 MBIM compositionsDaniele Palmas2024-10-171-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the following Telit FN920C04 compositions: 0x10a2: MBIM + tty (AT/NMEA) + tty (AT) + tty (diag) T: Bus=03 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 17 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10a2 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FN920 S: SerialNumber=92c4c4d8 C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms 0x10a7: MBIM + tty (AT) + tty (AT) + tty (diag) T: Bus=03 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 18 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10a7 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FN920 S: SerialNumber=92c4c4d8 C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms 0x10aa: MBIM + tty (AT) + tty (diag) + DPL (data packet logging) + adb T: Bus=03 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 15 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10aa Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FN920 S: SerialNumber=92c4c4d8 C: #Ifs= 6 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>
| | * USB: serial: option: add support for Quectel EG916Q-GLBenjamin B. Frost2024-10-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add Quectel EM916Q-GL with product ID 0x6007 T: Bus=01 Lev=02 Prnt=02 Port=01 Cnt=01 Dev#= 3 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=2c7c ProdID=6007 Rev= 2.00 S: Manufacturer=Quectel S: Product=EG916Q-GL C:* #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=200mA A: FirstIf#= 4 IfCount= 2 Cls=02(comm.) Sub=06 Prot=00 I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=82(I) Atr=03(Int.) MxPS= 16 Ivl=32ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=84(I) Atr=03(Int.) MxPS= 16 Ivl=32ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=86(I) Atr=03(Int.) MxPS= 16 Ivl=32ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=06 Prot=00 Driver=cdc_ether E: Ad=88(I) Atr=03(Int.) MxPS= 32 Ivl=32ms I: If#= 5 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether I:* If#= 5 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms MI_00 Quectel USB Diag Port MI_01 Quectel USB NMEA Port MI_02 Quectel USB AT Port MI_03 Quectel USB Modem Port MI_04 Quectel USB Net Port Signed-off-by: Benjamin B. Frost <benjamin@geanix.com> Reviewed-by: Lars Melin <larsm17@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>
| * | xhci: dbc: honor usb transfer size boundaries.Mathias Nyman2024-10-172-5/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Treat each completed full size write to /dev/ttyDBC0 as a separate usb transfer. Make sure the size of the TRBs matches the size of the tty write by first queuing as many max packet size TRBs as possible up to the last TRB which will be cut short to match the size of the tty write. This solves an issue where userspace writes several transfers back to back via /dev/ttyDBC0 into a kfifo before dbgtty can find available request to turn that kfifo data into TRBs on the transfer ring. The boundary between transfer was lost as xhci-dbgtty then turned everyting in the kfifo into as many 'max packet size' TRBs as possible. DbC would then send more data to the host than intended for that transfer, causing host to issue a babble error. Refuse to write more data to kfifo until previous tty write data is turned into properly sized TRBs with data size boundaries matching tty write size Tested-by: Uday M Bhat <uday.m.bhat@intel.com> Tested-by: Łukasz Bartosik <ukaszb@chromium.org> Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241016140000.783905-5-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | usb: xhci: Fix handling errors mid TD followed by other errorsMichal Pecio2024-10-171-37/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some host controllers fail to produce the final completion event on an isochronous TD which experienced an error mid TD. We deal with it by flagging such TDs and checking if the next event points at the flagged TD or at the next one, and giving back the flagged TD if the latter. This is not enough, because the next TD may be missed by the xHC. Or there may be no next TD but a ring underrun. We also need to get such TD quickly out of the way, or errors on later TDs may be handled wrong. If the next TD experiences a Missed Service Error, we will set the skip flag on the endpoint and then attempt skipping TDs when yet another event arrives. In such scenario, we ought to report the 'error mid TD' transfer as such rather than skip it. Another problem case are Stopped events. If we see one after an error mid TD, we naively assume that it's a Force Stopped Event because it doesn't match the pending TD, but in reality it might be an ordinary Stopped event for the next TD, which we fail to recognize and handle. Fix this by moving error mid TD handling before the whole TD skipping loop. Remove unnecessary conditions, always give back the TD if the new event points to any TRB outside it or if the pointer is NULL, as may be the case in Ring Underrun and Overrun events on 1st gen hardware. Only if the pending TD isn't flagged, consider other actions like skipping. As a side effect of reordering with skip and FSE cases, error mid TD is reordered with last_td_was_short check. This is harmless, because the two cases are mutually exclusive - only one can happen in any given run of handle_tx_event(). Tested on the NEC host and a USB camera with flaky cable. Dynamic debug confirmed that Transaction Errors are sometimes seen, sometimes mid-TD, sometimes followed by Missed Service. In such cases, they were finished properly before skipping began. [Rebase on 6.12-rc1 -Mathias] Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241016140000.783905-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | xhci: Mitigate failed set dequeue pointer commandsMathias Nyman2024-10-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid xHC host from processing a cancelled URB by always turning cancelled URB TDs into no-op TRBs before queuing a 'Set TR Deq' command. If the command fails then xHC will start processing the cancelled TD instead of skipping it once endpoint is restarted, causing issues like Babble error. This is not a complete solution as a failed 'Set TR Deq' command does not guarantee xHC TRB caches are cleared. Fixes: 4db356924a50 ("xhci: turn cancelled td cleanup to its own function") Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241016140000.783905-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | xhci: Fix incorrect stream context type macroMathias Nyman2024-10-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The stream contex type (SCT) bitfield is used both in the stream context data structure, and in the 'Set TR Dequeue pointer' command TRB. In both cases it uses bits 3:1 The SCT_FOR_TRB(p) macro used to set the stream context type (SCT) field for the 'Set TR Dequeue pointer' command TRB incorrectly shifts the value 1 bit left before masking the three bits. Fix this by first masking and rshifting, just like the similar SCT_FOR_CTX(p) macro does This issue has not been visibile as the lost bit 3 is only used with secondary stream arrays (SSA). Xhci driver currently only supports using a primary stream array with Linear stream addressing. Fixes: 95241dbdf828 ("xhci: Set SCT field for Set TR dequeue on streams") Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241016140000.783905-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | USB: gadget: dummy-hcd: Fix "task hung" problemAlan Stern2024-10-171-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The syzbot fuzzer has been encountering "task hung" problems ever since the dummy-hcd driver was changed to use hrtimers instead of regular timers. It turns out that the problems are caused by a subtle difference between the timer_pending() and hrtimer_active() APIs. The changeover blindly replaced the first by the second. However, timer_pending() returns True when the timer is queued but not when its callback is running, whereas hrtimer_active() returns True when the hrtimer is queued _or_ its callback is running. This difference occasionally caused dummy_urb_enqueue() to think that the callback routine had not yet started when in fact it was almost finished. As a result the hrtimer was not restarted, which made it impossible for the driver to dequeue later the URB that was just enqueued. This caused usb_kill_urb() to hang, and things got worse from there. Since hrtimers have no API for telling when they are queued and the callback isn't running, the driver must keep track of this for itself. That's what this patch does, adding a new "timer_pending" flag and setting or clearing it at the appropriate times. Reported-by: syzbot+f342ea16c9d06d80b585@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-usb/6709234e.050a0220.3e960.0011.GAE@google.com/ Tested-by: syzbot+f342ea16c9d06d80b585@syzkaller.appspotmail.com Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Fixes: a7f3813e589f ("usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler") Cc: Marcello Sylvester Bauer <sylv@sylv.io> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/2dab644e-ef87-4de8-ac9a-26f100b2c609@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | usb: gadget: f_uac2: fix return value for UAC2_ATTRIBUTE_STRING storeKevin Groeneveld2024-10-161-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The configfs store callback should return the number of bytes consumed not the total number of bytes we actually stored. These could differ if for example the passed in string had a newline we did not store. If the returned value does not match the number of bytes written the writer might assume a failure or keep trying to write the remaining bytes. For example the following command will hang trying to write the final newline over and over again (tested on bash 2.05b): echo foo > function_name Fixes: 993a44fa85c1 ("usb: gadget: f_uac2: allow changing interface name via configfs") Cc: stable <stable@kernel.org> Signed-off-by: Kevin Groeneveld <kgroeneveld@lenbrook.com> Link: https://lore.kernel.org/r/20241006232637.4267-1-kgroeneveld@lenbrook.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | usb: dwc3: core: Fix system suspend on TI AM62 platformsRoger Quadros2024-10-162-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init"), system suspend is broken on AM62 TI platforms. Before that commit, both DWC3_GUSB3PIPECTL_SUSPHY and DWC3_GUSB2PHYCFG_SUSPHY bits (hence forth called 2 SUSPHY bits) were being set during core initialization and even during core re-initialization after a system suspend/resume. These bits are required to be set for system suspend/resume to work correctly on AM62 platforms. Since that commit, the 2 SUSPHY bits are not set for DEVICE/OTG mode if gadget driver is not loaded and started. For Host mode, the 2 SUSPHY bits are set before the first system suspend but get cleared at system resume during core re-init and are never set again. This patch resovles these two issues by ensuring the 2 SUSPHY bits are set before system suspend and restored to the original state during system resume. Cc: stable@vger.kernel.org # v6.9+ Fixes: 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init") Link: https://lore.kernel.org/all/1519dbe7-73b6-4afc-bfe3-23f4f75d772f@kernel.org/ Signed-off-by: Roger Quadros <rogerq@kernel.org> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Tested-by: Markus Schneider-Pargmann <msp@baylibre.com> Reviewed-by: Dhruva Gole <d-gole@ti.com> Link: https://lore.kernel.org/r/20241011-am62-lpm-usb-v3-1-562d445625b5@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | xhci: tegra: fix checked USB2 port numberHenry Lin2024-10-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If USB virtualizatoin is enabled, USB2 ports are shared between all Virtual Functions. The USB2 port number owned by an USB2 root hub in a Virtual Function may be less than total USB2 phy number supported by the Tegra XUSB controller. Using total USB2 phy number as port number to check all PORTSC values would cause invalid memory access. [ 116.923438] Unable to handle kernel paging request at virtual address 006c622f7665642f ... [ 117.213640] Call trace: [ 117.216783] tegra_xusb_enter_elpg+0x23c/0x658 [ 117.222021] tegra_xusb_runtime_suspend+0x40/0x68 [ 117.227260] pm_generic_runtime_suspend+0x30/0x50 [ 117.232847] __rpm_callback+0x84/0x3c0 [ 117.237038] rpm_suspend+0x2dc/0x740 [ 117.241229] pm_runtime_work+0xa0/0xb8 [ 117.245769] process_scheduled_works+0x24c/0x478 [ 117.251007] worker_thread+0x23c/0x328 [ 117.255547] kthread+0x104/0x1b0 [ 117.259389] ret_from_fork+0x10/0x20 [ 117.263582] Code: 54000222 f9461ae8 f8747908 b4ffff48 (f9400100) Cc: stable@vger.kernel.org # v6.3+ Fixes: a30951d31b25 ("xhci: tegra: USB2 pad power controls") Signed-off-by: Henry Lin <henryl@nvidia.com> Link: https://lore.kernel.org/r/20241014042134.27664-1-henryl@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | usb: dwc3: Wait for EndXfer completion before restoring GUSB2PHYCFGPrashanth K2024-10-161-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DWC3 programming guide mentions that when operating in USB2.0 speeds, if GUSB2PHYCFG[6] or GUSB2PHYCFG[8] is set, it must be cleared prior to issuing commands and may be set again after the command completes. But currently while issuing EndXfer command without CmdIOC set, we wait for 1ms after GUSB2PHYCFG is restored. This results in cases where EndXfer command doesn't get completed and causes SMMU faults since requests are unmapped afterwards. Hence restore GUSB2PHYCFG after waiting for EndXfer command completion. Cc: stable@vger.kernel.org Fixes: 1d26ba0944d3 ("usb: dwc3: Wait unconditionally after issuing EndXfer command") Signed-off-by: Prashanth K <quic_prashk@quicinc.com> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/20240924093208.2524531-1-quic_prashk@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | usb: typec: qcom-pmic-typec: fix sink status being overwritten with RP_DEFJonathan Marek2024-10-161-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This line is overwriting the result of the above switch-case. This fixes the tcpm driver getting stuck in a "Sink TX No Go" loop. Fixes: a4422ff22142 ("usb: typec: qcom: Add Qualcomm PMIC Type-C driver") Cc: stable <stable@kernel.org> Signed-off-by: Jonathan Marek <jonathan@marek.ca> Acked-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241005144146.2345-1-jonathan@marek.ca Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | usb: typec: altmode should keep reference to parentThadeu Lima de Souza Cascardo2024-10-161-0/+3
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The altmode device release refers to its parent device, but without keeping a reference to it. When registering the altmode, get a reference to the parent and put it in the release function. Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues like this: [ 43.572860] kobject: 'port0.0' (ffff8880057ba008): kobject_release, parent 0000000000000000 (delayed 3000) [ 43.573532] kobject: 'port0.1' (ffff8880057bd008): kobject_release, parent 0000000000000000 (delayed 1000) [ 43.574407] kobject: 'port0' (ffff8880057b9008): kobject_release, parent 0000000000000000 (delayed 3000) [ 43.575059] kobject: 'port1.0' (ffff8880057ca008): kobject_release, parent 0000000000000000 (delayed 4000) [ 43.575908] kobject: 'port1.1' (ffff8880057c9008): kobject_release, parent 0000000000000000 (delayed 4000) [ 43.576908] kobject: 'typec' (ffff8880062dbc00): kobject_release, parent 0000000000000000 (delayed 4000) [ 43.577769] kobject: 'port1' (ffff8880057bf008): kobject_release, parent 0000000000000000 (delayed 3000) [ 46.612867] ================================================================== [ 46.613402] BUG: KASAN: slab-use-after-free in typec_altmode_release+0x38/0x129 [ 46.614003] Read of size 8 at addr ffff8880057b9118 by task kworker/2:1/48 [ 46.614538] [ 46.614668] CPU: 2 UID: 0 PID: 48 Comm: kworker/2:1 Not tainted 6.12.0-rc1-00138-gedbae730ad31 #535 [ 46.615391] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 [ 46.616042] Workqueue: events kobject_delayed_cleanup [ 46.616446] Call Trace: [ 46.616648] <TASK> [ 46.616820] dump_stack_lvl+0x5b/0x7c [ 46.617112] ? typec_altmode_release+0x38/0x129 [ 46.617470] print_report+0x14c/0x49e [ 46.617769] ? rcu_read_unlock_sched+0x56/0x69 [ 46.618117] ? __virt_addr_valid+0x19a/0x1ab [ 46.618456] ? kmem_cache_debug_flags+0xc/0x1d [ 46.618807] ? typec_altmode_release+0x38/0x129 [ 46.619161] kasan_report+0x8d/0xb4 [ 46.619447] ? typec_altmode_release+0x38/0x129 [ 46.619809] ? process_scheduled_works+0x3cb/0x85f [ 46.620185] typec_altmode_release+0x38/0x129 [ 46.620537] ? process_scheduled_works+0x3cb/0x85f [ 46.620907] device_release+0xaf/0xf2 [ 46.621206] kobject_delayed_cleanup+0x13b/0x17a [ 46.621584] process_scheduled_works+0x4f6/0x85f [ 46.621955] ? __pfx_process_scheduled_works+0x10/0x10 [ 46.622353] ? hlock_class+0x31/0x9a [ 46.622647] ? lock_acquired+0x361/0x3c3 [ 46.622956] ? move_linked_works+0x46/0x7d [ 46.623277] worker_thread+0x1ce/0x291 [ 46.623582] ? __kthread_parkme+0xc8/0xdf [ 46.623900] ? __pfx_worker_thread+0x10/0x10 [ 46.624236] kthread+0x17e/0x190 [ 46.624501] ? kthread+0xfb/0x190 [ 46.624756] ? __pfx_kthread+0x10/0x10 [ 46.625015] ret_from_fork+0x20/0x40 [ 46.625268] ? __pfx_kthread+0x10/0x10 [ 46.625532] ret_from_fork_asm+0x1a/0x30 [ 46.625805] </TASK> [ 46.625953] [ 46.626056] Allocated by task 678: [ 46.626287] kasan_save_stack+0x24/0x44 [ 46.626555] kasan_save_track+0x14/0x2d [ 46.626811] __kasan_kmalloc+0x3f/0x4d [ 46.627049] __kmalloc_noprof+0x1bf/0x1f0 [ 46.627362] typec_register_port+0x23/0x491 [ 46.627698] cros_typec_probe+0x634/0xbb6 [ 46.628026] platform_probe+0x47/0x8c [ 46.628311] really_probe+0x20a/0x47d [ 46.628605] device_driver_attach+0x39/0x72 [ 46.628940] bind_store+0x87/0xd7 [ 46.629213] kernfs_fop_write_iter+0x1aa/0x218 [ 46.629574] vfs_write+0x1d6/0x29b [ 46.629856] ksys_write+0xcd/0x13b [ 46.630128] do_syscall_64+0xd4/0x139 [ 46.630420] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 46.630820] [ 46.630946] Freed by task 48: [ 46.631182] kasan_save_stack+0x24/0x44 [ 46.631493] kasan_save_track+0x14/0x2d [ 46.631799] kasan_save_free_info+0x3f/0x4d [ 46.632144] __kasan_slab_free+0x37/0x45 [ 46.632474] kfree+0x1d4/0x252 [ 46.632725] device_release+0xaf/0xf2 [ 46.633017] kobject_delayed_cleanup+0x13b/0x17a [ 46.633388] process_scheduled_works+0x4f6/0x85f [ 46.633764] worker_thread+0x1ce/0x291 [ 46.634065] kthread+0x17e/0x190 [ 46.634324] ret_from_fork+0x20/0x40 [ 46.634621] ret_from_fork_asm+0x1a/0x30 Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241004123738.2964524-1-cascardo@igalia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge tag 'irq_urgent_for_v6.12_rc4' of ↵Linus Torvalds2024-10-207-31/+70
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Fix a case for sifive-plic where an interrupt gets disabled *and* masked and remains masked when it gets reenabled later - Plug a small race in GIC-v4 where userspace can force an affinity change of a virtual CPU (vPE) in its unmapping path - Do not mix the two sets of ocelot irqchip's registers in the mask calculation of the main interrupt sticky register - Other smaller fixlets and cleanups * tag 'irq_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/renesas-rzg2l: Fix missing put_device irqchip/riscv-intc: Fix SMP=n boot with ACPI irqchip/sifive-plic: Unmask interrupt in plic_irq_enable() irqchip/gic-v4: Don't allow a VMOVP on a dying VPE irqchip/sifive-plic: Return error code on failure irqchip/riscv-imsic: Fix output text of base address irqchip/ocelot: Comment sticky register clearing code irqchip/ocelot: Fix trigger register address irqchip: Remove obsolete config ARM_GIC_V3_ITS_PCI
| * | irqchip/renesas-rzg2l: Fix missing put_deviceFabrizio Castro2024-10-151-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rzg2l_irqc_common_init() calls of_find_device_by_node(), but the corresponding put_device() call is missing. This also gets reported by make coccicheck. Make use of the cleanup interfaces from cleanup.h to call into __free_put_device(), which in turn calls into put_device when leaving function rzg2l_irqc_common_init() and variable "dev" goes out of scope. To prevent that the device is put on successful completion, assign NULL to "dev" to prevent __free_put_device() from calling into put_device() within the successful path. "make coccicheck" will still complain about missing put_device() calls, but those are false positives now. Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller driver") Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241011172003.1242841-1-fabrizio.castro.jz@renesas.com
| * | irqchip/riscv-intc: Fix SMP=n boot with ACPISunil V L2024-10-151-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_SMP is disabled, the static array rintc_acpi_data with size NR_CPUS is not sufficient to hold all RINTC structures passed from the firmware. All RINTC structures are required to configure IMSIC/APLIC/PLIC properly irrespective of SMP in the OS. So, allocate dynamic memory based on the number of RINTC structures in MADT to fix this issue. Fixes: f8619b66bdb1 ("irqchip/riscv-intc: Add ACPI support for AIA") Reported-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/all/20241014065739.656959-1-sunilvl@ventanamicro.com Closes: https://github.com/linux-riscv/linux-riscv/actions/runs/11280997511/job/31375229012
| * | irqchip/sifive-plic: Unmask interrupt in plic_irq_enable()Nam Cao2024-10-081-10/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible that an interrupt is disabled and masked at the same time. When the interrupt is enabled again by enable_irq(), only plic_irq_enable() is called, not plic_irq_unmask(). The interrupt remains masked and never raises. An example where interrupt is both disabled and masked is when handle_fasteoi_irq() is the handler, and IRQS_ONESHOT is set. The interrupt handler: 1. Mask the interrupt 2. Handle the interrupt 3. Check if interrupt is still enabled, and unmask it (see cond_unmask_eoi_irq()) If another task disables the interrupt in the middle of the above steps, the interrupt will not get unmasked, and will remain masked when it is enabled in the future. The problem is occasionally observed when PREEMPT_RT is enabled, because PREEMPT_RT adds the IRQS_ONESHOT flag. But PREEMPT_RT only makes the problem more likely to appear, the bug has been around since commit a1706a1c5062 ("irqchip/sifive-plic: Separate the enable and mask operations"). Fix it by unmasking interrupt in plic_irq_enable(). Fixes: a1706a1c5062 ("irqchip/sifive-plic: Separate the enable and mask operations") Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241003084152.2422969-1-namcao@linutronix.de
| * | irqchip/gic-v4: Don't allow a VMOVP on a dying VPEMarc Zyngier2024-10-081-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kunkun Jiang reported that there is a small window of opportunity for userspace to force a change of affinity for a VPE while the VPE has already been unmapped, but the corresponding doorbell interrupt still visible in /proc/irq/. Plug the race by checking the value of vmapp_count, which tracks whether the VPE is mapped ot not, and returning an error in this case. This involves making vmapp_count common to both GICv4.1 and its v4.0 ancestor. Fixes: 64edfaa9a234 ("irqchip/gic-v4.1: Implement the v4.1 flavour of VMAPP") Reported-by: Kunkun Jiang <jiangkunkun@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/c182ece6-2ba0-ce4f-3404-dba7a3ab6c52@huawei.com Link: https://lore.kernel.org/all/20241002204959.2051709-1-maz@kernel.org
| * | irqchip/sifive-plic: Return error code on failureCharlie Jenkins2024-10-021-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set error to -ENOMEM if kcalloc() fails or if irq_domain_add_linear() fails inside of plic_probe() instead of returning 0. Fixes: 4d936f10ff80 ("irqchip/sifive-plic: Probe plic driver early for Allwinner D1 platform") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20240903-correct_error_codes_sifive_plic-v1-1-d929b79663a2@rivosinc.com Closes: https://lore.kernel.org/r/202409031122.yBh8HrxA-lkp@intel.com/
| * | irqchip/riscv-imsic: Fix output text of base addressAndrew Jones2024-10-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "per-CPU IDs ... at base ..." info log is outputting a physical address, not a PPN. Fixes: 027e125acdba ("irqchip/riscv-imsic: Add device MSI domain support for platform devices") Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/all/20240909085610.46625-2-ajones@ventanamicro.com
| * | irqchip/ocelot: Comment sticky register clearing codeSergey Matsievskiy2024-10-021-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add comment to the sticky register clearing code. Signed-off-by: Sergey Matsievskiy <matsievskiysv@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240925184416.54204-3-matsievskiysv@gmail.com
| * | irqchip/ocelot: Fix trigger register addressSergey Matsievskiy2024-10-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Controllers, supported by this driver, have two sets of registers: * (main) interrupt registers control peripheral interrupt sources. * device interrupt registers configure per-device (network interface) interrupts and act as an extra stage before the main interrupt registers. In the driver unmask code, device trigger registers are used in the mask calculation of the main interrupt sticky register, mixing two kinds of registers. Use the main interrupt trigger register instead. Signed-off-by: Sergey Matsievskiy <matsievskiysv@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240925184416.54204-2-matsievskiysv@gmail.com
| * | irqchip: Remove obsolete config ARM_GIC_V3_ITS_PCILukas Bulwahn2024-10-021-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit b5712bf89b4b ("irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]") moves the functionality of irq-gic-v3-its-pci-msi.c into irq-gic-v3-its-msi-parent.c, and drops the former file. With that, the config option ARM_GIC_V3_ITS_PCI is obsolete, but the definition of that config was not removed in the commit above. Remove this obsolete config ARM_GIC_V3_ITS_PCI. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240926125502.363364-1-lukas.bulwahn@redhat.com
* | | Merge tag 'for-linus-6.12a-rc4-tag' of ↵Linus Torvalds2024-10-204-7/+35
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fix from Juergen Gross: "A single fix for a build failure introduced this merge window" * tag 'for-linus-6.12a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: Remove dependency between pciback and privcmd
| * | | xen: Remove dependency between pciback and privcmdJiqian Chen2024-10-184-7/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 2fae6bb7be32 ("xen/privcmd: Add new syscall to get gsi from dev") adds a weak reverse dependency to the config XEN_PRIVCMD definition, that dependency causes xen-privcmd can't be loaded on domU, because dependent xen-pciback isn't always be loaded successfully on domU. To solve above problem, remove that dependency, and do not call pcistub_get_gsi_from_sbdf() directly, instead add a hook in drivers/xen/apci.c, xen-pciback register the real call function, then in privcmd_ioctl_pcidev_get_gsi call that hook. Fixes: 2fae6bb7be32 ("xen/privcmd: Add new syscall to get gsi from dev") Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> Reviewed-by: Juergen Gross <jgross@suse.com> Message-ID: <20241012084537.1543059-1-Jiqian.Chen@amd.com> Signed-off-by: Juergen Gross <jgross@suse.com>
* | | | Merge tag 'scsi-fixes' of ↵Linus Torvalds2024-10-195-42/+45
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Fixes all in drivers. The largest is the mpi3mr which corrects a phy count limit that should only apply to the controller but was being incorrectly applied to expander phys" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: target: core: Fix null-ptr-deref in target_alloc_device() scsi: mpi3mr: Validate SAS port assignments scsi: ufs: core: Set SDEV_OFFLINE when UFS is shut down scsi: ufs: core: Requeue aborted request scsi: ufs: core: Fix the issue of ICU failure
| * | | | scsi: target: core: Fix null-ptr-deref in target_alloc_device()Wang Hai2024-10-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a null-ptr-deref issue reported by KASAN: BUG: KASAN: null-ptr-deref in target_alloc_device+0xbc4/0xbe0 [target_core_mod] ... kasan_report+0xb9/0xf0 target_alloc_device+0xbc4/0xbe0 [target_core_mod] core_dev_setup_virtual_lun0+0xef/0x1f0 [target_core_mod] target_core_init_configfs+0x205/0x420 [target_core_mod] do_one_initcall+0xdd/0x4e0 ... entry_SYSCALL_64_after_hwframe+0x76/0x7e In target_alloc_device(), if allocing memory for dev queues fails, then dev will be freed by dev->transport->free_device(), but dev->transport is not initialized at that time, which will lead to a null pointer reference problem. Fixing this bug by freeing dev with hba->backend->ops->free_device(). Fixes: 1526d9f10c61 ("scsi: target: Make state_list per CPU") Signed-off-by: Wang Hai <wanghai38@huawei.com> Link: https://lore.kernel.org/r/20241011113444.40749-1-wanghai38@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | | scsi: mpi3mr: Validate SAS port assignmentsRanjan Kumar2024-10-162-17/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A sanity check on phy_mask was added in commit 3668651def2c ("scsi: mpi3mr: Sanitise num_phys"). This causes warning messages when more than 64 phys are detected and devices connected to phys greater than 64 are dropped. The phy_mask bitmap is only needed for controller phys and not required for expander phys. Controller phys can go up to a maximum of 64 and therefore u64 is good enough to contain phy_mask bitmap. To suppress those warnings and allow devices to be discovered as before the offending commit, restrict the phy_mask setting and lowest phy setting only to the controller phys. Fixes: 3668651def2c ("scsi: mpi3mr: Sanitise num_phys") Cc: stable@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202410051943.Mp9o5DlF-lkp@intel.com/ Reported-by: Alexander Motin <mav@ixsystems.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241008074353.200379-1-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | | scsi: ufs: core: Set SDEV_OFFLINE when UFS is shut downSeunghwan Baek2024-10-161-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a history of deadlock if reboot is performed at the beginning of booting. SDEV_QUIESCE was set for all LU's scsi_devices by UFS shutdown, and at that time the audio driver was waiting on blk_mq_submit_bio() holding a mutex_lock while reading the fw binary. After that, a deadlock issue occurred while audio driver shutdown was waiting for mutex_unlock of blk_mq_submit_bio(). To solve this, set SDEV_OFFLINE for all LUs except WLUN, so that any I/O that comes down after a UFS shutdown will return an error. [ 31.907781]I[0: swapper/0: 0] 1 130705007 1651079834 11289729804 0 D( 2) 3 ffffff882e208000 * init [device_shutdown] [ 31.907793]I[0: swapper/0: 0] Mutex: 0xffffff8849a2b8b0: owner[0xffffff882e28cb00 kworker/6:0 :49] [ 31.907806]I[0: swapper/0: 0] Call trace: [ 31.907810]I[0: swapper/0: 0] __switch_to+0x174/0x338 [ 31.907819]I[0: swapper/0: 0] __schedule+0x5ec/0x9cc [ 31.907826]I[0: swapper/0: 0] schedule+0x7c/0xe8 [ 31.907834]I[0: swapper/0: 0] schedule_preempt_disabled+0x24/0x40 [ 31.907842]I[0: swapper/0: 0] __mutex_lock+0x408/0xdac [ 31.907849]I[0: swapper/0: 0] __mutex_lock_slowpath+0x14/0x24 [ 31.907858]I[0: swapper/0: 0] mutex_lock+0x40/0xec [ 31.907866]I[0: swapper/0: 0] device_shutdown+0x108/0x280 [ 31.907875]I[0: swapper/0: 0] kernel_restart+0x4c/0x11c [ 31.907883]I[0: swapper/0: 0] __arm64_sys_reboot+0x15c/0x280 [ 31.907890]I[0: swapper/0: 0] invoke_syscall+0x70/0x158 [ 31.907899]I[0: swapper/0: 0] el0_svc_common+0xb4/0xf4 [ 31.907909]I[0: swapper/0: 0] do_el0_svc+0x2c/0xb0 [ 31.907918]I[0: swapper/0: 0] el0_svc+0x34/0xe0 [ 31.907928]I[0: swapper/0: 0] el0t_64_sync_handler+0x68/0xb4 [ 31.907937]I[0: swapper/0: 0] el0t_64_sync+0x1a0/0x1a4 [ 31.908774]I[0: swapper/0: 0] 49 0 11960702 11236868007 0 D( 2) 6 ffffff882e28cb00 * kworker/6:0 [__bio_queue_enter] [ 31.908783]I[0: swapper/0: 0] Call trace: [ 31.908788]I[0: swapper/0: 0] __switch_to+0x174/0x338 [ 31.908796]I[0: swapper/0: 0] __schedule+0x5ec/0x9cc [ 31.908803]I[0: swapper/0: 0] schedule+0x7c/0xe8 [ 31.908811]I[0: swapper/0: 0] __bio_queue_enter+0xb8/0x178 [ 31.908818]I[0: swapper/0: 0] blk_mq_submit_bio+0x194/0x67c [ 31.908827]I[0: swapper/0: 0] __submit_bio+0xb8/0x19c Fixes: b294ff3e3449 ("scsi: ufs: core: Enable power management for wlun") Cc: stable@vger.kernel.org Signed-off-by: Seunghwan Baek <sh8267.baek@samsung.com> Link: https://lore.kernel.org/r/20240829093913.6282-2-sh8267.baek@samsung.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | | scsi: ufs: core: Requeue aborted requestPeter Wang2024-10-161-16/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After the SQ cleanup fix, the CQ will receive a response with the corresponding tag marked as OCS: ABORTED. To align with the behavior of Legacy SDB mode, the handling of OCS: ABORTED has been changed to match that of OCS_INVALID_COMMAND_STATUS (SDB), with both returning a SCSI result of DID_REQUEUE. Furthermore, the workaround implemented before the SQ cleanup fix can be removed. Fixes: ab248643d3d6 ("scsi: ufs: core: Add error handling for MCQ mode") Cc: stable@vger.kernel.org Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20241001091917.6917-3-peter.wang@mediatek.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | | scsi: ufs: core: Fix the issue of ICU failurePeter Wang2024-10-161-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When setting the ICU bit without using read-modify-write, SQRTCy will restart SQ again and receive an RTC return error code 2 (Failure - SQ not stopped). Additionally, the error log has been modified so that this type of error can be observed. Fixes: ab248643d3d6 ("scsi: ufs: core: Add error handling for MCQ mode") Cc: stable@vger.kernel.org Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20241001091917.6917-2-peter.wang@mediatek.com Reviewed-by: Bao D. Nguyen <quic_nguyenb@quicinc.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | | | Merge tag 'input-for-v6.12-rc3' of ↵Linus Torvalds2024-10-192-12/+25
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - a fix for Zinitix driver to not fail probing if the property enabling touch keys functionality is not defined. Support for touch keys was added in 6.12 merge window so this issue does not affect users of released kernels - a couple new vendor/device IDs in xpad driver to enable support for more hardware * tag 'input-for-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: zinitix - don't fail if linux,keycodes prop is absent Input: xpad - add support for MSI Claw A1M Input: xpad - add support for 8BitDo Ultimate 2C Wireless Controller
| * | | | | Input: zinitix - don't fail if linux,keycodes prop is absentNikita Travkin2024-10-191-12/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When initially adding the touchkey support, a mistake was made in the property parsing code. The possible negative errno from device_property_count_u32() was never checked, which was an oversight left from converting to it from the of_property as part of the review fixes. Re-add the correct handling of the absent property, in which case zero touchkeys should be assumed, which would disable the feature. Reported-by: Jakob Hauser <jahau@rocketmail.com> Tested-by: Jakob Hauser <jahau@rocketmail.com> Fixes: 075d9b22c8fe ("Input: zinitix - add touchkey support") Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Nikita Travkin <nikita@trvn.ru> Tested-by: Yassine Oudjana <y.oudjana@protonmail.com> Link: https://lore.kernel.org/r/20241004-zinitix-no-keycodes-v2-1-876dc9fea4b6@trvn.ru Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
| * | | | | Input: xpad - add support for MSI Claw A1MJohn Edwards2024-10-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add MSI Claw A1M controller to xpad_device match table when in xinput mode. Add MSI VID as XPAD_XBOX360_VENDOR. Signed-off-by: John Edwards <uejji@uejji.net> Reviewed-by: Derek J. Clark <derekjohn.clark@gmail.com> Reviewed-by: Christopher Snowhill <kode54@gmail.com> Link: https://lore.kernel.org/r/20241010232020.3292284-4-uejji@uejji.net Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
| * | | | | Input: xpad - add support for 8BitDo Ultimate 2C Wireless ControllerStefan Kerkmann2024-10-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This XBOX360 compatible gamepad uses the new product id 0x310a under the 8BitDo's vendor id 0x2dc8. The change was tested using the gamepad in a wired and wireless dongle configuration. Signed-off-by: Stefan Kerkmann <s.kerkmann@pengutronix.de> Link: https://lore.kernel.org/r/20241015-8bitdo_2c_ultimate_wireless-v1-1-9c9f9db2e995@pengutronix.de Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
* | | | | | Merge tag 'block-6.12-20241018' of git://git.kernel.dk/linuxLinus Torvalds2024-10-1912-87/+124
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - Fix target passthrough identifier (Nilay) - Fix tcp locking (Hannes) - Replace list with sbitmap for tracking RDMA rsp tags (Guixen) - Remove unnecessary fallthrough statements (Tokunori) - Remove ready-without-media support (Greg) - Fix multipath partition scan deadlock (Keith) - Fix concurrent PCI reset and remove queue mapping (Maurizio) - Fabrics shutdown fixes (Nilay) - Fix for a kerneldoc warning (Keith) - Fix a race with blk-rq-qos and wakeups (Omar) - Cleanup of checking for always-set tag_set (SurajSonawane2415) - Fix for a crash with CPU hotplug notifiers (Ming) - Don't allow zero-copy ublk on unprivileged device (Ming) - Use array_index_nospec() for CDROM (Josh) - Remove dead code in drbd (David) - Tweaks to elevator loading (Breno) * tag 'block-6.12-20241018' of git://git.kernel.dk/linux: cdrom: Avoid barrier_nospec() in cdrom_ioctl_media_changed() nvme: use helper nvme_ctrl_state in nvme_keep_alive_finish function nvme: make keep-alive synchronous operation nvme-loop: flush off pending I/O while shutting down loop controller nvme-pci: fix race condition between reset and nvme_dev_disable() ublk: don't allow user copy for unprivileged device blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race nvme-multipath: defer partition scanning blk-mq: setup queue ->tag_set before initializing hctx elevator: Remove argument from elevator_find_get elevator: do not request_module if elevator exists drbd: Remove unused conn_lowest_minor nvme: disable CC.CRIME (NVME_CC_CRIME) nvme: delete unnecessary fallthru comment nvmet-rdma: use sbitmap to replace rsp free list block: Fix elevator_get_default() checking for NULL q->tag_set nvme: tcp: avoid race between queue_lock lock and destroy nvmet-passthru: clear EUID/NGUID/UUID while using loop target block: fix blk_rq_map_integrity_sg kernel-doc
| * | | | | | cdrom: Avoid barrier_nospec() in cdrom_ioctl_media_changed()Josh Poimboeuf2024-10-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The barrier_nospec() after the array bounds check is overkill and painfully slow for arches which implement it. Furthermore, most arches don't implement it, so they remain exposed to Spectre v1 (which can affect pretty much any CPU with branch prediction). Instead, clamp the user pointer to a valid range so it's guaranteed to be a valid array index even when the bounds check mispredicts. Fixes: 8270cb10c068 ("cdrom: Fix spectre-v1 gadget") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Link: https://lore.kernel.org/r/1d86f4d9d8fba68e5ca64cdeac2451b95a8bf872.1729202937.git.jpoimboe@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | | | | Merge tag 'nvme-6.12-2024-10-18' of git://git.infradead.org/nvme into block-6.12Jens Axboe2024-10-178-70/+113
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NVMe fixes from Keith: "nvme fixes for Linux 6.12 - Fix target passthrough identifier (Nilay) - Fix tcp locking (Hannes) - Replace list with sbitmap for tracking RDMA rsp tags (Guixen) - Remove unnecessary fallthrough statements (Tokunori) - Remove ready-without-media support (Greg) - Fix multipath partition scan deadlock (Keith) - Fix concurrent PCI reset and remove queue mapping (Maurizio) - Fabrics shutdown fixes (Nilay)" * tag 'nvme-6.12-2024-10-18' of git://git.infradead.org/nvme: nvme: use helper nvme_ctrl_state in nvme_keep_alive_finish function nvme: make keep-alive synchronous operation nvme-loop: flush off pending I/O while shutting down loop controller nvme-pci: fix race condition between reset and nvme_dev_disable() nvme-multipath: defer partition scanning nvme: disable CC.CRIME (NVME_CC_CRIME) nvme: delete unnecessary fallthru comment nvmet-rdma: use sbitmap to replace rsp free list nvme: tcp: avoid race between queue_lock lock and destroy nvmet-passthru: clear EUID/NGUID/UUID while using loop target block: fix blk_rq_map_integrity_sg kernel-doc
| | * | | | | | nvme: use helper nvme_ctrl_state in nvme_keep_alive_finish functionNilay Shroff2024-10-171-8/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We no more need acquiring ctrl->lock before accessing the NVMe controller state and instead we can now use the helper nvme_ctrl_state. So replace the use of ctrl->lock from nvme_keep_alive_finish function with nvme_ctrl_state call. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
| | * | | | | | nvme: make keep-alive synchronous operationNilay Shroff2024-10-171-10/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nvme keep-alive operation, which executes at a periodic interval, could potentially sneak in while shutting down a fabric controller. This may lead to a race between the fabric controller admin queue destroy code path (invoked while shutting down controller) and hw/hctx queue dispatcher called from the nvme keep-alive async request queuing operation. This race could lead to the kernel crash shown below: Call Trace: autoremove_wake_function+0x0/0xbc (unreliable) __blk_mq_sched_dispatch_requests+0x114/0x24c blk_mq_sched_dispatch_requests+0x44/0x84 blk_mq_run_hw_queue+0x140/0x220 nvme_keep_alive_work+0xc8/0x19c [nvme_core] process_one_work+0x200/0x4e0 worker_thread+0x340/0x504 kthread+0x138/0x140 start_kernel_thread+0x14/0x18 While shutting down fabric controller, if nvme keep-alive request sneaks in then it would be flushed off. The nvme_keep_alive_end_io function is then invoked to handle the end of the keep-alive operation which decrements the admin->q_usage_counter and assuming this is the last/only request in the admin queue then the admin->q_usage_counter becomes zero. If that happens then blk-mq destroy queue operation (blk_mq_destroy_ queue()) which could be potentially running simultaneously on another cpu (as this is the controller shutdown code path) would forward progress and deletes the admin queue. So, now from this point onward we are not supposed to access the admin queue resources. However the issue here's that the nvme keep-alive thread running hw/hctx queue dispatch operation hasn't yet finished its work and so it could still potentially access the admin queue resource while the admin queue had been already deleted and that causes the above crash. This fix helps avoid the observed crash by implementing keep-alive as a synchronous operation so that we decrement admin->q_usage_counter only after keep-alive command finished its execution and returns the command status back up to its caller (blk_execute_rq()). This would ensure that fabric shutdown code path doesn't destroy the fabric admin queue until keep-alive request finished execution and also keep-alive thread is not running hw/hctx queue dispatch operation. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
| | * | | | | | nvme-loop: flush off pending I/O while shutting down loop controllerNilay Shroff2024-10-171-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While shutting down loop controller, we first quiesce the admin/IO queue, delete the admin/IO tag-set and then at last destroy the admin/IO queue. However it's quite possible that during the window between quiescing and destroying of the admin/IO queue, some admin/IO request might sneak in and if that happens then we could potentially encounter a hung task because shutdown operation can't forward progress until any pending I/O is flushed off. This commit helps ensure that before destroying the admin/IO queue, we unquiesce the admin/IO queue so that any outstanding requests, which are added after the admin/IO queue is quiesced, are now flushed to its completion. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
| | * | | | | | nvme-pci: fix race condition between reset and nvme_dev_disable()Maurizio Lombardi2024-10-171-3/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nvme_dev_disable() modifies the dev->online_queues field, therefore nvme_pci_update_nr_queues() should avoid racing against it, otherwise we could end up passing invalid values to blk_mq_update_nr_hw_queues(). WARNING: CPU: 39 PID: 61303 at drivers/pci/msi/api.c:347 pci_irq_get_affinity+0x187/0x210 Workqueue: nvme-reset-wq nvme_reset_work [nvme] RIP: 0010:pci_irq_get_affinity+0x187/0x210 Call Trace: <TASK> ? blk_mq_pci_map_queues+0x87/0x3c0 ? pci_irq_get_affinity+0x187/0x210 blk_mq_pci_map_queues+0x87/0x3c0 nvme_pci_map_queues+0x189/0x460 [nvme] blk_mq_update_nr_hw_queues+0x2a/0x40 nvme_reset_work+0x1be/0x2a0 [nvme] Fix the bug by locking the shutdown_lock mutex before using dev->online_queues. Give up if nvme_dev_disable() is running or if it has been executed already. Fixes: 949928c1c731 ("NVMe: Fix possible queue use after freed") Tested-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
| | * | | | | | nvme-multipath: defer partition scanningKeith Busch2024-10-152-6/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to suppress the partition scan from occuring within the controller's scan_work context. If a path error occurs here, the IO will wait until a path becomes available or all paths are torn down, but that action also occurs within scan_work, so it would deadlock. Defer the partion scan to a different context that does not block scan_work. Reported-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
| | * | | | | | nvme: disable CC.CRIME (NVME_CC_CRIME)Greg Joyce2024-10-091-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Disable NVME_CC_CRIME so that CSTS.RDY indicates that the media is ready and able to handle commands without returning NVME_SC_ADMIN_COMMAND_MEDIA_NOT_READY. Signed-off-by: Greg Joyce <gjoyce@linux.ibm.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Tested-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
| | * | | | | | nvme: delete unnecessary fallthru commentTokunori Ikegami2024-10-081-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Tokunori Ikegami <ikegami.t@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
| | * | | | | | nvmet-rdma: use sbitmap to replace rsp free listGuixin Liu2024-10-081-29/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can use sbitmap to manage all the nvmet_rdma_rsp instead of using free lists and spinlock, and we can use an additional tag to determine whether the nvmet_rdma_rsp is extra allocated. In addition, performance has improved: 1. testing environment is local rxe rdma devie and mem-based backstore device. 2. fio command, test the average 5 times: fio -filename=/dev/nvme0n1 --ioengine=libaio -direct=1 -size=1G -name=1 -thread -runtime=60 -time_based -rw=read -numjobs=16 -iodepth=128 -bs=4k -group_reporting 3. Before: 241k IOPS, After: 256k IOPS, an increase of about 5%. Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Reviewed-by: Jens Axboe <axboe@kernel.dk>
| | * | | | | | nvme: tcp: avoid race between queue_lock lock and destroyHannes Reinecke2024-10-031-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 76d54bf20cdc ("nvme-tcp: don't access released socket during error recovery") added a mutex_lock() call for the queue->queue_lock in nvme_tcp_get_address(). However, the mutex_lock() races with mutex_destroy() in nvme_tcp_free_queue(), and causes the WARN below. DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 3 PID: 34077 at kernel/locking/mutex.c:587 __mutex_lock+0xcf0/0x1220 Modules linked in: nvmet_tcp nvmet nvme_tcp nvme_fabrics iw_cm ib_cm ib_core pktcdvd nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables qrtr sunrpc ppdev 9pnet_virtio 9pnet pcspkr netfs parport_pc parport e1000 i2c_piix4 i2c_smbus loop fuse nfnetlink zram bochs drm_vram_helper drm_ttm_helper ttm drm_kms_helper xfs drm sym53c8xx floppy nvme scsi_transport_spi nvme_core nvme_auth serio_raw ata_generic pata_acpi dm_multipath qemu_fw_cfg [last unloaded: ib_uverbs] CPU: 3 UID: 0 PID: 34077 Comm: udisksd Not tainted 6.11.0-rc7 #319 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:__mutex_lock+0xcf0/0x1220 Code: 08 84 d2 0f 85 c8 04 00 00 8b 15 ef b6 c8 01 85 d2 0f 85 78 f4 ff ff 48 c7 c6 20 93 ee af 48 c7 c7 60 91 ee af e8 f0 a7 6d fd <0f> 0b e9 5e f4 ff ff 48 b8 00 00 00 00 00 fc ff df 4c 89 f2 48 c1 RSP: 0018:ffff88811305f760 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff88812c652058 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: ffff88811305f8b0 R08: 0000000000000001 R09: ffffed1075c36341 R10: ffff8883ae1b1a0b R11: 0000000000010498 R12: 0000000000000000 R13: 0000000000000000 R14: dffffc0000000000 R15: ffff88812c652058 FS: 00007f9713ae4980(0000) GS:ffff8883ae180000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fcd78483c7c CR3: 0000000122c38000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __warn.cold+0x5b/0x1af ? __mutex_lock+0xcf0/0x1220 ? report_bug+0x1ec/0x390 ? handle_bug+0x3c/0x80 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? __mutex_lock+0xcf0/0x1220 ? nvme_tcp_get_address+0xc2/0x1e0 [nvme_tcp] ? __pfx___mutex_lock+0x10/0x10 ? __lock_acquire+0xd6a/0x59e0 ? nvme_tcp_get_address+0xc2/0x1e0 [nvme_tcp] nvme_tcp_get_address+0xc2/0x1e0 [nvme_tcp] ? __pfx_nvme_tcp_get_address+0x10/0x10 [nvme_tcp] nvme_sysfs_show_address+0x81/0xc0 [nvme_core] dev_attr_show+0x42/0x80 ? __asan_memset+0x1f/0x40 sysfs_kf_seq_show+0x1f0/0x370 seq_read_iter+0x2cb/0x1130 ? rw_verify_area+0x3b1/0x590 ? __mutex_lock+0x433/0x1220 vfs_read+0x6a6/0xa20 ? lockdep_hardirqs_on+0x78/0x100 ? __pfx_vfs_read+0x10/0x10 ksys_read+0xf7/0x1d0 ? __pfx_ksys_read+0x10/0x10 ? __x64_sys_openat+0x105/0x1d0 do_syscall_64+0x93/0x180 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? __pfx_ksys_read+0x10/0x10 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? do_syscall_64+0x9f/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f9713f55cfa Code: 55 48 89 e5 48 83 ec 20 48 89 55 e8 48 89 75 f0 89 7d f8 e8 e8 74 f8 ff 48 8b 55 e8 48 8b 75 f0 41 89 c0 8b 7d f8 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 2e 44 89 c7 48 89 45 f8 e8 42 75 f8 ff 48 8b RSP: 002b:00007ffd7f512e70 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 000055c38f316859 RCX: 00007f9713f55cfa RDX: 0000000000000fff RSI: 00007ffd7f512eb0 RDI: 0000000000000011 RBP: 00007ffd7f512e90 R08: 0000000000000000 R09: 00000000ffffffff R10: 0000000000000000 R11: 0000000000000246 R12: 000055c38f317148 R13: 0000000000000000 R14: 00007f96f4004f30 R15: 000055c3b6b623c0 </TASK> The WARN is observed when the blktests test case nvme/014 is repeated with tcp transport. It is rare, and 200 times repeat is required to recreate in some test environments. To avoid the WARN, check the NVME_TCP_Q_LIVE flag before locking queue->queue_lock. The flag is cleared long time before the lock gets destroyed. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
| | * | | | | | nvmet-passthru: clear EUID/NGUID/UUID while using loop targetNilay Shroff2024-10-011-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When nvme passthru is configured using loop target, the clear_ids attribute is, by default, set to true. This attribute would ensure that EUID/NGUID/UUID is cleared for the loop passthru target. The newer NVMe disk supporting the NVMe spec 1.3 or higher, typically, implements the support for "Namespace Identification Descriptor list" command. This command when issued from host returns EUID/NGUID/UUID assigned to the inquired namespace. Not clearing these values, while using nvme passthru using loop target, would result in NVMe host driver rejecting the namespace. This check was implemented in the commit 2079f41ec6ff ("nvme: check that EUI/GUID/UUID are globally unique"). The fix implemented in this commit ensure that when host issues ns-id descriptor list command, the EUID/NGUID/UUID are cleared by passthru target. In fact, the function nvmet_passthru_override_id_descs() which clears those unique ids already exits, so we just need to ensure that ns-id descriptor list command falls through the corretc code path. And while we're at it, we also combines the three passthru admin command cases together which shares the same code. Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>