From 6feb37b3b06e9049e20dcf7e23998f92c9c5be9a Mon Sep 17 00:00:00 2001 From: Chengfeng Ye Date: Tue, 27 Jun 2023 12:03:40 +0000 Subject: sctp: fix potential deadlock on &net->sctp.addr_wq_lock As &net->sctp.addr_wq_lock is also acquired by the timer sctp_addr_wq_timeout_handler() in protocal.c, the same lock acquisition at sctp_auto_asconf_init() seems should disable irq since it is called from sctp_accept() under process context. Possible deadlock scenario: sctp_accept() -> sctp_sock_migrate() -> sctp_auto_asconf_init() -> spin_lock(&net->sctp.addr_wq_lock) -> sctp_addr_wq_timeout_handler() -> spin_lock_bh(&net->sctp.addr_wq_lock); (deadlock here) This flaw was found using an experimental static analysis tool we are developing for irq-related deadlock. The tentative patch fix the potential deadlock by spin_lock_bh(). Signed-off-by: Chengfeng Ye Fixes: 34e5b0118685 ("sctp: delay auto_asconf init until binding the first addr") Acked-by: Xin Long Link: https://lore.kernel.org/r/20230627120340.19432-1-dg573847474@gmail.com Signed-off-by: Paolo Abeni --- net/sctp/socket.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'net') diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 6554a357fe33..9388d98aebc0 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -364,9 +364,9 @@ static void sctp_auto_asconf_init(struct sctp_sock *sp) struct net *net = sock_net(&sp->inet.sk); if (net->sctp.default_auto_asconf) { - spin_lock(&net->sctp.addr_wq_lock); + spin_lock_bh(&net->sctp.addr_wq_lock); list_add_tail(&sp->auto_asconf_list, &net->sctp.auto_asconf_splist); - spin_unlock(&net->sctp.addr_wq_lock); + spin_unlock_bh(&net->sctp.addr_wq_lock); sp->do_auto_asconf = 1; } } -- cgit v1.2.3 From b4ee93380b3c891fea996af8d1d3ca0e36ad31f0 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Tue, 27 Jun 2023 14:38:11 +0200 Subject: net/sched: act_ipt: add sanity checks on table name and hook locations Looks like "tc" hard-codes "mangle" as the only supported table name, but on kernel side there are no checks. This is wrong. Not all xtables targets are safe to call from tc. E.g. "nat" targets assume skb has a conntrack object assigned to it. Normally those get called from netfilter nat core which consults the nat table to obtain the address mapping. "tc" userspace either sets PRE or POSTROUTING as hook number, but there is no validation of this on kernel side, so update netlink policy to reject bogus numbers. Some targets may assume skb_dst is set for input/forward hooks, so prevent those from being used. act_ipt uses the hook number in two places: 1. the state hook number, this is fine as-is 2. to set par.hook_mask The latter is a bit mask, so update the assignment to make xt_check_target() to the right thing. Followup patch adds required checks for the skb/packet headers before calling the targets evaluation function. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Florian Westphal Reviewed-by: Simon Horman Acked-by: Jamal Hadi Salim Signed-off-by: Paolo Abeni --- net/sched/act_ipt.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) (limited to 'net') diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c index 5d96ffebd40f..ea7f151e7dd2 100644 --- a/net/sched/act_ipt.c +++ b/net/sched/act_ipt.c @@ -48,7 +48,7 @@ static int ipt_init_target(struct net *net, struct xt_entry_target *t, par.entryinfo = &e; par.target = target; par.targinfo = t->data; - par.hook_mask = hook; + par.hook_mask = 1 << hook; par.family = NFPROTO_IPV4; ret = xt_check_target(&par, t->u.target_size - sizeof(*t), 0, false); @@ -85,7 +85,8 @@ static void tcf_ipt_release(struct tc_action *a) static const struct nla_policy ipt_policy[TCA_IPT_MAX + 1] = { [TCA_IPT_TABLE] = { .type = NLA_STRING, .len = IFNAMSIZ }, - [TCA_IPT_HOOK] = { .type = NLA_U32 }, + [TCA_IPT_HOOK] = NLA_POLICY_RANGE(NLA_U32, NF_INET_PRE_ROUTING, + NF_INET_NUMHOOKS), [TCA_IPT_INDEX] = { .type = NLA_U32 }, [TCA_IPT_TARG] = { .len = sizeof(struct xt_entry_target) }, }; @@ -158,15 +159,27 @@ static int __tcf_ipt_init(struct net *net, unsigned int id, struct nlattr *nla, return -EEXIST; } } + + err = -EINVAL; hook = nla_get_u32(tb[TCA_IPT_HOOK]); + switch (hook) { + case NF_INET_PRE_ROUTING: + break; + case NF_INET_POST_ROUTING: + break; + default: + goto err1; + } + + if (tb[TCA_IPT_TABLE]) { + /* mangle only for now */ + if (nla_strcmp(tb[TCA_IPT_TABLE], "mangle")) + goto err1; + } - err = -ENOMEM; - tname = kmalloc(IFNAMSIZ, GFP_KERNEL); + tname = kstrdup("mangle", GFP_KERNEL); if (unlikely(!tname)) goto err1; - if (tb[TCA_IPT_TABLE] == NULL || - nla_strscpy(tname, tb[TCA_IPT_TABLE], IFNAMSIZ) >= IFNAMSIZ) - strcpy(tname, "mangle"); t = kmemdup(td, td->u.target_size, GFP_KERNEL); if (unlikely(!t)) -- cgit v1.2.3 From b2dc32dcba08bf55cec600caa76f4afd2e3614df Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Tue, 27 Jun 2023 14:38:12 +0200 Subject: net/sched: act_ipt: add sanity checks on skb before calling target Netfilter targets make assumptions on the skb state, for example iphdr is supposed to be in the linear area. This is normally done by IP stack, but in act_ipt case no such checks are made. Some targets can even assume that skb_dst will be valid. Make a minimum effort to check for this: - Don't call the targets eval function for non-ipv4 skbs. - Don't call the targets eval function for POSTROUTING emulation when the skb has no dst set. v3: use skb_protocol helper (Davide Caratti) Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Florian Westphal Reviewed-by: Simon Horman Acked-by: Jamal Hadi Salim Signed-off-by: Paolo Abeni --- net/sched/act_ipt.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) (limited to 'net') diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c index ea7f151e7dd2..a6b522b512dc 100644 --- a/net/sched/act_ipt.c +++ b/net/sched/act_ipt.c @@ -230,6 +230,26 @@ static int tcf_xt_init(struct net *net, struct nlattr *nla, a, &act_xt_ops, tp, flags); } +static bool tcf_ipt_act_check(struct sk_buff *skb) +{ + const struct iphdr *iph; + unsigned int nhoff, len; + + if (!pskb_may_pull(skb, sizeof(struct iphdr))) + return false; + + nhoff = skb_network_offset(skb); + iph = ip_hdr(skb); + if (iph->ihl < 5 || iph->version != 4) + return false; + + len = skb_ip_totlen(skb); + if (skb->len < nhoff + len || len < (iph->ihl * 4u)) + return false; + + return pskb_may_pull(skb, iph->ihl * 4u); +} + TC_INDIRECT_SCOPE int tcf_ipt_act(struct sk_buff *skb, const struct tc_action *a, struct tcf_result *res) @@ -244,9 +264,22 @@ TC_INDIRECT_SCOPE int tcf_ipt_act(struct sk_buff *skb, .pf = NFPROTO_IPV4, }; + if (skb_protocol(skb, false) != htons(ETH_P_IP)) + return TC_ACT_UNSPEC; + if (skb_unclone(skb, GFP_ATOMIC)) return TC_ACT_UNSPEC; + if (!tcf_ipt_act_check(skb)) + return TC_ACT_UNSPEC; + + if (state.hook == NF_INET_POST_ROUTING) { + if (!skb_dst(skb)) + return TC_ACT_UNSPEC; + + state.out = skb->dev; + } + spin_lock(&ipt->tcf_lock); tcf_lastuse_update(&ipt->tcf_tm); -- cgit v1.2.3 From 93d75d475c5dc3404292976147d063ee4d808592 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Tue, 27 Jun 2023 14:38:13 +0200 Subject: net/sched: act_ipt: zero skb->cb before calling target xtables relies on skb being owned by ip stack, i.e. with ipv4 check in place skb->cb is supposed to be IPCB. I don't see an immediate problem (REJECT target cannot be used anymore now that PRE/POSTROUTING hook validation has been fixed), but better be safe than sorry. A much better patch would be to either mark act_ipt as "depends on BROKEN" or remove it altogether. I plan to do this for -next in the near future. This tc extension is broken in the sense that tc lacks an equivalent of NF_STOLEN verdict. With NF_STOLEN, target function takes complete ownership of skb, caller cannot dereference it anymore. ACT_STOLEN cannot be used for this: it has a different meaning, caller is allowed to dereference the skb. At this time NF_STOLEN won't be returned by any targets as far as I can see, but this may change in the future. It might be possible to work around this via list of allowed target extensions known to only return DROP or ACCEPT verdicts, but this is error prone/fragile. Existing selftest only validates xt_LOG and act_ipt is restricted to ipv4 so I don't think this action is used widely. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Florian Westphal Reviewed-by: Simon Horman Acked-by: Jamal Hadi Salim Signed-off-by: Paolo Abeni --- net/sched/act_ipt.c | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'net') diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c index a6b522b512dc..598d6e299152 100644 --- a/net/sched/act_ipt.c +++ b/net/sched/act_ipt.c @@ -21,6 +21,7 @@ #include #include #include +#include #include @@ -254,6 +255,7 @@ TC_INDIRECT_SCOPE int tcf_ipt_act(struct sk_buff *skb, const struct tc_action *a, struct tcf_result *res) { + char saved_cb[sizeof_field(struct sk_buff, cb)]; int ret = 0, result = 0; struct tcf_ipt *ipt = to_ipt(a); struct xt_action_param par; @@ -280,6 +282,8 @@ TC_INDIRECT_SCOPE int tcf_ipt_act(struct sk_buff *skb, state.out = skb->dev; } + memcpy(saved_cb, skb->cb, sizeof(saved_cb)); + spin_lock(&ipt->tcf_lock); tcf_lastuse_update(&ipt->tcf_tm); @@ -292,6 +296,9 @@ TC_INDIRECT_SCOPE int tcf_ipt_act(struct sk_buff *skb, par.state = &state; par.target = ipt->tcfi_t->u.kernel.target; par.targinfo = ipt->tcfi_t->data; + + memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); + ret = par.target->target(skb, &par); switch (ret) { @@ -312,6 +319,9 @@ TC_INDIRECT_SCOPE int tcf_ipt_act(struct sk_buff *skb, break; } spin_unlock(&ipt->tcf_lock); + + memcpy(skb->cb, saved_cb, sizeof(skb->cb)); + return result; } -- cgit v1.2.3 From c1ae02d876898b1b8ca1e12c6f84d7b406263800 Mon Sep 17 00:00:00 2001 From: Vladimir Oltean Date: Tue, 27 Jun 2023 12:42:07 +0300 Subject: net: dsa: tag_sja1105: always prefer source port information from INCL_SRCPT Currently the sja1105 tagging protocol prefers using the source port information from the VLAN header if that is available, falling back to the INCL_SRCPT option if it isn't. The VLAN header is available for all frames except for META frames initiated by the switch (containing RX timestamps), and thus, the "if (is_link_local)" branch is practically dead. The tag_8021q source port identification has become more loose ("imprecise") and will report a plausible rather than exact bridge port, when under a bridge (be it VLAN-aware or VLAN-unaware). But link-local traffic always needs to know the precise source port. With incorrect source port reporting, for example PTP traffic over 2 bridged ports will all be seen on sockets opened on the first such port, which is incorrect. Now that the tagging protocol has been changed to make link-local frames always contain source port information, we can reverse the order of the checks so that we always give precedence to that information (which is always precise) in lieu of the tag_8021q VID which is only precise for a standalone port. Fixes: d7f9787a763f ("net: dsa: tag_8021q: add support for imprecise RX based on the VBID") Fixes: 91495f21fcec ("net: dsa: tag_8021q: replace the SVL bridging with VLAN-unaware IVL bridging") Signed-off-by: Vladimir Oltean Reviewed-by: Simon Horman Signed-off-by: Paolo Abeni --- net/dsa/tag_sja1105.c | 38 +++++++++++++++++++++++++++++--------- 1 file changed, 29 insertions(+), 9 deletions(-) (limited to 'net') diff --git a/net/dsa/tag_sja1105.c b/net/dsa/tag_sja1105.c index a5f3b73da417..92a626a05e82 100644 --- a/net/dsa/tag_sja1105.c +++ b/net/dsa/tag_sja1105.c @@ -545,10 +545,7 @@ static struct sk_buff *sja1105_rcv(struct sk_buff *skb, is_link_local = sja1105_is_link_local(skb); is_meta = sja1105_is_meta_frame(skb); - if (sja1105_skb_has_tag_8021q(skb)) { - /* Normal traffic path. */ - sja1105_vlan_rcv(skb, &source_port, &switch_id, &vbid, &vid); - } else if (is_link_local) { + if (is_link_local) { /* Management traffic path. Switch embeds the switch ID and * port ID into bytes of the destination MAC, courtesy of * the incl_srcpt options. @@ -562,16 +559,39 @@ static struct sk_buff *sja1105_rcv(struct sk_buff *skb, sja1105_meta_unpack(skb, &meta); source_port = meta.source_port; switch_id = meta.switch_id; - } else { + } + + /* Normal data plane traffic and link-local frames are tagged with + * a tag_8021q VLAN which we have to strip + */ + if (sja1105_skb_has_tag_8021q(skb)) { + int tmp_source_port = -1, tmp_switch_id = -1; + + sja1105_vlan_rcv(skb, &tmp_source_port, &tmp_switch_id, &vbid, + &vid); + /* Preserve the source information from the INCL_SRCPT option, + * if available. This allows us to not overwrite a valid source + * port and switch ID with zeroes when receiving link-local + * frames from a VLAN-unaware bridged port (non-zero vbid) or a + * VLAN-aware bridged port (non-zero vid). + */ + if (source_port == -1) + source_port = tmp_source_port; + if (switch_id == -1) + switch_id = tmp_switch_id; + } else if (source_port == -1 && switch_id == -1) { + /* Packets with no source information have no chance of + * getting accepted, drop them straight away. + */ return NULL; } - if (vbid >= 1) + if (source_port != -1 && switch_id != -1) + skb->dev = dsa_master_find_slave(netdev, switch_id, source_port); + else if (vbid >= 1) skb->dev = dsa_tag_8021q_find_port_by_vbid(netdev, vbid); - else if (source_port == -1 || switch_id == -1) - skb->dev = dsa_find_designated_bridge_port_by_vid(netdev, vid); else - skb->dev = dsa_master_find_slave(netdev, switch_id, source_port); + skb->dev = dsa_find_designated_bridge_port_by_vid(netdev, vid); if (!skb->dev) { netdev_warn(netdev, "Couldn't decode source port\n"); return NULL; -- cgit v1.2.3 From f752a0b334bb95fe9b42ecb511e0864e2768046f Mon Sep 17 00:00:00 2001 From: Zhengping Jiang Date: Wed, 24 May 2023 17:04:15 -0700 Subject: Bluetooth: L2CAP: Fix use-after-free Fix potential use-after-free in l2cap_le_command_rej. Signed-off-by: Zhengping Jiang Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Jakub Kicinski --- net/bluetooth/l2cap_core.c | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'net') diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index c5e8798e297c..17ca13e8c044 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -6374,9 +6374,14 @@ static inline int l2cap_le_command_rej(struct l2cap_conn *conn, if (!chan) goto done; + chan = l2cap_chan_hold_unless_zero(chan); + if (!chan) + goto done; + l2cap_chan_lock(chan); l2cap_chan_del(chan, ECONNREFUSED); l2cap_chan_unlock(chan); + l2cap_chan_put(chan); done: mutex_unlock(&conn->chan_lock); -- cgit v1.2.3 From 0cb7365850bacb8c2a9975cae672d65714d8daa1 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Wed, 31 May 2023 11:04:23 +0200 Subject: Bluetooth: fix invalid-bdaddr quirk for non-persistent setup Devices that lack persistent storage for the device address can indicate this by setting the HCI_QUIRK_INVALID_BDADDR which causes the controller to be marked as unconfigured until user space has set a valid address. Once configured, the device address must be set on every setup for controllers with HCI_QUIRK_NON_PERSISTENT_SETUP to avoid marking the controller as unconfigured and requiring the address to be set again. Fixes: 740011cfe948 ("Bluetooth: Add new quirk for non-persistent setup settings") Signed-off-by: Johan Hovold Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Jakub Kicinski --- net/bluetooth/hci_sync.c | 28 +++++++++++----------------- 1 file changed, 11 insertions(+), 17 deletions(-) (limited to 'net') diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c index 804cde43b4e0..b5b1b610df33 100644 --- a/net/bluetooth/hci_sync.c +++ b/net/bluetooth/hci_sync.c @@ -4626,23 +4626,17 @@ static int hci_dev_setup_sync(struct hci_dev *hdev) invalid_bdaddr = test_bit(HCI_QUIRK_INVALID_BDADDR, &hdev->quirks); if (!ret) { - if (test_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks)) { - if (!bacmp(&hdev->public_addr, BDADDR_ANY)) - hci_dev_get_bd_addr_from_property(hdev); - - if (bacmp(&hdev->public_addr, BDADDR_ANY) && - hdev->set_bdaddr) { - ret = hdev->set_bdaddr(hdev, - &hdev->public_addr); - - /* If setting of the BD_ADDR from the device - * property succeeds, then treat the address - * as valid even if the invalid BD_ADDR - * quirk indicates otherwise. - */ - if (!ret) - invalid_bdaddr = false; - } + if (test_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks) && + !bacmp(&hdev->public_addr, BDADDR_ANY)) + hci_dev_get_bd_addr_from_property(hdev); + + if ((invalid_bdaddr || + test_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks)) && + bacmp(&hdev->public_addr, BDADDR_ANY) && + hdev->set_bdaddr) { + ret = hdev->set_bdaddr(hdev, &hdev->public_addr); + if (!ret) + invalid_bdaddr = false; } } -- cgit v1.2.3 From 6945795bc81ab7be22750ecfb365056688f2fada Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Wed, 31 May 2023 11:04:24 +0200 Subject: Bluetooth: fix use-bdaddr-property quirk Devices that lack persistent storage for the device address can indicate this by setting the HCI_QUIRK_INVALID_BDADDR which causes the controller to be marked as unconfigured until user space has set a valid address. The related HCI_QUIRK_USE_BDADDR_PROPERTY was later added to similarly indicate that the device lacks a valid address but that one may be specified in the devicetree. As is clear from commit 7a0e5b15ca45 ("Bluetooth: Add quirk for reading BD_ADDR from fwnode property") that added and documented this quirk and commits like de79a9df1692 ("Bluetooth: btqcomsmd: use HCI_QUIRK_USE_BDADDR_PROPERTY"), the device address of controllers with this flag should be treated as invalid until user space has had a chance to configure the controller in case the devicetree property is missing. As it does not make sense to allow controllers with invalid addresses, restore the original semantics, which also makes sure that the implementation is consistent (e.g. get_missing_options() indicates that the address must be set) and matches the documentation (including comments in the code, such as, "In case any of them is set, the controller has to start up as unconfigured."). Fixes: e668eb1e1578 ("Bluetooth: hci_core: Don't stop BT if the BD address missing in dts") Signed-off-by: Johan Hovold Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Jakub Kicinski --- net/bluetooth/hci_sync.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) (limited to 'net') diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c index b5b1b610df33..8561616abbe5 100644 --- a/net/bluetooth/hci_sync.c +++ b/net/bluetooth/hci_sync.c @@ -4623,16 +4623,14 @@ static int hci_dev_setup_sync(struct hci_dev *hdev) * BD_ADDR invalid before creating the HCI device or in * its setup callback. */ - invalid_bdaddr = test_bit(HCI_QUIRK_INVALID_BDADDR, &hdev->quirks); - + invalid_bdaddr = test_bit(HCI_QUIRK_INVALID_BDADDR, &hdev->quirks) || + test_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks); if (!ret) { if (test_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks) && !bacmp(&hdev->public_addr, BDADDR_ANY)) hci_dev_get_bd_addr_from_property(hdev); - if ((invalid_bdaddr || - test_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks)) && - bacmp(&hdev->public_addr, BDADDR_ANY) && + if (invalid_bdaddr && bacmp(&hdev->public_addr, BDADDR_ANY) && hdev->set_bdaddr) { ret = hdev->set_bdaddr(hdev, &hdev->public_addr); if (!ret) -- cgit v1.2.3 From 1728137b33c00d5a2b5110ed7aafb42e7c32e4a1 Mon Sep 17 00:00:00 2001 From: Sungwoo Kim Date: Wed, 31 May 2023 01:39:56 -0400 Subject: Bluetooth: L2CAP: Fix use-after-free in l2cap_sock_ready_cb l2cap_sock_release(sk) frees sk. However, sk's children are still alive and point to the already free'd sk's address. To fix this, l2cap_sock_release(sk) also cleans sk's children. ================================================================== BUG: KASAN: use-after-free in l2cap_sock_ready_cb+0xb7/0x100 net/bluetooth/l2cap_sock.c:1650 Read of size 8 at addr ffff888104617aa8 by task kworker/u3:0/276 CPU: 0 PID: 276 Comm: kworker/u3:0 Not tainted 6.2.0-00001-gef397bd4d5fb-dirty #59 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: hci2 hci_rx_work Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x72/0x95 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:306 [inline] print_report+0x175/0x478 mm/kasan/report.c:417 kasan_report+0xb1/0x130 mm/kasan/report.c:517 l2cap_sock_ready_cb+0xb7/0x100 net/bluetooth/l2cap_sock.c:1650 l2cap_chan_ready+0x10e/0x1e0 net/bluetooth/l2cap_core.c:1386 l2cap_config_req+0x753/0x9f0 net/bluetooth/l2cap_core.c:4480 l2cap_bredr_sig_cmd net/bluetooth/l2cap_core.c:5739 [inline] l2cap_sig_channel net/bluetooth/l2cap_core.c:6509 [inline] l2cap_recv_frame+0xe2e/0x43c0 net/bluetooth/l2cap_core.c:7788 l2cap_recv_acldata+0x6ed/0x7e0 net/bluetooth/l2cap_core.c:8506 hci_acldata_packet net/bluetooth/hci_core.c:3813 [inline] hci_rx_work+0x66e/0xbc0 net/bluetooth/hci_core.c:4048 process_one_work+0x4ea/0x8e0 kernel/workqueue.c:2289 worker_thread+0x364/0x8e0 kernel/workqueue.c:2436 kthread+0x1b9/0x200 kernel/kthread.c:376 ret_from_fork+0x2c/0x50 arch/x86/entry/entry_64.S:308 Allocated by task 288: kasan_save_stack+0x22/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 ____kasan_kmalloc mm/kasan/common.c:374 [inline] __kasan_kmalloc+0x82/0x90 mm/kasan/common.c:383 kasan_kmalloc include/linux/kasan.h:211 [inline] __do_kmalloc_node mm/slab_common.c:968 [inline] __kmalloc+0x5a/0x140 mm/slab_common.c:981 kmalloc include/linux/slab.h:584 [inline] sk_prot_alloc+0x113/0x1f0 net/core/sock.c:2040 sk_alloc+0x36/0x3c0 net/core/sock.c:2093 l2cap_sock_alloc.constprop.0+0x39/0x1c0 net/bluetooth/l2cap_sock.c:1852 l2cap_sock_create+0x10d/0x220 net/bluetooth/l2cap_sock.c:1898 bt_sock_create+0x183/0x290 net/bluetooth/af_bluetooth.c:132 __sock_create+0x226/0x380 net/socket.c:1518 sock_create net/socket.c:1569 [inline] __sys_socket_create net/socket.c:1606 [inline] __sys_socket_create net/socket.c:1591 [inline] __sys_socket+0x112/0x200 net/socket.c:1639 __do_sys_socket net/socket.c:1652 [inline] __se_sys_socket net/socket.c:1650 [inline] __x64_sys_socket+0x40/0x50 net/socket.c:1650 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3f/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc Freed by task 288: kasan_save_stack+0x22/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 kasan_save_free_info+0x2e/0x50 mm/kasan/generic.c:523 ____kasan_slab_free mm/kasan/common.c:236 [inline] ____kasan_slab_free mm/kasan/common.c:200 [inline] __kasan_slab_free+0x10a/0x190 mm/kasan/common.c:244 kasan_slab_free include/linux/kasan.h:177 [inline] slab_free_hook mm/slub.c:1781 [inline] slab_free_freelist_hook mm/slub.c:1807 [inline] slab_free mm/slub.c:3787 [inline] __kmem_cache_free+0x88/0x1f0 mm/slub.c:3800 sk_prot_free net/core/sock.c:2076 [inline] __sk_destruct+0x347/0x430 net/core/sock.c:2168 sk_destruct+0x9c/0xb0 net/core/sock.c:2183 __sk_free+0x82/0x220 net/core/sock.c:2194 sk_free+0x7c/0xa0 net/core/sock.c:2205 sock_put include/net/sock.h:1991 [inline] l2cap_sock_kill+0x256/0x2b0 net/bluetooth/l2cap_sock.c:1257 l2cap_sock_release+0x1a7/0x220 net/bluetooth/l2cap_sock.c:1428 __sock_release+0x80/0x150 net/socket.c:650 sock_close+0x19/0x30 net/socket.c:1368 __fput+0x17a/0x5c0 fs/file_table.c:320 task_work_run+0x132/0x1c0 kernel/task_work.c:179 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] exit_to_user_mode_prepare+0x113/0x120 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x21/0x50 kernel/entry/common.c:296 do_syscall_64+0x4c/0x90 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x72/0xdc The buggy address belongs to the object at ffff888104617800 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 680 bytes inside of 1024-byte region [ffff888104617800, ffff888104617c00) The buggy address belongs to the physical page: page:00000000dbca6a80 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888104614000 pfn:0x104614 head:00000000dbca6a80 order:2 compound_mapcount:0 subpages_mapcount:0 compound_pincount:0 flags: 0x200000000010200(slab|head|node=0|zone=2) raw: 0200000000010200 ffff888100041dc0 ffffea0004212c10 ffffea0004234b10 raw: ffff888104614000 0000000000080002 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888104617980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888104617a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff888104617a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888104617b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888104617b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Ack: This bug is found by FuzzBT with a modified Syzkaller. Other contributors are Ruoyu Wu and Hui Peng. Signed-off-by: Sungwoo Kim Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Jakub Kicinski --- net/bluetooth/l2cap_sock.c | 2 ++ 1 file changed, 2 insertions(+) (limited to 'net') diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c index eebe256104bc..947ca580bb9a 100644 --- a/net/bluetooth/l2cap_sock.c +++ b/net/bluetooth/l2cap_sock.c @@ -46,6 +46,7 @@ static const struct proto_ops l2cap_sock_ops; static void l2cap_sock_init(struct sock *sk, struct sock *parent); static struct sock *l2cap_sock_alloc(struct net *net, struct socket *sock, int proto, gfp_t prio, int kern); +static void l2cap_sock_cleanup_listen(struct sock *parent); bool l2cap_is_socket(struct socket *sock) { @@ -1415,6 +1416,7 @@ static int l2cap_sock_release(struct socket *sock) if (!sk) return 0; + l2cap_sock_cleanup_listen(sk); bt_sock_unlink(&l2cap_sk_list, sk); err = l2cap_sock_shutdown(sock, SHUT_RDWR); -- cgit v1.2.3 From 6b9545dc9f8ff01d8bc1229103960d9cd265343f Mon Sep 17 00:00:00 2001 From: Pauli Virtanen Date: Thu, 1 Jun 2023 09:34:43 +0300 Subject: Bluetooth: ISO: use hci_sync for setting CIG parameters When reconfiguring CIG after disconnection of the last CIS, LE Remove CIG shall be sent before LE Set CIG Parameters. Otherwise, it fails because CIG is in the inactive state and not configurable (Core v5.3 Vol 6 Part B Sec. 4.5.14.3). This ordering is currently wrong under suitable timing conditions, because LE Remove CIG is sent via the hci_sync queue and may be delayed, but Set CIG Parameters is via hci_send_cmd. Make the ordering well-defined by sending also Set CIG Parameters via hci_sync. Fixes: 26afbd826ee3 ("Bluetooth: Add initial implementation of CIS connections") Signed-off-by: Pauli Virtanen Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Jakub Kicinski --- net/bluetooth/hci_conn.c | 47 +++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 39 insertions(+), 8 deletions(-) (limited to 'net') diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c index 1ef952bda97d..2275e0d9f841 100644 --- a/net/bluetooth/hci_conn.c +++ b/net/bluetooth/hci_conn.c @@ -775,6 +775,11 @@ static void le_conn_timeout(struct work_struct *work) hci_abort_conn(conn, HCI_ERROR_REMOTE_USER_TERM); } +struct iso_cig_params { + struct hci_cp_le_set_cig_params cp; + struct hci_cis_params cis[0x1f]; +}; + struct iso_list_data { union { u8 cig; @@ -786,10 +791,7 @@ struct iso_list_data { u16 sync_handle; }; int count; - struct { - struct hci_cp_le_set_cig_params cp; - struct hci_cis_params cis[0x11]; - } pdu; + struct iso_cig_params pdu; }; static void bis_list(struct hci_conn *conn, void *data) @@ -1764,10 +1766,33 @@ static int hci_le_create_big(struct hci_conn *conn, struct bt_iso_qos *qos) return hci_send_cmd(hdev, HCI_OP_LE_CREATE_BIG, sizeof(cp), &cp); } +static void set_cig_params_complete(struct hci_dev *hdev, void *data, int err) +{ + struct iso_cig_params *pdu = data; + + bt_dev_dbg(hdev, ""); + + if (err) + bt_dev_err(hdev, "Unable to set CIG parameters: %d", err); + + kfree(pdu); +} + +static int set_cig_params_sync(struct hci_dev *hdev, void *data) +{ + struct iso_cig_params *pdu = data; + u32 plen; + + plen = sizeof(pdu->cp) + pdu->cp.num_cis * sizeof(pdu->cis[0]); + return __hci_cmd_sync_status(hdev, HCI_OP_LE_SET_CIG_PARAMS, plen, pdu, + HCI_CMD_TIMEOUT); +} + static bool hci_le_set_cig_params(struct hci_conn *conn, struct bt_iso_qos *qos) { struct hci_dev *hdev = conn->hdev; struct iso_list_data data; + struct iso_cig_params *pdu; memset(&data, 0, sizeof(data)); @@ -1837,12 +1862,18 @@ static bool hci_le_set_cig_params(struct hci_conn *conn, struct bt_iso_qos *qos) if (qos->ucast.cis == BT_ISO_QOS_CIS_UNSET || !data.pdu.cp.num_cis) return false; - if (hci_send_cmd(hdev, HCI_OP_LE_SET_CIG_PARAMS, - sizeof(data.pdu.cp) + - (data.pdu.cp.num_cis * sizeof(*data.pdu.cis)), - &data.pdu) < 0) + pdu = kzalloc(sizeof(*pdu), GFP_KERNEL); + if (!pdu) return false; + memcpy(pdu, &data.pdu, sizeof(*pdu)); + + if (hci_cmd_sync_queue(hdev, set_cig_params_sync, pdu, + set_cig_params_complete) < 0) { + kfree(pdu); + return false; + } + return true; } -- cgit v1.2.3 From db9cbcadc16e8b9f0b3ef5870f3a38ebafcbe8e0 Mon Sep 17 00:00:00 2001 From: Pauli Virtanen Date: Sat, 3 Jun 2023 00:28:12 +0300 Subject: Bluetooth: hci_event: fix Set CIG Parameters error status handling If the event has error status, return right error code and don't show incorrect "response malformed" messages. Signed-off-by: Pauli Virtanen Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Jakub Kicinski --- net/bluetooth/hci_event.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'net') diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index 09ba6d8987ee..1d493eefaabe 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -3812,7 +3812,8 @@ static u8 hci_cc_le_set_cig_params(struct hci_dev *hdev, void *data, bt_dev_dbg(hdev, "status 0x%2.2x", rp->status); cp = hci_sent_cmd_data(hdev, HCI_OP_LE_SET_CIG_PARAMS); - if (!cp || rp->num_handles != cp->num_cis || rp->cig_id != cp->cig_id) { + if (!rp->status && (!cp || rp->num_handles != cp->num_cis || + rp->cig_id != cp->cig_id)) { bt_dev_err(hdev, "unexpected Set CIG Parameters response data"); status = HCI_ERROR_UNSPECIFIED; } -- cgit v1.2.3 From 73f55453ea5236a586a7f1b3d5e2ee051d655351 Mon Sep 17 00:00:00 2001 From: Luiz Augusto von Dentz Date: Wed, 7 Jun 2023 12:33:47 -0700 Subject: Bluetooth: MGMT: Fix marking SCAN_RSP as not connectable When receiving a scan response there is no way to know if the remote device is connectable or not, so when it cannot be merged don't make any assumption and instead just mark it with a new flag defined as MGMT_DEV_FOUND_SCAN_RSP so userspace can tell it is a standalone SCAN_RSP. Link: https://lore.kernel.org/linux-bluetooth/CABBYNZ+CYMsDSPTxBn09Js3BcdC-x7vZFfyLJ3ppZGGwJKmUTw@mail.gmail.com/ Fixes: c70a7e4cc8d2 ("Bluetooth: Add support for Not Connectable flag for Device Found events") Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Jakub Kicinski --- include/net/bluetooth/mgmt.h | 1 + net/bluetooth/hci_event.c | 15 +++++---------- 2 files changed, 6 insertions(+), 10 deletions(-) (limited to 'net') diff --git a/include/net/bluetooth/mgmt.h b/include/net/bluetooth/mgmt.h index a5801649f619..5e68b3dd4422 100644 --- a/include/net/bluetooth/mgmt.h +++ b/include/net/bluetooth/mgmt.h @@ -979,6 +979,7 @@ struct mgmt_ev_auth_failed { #define MGMT_DEV_FOUND_NOT_CONNECTABLE BIT(2) #define MGMT_DEV_FOUND_INITIATED_CONN BIT(3) #define MGMT_DEV_FOUND_NAME_REQUEST_FAILED BIT(4) +#define MGMT_DEV_FOUND_SCAN_RSP BIT(5) #define MGMT_EV_DEVICE_FOUND 0x0012 struct mgmt_ev_device_found { diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index 1d493eefaabe..2383153d5345 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -6317,23 +6317,18 @@ static void process_adv_report(struct hci_dev *hdev, u8 type, bdaddr_t *bdaddr, return; } - /* When receiving non-connectable or scannable undirected - * advertising reports, this means that the remote device is - * not connectable and then clearly indicate this in the - * device found event. - * - * When receiving a scan response, then there is no way to + /* When receiving a scan response, then there is no way to * know if the remote device is connectable or not. However * since scan responses are merged with a previously seen * advertising report, the flags field from that report * will be used. * - * In the really unlikely case that a controller get confused - * and just sends a scan response event, then it is marked as - * not connectable as well. + * In the unlikely case that a controller just sends a scan + * response event that doesn't match the pending report, then + * it is marked as a standalone SCAN_RSP. */ if (type == LE_ADV_SCAN_RSP) - flags = MGMT_DEV_FOUND_NOT_CONNECTABLE; + flags = MGMT_DEV_FOUND_SCAN_RSP; /* If there's nothing pending either store the data from this * event or send an immediate device found event if the data -- cgit v1.2.3 From 14f0dceca60b2fc4f2388505b25f9e6f71785e05 Mon Sep 17 00:00:00 2001 From: Luiz Augusto von Dentz Date: Thu, 8 Jun 2023 11:12:18 -0700 Subject: Bluetooth: ISO: Rework sync_interval to be sync_factor This rework sync_interval to be sync_factor as having sync_interval in the order of seconds is sometimes not disarable. Wit sync_factor the application can tell how many SDU intervals it wants to send an announcement with PA, the EA interval is set to 2 times that so a factor of 24 of BIG SDU interval of 10ms would look like the following: < HCI Command: LE Set Extended Advertising Parameters (0x08|0x0036) plen 25 Handle: 0x01 Properties: 0x0000 Min advertising interval: 480.000 msec (0x0300) Max advertising interval: 480.000 msec (0x0300) Channel map: 37, 38, 39 (0x07) Own address type: Random (0x01) Peer address type: Public (0x00) Peer address: 00:00:00:00:00:00 (OUI 00-00-00) Filter policy: Allow Scan Request from Any, Allow Connect Request from Any (0x00) TX power: Host has no preference (0x7f) Primary PHY: LE 1M (0x01) Secondary max skip: 0x00 Secondary PHY: LE 2M (0x02) SID: 0x00 Scan request notifications: Disabled (0x00) < HCI Command: LE Set Periodic Advertising Parameters (0x08|0x003e) plen 7 Handle: 1 Min interval: 240.00 msec (0x00c0) Max interval: 240.00 msec (0x00c0) Properties: 0x0000 Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Jakub Kicinski --- include/net/bluetooth/bluetooth.h | 2 +- net/bluetooth/hci_conn.c | 4 ++-- net/bluetooth/iso.c | 4 ++-- 3 files changed, 5 insertions(+), 5 deletions(-) (limited to 'net') diff --git a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h index 1b4230cd42a3..af729859385e 100644 --- a/include/net/bluetooth/bluetooth.h +++ b/include/net/bluetooth/bluetooth.h @@ -185,7 +185,7 @@ struct bt_iso_ucast_qos { struct bt_iso_bcast_qos { __u8 big; __u8 bis; - __u8 sync_interval; + __u8 sync_factor; __u8 packing; __u8 framing; struct bt_iso_io_qos in; diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c index 2275e0d9f841..24407a974b9c 100644 --- a/net/bluetooth/hci_conn.c +++ b/net/bluetooth/hci_conn.c @@ -2075,10 +2075,10 @@ static int create_big_sync(struct hci_dev *hdev, void *data) flags |= MGMT_ADV_FLAG_SEC_2M; /* Align intervals */ - interval = qos->bcast.out.interval / 1250; + interval = (qos->bcast.out.interval / 1250) * qos->bcast.sync_factor; if (qos->bcast.bis) - sync_interval = qos->bcast.sync_interval * 1600; + sync_interval = interval * 4; err = hci_start_per_adv_sync(hdev, qos->bcast.bis, conn->le_per_adv_data_len, conn->le_per_adv_data, flags, interval, diff --git a/net/bluetooth/iso.c b/net/bluetooth/iso.c index 34d55a85d8f6..0e6cc57b3911 100644 --- a/net/bluetooth/iso.c +++ b/net/bluetooth/iso.c @@ -704,7 +704,7 @@ static struct bt_iso_qos default_qos = { .bcast = { .big = BT_ISO_QOS_BIG_UNSET, .bis = BT_ISO_QOS_BIS_UNSET, - .sync_interval = 0x00, + .sync_factor = 0x01, .packing = 0x00, .framing = 0x00, .in = DEFAULT_IO_QOS, @@ -1213,7 +1213,7 @@ static bool check_ucast_qos(struct bt_iso_qos *qos) static bool check_bcast_qos(struct bt_iso_qos *qos) { - if (qos->bcast.sync_interval > 0x07) + if (qos->bcast.sync_factor == 0x00) return false; if (qos->bcast.packing > 0x01) -- cgit v1.2.3 From d40d6f52d5bbbd7aa7092ba9fa793ad7dc253a35 Mon Sep 17 00:00:00 2001 From: Ivan Orlov Date: Tue, 20 Jun 2023 16:40:52 +0200 Subject: Bluetooth: hci_sysfs: make bt_class a static const structure Now that the driver core allows for struct class to be in read-only memory, move the bt_class structure to be declared at build time placing it into read-only memory, instead of having to be dynamically allocated at load time. Cc: Marcel Holtmann Cc: Johan Hedberg Cc: Luiz Augusto von Dentz Cc: linux-bluetooth@vger.kernel.org Suggested-by: Greg Kroah-Hartman Signed-off-by: Ivan Orlov Signed-off-by: Greg Kroah-Hartman Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Jakub Kicinski --- net/bluetooth/hci_sysfs.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) (limited to 'net') diff --git a/net/bluetooth/hci_sysfs.c b/net/bluetooth/hci_sysfs.c index 2934d7f4d564..15b33579007c 100644 --- a/net/bluetooth/hci_sysfs.c +++ b/net/bluetooth/hci_sysfs.c @@ -6,7 +6,9 @@ #include #include -static struct class *bt_class; +static const struct class bt_class = { + .name = "bluetooth", +}; static void bt_link_release(struct device *dev) { @@ -36,7 +38,7 @@ void hci_conn_init_sysfs(struct hci_conn *conn) BT_DBG("conn %p", conn); conn->dev.type = &bt_link; - conn->dev.class = bt_class; + conn->dev.class = &bt_class; conn->dev.parent = &hdev->dev; device_initialize(&conn->dev); @@ -104,7 +106,7 @@ void hci_init_sysfs(struct hci_dev *hdev) struct device *dev = &hdev->dev; dev->type = &bt_host; - dev->class = bt_class; + dev->class = &bt_class; __module_get(THIS_MODULE); device_initialize(dev); @@ -112,12 +114,10 @@ void hci_init_sysfs(struct hci_dev *hdev) int __init bt_sysfs_init(void) { - bt_class = class_create("bluetooth"); - - return PTR_ERR_OR_ZERO(bt_class); + return class_register(&bt_class); } void bt_sysfs_cleanup(void) { - class_destroy(bt_class); + class_unregister(&bt_class); } -- cgit v1.2.3 From 5b6d345d1b65d67624349e5de22227492c637576 Mon Sep 17 00:00:00 2001 From: Jiapeng Chong Date: Sun, 25 Jun 2023 16:45:13 +0800 Subject: Bluetooth: hci_conn: Use kmemdup() to replace kzalloc + memcpy Use kmemdup rather than duplicating its implementation. ./net/bluetooth/hci_conn.c:1880:7-14: WARNING opportunity for kmemdup. Reported-by: Abaci Robot Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5597 Signed-off-by: Jiapeng Chong Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Jakub Kicinski --- net/bluetooth/hci_conn.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) (limited to 'net') diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c index 24407a974b9c..056f9516e46d 100644 --- a/net/bluetooth/hci_conn.c +++ b/net/bluetooth/hci_conn.c @@ -1862,12 +1862,10 @@ static bool hci_le_set_cig_params(struct hci_conn *conn, struct bt_iso_qos *qos) if (qos->ucast.cis == BT_ISO_QOS_CIS_UNSET || !data.pdu.cp.num_cis) return false; - pdu = kzalloc(sizeof(*pdu), GFP_KERNEL); + pdu = kmemdup(&data.pdu, sizeof(*pdu), GFP_KERNEL); if (!pdu) return false; - memcpy(pdu, &data.pdu, sizeof(*pdu)); - if (hci_cmd_sync_queue(hdev, set_cig_params_sync, pdu, set_cig_params_complete) < 0) { kfree(pdu); -- cgit v1.2.3 From 2be22f1941d5f661aa8043261d1bae5b6696c749 Mon Sep 17 00:00:00 2001 From: Luiz Augusto von Dentz Date: Tue, 20 Jun 2023 15:41:11 -0700 Subject: Bluetooth: hci_event: Fix parsing of CIS Established Event The ISO Interval on CIS Established Event uses 1.25 ms slots: BLUETOOTH CORE SPECIFICATION Version 5.3 | Vol 4, Part E page 2304: Time = N * 1.25 ms In addition to that this always update the QoS settings based on CIS Established Event. Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Jakub Kicinski --- net/bluetooth/hci_event.c | 49 ++++++++++++++++++++++++++++++++--------------- 1 file changed, 34 insertions(+), 15 deletions(-) (limited to 'net') diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index 2383153d5345..95816a938cea 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -6786,6 +6786,7 @@ static void hci_le_cis_estabilished_evt(struct hci_dev *hdev, void *data, { struct hci_evt_le_cis_established *ev = data; struct hci_conn *conn; + struct bt_iso_qos *qos; u16 handle = __le16_to_cpu(ev->handle); bt_dev_dbg(hdev, "status 0x%2.2x", ev->status); @@ -6807,21 +6808,39 @@ static void hci_le_cis_estabilished_evt(struct hci_dev *hdev, void *data, goto unlock; } - if (conn->role == HCI_ROLE_SLAVE) { - __le32 interval; - - memset(&interval, 0, sizeof(interval)); - - memcpy(&interval, ev->c_latency, sizeof(ev->c_latency)); - conn->iso_qos.ucast.in.interval = le32_to_cpu(interval); - memcpy(&interval, ev->p_latency, sizeof(ev->p_latency)); - conn->iso_qos.ucast.out.interval = le32_to_cpu(interval); - conn->iso_qos.ucast.in.latency = le16_to_cpu(ev->interval); - conn->iso_qos.ucast.out.latency = le16_to_cpu(ev->interval); - conn->iso_qos.ucast.in.sdu = le16_to_cpu(ev->c_mtu); - conn->iso_qos.ucast.out.sdu = le16_to_cpu(ev->p_mtu); - conn->iso_qos.ucast.in.phy = ev->c_phy; - conn->iso_qos.ucast.out.phy = ev->p_phy; + qos = &conn->iso_qos; + + /* Convert ISO Interval (1.25 ms slots) to SDU Interval (us) */ + qos->ucast.in.interval = le16_to_cpu(ev->interval) * 1250; + qos->ucast.out.interval = qos->ucast.in.interval; + + switch (conn->role) { + case HCI_ROLE_SLAVE: + /* Convert Transport Latency (us) to Latency (msec) */ + qos->ucast.in.latency = + DIV_ROUND_CLOSEST(get_unaligned_le24(ev->c_latency), + 1000); + qos->ucast.out.latency = + DIV_ROUND_CLOSEST(get_unaligned_le24(ev->p_latency), + 1000); + qos->ucast.in.sdu = le16_to_cpu(ev->c_mtu); + qos->ucast.out.sdu = le16_to_cpu(ev->p_mtu); + qos->ucast.in.phy = ev->c_phy; + qos->ucast.out.phy = ev->p_phy; + break; + case HCI_ROLE_MASTER: + /* Convert Transport Latency (us) to Latency (msec) */ + qos->ucast.out.latency = + DIV_ROUND_CLOSEST(get_unaligned_le24(ev->c_latency), + 1000); + qos->ucast.in.latency = + DIV_ROUND_CLOSEST(get_unaligned_le24(ev->p_latency), + 1000); + qos->ucast.out.sdu = le16_to_cpu(ev->c_mtu); + qos->ucast.in.sdu = le16_to_cpu(ev->p_mtu); + qos->ucast.out.phy = ev->c_phy; + qos->ucast.in.phy = ev->p_phy; + break; } if (!ev->status) { -- cgit v1.2.3 From 6ca3c005d0604e8d2b439366e3923ea58db99641 Mon Sep 17 00:00:00 2001 From: Vladimir Oltean Date: Fri, 30 Jun 2023 19:41:18 +0300 Subject: net: bridge: keep ports without IFF_UNICAST_FLT in BR_PROMISC mode According to the synchronization rules for .ndo_get_stats() as seen in Documentation/networking/netdevices.rst, acquiring a plain spin_lock() should not be illegal, but the bridge driver implementation makes it so. After running these commands, I am being faced with the following lockdep splat: $ ip link add link swp0 name macsec0 type macsec encrypt on && ip link set swp0 up $ ip link add dev br0 type bridge vlan_filtering 1 && ip link set br0 up $ ip link set macsec0 master br0 && ip link set macsec0 up ======================================================== WARNING: possible irq lock inversion dependency detected 6.4.0-04295-g31b577b4bd4a #603 Not tainted -------------------------------------------------------- swapper/1/0 just changed the state of lock: ffff6bd348724cd8 (&br->lock){+.-.}-{3:3}, at: br_forward_delay_timer_expired+0x34/0x198 but this lock took another, SOFTIRQ-unsafe lock in the past: (&ocelot->stats_lock){+.+.}-{3:3} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Chain exists of: &br->lock --> &br->hash_lock --> &ocelot->stats_lock Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ocelot->stats_lock); local_irq_disable(); lock(&br->lock); lock(&br->hash_lock); lock(&br->lock); *** DEADLOCK *** (details about the 3 locks skipped) swp0 is instantiated by drivers/net/dsa/ocelot/felix.c, and this only matters to the extent that its .ndo_get_stats64() method calls spin_lock(&ocelot->stats_lock). Documentation/locking/lockdep-design.rst says: | A lock is irq-safe means it was ever used in an irq context, while a lock | is irq-unsafe means it was ever acquired with irq enabled. (...) | Furthermore, the following usage based lock dependencies are not allowed | between any two lock-classes:: | | -> | -> Lockdep marks br->hash_lock as softirq-safe, because it is sometimes taken in softirq context (for example br_fdb_update() which runs in NET_RX softirq), and when it's not in softirq context it blocks softirqs by using spin_lock_bh(). Lockdep marks ocelot->stats_lock as softirq-unsafe, because it never blocks softirqs from running, and it is never taken from softirq context. So it can always be interrupted by softirqs. There is a call path through which a function that holds br->hash_lock: fdb_add_hw_addr() will call a function that acquires ocelot->stats_lock: ocelot_port_get_stats64(). This can be seen below: ocelot_port_get_stats64+0x3c/0x1e0 felix_get_stats64+0x20/0x38 dsa_slave_get_stats64+0x3c/0x60 dev_get_stats+0x74/0x2c8 rtnl_fill_stats+0x4c/0x150 rtnl_fill_ifinfo+0x5cc/0x7b8 rtmsg_ifinfo_build_skb+0xe4/0x150 rtmsg_ifinfo+0x5c/0xb0 __dev_notify_flags+0x58/0x200 __dev_set_promiscuity+0xa0/0x1f8 dev_set_promiscuity+0x30/0x70 macsec_dev_change_rx_flags+0x68/0x88 __dev_set_promiscuity+0x1a8/0x1f8 __dev_set_rx_mode+0x74/0xa8 dev_uc_add+0x74/0xa0 fdb_add_hw_addr+0x68/0xd8 fdb_add_local+0xc4/0x110 br_fdb_add_local+0x54/0x88 br_add_if+0x338/0x4a0 br_add_slave+0x20/0x38 do_setlink+0x3a4/0xcb8 rtnl_newlink+0x758/0x9d0 rtnetlink_rcv_msg+0x2f0/0x550 netlink_rcv_skb+0x128/0x148 rtnetlink_rcv+0x24/0x38 the plain English explanation for it is: The macsec0 bridge port is created without p->flags & BR_PROMISC, because it is what br_manage_promisc() decides for a VLAN filtering bridge with a single auto port. As part of the br_add_if() procedure, br_fdb_add_local() is called for the MAC address of the device, and this results in a call to dev_uc_add() for macsec0 while the softirq-safe br->hash_lock is taken. Because macsec0 does not have IFF_UNICAST_FLT, dev_uc_add() ends up calling __dev_set_promiscuity() for macsec0, which is propagated by its implementation, macsec_dev_change_rx_flags(), to the lower device: swp0. This triggers the call path: dev_set_promiscuity(swp0) -> rtmsg_ifinfo() -> dev_get_stats() -> ocelot_port_get_stats64() with a calling context that lockdep doesn't like (br->hash_lock held). Normally we don't see this, because even though many drivers that can be bridge ports don't support IFF_UNICAST_FLT, we need a driver that (a) doesn't support IFF_UNICAST_FLT, *and* (b) it forwards the IFF_PROMISC flag to another driver, and (c) *that* driver implements ndo_get_stats64() using a softirq-unsafe spinlock. Condition (b) is necessary because the first __dev_set_rx_mode() calls __dev_set_promiscuity() with "bool notify=false", and thus, the rtmsg_ifinfo() code path won't be entered. The same criteria also hold true for DSA switches which don't report IFF_UNICAST_FLT. When the DSA master uses a spin_lock() in its ndo_get_stats64() method, the same lockdep splat can be seen. I think the deadlock possibility is real, even though I didn't reproduce it, and I'm thinking of the following situation to support that claim: fdb_add_hw_addr() runs on a CPU A, in a context with softirqs locally disabled and br->hash_lock held, and may end up attempting to acquire ocelot->stats_lock. In parallel, ocelot->stats_lock is currently held by a thread B (say, ocelot_check_stats_work()), which is interrupted while holding it by a softirq which attempts to lock br->hash_lock. Thread B cannot make progress because br->hash_lock is held by A. Whereas thread A cannot make progress because ocelot->stats_lock is held by B. When taking the issue at face value, the bridge can avoid that problem by simply making the ports promiscuous from a code path with a saner calling context (br->hash_lock not held). A bridge port without IFF_UNICAST_FLT is going to become promiscuous as soon as we call dev_uc_add() on it (which we do unconditionally), so why not be preemptive and make it promiscuous right from the beginning, so as to not be taken by surprise. With this, we've broken the links between code that holds br->hash_lock or br->lock and code that calls into the ndo_change_rx_flags() or ndo_get_stats64() ops of the bridge port. Fixes: 2796d0c648c9 ("bridge: Automatically manage port promiscuous mode.") Signed-off-by: Vladimir Oltean Reviewed-by: Ido Schimmel Signed-off-by: David S. Miller --- net/bridge/br_if.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) (limited to 'net') diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c index 3f04b40f6056..2450690f98cf 100644 --- a/net/bridge/br_if.c +++ b/net/bridge/br_if.c @@ -166,8 +166,9 @@ void br_manage_promisc(struct net_bridge *br) * This lets us disable promiscuous mode and write * this config to hw. */ - if (br->auto_cnt == 0 || - (br->auto_cnt == 1 && br_auto_port(p))) + if ((p->dev->priv_flags & IFF_UNICAST_FLT) && + (br->auto_cnt == 0 || + (br->auto_cnt == 1 && br_auto_port(p)))) br_port_clear_promisc(p); else br_port_set_promisc(p); -- cgit v1.2.3 From a398b9ea0c3b791b7a0f4c6029a62cf628f97f22 Mon Sep 17 00:00:00 2001 From: Vladimir Oltean Date: Sat, 1 Jul 2023 01:20:10 +0300 Subject: net: dsa: tag_sja1105: fix source port decoding in vlan_filtering=0 bridge mode There was a regression introduced by the blamed commit, where pinging to a VLAN-unaware bridge would fail with the repeated message "Couldn't decode source port" coming from the tagging protocol driver. When receiving packets with a bridge_vid as determined by dsa_tag_8021q_bridge_join(), dsa_8021q_rcv() will decode: - source_port = 0 (which isn't really valid, more like "don't know") - switch_id = 0 (which isn't really valid, more like "don't know") - vbid = value in range 1-7 Since the blamed patch has reversed the order of the checks, we are now going to believe that source_port != -1 and switch_id != -1, so they're valid, but they aren't. The minimal solution to the problem is to only populate source_port and switch_id with what dsa_8021q_rcv() came up with, if the vbid is zero, i.e. the source port information is trustworthy. Fixes: c1ae02d87689 ("net: dsa: tag_sja1105: always prefer source port information from INCL_SRCPT") Signed-off-by: Vladimir Oltean Reviewed-by: Simon Horman Signed-off-by: David S. Miller --- net/dsa/tag_sja1105.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) (limited to 'net') diff --git a/net/dsa/tag_sja1105.c b/net/dsa/tag_sja1105.c index 92a626a05e82..db0a6ac67470 100644 --- a/net/dsa/tag_sja1105.c +++ b/net/dsa/tag_sja1105.c @@ -573,11 +573,14 @@ static struct sk_buff *sja1105_rcv(struct sk_buff *skb, * if available. This allows us to not overwrite a valid source * port and switch ID with zeroes when receiving link-local * frames from a VLAN-unaware bridged port (non-zero vbid) or a - * VLAN-aware bridged port (non-zero vid). + * VLAN-aware bridged port (non-zero vid). Furthermore, the + * tag_8021q source port information is only of trust when the + * vbid is 0 (precise port). Otherwise, tmp_source_port and + * tmp_switch_id will be zeroes. */ - if (source_port == -1) + if (vbid == 0 && source_port == -1) source_port = tmp_source_port; - if (switch_id == -1) + if (vbid == 0 && switch_id == -1) switch_id = tmp_switch_id; } else if (source_port == -1 && switch_id == -1) { /* Packets with no source information have no chance of -- cgit v1.2.3 From 998127cdb4699b9d470a9348ffe9f1154346be5f Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Thu, 29 Jun 2023 16:41:50 +0000 Subject: tcp: annotate data races in __tcp_oow_rate_limited() request sockets are lockless, __tcp_oow_rate_limited() could be called on the same object from different cpus. This is harmless. Add READ_ONCE()/WRITE_ONCE() annotations to avoid a KCSAN report. Fixes: 4ce7e93cb3fe ("tcp: rate limit ACK sent by SYN_RECV request sockets") Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller --- net/ipv4/tcp_input.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) (limited to 'net') diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 6f072095211e..57c8af1859c1 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3590,8 +3590,11 @@ static int tcp_ack_update_window(struct sock *sk, const struct sk_buff *skb, u32 static bool __tcp_oow_rate_limited(struct net *net, int mib_idx, u32 *last_oow_ack_time) { - if (*last_oow_ack_time) { - s32 elapsed = (s32)(tcp_jiffies32 - *last_oow_ack_time); + /* Paired with the WRITE_ONCE() in this function. */ + u32 val = READ_ONCE(*last_oow_ack_time); + + if (val) { + s32 elapsed = (s32)(tcp_jiffies32 - val); if (0 <= elapsed && elapsed < READ_ONCE(net->ipv4.sysctl_tcp_invalid_ratelimit)) { @@ -3600,7 +3603,10 @@ static bool __tcp_oow_rate_limited(struct net *net, int mib_idx, } } - *last_oow_ack_time = tcp_jiffies32; + /* Paired with the prior READ_ONCE() and with itself, + * as we might be lockless. + */ + WRITE_ONCE(*last_oow_ack_time, tcp_jiffies32); return false; /* not rate-limited: go ahead, send dupack now! */ } -- cgit v1.2.3 From f7306acec9aae9893d15e745c8791124d42ab10a Mon Sep 17 00:00:00 2001 From: Ilya Maximets Date: Mon, 3 Jul 2023 19:53:29 +0200 Subject: xsk: Honor SO_BINDTODEVICE on bind Initial creation of an AF_XDP socket requires CAP_NET_RAW capability. A privileged process might create the socket and pass it to a non-privileged process for later use. However, that process will be able to bind the socket to any network interface. Even though it will not be able to receive any traffic without modification of the BPF map, the situation is not ideal. Sockets already have a mechanism that can be used to restrict what interface they can be attached to. That is SO_BINDTODEVICE. To change the SO_BINDTODEVICE binding the process will need CAP_NET_RAW. Make xsk_bind() honor the SO_BINDTODEVICE in order to allow safer workflow when non-privileged process is using AF_XDP. The intended workflow is following: 1. First process creates a bare socket with socket(AF_XDP, ...). 2. First process loads the XSK program to the interface. 3. First process adds the socket fd to a BPF map. 4. First process ties socket fd to a particular interface using SO_BINDTODEVICE. 5. First process sends socket fd to a second process. 6. Second process allocates UMEM. 7. Second process binds socket to the interface with bind(...). 8. Second process sends/receives the traffic. All the steps above are possible today if the first process is privileged and the second one has sufficient RLIMIT_MEMLOCK and no capabilities. However, the second process will be able to bind the socket to any interface it wants on step 7 and send traffic from it. With the proposed change, the second process will be able to bind the socket only to a specific interface chosen by the first process at step 4. Fixes: 965a99098443 ("xsk: add support for bind for Rx") Signed-off-by: Ilya Maximets Signed-off-by: Daniel Borkmann Acked-by: Magnus Karlsson Acked-by: John Fastabend Acked-by: Jason Wang Link: https://lore.kernel.org/bpf/20230703175329.3259672-1-i.maximets@ovn.org --- Documentation/networking/af_xdp.rst | 9 +++++++++ net/xdp/xsk.c | 5 +++++ 2 files changed, 14 insertions(+) (limited to 'net') diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst index 247c6c4127e9..1cc35de336a4 100644 --- a/Documentation/networking/af_xdp.rst +++ b/Documentation/networking/af_xdp.rst @@ -433,6 +433,15 @@ start N bytes into the buffer leaving the first N bytes for the application to use. The final option is the flags field, but it will be dealt with in separate sections for each UMEM flag. +SO_BINDTODEVICE setsockopt +-------------------------- + +This is a generic SOL_SOCKET option that can be used to tie AF_XDP +socket to a particular network interface. It is useful when a socket +is created by a privileged process and passed to a non-privileged one. +Once the option is set, kernel will refuse attempts to bind that socket +to a different interface. Updating the value requires CAP_NET_RAW. + XDP_STATISTICS getsockopt ------------------------- diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 5a8c0dd250af..31dca4ecb2c5 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -886,6 +886,7 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) struct sock *sk = sock->sk; struct xdp_sock *xs = xdp_sk(sk); struct net_device *dev; + int bound_dev_if; u32 flags, qid; int err = 0; @@ -899,6 +900,10 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) XDP_USE_NEED_WAKEUP)) return -EINVAL; + bound_dev_if = READ_ONCE(sk->sk_bound_dev_if); + if (bound_dev_if && bound_dev_if != sxdp->sxdp_ifindex) + return -EINVAL; + rtnl_lock(); mutex_lock(&xs->mutex); if (xs->state != XSK_READY) { -- cgit v1.2.3 From 30c45b5361d39b4b793780ffac5538090b9e2eb1 Mon Sep 17 00:00:00 2001 From: Lin Ma Date: Mon, 3 Jul 2023 19:08:42 +0800 Subject: net/sched: act_pedit: Add size check for TCA_PEDIT_PARMS_EX The attribute TCA_PEDIT_PARMS_EX is not be included in pedit_policy and one malicious user could fake a TCA_PEDIT_PARMS_EX whose length is smaller than the intended sizeof(struct tc_pedit). Hence, the dereference in tcf_pedit_init() could access dirty heap data. static int tcf_pedit_init(...) { // ... pattr = tb[TCA_PEDIT_PARMS]; // TCA_PEDIT_PARMS is included if (!pattr) pattr = tb[TCA_PEDIT_PARMS_EX]; // but this is not // ... parm = nla_data(pattr); index = parm->index; // parm is able to be smaller than 4 bytes // and this dereference gets dirty skb_buff // data created in netlink_sendmsg } This commit adds TCA_PEDIT_PARMS_EX length in pedit_policy which avoid the above case, just like the TCA_PEDIT_PARMS. Fixes: 71d0ed7079df ("net/act_pedit: Support using offset relative to the conventional network headers") Signed-off-by: Lin Ma Reviewed-by: Pedro Tammela Link: https://lore.kernel.org/r/20230703110842.590282-1-linma@zju.edu.cn Signed-off-by: Paolo Abeni --- net/sched/act_pedit.c | 1 + 1 file changed, 1 insertion(+) (limited to 'net') diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c index b562fc2bb5b1..1ef8fcfa9997 100644 --- a/net/sched/act_pedit.c +++ b/net/sched/act_pedit.c @@ -29,6 +29,7 @@ static struct tc_action_ops act_pedit_ops; static const struct nla_policy pedit_policy[TCA_PEDIT_MAX + 1] = { [TCA_PEDIT_PARMS] = { .len = sizeof(struct tc_pedit) }, + [TCA_PEDIT_PARMS_EX] = { .len = sizeof(struct tc_pedit) }, [TCA_PEDIT_KEYS_EX] = { .type = NLA_NESTED }, }; -- cgit v1.2.3 From 1dcf6efd5f0c1f4496b3ef7ec5a7db104a53b38c Mon Sep 17 00:00:00 2001 From: Vladimir Oltean Date: Tue, 4 Jul 2023 01:05:44 +0300 Subject: net: dsa: tag_sja1105: fix MAC DA patching from meta frames The SJA1105 manual says that at offset 4 into the meta frame payload we have "MAC destination byte 2" and at offset 5 we have "MAC destination byte 1". These are counted from the LSB, so byte 1 is h_dest[ETH_HLEN-2] aka h_dest[4] and byte 2 is h_dest[ETH_HLEN-3] aka h_dest[3]. The sja1105_meta_unpack() function decodes these the other way around, so a frame with MAC DA 01:80:c2:11:22:33 is received by the network stack as having 01:80:c2:22:11:33. Fixes: e53e18a6fe4d ("net: dsa: sja1105: Receive and decode meta frames") Signed-off-by: Vladimir Oltean Reviewed-by: Simon Horman Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller --- net/dsa/tag_sja1105.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'net') diff --git a/net/dsa/tag_sja1105.c b/net/dsa/tag_sja1105.c index db0a6ac67470..ec48165673ed 100644 --- a/net/dsa/tag_sja1105.c +++ b/net/dsa/tag_sja1105.c @@ -118,8 +118,8 @@ static void sja1105_meta_unpack(const struct sk_buff *skb, * a unified unpacking command for both device series. */ packing(buf, &meta->tstamp, 31, 0, 4, UNPACK, 0); - packing(buf + 4, &meta->dmac_byte_4, 7, 0, 1, UNPACK, 0); - packing(buf + 5, &meta->dmac_byte_3, 7, 0, 1, UNPACK, 0); + packing(buf + 4, &meta->dmac_byte_3, 7, 0, 1, UNPACK, 0); + packing(buf + 5, &meta->dmac_byte_4, 7, 0, 1, UNPACK, 0); packing(buf + 6, &meta->source_port, 7, 0, 1, UNPACK, 0); packing(buf + 7, &meta->switch_id, 7, 0, 1, UNPACK, 0); } -- cgit v1.2.3 From a372d66af48506d9f7aaae2a474cd18f14d98cb8 Mon Sep 17 00:00:00 2001 From: Vladimir Oltean Date: Tue, 4 Jul 2023 01:05:45 +0300 Subject: net: dsa: sja1105: always enable the send_meta options incl_srcpt has the limitation, mentioned in commit b4638af8885a ("net: dsa: sja1105: always enable the INCL_SRCPT option"), that frames with a MAC DA of 01:80:c2:xx:yy:zz will be received as 01:80:c2:00:00:zz unless PTP RX timestamping is enabled. The incl_srcpt option was initially unconditionally enabled, then that changed with commit 42824463d38d ("net: dsa: sja1105: Limit use of incl_srcpt to bridge+vlan mode"), then again with b4638af8885a ("net: dsa: sja1105: always enable the INCL_SRCPT option"). Bottom line is that it now needs to be always enabled, otherwise the driver does not have a reliable source of information regarding source_port and switch_id for link-local traffic (tag_8021q VLANs may be imprecise since now they identify an entire bridging domain when ports are not standalone). If we accept that PTP RX timestamping (and therefore, meta frame generation) is always enabled in hardware, then that limitation could be avoided and packets with any MAC DA can be properly received, because meta frames do contain the original bytes from the MAC DA of their associated link-local packet. This change enables meta frame generation unconditionally, which also has the nice side effects of simplifying the switch control path (a switch reset is no longer required on hwtstamping settings change) and the tagger data path (it no longer needs to be informed whether to expect meta frames or not - it always does). Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean Reviewed-by: Simon Horman Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller --- drivers/net/dsa/sja1105/sja1105.h | 2 +- drivers/net/dsa/sja1105/sja1105_main.c | 5 ++-- drivers/net/dsa/sja1105/sja1105_ptp.c | 48 +++------------------------------- include/linux/dsa/sja1105.h | 4 --- net/dsa/tag_sja1105.c | 45 ------------------------------- 5 files changed, 7 insertions(+), 97 deletions(-) (limited to 'net') diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h index fb1549a5fe32..dee35ba924ad 100644 --- a/drivers/net/dsa/sja1105/sja1105.h +++ b/drivers/net/dsa/sja1105/sja1105.h @@ -252,6 +252,7 @@ struct sja1105_private { unsigned long ucast_egress_floods; unsigned long bcast_egress_floods; unsigned long hwts_tx_en; + unsigned long hwts_rx_en; const struct sja1105_info *info; size_t max_xfer_len; struct spi_device *spidev; @@ -289,7 +290,6 @@ struct sja1105_spi_message { /* From sja1105_main.c */ enum sja1105_reset_reason { SJA1105_VLAN_FILTERING = 0, - SJA1105_RX_HWTSTAMPING, SJA1105_AGEING_TIME, SJA1105_SCHEDULING, SJA1105_BEST_EFFORT_POLICING, diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c index dd154b2b9680..3529a565b4aa 100644 --- a/drivers/net/dsa/sja1105/sja1105_main.c +++ b/drivers/net/dsa/sja1105/sja1105_main.c @@ -867,11 +867,11 @@ static int sja1105_init_general_params(struct sja1105_private *priv) .mac_fltres1 = SJA1105_LINKLOCAL_FILTER_A, .mac_flt1 = SJA1105_LINKLOCAL_FILTER_A_MASK, .incl_srcpt1 = true, - .send_meta1 = false, + .send_meta1 = true, .mac_fltres0 = SJA1105_LINKLOCAL_FILTER_B, .mac_flt0 = SJA1105_LINKLOCAL_FILTER_B_MASK, .incl_srcpt0 = true, - .send_meta0 = false, + .send_meta0 = true, /* Default to an invalid value */ .mirr_port = priv->ds->num_ports, /* No TTEthernet */ @@ -2215,7 +2215,6 @@ static int sja1105_reload_cbs(struct sja1105_private *priv) static const char * const sja1105_reset_reasons[] = { [SJA1105_VLAN_FILTERING] = "VLAN filtering", - [SJA1105_RX_HWTSTAMPING] = "RX timestamping", [SJA1105_AGEING_TIME] = "Ageing time", [SJA1105_SCHEDULING] = "Time-aware scheduling", [SJA1105_BEST_EFFORT_POLICING] = "Best-effort policing", diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.c b/drivers/net/dsa/sja1105/sja1105_ptp.c index 30fb2cc40164..a7d41e781398 100644 --- a/drivers/net/dsa/sja1105/sja1105_ptp.c +++ b/drivers/net/dsa/sja1105/sja1105_ptp.c @@ -58,35 +58,10 @@ enum sja1105_ptp_clk_mode { #define ptp_data_to_sja1105(d) \ container_of((d), struct sja1105_private, ptp_data) -/* Must be called only while the RX timestamping state of the tagger - * is turned off - */ -static int sja1105_change_rxtstamping(struct sja1105_private *priv, - bool on) -{ - struct sja1105_ptp_data *ptp_data = &priv->ptp_data; - struct sja1105_general_params_entry *general_params; - struct sja1105_table *table; - - table = &priv->static_config.tables[BLK_IDX_GENERAL_PARAMS]; - general_params = table->entries; - general_params->send_meta1 = on; - general_params->send_meta0 = on; - - ptp_cancel_worker_sync(ptp_data->clock); - skb_queue_purge(&ptp_data->skb_txtstamp_queue); - skb_queue_purge(&ptp_data->skb_rxtstamp_queue); - - return sja1105_static_config_reload(priv, SJA1105_RX_HWTSTAMPING); -} - int sja1105_hwtstamp_set(struct dsa_switch *ds, int port, struct ifreq *ifr) { - struct sja1105_tagger_data *tagger_data = sja1105_tagger_data(ds); struct sja1105_private *priv = ds->priv; struct hwtstamp_config config; - bool rx_on; - int rc; if (copy_from_user(&config, ifr->ifr_data, sizeof(config))) return -EFAULT; @@ -104,26 +79,13 @@ int sja1105_hwtstamp_set(struct dsa_switch *ds, int port, struct ifreq *ifr) switch (config.rx_filter) { case HWTSTAMP_FILTER_NONE: - rx_on = false; + priv->hwts_rx_en &= ~BIT(port); break; default: - rx_on = true; + priv->hwts_rx_en |= BIT(port); break; } - if (rx_on != tagger_data->rxtstamp_get_state(ds)) { - tagger_data->rxtstamp_set_state(ds, false); - - rc = sja1105_change_rxtstamping(priv, rx_on); - if (rc < 0) { - dev_err(ds->dev, - "Failed to change RX timestamping: %d\n", rc); - return rc; - } - if (rx_on) - tagger_data->rxtstamp_set_state(ds, true); - } - if (copy_to_user(ifr->ifr_data, &config, sizeof(config))) return -EFAULT; return 0; @@ -131,7 +93,6 @@ int sja1105_hwtstamp_set(struct dsa_switch *ds, int port, struct ifreq *ifr) int sja1105_hwtstamp_get(struct dsa_switch *ds, int port, struct ifreq *ifr) { - struct sja1105_tagger_data *tagger_data = sja1105_tagger_data(ds); struct sja1105_private *priv = ds->priv; struct hwtstamp_config config; @@ -140,7 +101,7 @@ int sja1105_hwtstamp_get(struct dsa_switch *ds, int port, struct ifreq *ifr) config.tx_type = HWTSTAMP_TX_ON; else config.tx_type = HWTSTAMP_TX_OFF; - if (tagger_data->rxtstamp_get_state(ds)) + if (priv->hwts_rx_en & BIT(port)) config.rx_filter = HWTSTAMP_FILTER_PTP_V2_L2_EVENT; else config.rx_filter = HWTSTAMP_FILTER_NONE; @@ -413,11 +374,10 @@ static long sja1105_rxtstamp_work(struct ptp_clock_info *ptp) bool sja1105_rxtstamp(struct dsa_switch *ds, int port, struct sk_buff *skb) { - struct sja1105_tagger_data *tagger_data = sja1105_tagger_data(ds); struct sja1105_private *priv = ds->priv; struct sja1105_ptp_data *ptp_data = &priv->ptp_data; - if (!tagger_data->rxtstamp_get_state(ds)) + if (!(priv->hwts_rx_en & BIT(port))) return false; /* We need to read the full PTP clock to reconstruct the Rx diff --git a/include/linux/dsa/sja1105.h b/include/linux/dsa/sja1105.h index 159e43171ccc..c177322f793d 100644 --- a/include/linux/dsa/sja1105.h +++ b/include/linux/dsa/sja1105.h @@ -48,13 +48,9 @@ struct sja1105_deferred_xmit_work { /* Global tagger data */ struct sja1105_tagger_data { - /* Tagger to switch */ void (*xmit_work_fn)(struct kthread_work *work); void (*meta_tstamp_handler)(struct dsa_switch *ds, int port, u8 ts_id, enum sja1110_meta_tstamp dir, u64 tstamp); - /* Switch to tagger */ - bool (*rxtstamp_get_state)(struct dsa_switch *ds); - void (*rxtstamp_set_state)(struct dsa_switch *ds, bool on); }; struct sja1105_skb_cb { diff --git a/net/dsa/tag_sja1105.c b/net/dsa/tag_sja1105.c index ec48165673ed..ade3eeb2f3e6 100644 --- a/net/dsa/tag_sja1105.c +++ b/net/dsa/tag_sja1105.c @@ -58,11 +58,8 @@ #define SJA1110_TX_TRAILER_LEN 4 #define SJA1110_MAX_PADDING_LEN 15 -#define SJA1105_HWTS_RX_EN 0 - struct sja1105_tagger_private { struct sja1105_tagger_data data; /* Must be first */ - unsigned long state; /* Protects concurrent access to the meta state machine * from taggers running on multiple ports on SMP systems */ @@ -392,10 +389,6 @@ static struct sk_buff priv = sja1105_tagger_private(ds); - if (!test_bit(SJA1105_HWTS_RX_EN, &priv->state)) - /* Do normal processing. */ - return skb; - spin_lock(&priv->meta_lock); /* Was this a link-local frame instead of the meta * that we were expecting? @@ -431,12 +424,6 @@ static struct sk_buff priv = sja1105_tagger_private(ds); - /* Drop the meta frame if we're not in the right state - * to process it. - */ - if (!test_bit(SJA1105_HWTS_RX_EN, &priv->state)) - return NULL; - spin_lock(&priv->meta_lock); stampable_skb = priv->stampable_skb; @@ -472,30 +459,6 @@ static struct sk_buff return skb; } -static bool sja1105_rxtstamp_get_state(struct dsa_switch *ds) -{ - struct sja1105_tagger_private *priv = sja1105_tagger_private(ds); - - return test_bit(SJA1105_HWTS_RX_EN, &priv->state); -} - -static void sja1105_rxtstamp_set_state(struct dsa_switch *ds, bool on) -{ - struct sja1105_tagger_private *priv = sja1105_tagger_private(ds); - - if (on) - set_bit(SJA1105_HWTS_RX_EN, &priv->state); - else - clear_bit(SJA1105_HWTS_RX_EN, &priv->state); - - /* Initialize the meta state machine to a known state */ - if (!priv->stampable_skb) - return; - - kfree_skb(priv->stampable_skb); - priv->stampable_skb = NULL; -} - static bool sja1105_skb_has_tag_8021q(const struct sk_buff *skb) { u16 tpid = ntohs(eth_hdr(skb)->h_proto); @@ -552,9 +515,6 @@ static struct sk_buff *sja1105_rcv(struct sk_buff *skb, */ source_port = hdr->h_dest[3]; switch_id = hdr->h_dest[4]; - /* Clear the DMAC bytes that were mangled by the switch */ - hdr->h_dest[3] = 0; - hdr->h_dest[4] = 0; } else if (is_meta) { sja1105_meta_unpack(skb, &meta); source_port = meta.source_port; @@ -785,7 +745,6 @@ static void sja1105_disconnect(struct dsa_switch *ds) static int sja1105_connect(struct dsa_switch *ds) { - struct sja1105_tagger_data *tagger_data; struct sja1105_tagger_private *priv; struct kthread_worker *xmit_worker; int err; @@ -805,10 +764,6 @@ static int sja1105_connect(struct dsa_switch *ds) } priv->xmit_worker = xmit_worker; - /* Export functions for switch driver use */ - tagger_data = &priv->data; - tagger_data->rxtstamp_get_state = sja1105_rxtstamp_get_state; - tagger_data->rxtstamp_set_state = sja1105_rxtstamp_set_state; ds->tagger_data = priv; return 0; -- cgit v1.2.3 From 3fffa15bfef48b0ad6424779c03e68ae8ace5acb Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Tue, 4 Jul 2023 22:44:33 +0200 Subject: mptcp: ensure subflow is unhashed before cleaning the backlog While tacking care of the mptcp-level listener I unintentionally moved the subflow level unhash after the subflow listener backlog cleanup. That could cause some nasty race and makes the code harder to read. Address the issue restoring the proper order of operations. Fixes: 57fc0f1ceaa4 ("mptcp: ensure listener is unhashed before updating the sk status") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni Reviewed-by: Matthieu Baerts Signed-off-by: Matthieu Baerts Signed-off-by: David S. Miller --- net/mptcp/protocol.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'net') diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index e892673deb73..489a3defdde5 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2909,10 +2909,10 @@ static void mptcp_check_listen_stop(struct sock *sk) return; lock_sock_nested(ssk, SINGLE_DEPTH_NESTING); + tcp_set_state(ssk, TCP_CLOSE); mptcp_subflow_queue_clean(sk, ssk); inet_csk_listen_stop(ssk); mptcp_event_pm_listener(ssk, MPTCP_EVENT_LISTENER_CLOSED); - tcp_set_state(ssk, TCP_CLOSE); release_sock(ssk); } -- cgit v1.2.3 From 0226436acf2495cde4b93e7400e5a87305c26054 Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Tue, 4 Jul 2023 22:44:34 +0200 Subject: mptcp: do not rely on implicit state check in mptcp_listen() Since the blamed commit, closing the first subflow resets the first subflow socket state to SS_UNCONNECTED. The current mptcp listen implementation relies only on such state to prevent touching not-fully-disconnected sockets. Incoming mptcp fastclose (or paired endpoint removal) unconditionally closes the first subflow. All the above allows an incoming fastclose followed by a listen() call to successfully race with a blocking recvmsg(), potentially causing the latter to hit a divide by zero bug in cleanup_rbuf/__tcp_select_window(). Address the issue explicitly checking the msk socket state in mptcp_listen(). An alternative solution would be moving the first subflow socket state update into mptcp_disconnect(), but in the long term the first subflow socket should be removed: better avoid relaying on it for internal consistency check. Fixes: b29fcfb54cd7 ("mptcp: full disconnect implementation") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/414 Signed-off-by: Paolo Abeni Reviewed-by: Matthieu Baerts Signed-off-by: Matthieu Baerts Signed-off-by: David S. Miller --- net/mptcp/protocol.c | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'net') diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 489a3defdde5..3613489eb6e3 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3703,6 +3703,11 @@ static int mptcp_listen(struct socket *sock, int backlog) pr_debug("msk=%p", msk); lock_sock(sk); + + err = -EINVAL; + if (sock->state != SS_UNCONNECTED || sock->type != SOCK_STREAM) + goto unlock; + ssock = __mptcp_nmpc_socket(msk); if (IS_ERR(ssock)) { err = PTR_ERR(ssock); -- cgit v1.2.3