diff options
author | Jakub Kicinski <kuba@kernel.org> | 2022-09-20 19:21:53 +0200 |
---|---|---|
committer | Jakub Kicinski <kuba@kernel.org> | 2022-09-20 19:21:54 +0200 |
commit | 4fa37e49114c7e20f9a23424f004ccf4155f1b47 (patch) | |
tree | ca59608a3e62d51f457e44e76631145cf375cb56 /drivers/net/ethernet/netronome | |
parent | headers: Remove some left-over license text (diff) | |
parent | tcp: Introduce optional per-netns ehash. (diff) | |
download | linux-4fa37e49114c7e20f9a23424f004ccf4155f1b47.tar.xz linux-4fa37e49114c7e20f9a23424f004ccf4155f1b47.zip |
Merge branch 'tcp-introduce-optional-per-netns-ehash'
Kuniyuki Iwashima says:
====================
tcp: Introduce optional per-netns ehash.
The more sockets we have in the hash table, the longer we spend looking
up the socket. While running a number of small workloads on the same
host, they penalise each other and cause performance degradation.
The root cause might be a single workload that consumes much more
resources than the others. It often happens on a cloud service where
different workloads share the same computing resource.
On EC2 c5.24xlarge instance (196 GiB memory and 524288 (1Mi / 2) ehash
entries), after running iperf3 in different netns, creating 24Mi sockets
without data transfer in the root netns causes about 10% performance
regression for the iperf3's connection.
thash_entries sockets length Gbps
524288 1 1 50.7
24Mi 48 45.1
It is basically related to the length of the list of each hash bucket.
For testing purposes to see how performance drops along the length,
I set 131072 (1Mi / 8) to thash_entries, and here's the result.
thash_entries sockets length Gbps
131072 1 1 50.7
1Mi 8 49.9
2Mi 16 48.9
4Mi 32 47.3
8Mi 64 44.6
16Mi 128 40.6
24Mi 192 36.3
32Mi 256 32.5
40Mi 320 27.0
48Mi 384 25.0
To resolve the socket lookup degradation, we introduce an optional
per-netns hash table for TCP, but it's just ehash, and we still share
the global bhash, bhash2 and lhash2.
With a smaller ehash, we can look up non-listener sockets faster and
isolate such noisy neighbours. Also, we can reduce lock contention.
For details, please see the last patch.
patch 1 - 4: prep for per-netns ehash
patch 5: small optimisation for netns dismantle without TIME_WAIT sockets
patch 6: add per-netns ehash
Many thanks to Eric Dumazet for reviewing and advising.
====================
Link: https://lore.kernel.org/r/20220908011022.45342-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'drivers/net/ethernet/netronome')
-rw-r--r-- | drivers/net/ethernet/netronome/nfp/crypto/tls.c | 5 |
1 files changed, 3 insertions, 2 deletions
diff --git a/drivers/net/ethernet/netronome/nfp/crypto/tls.c b/drivers/net/ethernet/netronome/nfp/crypto/tls.c index 78368e71ce83..f80f1a6953fa 100644 --- a/drivers/net/ethernet/netronome/nfp/crypto/tls.c +++ b/drivers/net/ethernet/netronome/nfp/crypto/tls.c @@ -474,6 +474,7 @@ int nfp_net_tls_rx_resync_req(struct net_device *netdev, { struct nfp_net *nn = netdev_priv(netdev); struct nfp_net_tls_offload_ctx *ntls; + struct net *net = dev_net(netdev); struct ipv6hdr *ipv6h; struct tcphdr *th; struct iphdr *iph; @@ -494,13 +495,13 @@ int nfp_net_tls_rx_resync_req(struct net_device *netdev, switch (ipv6h->version) { case 4: - sk = inet_lookup_established(dev_net(netdev), &tcp_hashinfo, + sk = inet_lookup_established(net, net->ipv4.tcp_death_row.hashinfo, iph->saddr, th->source, iph->daddr, th->dest, netdev->ifindex); break; #if IS_ENABLED(CONFIG_IPV6) case 6: - sk = __inet6_lookup_established(dev_net(netdev), &tcp_hashinfo, + sk = __inet6_lookup_established(net, net->ipv4.tcp_death_row.hashinfo, &ipv6h->saddr, th->source, &ipv6h->daddr, ntohs(th->dest), netdev->ifindex, 0); |