summaryrefslogtreecommitdiffstats
path: root/MAINTAINERS
diff options
context:
space:
mode:
authorRaghavendra K T <raghavendra.kt@linux.vnet.ibm.com>2015-08-30 07:59:42 +0200
committerDavid S. Miller <davem@davemloft.net>2015-08-31 06:48:58 +0200
commita3a773726c9f9ba2e87fd8ad8e36feff5f6ffd8e (patch)
treebc03244dc5119515d96c6c9a64b0c871ad554385 /MAINTAINERS
parentnet: Introduce helper functions to get the per cpu data (diff)
downloadlinux-a3a773726c9f9ba2e87fd8ad8e36feff5f6ffd8e.tar.xz
linux-a3a773726c9f9ba2e87fd8ad8e36feff5f6ffd8e.zip
net: Optimize snmp stat aggregation by walking all the percpu data at once
Docker container creation linearly increased from around 1.6 sec to 7.5 sec (at 1000 containers) and perf data showed 50% ovehead in snmp_fold_field. reason: currently __snmp6_fill_stats64 calls snmp_fold_field that walks through per cpu data of an item (iteratively for around 36 items). idea: This patch tries to aggregate the statistics by going through all the items of each cpu sequentially which is reducing cache misses. Docker creation got faster by more than 2x after the patch. Result: Before After Docker creation time 6.836s 3.25s cache miss 2.7% 1.41% perf before: 50.73% docker [kernel.kallsyms] [k] snmp_fold_field 9.07% swapper [kernel.kallsyms] [k] snooze_loop 3.49% docker [kernel.kallsyms] [k] veth_stats_one 2.85% swapper [kernel.kallsyms] [k] _raw_spin_lock perf after: 10.57% docker docker [.] scanblock 8.37% swapper [kernel.kallsyms] [k] snooze_loop 6.91% docker [kernel.kallsyms] [k] snmp_get_cpu_field 6.67% docker [kernel.kallsyms] [k] veth_stats_one changes/ideas suggested: Using buffer in stack (Eric), Usage of memset (David), Using memcpy in place of unaligned_put (Joe). Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to '')
0 files changed, 0 insertions, 0 deletions