summaryrefslogtreecommitdiffstats
path: root/kernel/smp.c
diff options
context:
space:
mode:
authorFeng Tang <feng.tang@intel.com>2020-11-25 06:22:21 +0100
committerLinus Torvalds <torvalds@linux-foundation.org>2020-11-26 18:35:49 +0100
commit4df910620bebb5cfe234af16ac8f6474b60215fd (patch)
treee8dd4c72b76dd5b743ddbcfb41cc99df0e1c7342 /kernel/smp.c
parentMerge tag 'media/v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mc... (diff)
downloadlinux-4df910620bebb5cfe234af16ac8f6474b60215fd.tar.xz
linux-4df910620bebb5cfe234af16ac8f6474b60215fd.zip
mm: memcg: relayout structure mem_cgroup to avoid cache interference
0day reported one -22.7% regression for will-it-scale page_fault2 case [1] on a 4 sockets 144 CPU platform, and bisected to it to be caused by Waiman's optimization (commit bd0b230fe1) of saving one 'struct page_counter' space for 'struct mem_cgroup'. Initially we thought it was due to the cache alignment change introduced by the patch, but further debug shows that it is due to some hot data members ('vmstats_local', 'vmstats_percpu', 'vmstats') sit in 2 adjacent cacheline (2N and 2N+1 cacheline), and when adjacent cache line prefetch is enabled, it triggers an "extended level" of cache false sharing for 2 adjacent cache lines. So exchange the 2 member blocks, while keeping mostly the original cache alignment, which can restore and even enhance the performance, and save 64 bytes of space for 'struct mem_cgroup' (from 2880 to 2816, with 0day's default RHEL-8.3 kernel config) [1]. https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/ Fixes: bd0b230fe145 ("mm/memcg: unify swap and memsw page counters") Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Feng Tang <feng.tang@intel.com> Acked-by: Waiman Long <longman@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'kernel/smp.c')
0 files changed, 0 insertions, 0 deletions