z3fold: use per-cpu unbuddied lists - linux

diff options

author	Vitaly Wool <vitalywool@gmail.com>	2017-09-07 01:24:47 +0200
committer	Linus Torvalds <torvalds@linux-foundation.org>	2017-09-07 02:27:30 +0200
commit	d30561c56f4114f7d6595a40498ba364ffa6e28e (patch)
tree	45b68d050718e2d3c41569d233f497ccacea6755 /mm/page_alloc.c
parent	mm, swap: don't use VMA based swap readahead if HDD is used as swap (diff)
download	linux-d30561c56f4114f7d6595a40498ba364ffa6e28e.tar.xz linux-d30561c56f4114f7d6595a40498ba364ffa6e28e.zip

z3fold: use per-cpu unbuddied lists

It's been noted that z3fold doesn't scale well when it's run in a large number of threads on many cores, which can be easily reproduced with fio 'randrw' test with --numjobs=32. E.g. the result for 1 cluster (4 cores) is: Run status group 0 (all jobs): READ: io=244785MB, aggrb=496883KB/s, minb=15527KB/s, ... WRITE: io=246735MB, aggrb=500841KB/s, minb=15651KB/s, ... While for 8 cores (2 clusters) the result is: Run status group 0 (all jobs): READ: io=244785MB, aggrb=265942KB/s, minb=8310KB/s, ... WRITE: io=246735MB, aggrb=268060KB/s, minb=8376KB/s, ... The bottleneck here is the pool lock which many threads become waiting upon. To reduce that spin lock contention, z3fold can operate only on the lists local to the current CPU whenever possible. Due to the nature of z3fold unbuddied list handling (it only takes the first entry off the list on a hot path), if the z3fold pool is big enough and balanced well enough, limiting search to only local unbuddied list doesn't lead to a significant compression ratio degrade (2.57x vs 2.65x in our measurements). This patch also introduces two worker threads: one for async in-page object layout optimization and one for releasing freed pages. This is done to speed up z3fold_free() which is often on a hot path. The fio results for 8-core case are now the following: Run status group 0 (all jobs): READ: io=244785MB, aggrb=1568.3MB/s, minb=50182KB/s, ... WRITE: io=246735MB, aggrb=1580.8MB/s, minb=50582KB/s, ... So we're in for almost 6x performance increase. Link: http://lkml.kernel.org/r/20170806181443.f9b65018f8bde25ef990f9e8@gmail.com Signed-off-by: Vitaly Wool <vitalywool@gmail.com> Cc: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Diffstat (limited to 'mm/page_alloc.c')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: