summaryrefslogtreecommitdiffstats
path: root/tools/perf/bench
diff options
context:
space:
mode:
authorSebastian Andrzej Siewior <bigeasy@linutronix.de>2016-10-16 21:08:02 +0200
committerArnaldo Carvalho de Melo <acme@redhat.com>2016-10-24 16:07:45 +0200
commit34b753007d646482a4125a7095e1d1986d395f95 (patch)
tree266d4c9ebb091a9bcc6f7b2268ef13abf0125ed3 /tools/perf/bench
parentperf tools: Use normal error reporting when processing PERF_RECORD_READ events (diff)
downloadlinux-34b753007d646482a4125a7095e1d1986d395f95.tar.xz
linux-34b753007d646482a4125a7095e1d1986d395f95.zip
perf bench futex: Cache align the worker struct
It popped up in perf testing that the worker consumes some amount of CPU. It boils down to the increment of `ops` which causes cache line bouncing between the individual threads. This patch aligns the struct by 256 bytes to ensure that not a cache line is shared among CPUs. 128 byte is the x86 worst case and grep says that L1_CACHE_SHIFT is set to 8 on s390. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20161016190803.3392-1-bigeasy@linutronix.de Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Diffstat (limited to 'tools/perf/bench')
-rw-r--r--tools/perf/bench/futex-hash.c5
1 files changed, 4 insertions, 1 deletions
diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
index 8024cd5febd2..d9e5e80bb4d0 100644
--- a/tools/perf/bench/futex-hash.c
+++ b/tools/perf/bench/futex-hash.c
@@ -39,12 +39,15 @@ static unsigned int threads_starting;
static struct stats throughput_stats;
static pthread_cond_t thread_parent, thread_worker;
+#define SMP_CACHE_BYTES 256
+#define __cacheline_aligned __attribute__ ((aligned (SMP_CACHE_BYTES)))
+
struct worker {
int tid;
u_int32_t *futex;
pthread_t thread;
unsigned long ops;
-};
+} __cacheline_aligned;
static const struct option options[] = {
OPT_UINTEGER('t', "threads", &nthreads, "Specify amount of threads"),