perf core: Add a kmem_cache for struct perf_event

The kernel can allocate a lot of struct perf_event when profiling. For example, 256 cpu x 8 events x 20 cgroups = 40K instances of the struct would be allocated on a large system. The size of struct perf_event in my setup is 1152 byte. As it's allocated by kmalloc, the actual allocation size would be rounded up to 2K. Then there's 896 byte (~43%) of waste per instance resulting in total ~35MB with 40K instances. We can create a dedicated kmem_cache to avoid such a big unnecessary memory consumption. With this change, I can see below (note this machine has 112 cpus). # grep perf_event /proc/slabinfo perf_event 224 784 1152 7 2 : tunables 24 12 8 : slabdata 112 112 0 The sixth column is pages-per-slab which is 2, and the fifth column is obj-per-slab which is 7. Thus actually it can use 1152 x 7 = 8064 byte in the 8K, and wasted memory is (8192 - 8064) / 7 = ~18 byte per instance. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311115413.444407-1-namhyung@kernel.org
author: Namhyung Kim <namhyung@google.com> 2021-03-11 12:54:12 +0100
committer: Peter Zijlstra <peterz@infradead.org> 2021-03-16 21:44:42 +0100
commit: bdacfaf26da166dd56c62f23f27a4b3e71f2d89e (patch)
tree: ec997eec7ea48b7875e1de7902a72ad80e8d5196 /kernel/events
parent: perf core: Allocate perf_buffer in the target node memory (diff)
download: linux-bdacfaf26da166dd56c62f23f27a4b3e71f2d89e.tar.xz
linux-bdacfaf26da166dd56c62f23f27a4b3e71f2d89e.zip
1 files changed, 6 insertions, 3 deletions
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 03db40f6cba9..f526ddb50d5e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -405,6 +405,7 @@ static LIST_HEAD(pmus);
 static DEFINE_MUTEX(pmus_lock);
 static struct srcu_struct pmus_srcu;
 static cpumask_var_t perf_online_mask;
+static struct kmem_cache *perf_event_cache;
 
 /*
  * perf event paranoia level:
@@ -4611,7 +4612,7 @@ static void free_event_rcu(struct rcu_head *head)
 	if (event->ns)
 		put_pid_ns(event->ns);
 	perf_event_free_filter(event);
-	kfree(event);
+	kmem_cache_free(perf_event_cache, event);
 }
 
 static void ring_buffer_attach(struct perf_event *event,
@@ -11293,7 +11294,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 			return ERR_PTR(-EINVAL);
 	}
 
-	event = kzalloc(sizeof(*event), GFP_KERNEL);
+	event = kmem_cache_zalloc(perf_event_cache, GFP_KERNEL);
 	if (!event)
 		return ERR_PTR(-ENOMEM);
 
@@ -11497,7 +11498,7 @@ err_ns:
 		put_pid_ns(event->ns);
 	if (event->hw.target)
 		put_task_struct(event->hw.target);
-	kfree(event);
+	kmem_cache_free(perf_event_cache, event);
 
 	return ERR_PTR(err);
 }
@@ -13130,6 +13131,8 @@ void __init perf_event_init(void)
 	ret = init_hw_breakpoint();
 	WARN(ret, "hw_breakpoint initialization failed with: %d", ret);
 
+	perf_event_cache = KMEM_CACHE(perf_event, SLAB_PANIC);
+
 	/*
 	 * Build time assertion that we keep the data_head at the intended
 	 * location.  IOW, validation we got the __reserved[] size right.
author	Namhyung Kim <namhyung@google.com>	2021-03-11 12:54:12 +0100
committer	Peter Zijlstra <peterz@infradead.org>	2021-03-16 21:44:42 +0100
commit	bdacfaf26da166dd56c62f23f27a4b3e71f2d89e (patch)
tree	ec997eec7ea48b7875e1de7902a72ad80e8d5196 /kernel/events
parent	perf core: Allocate perf_buffer in the target node memory (diff)
download	linux-bdacfaf26da166dd56c62f23f27a4b3e71f2d89e.tar.xz linux-bdacfaf26da166dd56c62f23f27a4b3e71f2d89e.zip