diff options
author | Roland Dreier <rdreier@cisco.com> | 2006-05-15 20:41:00 +0200 |
---|---|---|
committer | Linus Torvalds <torvalds@g5.osdl.org> | 2006-05-16 16:59:32 +0200 |
commit | a4523a8b38089478f93bc053c31f678c63f5ee1b (patch) | |
tree | 96f828650d2234aac76fe39ea38b7c7250c49349 | |
parent | [PATCH] x86_64: Don't schedule on exception stack on preemptive kernels (diff) | |
download | linux-a4523a8b38089478f93bc053c31f678c63f5ee1b.tar.xz linux-a4523a8b38089478f93bc053c31f678c63f5ee1b.zip |
[PATCH] slab: Fix kmem_cache_destroy() on NUMA
With CONFIG_NUMA set, kmem_cache_destroy() may fail and say "Can't
free all objects." The problem is caused by sequences such as the
following (suppose we are on a NUMA machine with two nodes, 0 and 1):
* Allocate an object from cache on node 0.
* Free the object on node 1. The object is put into node 1's alien
array_cache for node 0.
* Call kmem_cache_destroy(), which ultimately ends up in __cache_shrink().
* __cache_shrink() does drain_cpu_caches(), which loops through all nodes.
For each node it drains the shared array_cache and then handles the
alien array_cache for the other node.
However this means that node 0's shared array_cache will be drained,
and then node 1 will move the contents of its alien[0] array_cache
into that same shared array_cache. node 0's shared array_cache is
never looked at again, so the objects left there will appear to be in
use when __cache_shrink() calls __node_shrink() for node 0. So
__node_shrink() will return 1 and kmem_cache_destroy() will fail.
This patch fixes this by having drain_cpu_caches() do
drain_alien_cache() on every node before it does drain_array() on the
nodes' shared array_caches.
The problem was originally reported by Or Gerlitz <ogerlitz@voltaire.com>.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-rw-r--r-- | mm/slab.c | 11 |
1 files changed, 7 insertions, 4 deletions
diff --git a/mm/slab.c b/mm/slab.c index b1d643b5238d..d31a06bfbea5 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -2200,11 +2200,14 @@ static void drain_cpu_caches(struct kmem_cache *cachep) check_irq_on(); for_each_online_node(node) { l3 = cachep->nodelists[node]; - if (l3) { + if (l3 && l3->alien) + drain_alien_cache(cachep, l3->alien); + } + + for_each_online_node(node) { + l3 = cachep->nodelists[node]; + if (l3) drain_array(cachep, l3, l3->shared, 1, node); - if (l3->alien) - drain_alien_cache(cachep, l3->alien); - } } } |