summaryrefslogtreecommitdiffstats
path: root/mm/vmscan.c
diff options
context:
space:
mode:
authorHuang Ying <ying.huang@intel.com>2021-09-02 23:59:33 +0200
committerLinus Torvalds <torvalds@linux-foundation.org>2021-09-03 18:58:16 +0200
commit20b51af15e014cac63b58a4f8b8b323ac35bccce (patch)
tree1c718dff92e64dfef017c46ba2662c844353bce8 /mm/vmscan.c
parentmm/vmscan: never demote for memcg reclaim (diff)
downloadlinux-20b51af15e014cac63b58a4f8b8b323ac35bccce.tar.xz
linux-20b51af15e014cac63b58a4f8b8b323ac35bccce.zip
mm/migrate: add sysfs interface to enable reclaim migration
Some method is obviously needed to enable reclaim-based migration. Just like traditional autonuma, there will be some workloads that will benefit like workloads with more "static" configurations where hot pages stay hot and cold pages stay cold. If pages come and go from the hot and cold sets, the benefits of this approach will be more limited. The benefits are truly workload-based and *not* hardware-based. We do not believe that there is a viable threshold where certain hardware configurations should have this mechanism enabled while others do not. To be conservative, earlier work defaulted to disable reclaim- based migration and did not include a mechanism to enable it. This proposes add a new sysfs file /sys/kernel/mm/numa/demotion_enabled as a method to enable it. We are open to any alternative that allows end users to enable this mechanism or disable it if workload harm is detected (just like traditional autonuma). Once this is enabled page demotion may move data to a NUMA node that does not fall into the cpuset of the allocating process. This could be construed to violate the guarantees of cpusets. However, since this is an opt-in mechanism, the assumption is that anyone enabling it is content to relax the guarantees. Link: https://lkml.kernel.org/r/20210721063926.3024591-9-ying.huang@intel.com Link: https://lkml.kernel.org/r/20210715055145.195411-10-ying.huang@intel.com Signed-off-by: Huang Ying <ying.huang@intel.com> Originally-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Wei Xu <weixugc@google.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Cc: David Rientjes <rientjes@google.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Keith Busch <kbusch@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm/vmscan.c')
-rw-r--r--mm/vmscan.c5
1 files changed, 3 insertions, 2 deletions
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 43289f5f8488..2255025f1891 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -524,6 +524,8 @@ static long add_nr_deferred(long nr, struct shrinker *shrinker,
static bool can_demote(int nid, struct scan_control *sc)
{
+ if (!numa_demotion_enabled)
+ return false;
if (sc) {
if (sc->no_demotion)
return false;
@@ -534,8 +536,7 @@ static bool can_demote(int nid, struct scan_control *sc)
if (next_demotion_node(nid) == NUMA_NO_NODE)
return false;
- // FIXME: actually enable this later in the series
- return false;
+ return true;
}
static inline bool can_reclaim_anon_pages(struct mem_cgroup *memcg,