summaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorTejun Heo <tj@kernel.org>2017-02-02 19:50:35 +0100
committerTejun Heo <tj@kernel.org>2017-02-02 19:50:35 +0100
commit63f1ca59453aadae81f702840c7ac6ea8b9f9262 (patch)
tree5dc559d4dc33c8e1408f757d8853b23033ea118f /Documentation
parentcgroup: drop the matching uid requirement on migration for cgroup v2 (diff)
parentrdmacg: Fixed uninitialized current resource usage (diff)
downloadlinux-63f1ca59453aadae81f702840c7ac6ea8b9f9262.tar.xz
linux-63f1ca59453aadae81f702840c7ac6ea8b9f9262.zip
Merge branch 'cgroup/for-4.11-rdmacg' into cgroup/for-4.11
Merge in to resolve conflicts in Documentation/cgroup-v2.txt. The conflicts are from multiple section additions and trivial to resolve. Signed-off-by: Tejun Heo <tj@kernel.org>
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/cgroup-v1/rdma.txt109
-rw-r--r--Documentation/cgroup-v2.txt46
2 files changed, 151 insertions, 4 deletions
diff --git a/Documentation/cgroup-v1/rdma.txt b/Documentation/cgroup-v1/rdma.txt
new file mode 100644
index 000000000000..af618171e0eb
--- /dev/null
+++ b/Documentation/cgroup-v1/rdma.txt
@@ -0,0 +1,109 @@
+ RDMA Controller
+ ----------------
+
+Contents
+--------
+
+1. Overview
+ 1-1. What is RDMA controller?
+ 1-2. Why RDMA controller needed?
+ 1-3. How is RDMA controller implemented?
+2. Usage Examples
+
+1. Overview
+
+1-1. What is RDMA controller?
+-----------------------------
+
+RDMA controller allows user to limit RDMA/IB specific resources that a given
+set of processes can use. These processes are grouped using RDMA controller.
+
+RDMA controller defines two resources which can be limited for processes of a
+cgroup.
+
+1-2. Why RDMA controller needed?
+--------------------------------
+
+Currently user space applications can easily take away all the rdma verb
+specific resources such as AH, CQ, QP, MR etc. Due to which other applications
+in other cgroup or kernel space ULPs may not even get chance to allocate any
+rdma resources. This can leads to service unavailability.
+
+Therefore RDMA controller is needed through which resource consumption
+of processes can be limited. Through this controller different rdma
+resources can be accounted.
+
+1-3. How is RDMA controller implemented?
+----------------------------------------
+
+RDMA cgroup allows limit configuration of resources. Rdma cgroup maintains
+resource accounting per cgroup, per device using resource pool structure.
+Each such resource pool is limited up to 64 resources in given resource pool
+by rdma cgroup, which can be extended later if required.
+
+This resource pool object is linked to the cgroup css. Typically there
+are 0 to 4 resource pool instances per cgroup, per device in most use cases.
+But nothing limits to have it more. At present hundreds of RDMA devices per
+single cgroup may not be handled optimally, however there is no
+known use case or requirement for such configuration either.
+
+Since RDMA resources can be allocated from any process and can be freed by any
+of the child processes which shares the address space, rdma resources are
+always owned by the creator cgroup css. This allows process migration from one
+to other cgroup without major complexity of transferring resource ownership;
+because such ownership is not really present due to shared nature of
+rdma resources. Linking resources around css also ensures that cgroups can be
+deleted after processes migrated. This allow progress migration as well with
+active resources, even though that is not a primary use case.
+
+Whenever RDMA resource charging occurs, owner rdma cgroup is returned to
+the caller. Same rdma cgroup should be passed while uncharging the resource.
+This also allows process migrated with active RDMA resource to charge
+to new owner cgroup for new resource. It also allows to uncharge resource of
+a process from previously charged cgroup which is migrated to new cgroup,
+even though that is not a primary use case.
+
+Resource pool object is created in following situations.
+(a) User sets the limit and no previous resource pool exist for the device
+of interest for the cgroup.
+(b) No resource limits were configured, but IB/RDMA stack tries to
+charge the resource. So that it correctly uncharge them when applications are
+running without limits and later on when limits are enforced during uncharging,
+otherwise usage count will drop to negative.
+
+Resource pool is destroyed if all the resource limits are set to max and
+it is the last resource getting deallocated.
+
+User should set all the limit to max value if it intents to remove/unconfigure
+the resource pool for a particular device.
+
+IB stack honors limits enforced by the rdma controller. When application
+query about maximum resource limits of IB device, it returns minimum of
+what is configured by user for a given cgroup and what is supported by
+IB device.
+
+Following resources can be accounted by rdma controller.
+ hca_handle Maximum number of HCA Handles
+ hca_object Maximum number of HCA Objects
+
+2. Usage Examples
+-----------------
+
+(a) Configure resource limit:
+echo mlx4_0 hca_handle=2 hca_object=2000 > /sys/fs/cgroup/rdma/1/rdma.max
+echo ocrdma1 hca_handle=3 > /sys/fs/cgroup/rdma/2/rdma.max
+
+(b) Query resource limit:
+cat /sys/fs/cgroup/rdma/2/rdma.max
+#Output:
+mlx4_0 hca_handle=2 hca_object=2000
+ocrdma1 hca_handle=3 hca_object=max
+
+(c) Query current usage:
+cat /sys/fs/cgroup/rdma/2/rdma.current
+#Output:
+mlx4_0 hca_handle=1 hca_object=20
+ocrdma1 hca_handle=1 hca_object=23
+
+(d) Delete resource limit:
+echo echo mlx4_0 hca_handle=max hca_object=max > /sys/fs/cgroup/rdma/1/rdma.max
diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
index 1d101423ca92..3b8449f8ac7e 100644
--- a/Documentation/cgroup-v2.txt
+++ b/Documentation/cgroup-v2.txt
@@ -49,8 +49,10 @@ CONTENTS
5-3-2. Writeback
5-4. PID
5-4-1. PID Interface Files
- 5-5. Misc
- 5-5-1. perf_event
+ 5-5. RDMA
+ 5-5-1. RDMA Interface Files
+ 5-6. Misc
+ 5-6-1. perf_event
6. Namespace
6-1. Basics
6-2. The Root and Views
@@ -1160,9 +1162,45 @@ through fork() or clone(). These will return -EAGAIN if the creation
of a new process would cause a cgroup policy to be violated.
-5-5. Misc
+5-5. RDMA
-5-5-1. perf_event
+The "rdma" controller regulates the distribution and accounting of
+of RDMA resources.
+
+5-5-1. RDMA Interface Files
+
+ rdma.max
+ A readwrite nested-keyed file that exists for all the cgroups
+ except root that describes current configured resource limit
+ for a RDMA/IB device.
+
+ Lines are keyed by device name and are not ordered.
+ Each line contains space separated resource name and its configured
+ limit that can be distributed.
+
+ The following nested keys are defined.
+
+ hca_handle Maximum number of HCA Handles
+ hca_object Maximum number of HCA Objects
+
+ An example for mlx4 and ocrdma device follows.
+
+ mlx4_0 hca_handle=2 hca_object=2000
+ ocrdma1 hca_handle=3 hca_object=max
+
+ rdma.current
+ A read-only file that describes current resource usage.
+ It exists for all the cgroup except root.
+
+ An example for mlx4 and ocrdma device follows.
+
+ mlx4_0 hca_handle=1 hca_object=20
+ ocrdma1 hca_handle=1 hca_object=23
+
+
+5-6. Misc
+
+5-6-1. perf_event
perf_event controller, if not mounted on a legacy hierarchy, is
automatically enabled on the v2 hierarchy so that perf events can