summaryrefslogtreecommitdiffstats
path: root/Documentation/errseq.rst
diff options
context:
space:
mode:
authorMatthew Wilcox <mawilcox@microsoft.com>2017-12-22 15:32:16 +0100
committerJonathan Corbet <corbet@lwn.net>2018-01-01 20:40:27 +0100
commit14ebc28e07e68ff412aa42f7d8b67969e2f63d00 (patch)
tree7d5aa02eba4a70f4d8da61e51eae16e0c7956567 /Documentation/errseq.rst
parentw1_netlink.h: add support for nested structs (diff)
downloadlinux-14ebc28e07e68ff412aa42f7d8b67969e2f63d00.tar.xz
linux-14ebc28e07e68ff412aa42f7d8b67969e2f63d00.zip
errseq: Add to documentation tree
- Move errseq.rst into core-api - Add errseq to the core-api index - Promote the header to a more prominent header type, otherwise we get three entries in the table of contents. - Reformat the table to look nicer and be a little more proportional in terms of horizontal width per bit (the SF bit is still disproportionately large, but there's no way to fix that). - Include errseq kernel-doc in the errseq.rst - Neaten some kernel-doc markup Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Diffstat (limited to 'Documentation/errseq.rst')
-rw-r--r--Documentation/errseq.rst149
1 files changed, 0 insertions, 149 deletions
diff --git a/Documentation/errseq.rst b/Documentation/errseq.rst
deleted file mode 100644
index 4c29bd5afbc5..000000000000
--- a/Documentation/errseq.rst
+++ /dev/null
@@ -1,149 +0,0 @@
-The errseq_t datatype
-=====================
-An errseq_t is a way of recording errors in one place, and allowing any
-number of "subscribers" to tell whether it has changed since a previous
-point where it was sampled.
-
-The initial use case for this is tracking errors for file
-synchronization syscalls (fsync, fdatasync, msync and sync_file_range),
-but it may be usable in other situations.
-
-It's implemented as an unsigned 32-bit value. The low order bits are
-designated to hold an error code (between 1 and MAX_ERRNO). The upper bits
-are used as a counter. This is done with atomics instead of locking so that
-these functions can be called from any context.
-
-Note that there is a risk of collisions if new errors are being recorded
-frequently, since we have so few bits to use as a counter.
-
-To mitigate this, the bit between the error value and counter is used as
-a flag to tell whether the value has been sampled since a new value was
-recorded. That allows us to avoid bumping the counter if no one has
-sampled it since the last time an error was recorded.
-
-Thus we end up with a value that looks something like this::
-
- bit: 31..13 12 11..0
- +-----------------+----+----------------+
- | counter | SF | errno |
- +-----------------+----+----------------+
-
-The general idea is for "watchers" to sample an errseq_t value and keep
-it as a running cursor. That value can later be used to tell whether
-any new errors have occurred since that sampling was done, and atomically
-record the state at the time that it was checked. This allows us to
-record errors in one place, and then have a number of "watchers" that
-can tell whether the value has changed since they last checked it.
-
-A new errseq_t should always be zeroed out. An errseq_t value of all zeroes
-is the special (but common) case where there has never been an error. An all
-zero value thus serves as the "epoch" if one wishes to know whether there
-has ever been an error set since it was first initialized.
-
-API usage
-=========
-Let me tell you a story about a worker drone. Now, he's a good worker
-overall, but the company is a little...management heavy. He has to
-report to 77 supervisors today, and tomorrow the "big boss" is coming in
-from out of town and he's sure to test the poor fellow too.
-
-They're all handing him work to do -- so much he can't keep track of who
-handed him what, but that's not really a big problem. The supervisors
-just want to know when he's finished all of the work they've handed him so
-far and whether he made any mistakes since they last asked.
-
-He might have made the mistake on work they didn't actually hand him,
-but he can't keep track of things at that level of detail, all he can
-remember is the most recent mistake that he made.
-
-Here's our worker_drone representation::
-
- struct worker_drone {
- errseq_t wd_err; /* for recording errors */
- };
-
-Every day, the worker_drone starts out with a blank slate::
-
- struct worker_drone wd;
-
- wd.wd_err = (errseq_t)0;
-
-The supervisors come in and get an initial read for the day. They
-don't care about anything that happened before their watch begins::
-
- struct supervisor {
- errseq_t s_wd_err; /* private "cursor" for wd_err */
- spinlock_t s_wd_err_lock; /* protects s_wd_err */
- }
-
- struct supervisor su;
-
- su.s_wd_err = errseq_sample(&wd.wd_err);
- spin_lock_init(&su.s_wd_err_lock);
-
-Now they start handing him tasks to do. Every few minutes they ask him to
-finish up all of the work they've handed him so far. Then they ask him
-whether he made any mistakes on any of it::
-
- spin_lock(&su.su_wd_err_lock);
- err = errseq_check_and_advance(&wd.wd_err, &su.s_wd_err);
- spin_unlock(&su.su_wd_err_lock);
-
-Up to this point, that just keeps returning 0.
-
-Now, the owners of this company are quite miserly and have given him
-substandard equipment with which to do his job. Occasionally it
-glitches and he makes a mistake. He sighs a heavy sigh, and marks it
-down::
-
- errseq_set(&wd.wd_err, -EIO);
-
-...and then gets back to work. The supervisors eventually poll again
-and they each get the error when they next check. Subsequent calls will
-return 0, until another error is recorded, at which point it's reported
-to each of them once.
-
-Note that the supervisors can't tell how many mistakes he made, only
-whether one was made since they last checked, and the latest value
-recorded.
-
-Occasionally the big boss comes in for a spot check and asks the worker
-to do a one-off job for him. He's not really watching the worker
-full-time like the supervisors, but he does need to know whether a
-mistake occurred while his job was processing.
-
-He can just sample the current errseq_t in the worker, and then use that
-to tell whether an error has occurred later::
-
- errseq_t since = errseq_sample(&wd.wd_err);
- /* submit some work and wait for it to complete */
- err = errseq_check(&wd.wd_err, since);
-
-Since he's just going to discard "since" after that point, he doesn't
-need to advance it here. He also doesn't need any locking since it's
-not usable by anyone else.
-
-Serializing errseq_t cursor updates
-===================================
-Note that the errseq_t API does not protect the errseq_t cursor during a
-check_and_advance_operation. Only the canonical error code is handled
-atomically. In a situation where more than one task might be using the
-same errseq_t cursor at the same time, it's important to serialize
-updates to that cursor.
-
-If that's not done, then it's possible for the cursor to go backward
-in which case the same error could be reported more than once.
-
-Because of this, it's often advantageous to first do an errseq_check to
-see if anything has changed, and only later do an
-errseq_check_and_advance after taking the lock. e.g.::
-
- if (errseq_check(&wd.wd_err, READ_ONCE(su.s_wd_err)) {
- /* su.s_wd_err is protected by s_wd_err_lock */
- spin_lock(&su.s_wd_err_lock);
- err = errseq_check_and_advance(&wd.wd_err, &su.s_wd_err);
- spin_unlock(&su.s_wd_err_lock);
- }
-
-That avoids the spinlock in the common case where nothing has changed
-since the last time it was checked.