summaryrefslogtreecommitdiffstats
path: root/Documentation/trace/ring-buffer-design.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/trace/ring-buffer-design.txt')
-rw-r--r--Documentation/trace/ring-buffer-design.txt955
1 files changed, 0 insertions, 955 deletions
diff --git a/Documentation/trace/ring-buffer-design.txt b/Documentation/trace/ring-buffer-design.txt
deleted file mode 100644
index 2d53c6f25b91..000000000000
--- a/Documentation/trace/ring-buffer-design.txt
+++ /dev/null
@@ -1,955 +0,0 @@
- Lockless Ring Buffer Design
- ===========================
-
-Copyright 2009 Red Hat Inc.
- Author: Steven Rostedt <srostedt@redhat.com>
- License: The GNU Free Documentation License, Version 1.2
- (dual licensed under the GPL v2)
-Reviewers: Mathieu Desnoyers, Huang Ying, Hidetoshi Seto,
- and Frederic Weisbecker.
-
-
-Written for: 2.6.31
-
-Terminology used in this Document
----------------------------------
-
-tail - where new writes happen in the ring buffer.
-
-head - where new reads happen in the ring buffer.
-
-producer - the task that writes into the ring buffer (same as writer)
-
-writer - same as producer
-
-consumer - the task that reads from the buffer (same as reader)
-
-reader - same as consumer.
-
-reader_page - A page outside the ring buffer used solely (for the most part)
- by the reader.
-
-head_page - a pointer to the page that the reader will use next
-
-tail_page - a pointer to the page that will be written to next
-
-commit_page - a pointer to the page with the last finished non-nested write.
-
-cmpxchg - hardware-assisted atomic transaction that performs the following:
-
- A = B if previous A == C
-
- R = cmpxchg(A, C, B) is saying that we replace A with B if and only if
- current A is equal to C, and we put the old (current) A into R
-
- R gets the previous A regardless if A is updated with B or not.
-
- To see if the update was successful a compare of R == C may be used.
-
-The Generic Ring Buffer
------------------------
-
-The ring buffer can be used in either an overwrite mode or in
-producer/consumer mode.
-
-Producer/consumer mode is where if the producer were to fill up the
-buffer before the consumer could free up anything, the producer
-will stop writing to the buffer. This will lose most recent events.
-
-Overwrite mode is where if the producer were to fill up the buffer
-before the consumer could free up anything, the producer will
-overwrite the older data. This will lose the oldest events.
-
-No two writers can write at the same time (on the same per-cpu buffer),
-but a writer may interrupt another writer, but it must finish writing
-before the previous writer may continue. This is very important to the
-algorithm. The writers act like a "stack". The way interrupts works
-enforces this behavior.
-
-
- writer1 start
- <preempted> writer2 start
- <preempted> writer3 start
- writer3 finishes
- writer2 finishes
- writer1 finishes
-
-This is very much like a writer being preempted by an interrupt and
-the interrupt doing a write as well.
-
-Readers can happen at any time. But no two readers may run at the
-same time, nor can a reader preempt/interrupt another reader. A reader
-cannot preempt/interrupt a writer, but it may read/consume from the
-buffer at the same time as a writer is writing, but the reader must be
-on another processor to do so. A reader may read on its own processor
-and can be preempted by a writer.
-
-A writer can preempt a reader, but a reader cannot preempt a writer.
-But a reader can read the buffer at the same time (on another processor)
-as a writer.
-
-The ring buffer is made up of a list of pages held together by a linked list.
-
-At initialization a reader page is allocated for the reader that is not
-part of the ring buffer.
-
-The head_page, tail_page and commit_page are all initialized to point
-to the same page.
-
-The reader page is initialized to have its next pointer pointing to
-the head page, and its previous pointer pointing to a page before
-the head page.
-
-The reader has its own page to use. At start up time, this page is
-allocated but is not attached to the list. When the reader wants
-to read from the buffer, if its page is empty (like it is on start-up),
-it will swap its page with the head_page. The old reader page will
-become part of the ring buffer and the head_page will be removed.
-The page after the inserted page (old reader_page) will become the
-new head page.
-
-Once the new page is given to the reader, the reader could do what
-it wants with it, as long as a writer has left that page.
-
-A sample of how the reader page is swapped: Note this does not
-show the head page in the buffer, it is for demonstrating a swap
-only.
-
- +------+
- |reader| RING BUFFER
- |page |
- +------+
- +---+ +---+ +---+
- | |-->| |-->| |
- | |<--| |<--| |
- +---+ +---+ +---+
- ^ | ^ |
- | +-------------+ |
- +-----------------+
-
-
- +------+
- |reader| RING BUFFER
- |page |-------------------+
- +------+ v
- | +---+ +---+ +---+
- | | |-->| |-->| |
- | | |<--| |<--| |<-+
- | +---+ +---+ +---+ |
- | ^ | ^ | |
- | | +-------------+ | |
- | +-----------------+ |
- +------------------------------------+
-
- +------+
- |reader| RING BUFFER
- |page |-------------------+
- +------+ <---------------+ v
- | ^ +---+ +---+ +---+
- | | | |-->| |-->| |
- | | | | | |<--| |<-+
- | | +---+ +---+ +---+ |
- | | | ^ | |
- | | +-------------+ | |
- | +-----------------------------+ |
- +------------------------------------+
-
- +------+
- |buffer| RING BUFFER
- |page |-------------------+
- +------+ <---------------+ v
- | ^ +---+ +---+ +---+
- | | | | | |-->| |
- | | New | | | |<--| |<-+
- | | Reader +---+ +---+ +---+ |
- | | page ----^ | |
- | | | |
- | +-----------------------------+ |
- +------------------------------------+
-
-
-
-It is possible that the page swapped is the commit page and the tail page,
-if what is in the ring buffer is less than what is held in a buffer page.
-
-
- reader page commit page tail page
- | | |
- v | |
- +---+ | |
- | |<----------+ |
- | |<------------------------+
- | |------+
- +---+ |
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |--->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-This case is still valid for this algorithm.
-When the writer leaves the page, it simply goes into the ring buffer
-since the reader page still points to the next location in the ring
-buffer.
-
-
-The main pointers:
-
- reader page - The page used solely by the reader and is not part
- of the ring buffer (may be swapped in)
-
- head page - the next page in the ring buffer that will be swapped
- with the reader page.
-
- tail page - the page where the next write will take place.
-
- commit page - the page that last finished a write.
-
-The commit page only is updated by the outermost writer in the
-writer stack. A writer that preempts another writer will not move the
-commit page.
-
-When data is written into the ring buffer, a position is reserved
-in the ring buffer and passed back to the writer. When the writer
-is finished writing data into that position, it commits the write.
-
-Another write (or a read) may take place at anytime during this
-transaction. If another write happens it must finish before continuing
-with the previous write.
-
-
- Write reserve:
-
- Buffer page
- +---------+
- |written |
- +---------+ <--- given back to writer (current commit)
- |reserved |
- +---------+ <--- tail pointer
- | empty |
- +---------+
-
- Write commit:
-
- Buffer page
- +---------+
- |written |
- +---------+
- |written |
- +---------+ <--- next position for write (current commit)
- | empty |
- +---------+
-
-
- If a write happens after the first reserve:
-
- Buffer page
- +---------+
- |written |
- +---------+ <-- current commit
- |reserved |
- +---------+ <--- given back to second writer
- |reserved |
- +---------+ <--- tail pointer
-
- After second writer commits:
-
-
- Buffer page
- +---------+
- |written |
- +---------+ <--(last full commit)
- |reserved |
- +---------+
- |pending |
- |commit |
- +---------+ <--- tail pointer
-
- When the first writer commits:
-
- Buffer page
- +---------+
- |written |
- +---------+
- |written |
- +---------+
- |written |
- +---------+ <--(last full commit and tail pointer)
-
-
-The commit pointer points to the last write location that was
-committed without preempting another write. When a write that
-preempted another write is committed, it only becomes a pending commit
-and will not be a full commit until all writes have been committed.
-
-The commit page points to the page that has the last full commit.
-The tail page points to the page with the last write (before
-committing).
-
-The tail page is always equal to or after the commit page. It may
-be several pages ahead. If the tail page catches up to the commit
-page then no more writes may take place (regardless of the mode
-of the ring buffer: overwrite and produce/consumer).
-
-The order of pages is:
-
- head page
- commit page
- tail page
-
-Possible scenario:
- tail page
- head page commit page |
- | | |
- v v v
- +---+ +---+ +---+ +---+
-<---| |--->| |--->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-There is a special case that the head page is after either the commit page
-and possibly the tail page. That is when the commit (and tail) page has been
-swapped with the reader page. This is because the head page is always
-part of the ring buffer, but the reader page is not. Whenever there
-has been less than a full page that has been committed inside the ring buffer,
-and a reader swaps out a page, it will be swapping out the commit page.
-
-
- reader page commit page tail page
- | | |
- v | |
- +---+ | |
- | |<----------+ |
- | |<------------------------+
- | |------+
- +---+ |
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |--->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
- ^
- |
- head page
-
-
-In this case, the head page will not move when the tail and commit
-move back into the ring buffer.
-
-The reader cannot swap a page into the ring buffer if the commit page
-is still on that page. If the read meets the last commit (real commit
-not pending or reserved), then there is nothing more to read.
-The buffer is considered empty until another full commit finishes.
-
-When the tail meets the head page, if the buffer is in overwrite mode,
-the head page will be pushed ahead one. If the buffer is in producer/consumer
-mode, the write will fail.
-
-Overwrite mode:
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |--->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
- ^
- |
- head page
-
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |--->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
- ^
- |
- head page
-
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |--->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
- ^
- |
- head page
-
-Note, the reader page will still point to the previous head page.
-But when a swap takes place, it will use the most recent head page.
-
-
-Making the Ring Buffer Lockless:
---------------------------------
-
-The main idea behind the lockless algorithm is to combine the moving
-of the head_page pointer with the swapping of pages with the reader.
-State flags are placed inside the pointer to the page. To do this,
-each page must be aligned in memory by 4 bytes. This will allow the 2
-least significant bits of the address to be used as flags, since
-they will always be zero for the address. To get the address,
-simply mask out the flags.
-
- MASK = ~3
-
- address & MASK
-
-Two flags will be kept by these two bits:
-
- HEADER - the page being pointed to is a head page
-
- UPDATE - the page being pointed to is being updated by a writer
- and was or is about to be a head page.
-
-
- reader page
- |
- v
- +---+
- | |------+
- +---+ |
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-H->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-
-The above pointer "-H->" would have the HEADER flag set. That is
-the next page is the next page to be swapped out by the reader.
-This pointer means the next page is the head page.
-
-When the tail page meets the head pointer, it will use cmpxchg to
-change the pointer to the UPDATE state:
-
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-H->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-"-U->" represents a pointer in the UPDATE state.
-
-Any access to the reader will need to take some sort of lock to serialize
-the readers. But the writers will never take a lock to write to the
-ring buffer. This means we only need to worry about a single reader,
-and writes only preempt in "stack" formation.
-
-When the reader tries to swap the page with the ring buffer, it
-will also use cmpxchg. If the flag bit in the pointer to the
-head page does not have the HEADER flag set, the compare will fail
-and the reader will need to look for the new head page and try again.
-Note, the flags UPDATE and HEADER are never set at the same time.
-
-The reader swaps the reader page as follows:
-
- +------+
- |reader| RING BUFFER
- |page |
- +------+
- +---+ +---+ +---+
- | |--->| |--->| |
- | |<---| |<---| |
- +---+ +---+ +---+
- ^ | ^ |
- | +---------------+ |
- +-----H-------------+
-
-The reader sets the reader page next pointer as HEADER to the page after
-the head page.
-
-
- +------+
- |reader| RING BUFFER
- |page |-------H-----------+
- +------+ v
- | +---+ +---+ +---+
- | | |--->| |--->| |
- | | |<---| |<---| |<-+
- | +---+ +---+ +---+ |
- | ^ | ^ | |
- | | +---------------+ | |
- | +-----H-------------+ |
- +--------------------------------------+
-
-It does a cmpxchg with the pointer to the previous head page to make it
-point to the reader page. Note that the new pointer does not have the HEADER
-flag set. This action atomically moves the head page forward.
-
- +------+
- |reader| RING BUFFER
- |page |-------H-----------+
- +------+ v
- | ^ +---+ +---+ +---+
- | | | |-->| |-->| |
- | | | |<--| |<--| |<-+
- | | +---+ +---+ +---+ |
- | | | ^ | |
- | | +-------------+ | |
- | +-----------------------------+ |
- +------------------------------------+
-
-After the new head page is set, the previous pointer of the head page is
-updated to the reader page.
-
- +------+
- |reader| RING BUFFER
- |page |-------H-----------+
- +------+ <---------------+ v
- | ^ +---+ +---+ +---+
- | | | |-->| |-->| |
- | | | | | |<--| |<-+
- | | +---+ +---+ +---+ |
- | | | ^ | |
- | | +-------------+ | |
- | +-----------------------------+ |
- +------------------------------------+
-
- +------+
- |buffer| RING BUFFER
- |page |-------H-----------+ <--- New head page
- +------+ <---------------+ v
- | ^ +---+ +---+ +---+
- | | | | | |-->| |
- | | New | | | |<--| |<-+
- | | Reader +---+ +---+ +---+ |
- | | page ----^ | |
- | | | |
- | +-----------------------------+ |
- +------------------------------------+
-
-Another important point: The page that the reader page points back to
-by its previous pointer (the one that now points to the new head page)
-never points back to the reader page. That is because the reader page is
-not part of the ring buffer. Traversing the ring buffer via the next pointers
-will always stay in the ring buffer. Traversing the ring buffer via the
-prev pointers may not.
-
-Note, the way to determine a reader page is simply by examining the previous
-pointer of the page. If the next pointer of the previous page does not
-point back to the original page, then the original page is a reader page:
-
-
- +--------+
- | reader | next +----+
- | page |-------->| |<====== (buffer page)
- +--------+ +----+
- | | ^
- | v | next
- prev | +----+
- +------------->| |
- +----+
-
-The way the head page moves forward:
-
-When the tail page meets the head page and the buffer is in overwrite mode
-and more writes take place, the head page must be moved forward before the
-writer may move the tail page. The way this is done is that the writer
-performs a cmpxchg to convert the pointer to the head page from the HEADER
-flag to have the UPDATE flag set. Once this is done, the reader will
-not be able to swap the head page from the buffer, nor will it be able to
-move the head page, until the writer is finished with the move.
-
-This eliminates any races that the reader can have on the writer. The reader
-must spin, and this is why the reader cannot preempt the writer.
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-H->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-The following page will be made into the new head page.
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |-H->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-After the new head page has been set, we can set the old head page
-pointer back to NORMAL.
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |--->| |-H->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-After the head page has been moved, the tail page may now move forward.
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |--->| |-H->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-
-The above are the trivial updates. Now for the more complex scenarios.
-
-
-As stated before, if enough writes preempt the first write, the
-tail page may make it all the way around the buffer and meet the commit
-page. At this time, we must start dropping writes (usually with some kind
-of warning to the user). But what happens if the commit was still on the
-reader page? The commit page is not part of the ring buffer. The tail page
-must account for this.
-
-
- reader page commit page
- | |
- v |
- +---+ |
- | |<----------+
- | |
- | |------+
- +---+ |
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-H->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
- ^
- |
- tail page
-
-If the tail page were to simply push the head page forward, the commit when
-leaving the reader page would not be pointing to the correct page.
-
-The solution to this is to test if the commit page is on the reader page
-before pushing the head page. If it is, then it can be assumed that the
-tail page wrapped the buffer, and we must drop new writes.
-
-This is not a race condition, because the commit page can only be moved
-by the outermost writer (the writer that was preempted).
-This means that the commit will not move while a writer is moving the
-tail page. The reader cannot swap the reader page if it is also being
-used as the commit page. The reader can simply check that the commit
-is off the reader page. Once the commit page leaves the reader page
-it will never go back on it unless a reader does another swap with the
-buffer page that is also the commit page.
-
-
-Nested writes
--------------
-
-In the pushing forward of the tail page we must first push forward
-the head page if the head page is the next page. If the head page
-is not the next page, the tail page is simply updated with a cmpxchg.
-
-Only writers move the tail page. This must be done atomically to protect
-against nested writers.
-
- temp_page = tail_page
- next_page = temp_page->next
- cmpxchg(tail_page, temp_page, next_page)
-
-The above will update the tail page if it is still pointing to the expected
-page. If this fails, a nested write pushed it forward, the current write
-does not need to push it.
-
-
- temp page
- |
- v
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |--->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-Nested write comes in and moves the tail page forward:
-
- tail page (moved by nested writer)
- temp page |
- | |
- v v
- +---+ +---+ +---+ +---+
-<---| |--->| |--->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-The above would fail the cmpxchg, but since the tail page has already
-been moved forward, the writer will just try again to reserve storage
-on the new tail page.
-
-But the moving of the head page is a bit more complex.
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-H->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-The write converts the head page pointer to UPDATE.
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-But if a nested writer preempts here, it will see that the next
-page is a head page, but it is also nested. It will detect that
-it is nested and will save that information. The detection is the
-fact that it sees the UPDATE flag instead of a HEADER or NORMAL
-pointer.
-
-The nested writer will set the new head page pointer.
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |-H->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-But it will not reset the update back to normal. Only the writer
-that converted a pointer from HEAD to UPDATE will convert it back
-to NORMAL.
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |-H->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-After the nested writer finishes, the outermost writer will convert
-the UPDATE pointer to NORMAL.
-
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |--->| |-H->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-
-It can be even more complex if several nested writes came in and moved
-the tail page ahead several pages:
-
-
-(first writer)
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-H->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-The write converts the head page pointer to UPDATE.
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |--->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-Next writer comes in, and sees the update and sets up the new
-head page.
-
-(second writer)
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |-H->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-The nested writer moves the tail page forward. But does not set the old
-update page to NORMAL because it is not the outermost writer.
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |-H->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-Another writer preempts and sees the page after the tail page is a head page.
-It changes it from HEAD to UPDATE.
-
-(third writer)
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |-U->| |--->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-The writer will move the head page forward:
-
-
-(third writer)
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |-U->| |-H->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-But now that the third writer did change the HEAD flag to UPDATE it
-will convert it to normal:
-
-
-(third writer)
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |--->| |-H->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-
-Then it will move the tail page, and return back to the second writer.
-
-
-(second writer)
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |--->| |-H->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-
-The second writer will fail to move the tail page because it was already
-moved, so it will try again and add its data to the new tail page.
-It will return to the first writer.
-
-
-(first writer)
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |--->| |-H->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-The first writer cannot know atomically if the tail page moved
-while it updates the HEAD page. It will then update the head page to
-what it thinks is the new head page.
-
-
-(first writer)
-
- tail page
- |
- v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |-H->| |-H->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-Since the cmpxchg returns the old value of the pointer the first writer
-will see it succeeded in updating the pointer from NORMAL to HEAD.
-But as we can see, this is not good enough. It must also check to see
-if the tail page is either where it use to be or on the next page:
-
-
-(first writer)
-
- A B tail page
- | | |
- v v v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |-H->| |-H->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-If tail page != A and tail page != B, then it must reset the pointer
-back to NORMAL. The fact that it only needs to worry about nested
-writers means that it only needs to check this after setting the HEAD page.
-
-
-(first writer)
-
- A B tail page
- | | |
- v v v
- +---+ +---+ +---+ +---+
-<---| |--->| |-U->| |--->| |-H->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-
-Now the writer can update the head page. This is also why the head page must
-remain in UPDATE and only reset by the outermost writer. This prevents
-the reader from seeing the incorrect head page.
-
-
-(first writer)
-
- A B tail page
- | | |
- v v v
- +---+ +---+ +---+ +---+
-<---| |--->| |--->| |--->| |-H->
---->| |<---| |<---| |<---| |<---
- +---+ +---+ +---+ +---+
-