summaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/Changes15
-rw-r--r--Documentation/DocBook/Makefile3
-rw-r--r--Documentation/DocBook/genericirq.tmpl474
-rw-r--r--Documentation/DocBook/kernel-api.tmpl5
-rw-r--r--Documentation/IRQ.txt22
-rw-r--r--Documentation/RCU/torture.txt34
-rw-r--r--Documentation/README.DAC9606
-rw-r--r--Documentation/feature-removal-schedule.txt75
-rw-r--r--Documentation/filesystems/devfs/ChangeLog1977
-rw-r--r--Documentation/filesystems/devfs/README1959
-rw-r--r--Documentation/filesystems/devfs/ToDo40
-rw-r--r--Documentation/filesystems/devfs/boot-options65
-rw-r--r--Documentation/initrd.txt24
-rw-r--r--Documentation/ioctl-number.txt1
-rw-r--r--Documentation/kernel-parameters.txt17
-rw-r--r--Documentation/keys-request-key.txt54
-rw-r--r--Documentation/keys.txt29
-rw-r--r--Documentation/pi-futex.txt121
-rw-r--r--Documentation/robust-futexes.txt2
-rw-r--r--Documentation/rt-mutex-design.txt781
-rw-r--r--Documentation/rt-mutex.txt79
-rw-r--r--Documentation/sound/alsa/ALSA-Configuration.txt106
-rw-r--r--Documentation/video4linux/README.pvrusb2212
-rw-r--r--Documentation/watchdog/pcwd-watchdog.txt75
-rw-r--r--Documentation/watchdog/src/watchdog-simple.c15
-rw-r--r--Documentation/watchdog/src/watchdog-test.c68
-rw-r--r--Documentation/watchdog/watchdog-api.txt56
-rw-r--r--Documentation/watchdog/watchdog.txt23
28 files changed, 2093 insertions, 4245 deletions
diff --git a/Documentation/Changes b/Documentation/Changes
index b02f476c2973..488272074c36 100644
--- a/Documentation/Changes
+++ b/Documentation/Changes
@@ -181,8 +181,8 @@ Intel IA32 microcode
--------------------
A driver has been added to allow updating of Intel IA32 microcode,
-accessible as both a devfs regular file and as a normal (misc)
-character device. If you are not using devfs you may need to:
+accessible as a normal (misc) character device. If you are not using
+udev you may need to:
mkdir /dev/cpu
mknod /dev/cpu/microcode c 10 184
@@ -201,7 +201,9 @@ with programs using shared memory.
udev
----
udev is a userspace application for populating /dev dynamically with
-only entries for devices actually present. udev replaces devfs.
+only entries for devices actually present. udev replaces the basic
+functionality of devfs, while allowing persistant device naming for
+devices.
FUSE
----
@@ -231,18 +233,13 @@ The PPP driver has been restructured to support multilink and to
enable it to operate over diverse media layers. If you use PPP,
upgrade pppd to at least 2.4.0.
-If you are not using devfs, you must have the device file /dev/ppp
+If you are not using udev, you must have the device file /dev/ppp
which can be made by:
mknod /dev/ppp c 108 0
as root.
-If you use devfsd and build ppp support as modules, you will need
-the following in your /etc/devfsd.conf file:
-
-LOOKUP PPP MODLOAD
-
Isdn4k-utils
------------
diff --git a/Documentation/DocBook/Makefile b/Documentation/DocBook/Makefile
index 5a2882d275ba..66e1cf733571 100644
--- a/Documentation/DocBook/Makefile
+++ b/Documentation/DocBook/Makefile
@@ -10,7 +10,8 @@ DOCBOOKS := wanbook.xml z8530book.xml mcabook.xml videobook.xml \
kernel-hacking.xml kernel-locking.xml deviceiobook.xml \
procfs-guide.xml writing_usb_driver.xml \
kernel-api.xml journal-api.xml lsm.xml usb.xml \
- gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml
+ gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
+ genericirq.xml
###
# The build process is as follows (targets):
diff --git a/Documentation/DocBook/genericirq.tmpl b/Documentation/DocBook/genericirq.tmpl
new file mode 100644
index 000000000000..0f4a4b6321e4
--- /dev/null
+++ b/Documentation/DocBook/genericirq.tmpl
@@ -0,0 +1,474 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
+ "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
+
+<book id="Generic-IRQ-Guide">
+ <bookinfo>
+ <title>Linux generic IRQ handling</title>
+
+ <authorgroup>
+ <author>
+ <firstname>Thomas</firstname>
+ <surname>Gleixner</surname>
+ <affiliation>
+ <address>
+ <email>tglx@linutronix.de</email>
+ </address>
+ </affiliation>
+ </author>
+ <author>
+ <firstname>Ingo</firstname>
+ <surname>Molnar</surname>
+ <affiliation>
+ <address>
+ <email>mingo@elte.hu</email>
+ </address>
+ </affiliation>
+ </author>
+ </authorgroup>
+
+ <copyright>
+ <year>2005-2006</year>
+ <holder>Thomas Gleixner</holder>
+ </copyright>
+ <copyright>
+ <year>2005-2006</year>
+ <holder>Ingo Molnar</holder>
+ </copyright>
+
+ <legalnotice>
+ <para>
+ This documentation is free software; you can redistribute
+ it and/or modify it under the terms of the GNU General Public
+ License version 2 as published by the Free Software Foundation.
+ </para>
+
+ <para>
+ This program is distributed in the hope that it will be
+ useful, but WITHOUT ANY WARRANTY; without even the implied
+ warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ See the GNU General Public License for more details.
+ </para>
+
+ <para>
+ You should have received a copy of the GNU General Public
+ License along with this program; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
+ MA 02111-1307 USA
+ </para>
+
+ <para>
+ For more details see the file COPYING in the source
+ distribution of Linux.
+ </para>
+ </legalnotice>
+ </bookinfo>
+
+<toc></toc>
+
+ <chapter id="intro">
+ <title>Introduction</title>
+ <para>
+ The generic interrupt handling layer is designed to provide a
+ complete abstraction of interrupt handling for device drivers.
+ It is able to handle all the different types of interrupt controller
+ hardware. Device drivers use generic API functions to request, enable,
+ disable and free interrupts. The drivers do not have to know anything
+ about interrupt hardware details, so they can be used on different
+ platforms without code changes.
+ </para>
+ <para>
+ This documentation is provided to developers who want to implement
+ an interrupt subsystem based for their architecture, with the help
+ of the generic IRQ handling layer.
+ </para>
+ </chapter>
+
+ <chapter id="rationale">
+ <title>Rationale</title>
+ <para>
+ The original implementation of interrupt handling in Linux is using
+ the __do_IRQ() super-handler, which is able to deal with every
+ type of interrupt logic.
+ </para>
+ <para>
+ Originally, Russell King identified different types of handlers to
+ build a quite universal set for the ARM interrupt handler
+ implementation in Linux 2.5/2.6. He distinguished between:
+ <itemizedlist>
+ <listitem><para>Level type</para></listitem>
+ <listitem><para>Edge type</para></listitem>
+ <listitem><para>Simple type</para></listitem>
+ </itemizedlist>
+ In the SMP world of the __do_IRQ() super-handler another type
+ was identified:
+ <itemizedlist>
+ <listitem><para>Per CPU type</para></listitem>
+ </itemizedlist>
+ </para>
+ <para>
+ This split implementation of highlevel IRQ handlers allows us to
+ optimize the flow of the interrupt handling for each specific
+ interrupt type. This reduces complexity in that particular codepath
+ and allows the optimized handling of a given type.
+ </para>
+ <para>
+ The original general IRQ implementation used hw_interrupt_type
+ structures and their ->ack(), ->end() [etc.] callbacks to
+ differentiate the flow control in the super-handler. This leads to
+ a mix of flow logic and lowlevel hardware logic, and it also leads
+ to unnecessary code duplication: for example in i386, there is a
+ ioapic_level_irq and a ioapic_edge_irq irq-type which share many
+ of the lowlevel details but have different flow handling.
+ </para>
+ <para>
+ A more natural abstraction is the clean separation of the
+ 'irq flow' and the 'chip details'.
+ </para>
+ <para>
+ Analysing a couple of architecture's IRQ subsystem implementations
+ reveals that most of them can use a generic set of 'irq flow'
+ methods and only need to add the chip level specific code.
+ The separation is also valuable for (sub)architectures
+ which need specific quirks in the irq flow itself but not in the
+ chip-details - and thus provides a more transparent IRQ subsystem
+ design.
+ </para>
+ <para>
+ Each interrupt descriptor is assigned its own highlevel flow
+ handler, which is normally one of the generic
+ implementations. (This highlevel flow handler implementation also
+ makes it simple to provide demultiplexing handlers which can be
+ found in embedded platforms on various architectures.)
+ </para>
+ <para>
+ The separation makes the generic interrupt handling layer more
+ flexible and extensible. For example, an (sub)architecture can
+ use a generic irq-flow implementation for 'level type' interrupts
+ and add a (sub)architecture specific 'edge type' implementation.
+ </para>
+ <para>
+ To make the transition to the new model easier and prevent the
+ breakage of existing implementations, the __do_IRQ() super-handler
+ is still available. This leads to a kind of duality for the time
+ being. Over time the new model should be used in more and more
+ architectures, as it enables smaller and cleaner IRQ subsystems.
+ </para>
+ </chapter>
+ <chapter id="bugs">
+ <title>Known Bugs And Assumptions</title>
+ <para>
+ None (knock on wood).
+ </para>
+ </chapter>
+
+ <chapter id="Abstraction">
+ <title>Abstraction layers</title>
+ <para>
+ There are three main levels of abstraction in the interrupt code:
+ <orderedlist>
+ <listitem><para>Highlevel driver API</para></listitem>
+ <listitem><para>Highlevel IRQ flow handlers</para></listitem>
+ <listitem><para>Chiplevel hardware encapsulation</para></listitem>
+ </orderedlist>
+ </para>
+ <sect1>
+ <title>Interrupt control flow</title>
+ <para>
+ Each interrupt is described by an interrupt descriptor structure
+ irq_desc. The interrupt is referenced by an 'unsigned int' numeric
+ value which selects the corresponding interrupt decription structure
+ in the descriptor structures array.
+ The descriptor structure contains status information and pointers
+ to the interrupt flow method and the interrupt chip structure
+ which are assigned to this interrupt.
+ </para>
+ <para>
+ Whenever an interrupt triggers, the lowlevel arch code calls into
+ the generic interrupt code by calling desc->handle_irq().
+ This highlevel IRQ handling function only uses desc->chip primitives
+ referenced by the assigned chip descriptor structure.
+ </para>
+ </sect1>
+ <sect1>
+ <title>Highlevel Driver API</title>
+ <para>
+ The highlevel Driver API consists of following functions:
+ <itemizedlist>
+ <listitem><para>request_irq()</para></listitem>
+ <listitem><para>free_irq()</para></listitem>
+ <listitem><para>disable_irq()</para></listitem>
+ <listitem><para>enable_irq()</para></listitem>
+ <listitem><para>disable_irq_nosync() (SMP only)</para></listitem>
+ <listitem><para>synchronize_irq() (SMP only)</para></listitem>
+ <listitem><para>set_irq_type()</para></listitem>
+ <listitem><para>set_irq_wake()</para></listitem>
+ <listitem><para>set_irq_data()</para></listitem>
+ <listitem><para>set_irq_chip()</para></listitem>
+ <listitem><para>set_irq_chip_data()</para></listitem>
+ </itemizedlist>
+ See the autogenerated function documentation for details.
+ </para>
+ </sect1>
+ <sect1>
+ <title>Highlevel IRQ flow handlers</title>
+ <para>
+ The generic layer provides a set of pre-defined irq-flow methods:
+ <itemizedlist>
+ <listitem><para>handle_level_irq</para></listitem>
+ <listitem><para>handle_edge_irq</para></listitem>
+ <listitem><para>handle_simple_irq</para></listitem>
+ <listitem><para>handle_percpu_irq</para></listitem>
+ </itemizedlist>
+ The interrupt flow handlers (either predefined or architecture
+ specific) are assigned to specific interrupts by the architecture
+ either during bootup or during device initialization.
+ </para>
+ <sect2>
+ <title>Default flow implementations</title>
+ <sect3>
+ <title>Helper functions</title>
+ <para>
+ The helper functions call the chip primitives and
+ are used by the default flow implementations.
+ The following helper functions are implemented (simplified excerpt):
+ <programlisting>
+default_enable(irq)
+{
+ desc->chip->unmask(irq);
+}
+
+default_disable(irq)
+{
+ if (!delay_disable(irq))
+ desc->chip->mask(irq);
+}
+
+default_ack(irq)
+{
+ chip->ack(irq);
+}
+
+default_mask_ack(irq)
+{
+ if (chip->mask_ack) {
+ chip->mask_ack(irq);
+ } else {
+ chip->mask(irq);
+ chip->ack(irq);
+ }
+}
+
+noop(irq)
+{
+}
+
+ </programlisting>
+ </para>
+ </sect3>
+ </sect2>
+ <sect2>
+ <title>Default flow handler implementations</title>
+ <sect3>
+ <title>Default Level IRQ flow handler</title>
+ <para>
+ handle_level_irq provides a generic implementation
+ for level-triggered interrupts.
+ </para>
+ <para>
+ The following control flow is implemented (simplified excerpt):
+ <programlisting>
+desc->chip->start();
+handle_IRQ_event(desc->action);
+desc->chip->end();
+ </programlisting>
+ </para>
+ </sect3>
+ <sect3>
+ <title>Default Edge IRQ flow handler</title>
+ <para>
+ handle_edge_irq provides a generic implementation
+ for edge-triggered interrupts.
+ </para>
+ <para>
+ The following control flow is implemented (simplified excerpt):
+ <programlisting>
+if (desc->status &amp; running) {
+ desc->chip->hold();
+ desc->status |= pending | masked;
+ return;
+}
+desc->chip->start();
+desc->status |= running;
+do {
+ if (desc->status &amp; masked)
+ desc->chip->enable();
+ desc-status &amp;= ~pending;
+ handle_IRQ_event(desc->action);
+} while (status &amp; pending);
+desc-status &amp;= ~running;
+desc->chip->end();
+ </programlisting>
+ </para>
+ </sect3>
+ <sect3>
+ <title>Default simple IRQ flow handler</title>
+ <para>
+ handle_simple_irq provides a generic implementation
+ for simple interrupts.
+ </para>
+ <para>
+ Note: The simple flow handler does not call any
+ handler/chip primitives.
+ </para>
+ <para>
+ The following control flow is implemented (simplified excerpt):
+ <programlisting>
+handle_IRQ_event(desc->action);
+ </programlisting>
+ </para>
+ </sect3>
+ <sect3>
+ <title>Default per CPU flow handler</title>
+ <para>
+ handle_percpu_irq provides a generic implementation
+ for per CPU interrupts.
+ </para>
+ <para>
+ Per CPU interrupts are only available on SMP and
+ the handler provides a simplified version without
+ locking.
+ </para>
+ <para>
+ The following control flow is implemented (simplified excerpt):
+ <programlisting>
+desc->chip->start();
+handle_IRQ_event(desc->action);
+desc->chip->end();
+ </programlisting>
+ </para>
+ </sect3>
+ </sect2>
+ <sect2>
+ <title>Quirks and optimizations</title>
+ <para>
+ The generic functions are intended for 'clean' architectures and chips,
+ which have no platform-specific IRQ handling quirks. If an architecture
+ needs to implement quirks on the 'flow' level then it can do so by
+ overriding the highlevel irq-flow handler.
+ </para>
+ </sect2>
+ <sect2>
+ <title>Delayed interrupt disable</title>
+ <para>
+ This per interrupt selectable feature, which was introduced by Russell
+ King in the ARM interrupt implementation, does not mask an interrupt
+ at the hardware level when disable_irq() is called. The interrupt is
+ kept enabled and is masked in the flow handler when an interrupt event
+ happens. This prevents losing edge interrupts on hardware which does
+ not store an edge interrupt event while the interrupt is disabled at
+ the hardware level. When an interrupt arrives while the IRQ_DISABLED
+ flag is set, then the interrupt is masked at the hardware level and
+ the IRQ_PENDING bit is set. When the interrupt is re-enabled by
+ enable_irq() the pending bit is checked and if it is set, the
+ interrupt is resent either via hardware or by a software resend
+ mechanism. (It's necessary to enable CONFIG_HARDIRQS_SW_RESEND when
+ you want to use the delayed interrupt disable feature and your
+ hardware is not capable of retriggering an interrupt.)
+ The delayed interrupt disable can be runtime enabled, per interrupt,
+ by setting the IRQ_DELAYED_DISABLE flag in the irq_desc status field.
+ </para>
+ </sect2>
+ </sect1>
+ <sect1>
+ <title>Chiplevel hardware encapsulation</title>
+ <para>
+ The chip level hardware descriptor structure irq_chip
+ contains all the direct chip relevant functions, which
+ can be utilized by the irq flow implementations.
+ <itemizedlist>
+ <listitem><para>ack()</para></listitem>
+ <listitem><para>mask_ack() - Optional, recommended for performance</para></listitem>
+ <listitem><para>mask()</para></listitem>
+ <listitem><para>unmask()</para></listitem>
+ <listitem><para>retrigger() - Optional</para></listitem>
+ <listitem><para>set_type() - Optional</para></listitem>
+ <listitem><para>set_wake() - Optional</para></listitem>
+ </itemizedlist>
+ These primitives are strictly intended to mean what they say: ack means
+ ACK, masking means masking of an IRQ line, etc. It is up to the flow
+ handler(s) to use these basic units of lowlevel functionality.
+ </para>
+ </sect1>
+ </chapter>
+
+ <chapter id="doirq">
+ <title>__do_IRQ entry point</title>
+ <para>
+ The original implementation __do_IRQ() is an alternative entry
+ point for all types of interrupts.
+ </para>
+ <para>
+ This handler turned out to be not suitable for all
+ interrupt hardware and was therefore reimplemented with split
+ functionality for egde/level/simple/percpu interrupts. This is not
+ only a functional optimization. It also shortens code paths for
+ interrupts.
+ </para>
+ <para>
+ To make use of the split implementation, replace the call to
+ __do_IRQ by a call to desc->chip->handle_irq() and associate
+ the appropriate handler function to desc->chip->handle_irq().
+ In most cases the generic handler implementations should
+ be sufficient.
+ </para>
+ </chapter>
+
+ <chapter id="locking">
+ <title>Locking on SMP</title>
+ <para>
+ The locking of chip registers is up to the architecture that
+ defines the chip primitives. There is a chip->lock field that can be used
+ for serialization, but the generic layer does not touch it. The per-irq
+ structure is protected via desc->lock, by the generic layer.
+ </para>
+ </chapter>
+ <chapter id="structs">
+ <title>Structures</title>
+ <para>
+ This chapter contains the autogenerated documentation of the structures which are
+ used in the generic IRQ layer.
+ </para>
+!Iinclude/linux/irq.h
+ </chapter>
+
+ <chapter id="pubfunctions">
+ <title>Public Functions Provided</title>
+ <para>
+ This chapter contains the autogenerated documentation of the kernel API functions
+ which are exported.
+ </para>
+!Ekernel/irq/manage.c
+!Ekernel/irq/chip.c
+ </chapter>
+
+ <chapter id="intfunctions">
+ <title>Internal Functions Provided</title>
+ <para>
+ This chapter contains the autogenerated documentation of the internal functions.
+ </para>
+!Ikernel/irq/handle.c
+!Ikernel/irq/chip.c
+ </chapter>
+
+ <chapter id="credits">
+ <title>Credits</title>
+ <para>
+ The following people have contributed to this document:
+ <orderedlist>
+ <listitem><para>Thomas Gleixner<email>tglx@linutronix.de</email></para></listitem>
+ <listitem><para>Ingo Molnar<email>mingo@elte.hu</email></para></listitem>
+ </orderedlist>
+ </para>
+ </chapter>
+</book>
diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl
index 3630a0d7695f..1ae4dc0fd856 100644
--- a/Documentation/DocBook/kernel-api.tmpl
+++ b/Documentation/DocBook/kernel-api.tmpl
@@ -348,11 +348,6 @@ X!Earch/i386/kernel/mca.c
</sect1>
</chapter>
- <chapter id="devfs">
- <title>The Device File System</title>
-!Efs/devfs/base.c
- </chapter>
-
<chapter id="sysfs">
<title>The Filesystem for Exporting Kernel Objects</title>
!Efs/sysfs/file.c
diff --git a/Documentation/IRQ.txt b/Documentation/IRQ.txt
new file mode 100644
index 000000000000..1011e7175021
--- /dev/null
+++ b/Documentation/IRQ.txt
@@ -0,0 +1,22 @@
+What is an IRQ?
+
+An IRQ is an interrupt request from a device.
+Currently they can come in over a pin, or over a packet.
+Several devices may be connected to the same pin thus
+sharing an IRQ.
+
+An IRQ number is a kernel identifier used to talk about a hardware
+interrupt source. Typically this is an index into the global irq_desc
+array, but except for what linux/interrupt.h implements the details
+are architecture specific.
+
+An IRQ number is an enumeration of the possible interrupt sources on a
+machine. Typically what is enumerated is the number of input pins on
+all of the interrupt controller in the system. In the case of ISA
+what is enumerated are the 16 input pins on the two i8259 interrupt
+controllers.
+
+Architectures can assign additional meaning to the IRQ numbers, and
+are encouraged to in the case where there is any manual configuration
+of the hardware involved. The ISA IRQs are a classic example of
+assigning this kind of additional meaning.
diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt
index e4c38152f7f7..a4948591607d 100644
--- a/Documentation/RCU/torture.txt
+++ b/Documentation/RCU/torture.txt
@@ -7,7 +7,7 @@ The CONFIG_RCU_TORTURE_TEST config option is available for all RCU
implementations. It creates an rcutorture kernel module that can
be loaded to run a torture test. The test periodically outputs
status messages via printk(), which can be examined via the dmesg
-command (perhaps grepping for "rcutorture"). The test is started
+command (perhaps grepping for "torture"). The test is started
when the module is loaded, and stops when the module is unloaded.
However, actually setting this config option to "y" results in the system
@@ -35,6 +35,19 @@ stat_interval The number of seconds between output of torture
be printed -only- when the module is unloaded, and this
is the default.
+shuffle_interval
+ The number of seconds to keep the test threads affinitied
+ to a particular subset of the CPUs. Used in conjunction
+ with test_no_idle_hz.
+
+test_no_idle_hz Whether or not to test the ability of RCU to operate in
+ a kernel that disables the scheduling-clock interrupt to
+ idle CPUs. Boolean parameter, "1" to test, "0" otherwise.
+
+torture_type The type of RCU to test: "rcu" for the rcu_read_lock()
+ API, "rcu_bh" for the rcu_read_lock_bh() API, and "srcu"
+ for the "srcu_read_lock()" API.
+
verbose Enable debug printk()s. Default is disabled.
@@ -42,14 +55,14 @@ OUTPUT
The statistics output is as follows:
- rcutorture: --- Start of test: nreaders=16 stat_interval=0 verbose=0
- rcutorture: rtc: 0000000000000000 ver: 1916 tfle: 0 rta: 1916 rtaf: 0 rtf: 1915
- rcutorture: Reader Pipe: 1466408 9747 0 0 0 0 0 0 0 0 0
- rcutorture: Reader Batch: 1464477 11678 0 0 0 0 0 0 0 0
- rcutorture: Free-Block Circulation: 1915 1915 1915 1915 1915 1915 1915 1915 1915 1915 0
- rcutorture: --- End of test
+ rcu-torture: --- Start of test: nreaders=16 stat_interval=0 verbose=0
+ rcu-torture: rtc: 0000000000000000 ver: 1916 tfle: 0 rta: 1916 rtaf: 0 rtf: 1915
+ rcu-torture: Reader Pipe: 1466408 9747 0 0 0 0 0 0 0 0 0
+ rcu-torture: Reader Batch: 1464477 11678 0 0 0 0 0 0 0 0
+ rcu-torture: Free-Block Circulation: 1915 1915 1915 1915 1915 1915 1915 1915 1915 1915 0
+ rcu-torture: --- End of test
-The command "dmesg | grep rcutorture:" will extract this information on
+The command "dmesg | grep torture:" will extract this information on
most systems. On more esoteric configurations, it may be necessary to
use other commands to access the output of the printk()s used by
the RCU torture test. The printk()s use KERN_ALERT, so they should
@@ -115,8 +128,9 @@ The following script may be used to torture RCU:
modprobe rcutorture
sleep 100
rmmod rcutorture
- dmesg | grep rcutorture:
+ dmesg | grep torture:
The output can be manually inspected for the error flag of "!!!".
One could of course create a more elaborate script that automatically
-checked for such errors.
+checked for such errors. The "rmmod" command forces a "SUCCESS" or
+"FAILURE" indication to be printk()ed.
diff --git a/Documentation/README.DAC960 b/Documentation/README.DAC960
index 98ea617a0dd6..0e8f618ab534 100644
--- a/Documentation/README.DAC960
+++ b/Documentation/README.DAC960
@@ -78,9 +78,9 @@ also known as "System Drives", and Drive Groups are also called "Packs". Both
terms are in use in the Mylex documentation; I have chosen to standardize on
the more generic "Logical Drive" and "Drive Group".
-DAC960 RAID disk devices are named in the style of the Device File System
-(DEVFS). The device corresponding to Logical Drive D on Controller C is
-referred to as /dev/rd/cCdD, and the partitions are called /dev/rd/cCdDp1
+DAC960 RAID disk devices are named in the style of the obsolete Device File
+System (DEVFS). The device corresponding to Logical Drive D on Controller C
+is referred to as /dev/rd/cCdD, and the partitions are called /dev/rd/cCdDp1
through /dev/rd/cCdDp7. For example, partition 3 of Logical Drive 5 on
Controller 2 is referred to as /dev/rd/c2d5p3. Note that unlike with SCSI
disks the device names will not change in the event of a disk drive failure.
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 027285d0c26c..1cbbb8e28999 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -6,17 +6,6 @@ be removed from this file.
---------------------------
-What: devfs
-When: July 2005
-Files: fs/devfs/*, include/linux/devfs_fs*.h and assorted devfs
- function calls throughout the kernel tree
-Why: It has been unmaintained for a number of years, has unfixable
- races, contains a naming policy within the kernel that is
- against the LSB, and can be replaced by using udev.
-Who: Greg Kroah-Hartman <greg@kroah.com>
-
----------------------------
-
What: RAW driver (CONFIG_RAW_DRIVER)
When: December 2005
Why: declared obsolete since kernel 2.6.3
@@ -132,16 +121,6 @@ Who: NeilBrown <neilb@suse.de>
---------------------------
-What: au1x00_uart driver
-When: January 2006
-Why: The 8250 serial driver now has the ability to deal with the differences
- between the standard 8250 family of UARTs and their slightly strange
- brother on Alchemy SOCs. The loss of features is not considered an
- issue.
-Who: Ralf Baechle <ralf@linux-mips.org>
-
----------------------------
-
What: eepro100 network driver
When: January 2007
Why: replaced by the e100 driver
@@ -177,6 +156,16 @@ Who: Jean Delvare <khali@linux-fr.org>
---------------------------
+What: Unused EXPORT_SYMBOL/EXPORT_SYMBOL_GPL exports
+ (temporary transition config option provided until then)
+ The transition config option will also be removed at the same time.
+When: before 2.6.19
+Why: Unused symbols are both increasing the size of the kernel binary
+ and are often a sign of "wrong API"
+Who: Arjan van de Ven <arjan@linux.intel.com>
+
+---------------------------
+
What: remove EXPORT_SYMBOL(tasklist_lock)
When: August 2006
Files: kernel/fork.c
@@ -224,3 +213,47 @@ Why: The interface no longer has any callers left in the kernel. It
Who: Nick Piggin <npiggin@suse.de>
---------------------------
+
+What: Support for the MIPS EV96100 evaluation board
+When: September 2006
+Why: Does no longer build since at least November 15, 2003, apparently
+ no userbase left.
+Who: Ralf Baechle <ralf@linux-mips.org>
+
+---------------------------
+
+What: Support for the Momentum / PMC-Sierra Jaguar ATX evaluation board
+When: September 2006
+Why: Does no longer build since quite some time, and was never popular,
+ due to the platform being replaced by successor models. Apparently
+ no user base left. It also is one of the last users of
+ WANT_PAGE_VIRTUAL.
+Who: Ralf Baechle <ralf@linux-mips.org>
+
+---------------------------
+
+What: Support for the Momentum Ocelot, Ocelot 3, Ocelot C and Ocelot G
+When: September 2006
+Why: Some do no longer build and apparently there is no user base left
+ for these platforms.
+Who: Ralf Baechle <ralf@linux-mips.org>
+
+---------------------------
+
+What: Support for MIPS Technologies' Altas and SEAD evaluation board
+When: September 2006
+Why: Some do no longer build and apparently there is no user base left
+ for these platforms. Hardware out of production since several years.
+Who: Ralf Baechle <ralf@linux-mips.org>
+
+---------------------------
+
+What: Support for the IT8172-based platforms, ITE 8172G and Globespan IVR
+When: September 2006
+Why: Code does no longer build since at least 2.6.0, apparently there is
+ no user base left for these platforms. Hardware out of production
+ since several years and hardly a trace of the manufacturer left on
+ the net.
+Who: Ralf Baechle <ralf@linux-mips.org>
+
+---------------------------
diff --git a/Documentation/filesystems/devfs/ChangeLog b/Documentation/filesystems/devfs/ChangeLog
deleted file mode 100644
index e5aba5246d7c..000000000000
--- a/Documentation/filesystems/devfs/ChangeLog
+++ /dev/null
@@ -1,1977 +0,0 @@
-/* -*- auto-fill -*- */
-===============================================================================
-Changes for patch v1
-
-- creation of devfs
-
-- modified miscellaneous character devices to support devfs
-===============================================================================
-Changes for patch v2
-
-- bug fix with manual inode creation
-===============================================================================
-Changes for patch v3
-
-- bugfixes
-
-- documentation improvements
-
-- created a couple of scripts (one to save&restore a devfs and the
- other to set up compatibility symlinks)
-
-- devfs support for SCSI discs. New name format is: sd_hHcCiIlL
-===============================================================================
-Changes for patch v4
-
-- bugfix for the directory reading code
-
-- bugfix for compilation with kerneld
-
-- devfs support for generic hard discs
-
-- rationalisation of the various watchdog drivers
-===============================================================================
-Changes for patch v5
-
-- support for mounting directly from entries in the devfs (it doesn't
- need to be mounted to do this), including the root filesystem.
- Mounting of swap partitions also works. Hence, now if you set
- CONFIG_DEVFS_ONLY to 'Y' then you won't be able to access your discs
- via ordinary device nodes. Naturally, the default is 'N' so that you
- can still use your old device nodes. If you want to mount from devfs
- entries, make sure you use: append = "root=/dev/sd_..." in your
- lilo.conf. It seems LILO looks for the device number (major&minor)
- and writes that into the kernel image :-(
-
-- support for character memory devices (/dev/null, /dev/zero, /dev/full
- and so on). Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-===============================================================================
-Changes for patch v6
-
-- support for subdirectories
-
-- support for symbolic links (created by devfs_mk_symlink(), no
- support yet for creation via symlink(2))
-
-- SCSI disc naming now cast in stone, with the format:
- /dev/sd/c0b1t2u3 controller=0, bus=1, ID=2, LUN=3, whole disc
- /dev/sd/c0b1t2u3p4 controller=0, bus=1, ID=2, LUN=3, 4th partition
-
-- loop devices now appear in devfs
-
-- tty devices, console, serial ports, etc. now appear in devfs
- Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- bugs with mounting devfs-only devices now fixed
-===============================================================================
-Changes for patch v7
-
-- SCSI CD-ROMS, tapes and generic devices now appear in devfs
-===============================================================================
-Changes for patch v8
-
-- bugfix with no-rewind SCSI tapes
-
-- RAMDISCs now appear in devfs
-
-- better cleaning up of devfs entries created by various modules
-
-- interface change to <devfs_register>
-===============================================================================
-Changes for patch v9
-
-- the v8 patch was corrupted somehow, which would affect the patch for
- linux/fs/filesystems.c
- I've also fixed the v8 patch file on the WWW
-
-- MetaDevices (/dev/md*) should now appear in devfs
-===============================================================================
-Changes for patch v10
-
-- bugfix in meta device support for devfs
-
-- created this ChangeLog file
-
-- added devfs support to the floppy driver
-
-- added support for creating sockets in a devfs
-===============================================================================
-Changes for patch v11
-
-- added DEVFS_FL_HIDE_UNREG flag
-
-- incorporated better patch for ttyname() in libc 5.4.43 from H.J. Lu.
-
-- interface change to <devfs_mk_symlink>
-
-- support for creating symlinks with symlink(2)
-
-- parallel port printer (/dev/lp*) now appears in devfs
-===============================================================================
-Changes for patch v12
-
-- added inode check to <devfs_fill_file> function
-
-- improved devfs support when mounting from devfs
-
-- added call to <<release>> operation when removing swap areas on
- devfs devices
-
-- increased NR_SUPER to 128 to support large numbers of devfs mounts
- (for chroot(2) gaols)
-
-- fixed bug in SCSI disc support: was generating incorrect minors if
- SCSI ID's did not start at 0 and increase by 1
-
-- support symlink traversal when mounting root
-===============================================================================
-Changes for patch v13
-
-- added devfs support to soundcard driver
- Thanks to Eric Dumas <dumas@linux.eu.org> and
- C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- added devfs support to the joystick driver
-
-- loop driver now has it's own subdirectory "/dev/loop/"
-
-- created <devfs_get_flags> and <devfs_set_flags> functions
-
-- fix problem with SCSI disc compatibility names (sd{a,b,c,d,e,f})
- which assumes ID's start at 0 and increase by 1. Also only create
- devfs entries for SCSI disc partitions which actually exist
- Show new names in partition check
- Thanks to Jakub Jelinek <jj@sunsite.ms.mff.cuni.cz>
-===============================================================================
-Changes for patch v14
-
-- bug fix in floppy driver: would not compile without
- CONFIG_DEVFS_FS='Y'
- Thanks to Jurgen Botz <jbotz@nova.botz.org>
-
-- bug fix in loop driver
- Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- do not create devfs entries for printers not configured
- Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- do not create devfs entries for serial ports not present
- Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- ensure <tty_register_devfs> is exported from tty_io.c
- Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- allow unregistering of devfs symlink entries
-
-- fixed bug in SCSI disc naming introduced in last patch version
-===============================================================================
-Changes for patch v15
-
-- ported to kernel 2.1.81
-===============================================================================
-Changes for patch v16
-
-- created <devfs_set_symlink_destination> function
-
-- moved DEVFS_SUPER_MAGIC into header file
-
-- added DEVFS_FL_HIDE flag
-
-- created <devfs_get_maj_min>
-
-- created <devfs_get_handle_from_inode>
-
-- fixed bugs in searching by major&minor
-
-- changed interface to <devfs_unregister>, <devfs_fill_file> and
- <devfs_find_handle>
-
-- fixed inode times when symlink created with symlink(2)
-
-- change tty driver to do auto-creation of devfs entries
- Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- fixed bug in genhd.c: whole disc (non-SCSI) was not registered to
- devfs
-
-- updated libc 5.4.43 patch for ttyname()
-===============================================================================
-Changes for patch v17
-
-- added CONFIG_DEVFS_TTY_COMPAT
- Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- bugfix in devfs support for drivers/char/lp.c
- Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- clean up serial driver so that PCMCIA devices unregister correctly
- Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- fixed bug in genhd.c: whole disc (non-SCSI) was not registered to
- devfs [was missing in patch v16]
-
-- updated libc 5.4.43 patch for ttyname() [was missing in patch v16]
-
-- all SCSI devices now registered in /dev/sg
-
-- support removal of devfs entries via unlink(2)
-===============================================================================
-Changes for patch v18
-
-- added floppy/?u720 floppy entry
-
-- fixed kerneld support for entries in devfs subdirectories
-
-- incorporated latest patch for ttyname() in libc 5.4.43 from H.J. Lu.
-===============================================================================
-Changes for patch v19
-
-- bug fix when looking up unregistered entries: kerneld was not called
-
-- fixes for kernel 2.1.86 (now requires 2.1.86)
-===============================================================================
-Changes for patch v20
-
-- only create available floppy entries
- Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl>
-
-- new IDE naming scheme following SCSI format (i.e. /dev/id/c0b0t0u0p1
- instead of /dev/hda1)
- Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl>
-
-- new XT disc naming scheme following SCSI format (i.e. /dev/xd/c0t0p1
- instead of /dev/xda1)
- Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl>
-
-- new non-standard CD-ROM names (i.e. /dev/sbp/c#t#)
- Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl>
-
-- allow symlink traversal when mounting the root filesystem
-
-- Create entries for MD devices at MD init
- Thanks to Christophe Leroy <christophe.leroy5@capway.com>
-===============================================================================
-Changes for patch v21
-
-- ported to kernel 2.1.91
-===============================================================================
-Changes for patch v22
-
-- SCSI host number patch ("scsihosts=" kernel option)
- Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl>
-===============================================================================
-Changes for patch v23
-
-- Fixed persistence bug with device numbers for manually created
- device files
-
-- Fixed problem with recreating symlinks with different content
-
-- Added CONFIG_DEVFS_MOUNT (mount devfs on /dev at boot time)
-===============================================================================
-Changes for patch v24
-
-- Switched from CONFIG_KERNELD to CONFIG_KMOD: module autoloading
- should now work again
-
-- Hide entries which are manually unlinked
-
-- Always invalidate devfs dentry cache when registering entries
-
-- Support removal of devfs directories via rmdir(2)
-
-- Ensure directories created by <devfs_mk_dir> are visible
-
-- Default no access for "other" for floppy device
-===============================================================================
-Changes for patch v25
-
-- Updates to CREDITS file and minor IDE numbering change
- Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl>
-
-- Invalidate devfs dentry cache when making directories
-
-- Invalidate devfs dentry cache when removing entries
-
-- More informative message if root FS mount fails when devfs
- configured
-
-- Fixed persistence bug with fifos
-===============================================================================
-Changes for patch v26
-
-- ported to kernel 2.1.97
-
-- Changed serial directory from "/dev/serial" to "/dev/tts" and
- "/dev/consoles" to "/dev/vc" to be more friendly to new procps
-===============================================================================
-Changes for patch v27
-
-- Added support for IDE4 and IDE5
- Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl>
-
-- Documented "scsihosts=" boot parameter
-
-- Print process command when debugging kerneld/kmod
-
-- Added debugging for register/unregister/change operations
-
-- Added "devfs=" boot options
-
-- Hide unregistered entries by default
-===============================================================================
-Changes for patch v28
-
-- No longer lock/unlock superblock in <devfs_put_super> (cope with
- recent VFS interface change)
-
-- Do not automatically change ownership/protection of /dev/tty
-
-- Drop negative dentries when they are released
-
-- Manage dcache more efficiently
-===============================================================================
-Changes for patch v29
-
-- Added DEVFS_FL_AUTO_DEVNUM flag
-===============================================================================
-Changes for patch v30
-
-- No longer set unnecessary methods
-
-- Ported to kernel 2.1.99-pre3
-===============================================================================
-Changes for patch v31
-
-- Added PID display to <call_kerneld> debugging message
-
-- Added "diread" and "diwrite" options
-
-- Ported to kernel 2.1.102
-
-- Fixed persistence problem with permissions
-===============================================================================
-Changes for patch v32
-
-- Fixed devfs support in drivers/block/md.c
-===============================================================================
-Changes for patch v33
-
-- Support legacy device nodes
-
-- Fixed bug where recreated inodes were hidden
-
-- New IDE naming scheme: everything is under /dev/ide
-===============================================================================
-Changes for patch v34
-
-- Improved debugging in <get_vfs_inode>
-
-- Prevent duplicate calls to <devfs_mk_dir> in SCSI layer
-
-- No longer free old dentries in <devfs_mk_dir>
-
-- Free all dentries for a given entry when deleting inodes
-===============================================================================
-Changes for patch v35
-
-- Ported to kernel 2.1.105 (sound driver changes)
-===============================================================================
-Changes for patch v36
-
-- Fixed sound driver port
-===============================================================================
-Changes for patch v37
-
-- Minor documentation tweaks
-===============================================================================
-Changes for patch v38
-
-- More documentation tweaks
-
-- Fix for sound driver port
-
-- Removed ttyname-patch (grab libc 5.4.44 instead)
-
-- Ported to kernel 2.1.107-pre2 (loop driver fix)
-===============================================================================
-Changes for patch v39
-
-- Ported to kernel 2.1.107 (hd.c hunk broke due to spelling "fixes"). Sigh
-
-- Removed many #ifdef's, replaced with trickery in include/devfs_fs.h
-===============================================================================
-Changes for patch v40
-
-- Fix for sound driver port
-
-- Limit auto-device numbering to majors 128 to 239
-===============================================================================
-Changes for patch v41
-
-- Fixed inode times persistence problem
-===============================================================================
-Changes for patch v42
-
-- Ported to kernel 2.1.108 (drivers/scsi/hosts.c hunk broke)
-===============================================================================
-Changes for patch v43
-
-- Fixed spelling in <devfs_readlink> debug
-
-- Fixed bug in <devfs_setup> parsing "dilookup"
-
-- More #ifdef's removed
-
-- Supported Sparc keyboard (/dev/kbd)
-
-- Supported DSP56001 digital signal processor (/dev/dsp56k)
-
-- Supported Apple Desktop Bus (/dev/adb)
-
-- Supported Coda network file system (/dev/cfs*)
-===============================================================================
-Changes for patch v44
-
-- Fixed devfs inode leak when manually recreating inodes
-
-- Fixed permission persistence problem when recreating inodes
-===============================================================================
-Changes for patch v45
-
-- Ported to kernel 2.1.110
-===============================================================================
-Changes for patch v46
-
-- Ported to kernel 2.1.112-pre1
-
-- Removed harmless "unused variable" compiler warning
-
-- Fixed modes for manually recreated device nodes
-===============================================================================
-Changes for patch v47
-
-- Added NULL devfs inode warning in <devfs_read_inode>
-
-- Force all inode nlink values to 1
-===============================================================================
-Changes for patch v48
-
-- Added "dimknod" option
-
-- Set inode nlink to 0 when freeing dentries
-
-- Added support for virtual console capture devices (/dev/vcs*)
- Thanks to Dennis Hou <smilax@mindmeld.yi.org>
-
-- Fixed modes for manually recreated symlinks
-===============================================================================
-Changes for patch v49
-
-- Ported to kernel 2.1.113
-===============================================================================
-Changes for patch v50
-
-- Fixed bugs in recreated directories and symlinks
-===============================================================================
-Changes for patch v51
-
-- Improved robustness of rc.devfs script
- Thanks to Roderich Schupp <rsch@experteam.de>
-
-- Fixed bugs in recreated device nodes
-
-- Fixed bug in currently unused <devfs_get_handle_from_inode>
-
-- Defined new <devfs_handle_t> type
-
-- Improved debugging when getting entries
-
-- Fixed bug where directories could be emptied
-
-- Ported to kernel 2.1.115
-===============================================================================
-Changes for patch v52
-
-- Replaced dummy .epoch inode with .devfsd character device
-
-- Modified rc.devfs to take account of above change
-
-- Removed spurious driver warning messages when CONFIG_DEVFS_FS=n
-
-- Implemented devfsd protocol revision 0
-===============================================================================
-Changes for patch v53
-
-- Ported to kernel 2.1.116 (kmod change broke hunk)
-
-- Updated Documentation/Configure.help
-
-- Test and tty pattern patch for rc.devfs script
- Thanks to Roderich Schupp <rsch@experteam.de>
-
-- Added soothing message to warning in <devfs_d_iput>
-===============================================================================
-Changes for patch v54
-
-- Ported to kernel 2.1.117
-
-- Fixed default permissions in sound driver
-
-- Added support for frame buffer devices (/dev/fb*)
-===============================================================================
-Changes for patch v55
-
-- Ported to kernel 2.1.119
-
-- Use GCC extensions for structure initialisations
-
-- Implemented async open notification
-
-- Incremented devfsd protocol revision to 1
-===============================================================================
-Changes for patch v56
-
-- Ported to kernel 2.1.120-pre3
-
-- Moved async open notification to end of <devfs_open>
-===============================================================================
-Changes for patch v57
-
-- Ported to kernel 2.1.121
-
-- Prepended "/dev/" to module load request
-
-- Renamed <call_kerneld> to <call_kmod>
-
-- Created sample modules.conf file
-===============================================================================
-Changes for patch v58
-
-- Fixed typo "AYSNC" -> "ASYNC"
-===============================================================================
-Changes for patch v59
-
-- Added open flag for files
-===============================================================================
-Changes for patch v60
-
-- Ported to kernel 2.1.123-pre2
-===============================================================================
-Changes for patch v61
-
-- Set i_blocks=0 and i_blksize=1024 in <devfs_read_inode>
-===============================================================================
-Changes for patch v62
-
-- Ported to kernel 2.1.123
-===============================================================================
-Changes for patch v63
-
-- Ported to kernel 2.1.124-pre2
-===============================================================================
-Changes for patch v64
-
-- Fixed Unix98 pty support
-
-- Increased buffer size in <get_partition_list> to avoid crash and
- burn
-===============================================================================
-Changes for patch v65
-
-- More Unix98 pty support fixes
-
-- Added test for empty <<name>> in <devfs_find_handle>
-
-- Renamed <generate_path> to <devfs_generate_path> and published
-
-- Created /dev/root symlink
- Thanks to Roderich Schupp <rsch@ExperTeam.de>
- with further modifications by me
-===============================================================================
-Changes for patch v66
-
-- Yet more Unix98 pty support fixes (now tested)
-
-- Created <devfs_get_fops>
-
-- Support media change checks when CONFIG_DEVFS_ONLY=y
-
-- Abolished Unix98-style PTY names for old PTY devices
-===============================================================================
-Changes for patch v67
-
-- Added inline declaration for dummy <devfs_generate_path>
-
-- Removed spurious "unable to register... in devfs" messages when
- CONFIG_DEVFS_FS=n
-
-- Fixed misc. devices when CONFIG_DEVFS_FS=n
-
-- Limit auto-device numbering to majors 144 to 239
-===============================================================================
-Changes for patch v68
-
-- Hide unopened virtual consoles from directory listings
-
-- Added support for video capture devices
-
-- Ported to kernel 2.1.125
-===============================================================================
-Changes for patch v69
-
-- Fix for CONFIG_VT=n
-===============================================================================
-Changes for patch v70
-
-- Added support for non-OSS/Free sound cards
-===============================================================================
-Changes for patch v71
-
-- Ported to kernel 2.1.126-pre2
-===============================================================================
-Changes for patch v72
-
-- #ifdef's for CONFIG_DEVFS_DISABLE_OLD_NAMES removed
-===============================================================================
-Changes for patch v73
-
-- CONFIG_DEVFS_DISABLE_OLD_NAMES replaced with "nocompat" boot option
-
-- CONFIG_DEVFS_BOOT_OPTIONS removed: boot options always available
-===============================================================================
-Changes for patch v74
-
-- Removed CONFIG_DEVFS_MOUNT and "mount" boot option and replaced with
- "nomount" boot option
-
-- Documentation updates
-
-- Updated sample modules.conf
-===============================================================================
-Changes for patch v75
-
-- Updated sample modules.conf
-
-- Remount devfs after initrd finishes
-
-- Ported to kernel 2.1.127
-
-- Added support for ISDN
- Thanks to Christophe Leroy <christophe.leroy5@capway.com>
-===============================================================================
-Changes for patch v76
-
-- Updated an email address in ChangeLog
-
-- CONFIG_DEVFS_ONLY replaced with "only" boot option
-===============================================================================
-Changes for patch v77
-
-- Added DEVFS_FL_REMOVABLE flag
-
-- Check for disc change when listing directories with removable media
- devices
-
-- Use DEVFS_FL_REMOVABLE in sd.c
-
-- Ported to kernel 2.1.128
-===============================================================================
-Changes for patch v78
-
-- Only call <scan_dir_for_removable> on first call to <devfs_readdir>
-
-- Ported to kernel 2.1.129-pre5
-
-- ISDN support improvements
- Thanks to Christophe Leroy <christophe.leroy5@capway.com>
-===============================================================================
-Changes for patch v79
-
-- Ported to kernel 2.1.130
-
-- Renamed miscdevice "apm" to "apm_bios" to be consistent with
- devices.txt
-===============================================================================
-Changes for patch v80
-
-- Ported to kernel 2.1.131
-
-- Updated <devfs_rmdir> for VFS change in 2.1.131
-===============================================================================
-Changes for patch v81
-
-- Fixed permissions on /dev/ptmx
-===============================================================================
-Changes for patch v82
-
-- Ported to kernel 2.1.132-pre4
-
-- Changed initial permissions on /dev/pts/*
-
-- Created <devfs_mk_compat>
-
-- Added "symlinks" boot option
-
-- Changed devfs_register_blkdev() back to register_blkdev() for IDE
-
-- Check for partitions on removable media in <devfs_lookup>
-===============================================================================
-Changes for patch v83
-
-- Fixed support for ramdisc when using string-based root FS name
-
-- Ported to kernel 2.2.0-pre1
-===============================================================================
-Changes for patch v84
-
-- Ported to kernel 2.2.0-pre7
-===============================================================================
-Changes for patch v85
-
-- Compile fixes for driver/sound/sound_common.c (non-module) and
- drivers/isdn/isdn_common.c
- Thanks to Christophe Leroy <christophe.leroy5@capway.com>
-
-- Added support for registering regular files
-
-- Created <devfs_set_file_size>
-
-- Added /dev/cpu/mtrr as an alternative interface to /proc/mtrr
-
-- Update devfs inodes from entries if not changed through FS
-===============================================================================
-Changes for patch v86
-
-- Ported to kernel 2.2.0-pre9
-===============================================================================
-Changes for patch v87
-
-- Fixed bug when mounting non-devfs devices in a devfs
-===============================================================================
-Changes for patch v88
-
-- Fixed <devfs_fill_file> to only initialise temporary inodes
-
-- Trap for NULL fops in <devfs_register>
-
-- Return -ENODEV in <devfs_fill_file> for non-driver inodes
-
-- Fixed bug when unswapping non-devfs devices in a devfs
-===============================================================================
-Changes for patch v89
-
-- Switched to C data types in include/linux/devfs_fs.h
-
-- Switched from PATH_MAX to DEVFS_PATHLEN
-
-- Updated Documentation/filesystems/devfs/modules.conf to take account
- of reverse scanning (!) by modprobe
-
-- Ported to kernel 2.2.0
-===============================================================================
-Changes for patch v90
-
-- CONFIG_DEVFS_DISABLE_OLD_TTY_NAMES replaced with "nottycompat" boot
- option
-
-- CONFIG_DEVFS_TTY_COMPAT removed: existing "symlinks" boot option now
- controls this. This means you must have libc 5.4.44 or later, or a
- recent version of libc 6 if you use the "symlinks" option
-===============================================================================
-Changes for patch v91
-
-- Switch from <devfs_mk_symlink> to <devfs_mk_compat> in
- drivers/char/vc_screen.c to fix problems with Midnight Commander
-===============================================================================
-Changes for patch v92
-
-- Ported to kernel 2.2.2-pre5
-===============================================================================
-Changes for patch v93
-
-- Modified <sd_name> in drivers/scsi/sd.c to cope with devices that
- don't exist (which happens with new RAID autostart code printk()s)
-===============================================================================
-Changes for patch v94
-
-- Fixed bug in joystick driver: only first joystick was registered
-===============================================================================
-Changes for patch v95
-
-- Fixed another bug in joystick driver
-
-- Fixed <devfsd_read> to not overrun event buffer
-===============================================================================
-Changes for patch v96
-
-- Ported to kernel 2.2.5-2
-
-- Created <devfs_auto_unregister>
-
-- Fixed bugs: compatibility entries were not unregistered for:
- loop driver
- floppy driver
- RAMDISC driver
- IDE tape driver
- SCSI CD-ROM driver
- SCSI HDD driver
-===============================================================================
-Changes for patch v97
-
-- Fixed bugs: compatibility entries were not unregistered for:
- ALSA sound driver
- partitions in generic disc driver
-
-- Don't return unregistred entries in <devfs_find_handle>
-
-- Panic in <devfs_unregister> if entry unregistered
-
-- Don't panic in <devfs_auto_unregister> for duplicates
-===============================================================================
-Changes for patch v98
-
-- Don't unregister already unregistered entries in <unregister>
-
-- Register entry in <sd_detect>
-
-- Unregister entry in <sd_detach>
-
-- Changed to <devfs_*register_chrdev> in drivers/char/tty_io.c
-
-- Ported to kernel 2.2.7
-===============================================================================
-Changes for patch v99
-
-- Ported to kernel 2.2.8
-
-- Fixed bug in drivers/scsi/sd.c when >16 SCSI discs
-
-- Disable warning messages when unable to read partition table for
- removable media
-===============================================================================
-Changes for patch v100
-
-- Ported to kernel 2.3.1-pre5
-
-- Added "oops-on-panic" boot option
-
-- Improved debugging in <devfs_register> and <devfs_unregister>
-
-- Register entry in <sr_detect>
-
-- Unregister entry in <sr_detach>
-
-- Register entry in <sg_detect>
-
-- Unregister entry in <sg_detach>
-
-- Added support for ALSA drivers
-===============================================================================
-Changes for patch v101
-
-- Ported to kernel 2.3.2
-===============================================================================
-Changes for patch v102
-
-- Update serial driver to register PCMCIA entries
- Thanks to Roch-Alexandre Nomine-Beguin <roch@samarkand.infini.fr>
-
-- Updated an email address in ChangeLog
-
-- Hide virtual console capture entries from directory listings when
- corresponding console device is not open
-===============================================================================
-Changes for patch v103
-
-- Ported to kernel 2.3.3
-===============================================================================
-Changes for patch v104
-
-- Added documentation for some functions
-
-- Added "doc" target to fs/devfs/Makefile
-
-- Added "v4l" directory for video4linux devices
-
-- Replaced call to <devfs_unregister> in <sd_detach> with call to
- <devfs_register_partitions>
-
-- Moved registration for sr and sg drivers from detect() to attach()
- methods
-
-- Register entries in <st_attach> and unregister in <st_detach>
-
-- Work around IDE driver treating CD-ROM as gendisk
-
-- Use <sed> instead of <tr> in rc.devfs
-
-- Updated ToDo list
-
-- Removed "oops-on-panic" boot option: now always Oops
-===============================================================================
-Changes for patch v105
-
-- Unregister SCSI host from <scsi_host_no_list> in <scsi_unregister>
- Thanks to Zoltán Böszörményi <zboszor@mail.externet.hu>
-
-- Don't save /dev/log in rc.devfs
-
-- Ported to kernel 2.3.4-pre1
-===============================================================================
-Changes for patch v106
-
-- Fixed silly typo in drivers/scsi/st.c
-
-- Improved debugging in <devfs_register>
-===============================================================================
-Changes for patch v107
-
-- Added "diunlink" and "nokmod" boot options
-
-- Removed superfluous warning message in <devfs_d_iput>
-===============================================================================
-Changes for patch v108
-
-- Remove entries when unloading sound module
-===============================================================================
-Changes for patch v109
-
-- Ported to kernel 2.3.6-pre2
-===============================================================================
-Changes for patch v110
-
-- Took account of change to <d_alloc_root>
-===============================================================================
-Changes for patch v111
-
-- Created separate event queue for each mounted devfs
-
-- Removed <devfs_invalidate_dcache>
-
-- Created new ioctl()s for devfsd
-
-- Incremented devfsd protocol revision to 3
-
-- Fixed bug when re-creating directories: contents were lost
-
-- Block access to inodes until devfsd updates permissions
-===============================================================================
-Changes for patch v112
-
-- Modified patch so it applies against 2.3.5 and 2.3.6
-
-- Updated an email address in ChangeLog
-
-- Do not automatically change ownership/protection of /dev/tty<n>
-
-- Updated sample modules.conf
-
-- Switched to sending process uid/gid to devfsd
-
-- Renamed <call_kmod> to <try_modload>
-
-- Added DEVFSD_NOTIFY_LOOKUP event
-
-- Added DEVFSD_NOTIFY_CHANGE event
-
-- Added DEVFSD_NOTIFY_CREATE event
-
-- Incremented devfsd protocol revision to 4
-
-- Moved kernel-specific stuff to include/linux/devfs_fs_kernel.h
-===============================================================================
-Changes for patch v113
-
-- Ported to kernel 2.3.9
-
-- Restricted permissions on some block devices
-===============================================================================
-Changes for patch v114
-
-- Added support for /dev/netlink
- Thanks to Dennis Hou <smilax@mindmeld.yi.org>
-
-- Return EISDIR rather than EINVAL for read(2) on directories
-
-- Ported to kernel 2.3.10
-===============================================================================
-Changes for patch v115
-
-- Added support for all remaining character devices
- Thanks to Dennis Hou <smilax@mindmeld.yi.org>
-
-- Cleaned up netlink support
-===============================================================================
-Changes for patch v116
-
-- Added support for /dev/parport%d
- Thanks to Tim Waugh <tim@cyberelk.demon.co.uk>
-
-- Fixed parallel port ATAPI tape driver
-
-- Fixed Atari SLM laser printer driver
-===============================================================================
-Changes for patch v117
-
-- Added support for COSA card
- Thanks to Dennis Hou <smilax@mindmeld.yi.org>
-
-- Fixed drivers/char/ppdev.c: missing #include <linux/init.h>
-
-- Fixed drivers/char/ftape/zftape/zftape-init.c
- Thanks to Vladimir Popov <mashgrad@usa.net>
-===============================================================================
-Changes for patch v118
-
-- Ported to kernel 2.3.15-pre3
-
-- Fixed bug in loop driver
-
-- Unregister /dev/lp%d entries in drivers/char/lp.c
- Thanks to Maciej W. Rozycki <macro@ds2.pg.gda.pl>
-===============================================================================
-Changes for patch v119
-
-- Ported to kernel 2.3.16
-===============================================================================
-Changes for patch v120
-
-- Fixed bug in drivers/scsi/scsi.c
-
-- Added /dev/ppp
- Thanks to Dennis Hou <smilax@mindmeld.yi.org>
-
-- Ported to kernel 2.3.17
-===============================================================================
-Changes for patch v121
-
-- Fixed bug in drivers/block/loop.c
-
-- Ported to kernel 2.3.18
-===============================================================================
-Changes for patch v122
-
-- Ported to kernel 2.3.19
-===============================================================================
-Changes for patch v123
-
-- Ported to kernel 2.3.20
-===============================================================================
-Changes for patch v124
-
-- Ported to kernel 2.3.21
-===============================================================================
-Changes for patch v125
-
-- Created <devfs_get_info>, <devfs_set_info>,
- <devfs_get_first_child> and <devfs_get_next_sibling>
- Added <<dir>> parameter to <devfs_register>, <devfs_mk_compat>,
- <devfs_mk_dir> and <devfs_find_handle>
- Work sponsored by SGI
-
-- Fixed apparent bug in COSA driver
-
-- Re-instated "scsihosts=" boot option
-===============================================================================
-Changes for patch v126
-
-- Always create /dev/pts if CONFIG_UNIX98_PTYS=y
-
-- Fixed call to <devfs_mk_dir> in drivers/block/ide-disk.c
- Thanks to Dennis Hou <smilax@mindmeld.yi.org>
-
-- Allow multiple unregistrations
-
-- Created /dev/scsi hierarchy
- Work sponsored by SGI
-===============================================================================
-Changes for patch v127
-
-Work sponsored by SGI
-
-- No longer disable devpts if devfs enabled (caveat emptor)
-
-- Added flags array to struct gendisk and removed code from
- drivers/scsi/sd.c
-
-- Created /dev/discs hierarchy
-===============================================================================
-Changes for patch v128
-
-Work sponsored by SGI
-
-- Created /dev/cdroms hierarchy
-===============================================================================
-Changes for patch v129
-
-Work sponsored by SGI
-
-- Removed compatibility entries for sound devices
-
-- Removed compatibility entries for printer devices
-
-- Removed compatibility entries for video4linux devices
-
-- Removed compatibility entries for parallel port devices
-
-- Removed compatibility entries for frame buffer devices
-===============================================================================
-Changes for patch v130
-
-Work sponsored by SGI
-
-- Added major and minor number to devfsd protocol
-
-- Incremented devfsd protocol revision to 5
-
-- Removed compatibility entries for SoundBlaster CD-ROMs
-
-- Removed compatibility entries for netlink devices
-
-- Removed compatibility entries for SCSI generic devices
-
-- Removed compatibility entries for SCSI tape devices
-===============================================================================
-Changes for patch v131
-
-Work sponsored by SGI
-
-- Support info pointer for all devfs entry types
-
-- Added <<info>> parameter to <devfs_mk_dir> and <devfs_mk_symlink>
-
-- Removed /dev/st hierarchy
-
-- Removed /dev/sg hierarchy
-
-- Removed compatibility entries for loop devices
-
-- Removed compatibility entries for IDE tape devices
-
-- Removed compatibility entries for SCSI CD-ROMs
-
-- Removed /dev/sr hierarchy
-===============================================================================
-Changes for patch v132
-
-Work sponsored by SGI
-
-- Removed compatibility entries for floppy devices
-
-- Removed compatibility entries for RAMDISCs
-
-- Removed compatibility entries for meta-devices
-
-- Removed compatibility entries for SCSI discs
-
-- Created <devfs_make_root>
-
-- Removed /dev/sd hierarchy
-
-- Support "../" when searching devfs namespace
-
-- Created /dev/ide/host* hierarchy
-
-- Supported IDE hard discs in /dev/ide/host* hierarchy
-
-- Removed compatibility entries for IDE discs
-
-- Removed /dev/ide/hd hierarchy
-
-- Supported IDE CD-ROMs in /dev/ide/host* hierarchy
-
-- Removed compatibility entries for IDE CD-ROMs
-
-- Removed /dev/ide/cd hierarchy
-===============================================================================
-Changes for patch v133
-
-Work sponsored by SGI
-
-- Created <devfs_get_unregister_slave>
-
-- Fixed bug in fs/partitions/check.c when rescanning
-===============================================================================
-Changes for patch v134
-
-Work sponsored by SGI
-
-- Removed /dev/sd, /dev/sr, /dev/st and /dev/sg directories
-
-- Removed /dev/ide/hd directory
-
-- Exported <devfs_get_parent>
-
-- Created <devfs_register_tape> and /dev/tapes hierarchy
-
-- Removed /dev/ide/mt hierarchy
-
-- Removed /dev/ide/fd hierarchy
-
-- Ported to kernel 2.3.25
-===============================================================================
-Changes for patch v135
-
-Work sponsored by SGI
-
-- Removed compatibility entries for virtual console capture devices
-
-- Removed unused <devfs_set_symlink_destination>
-
-- Removed compatibility entries for serial devices
-
-- Removed compatibility entries for console devices
-
-- Do not hide entries from devfsd or children
-
-- Removed DEVFS_FL_TTY_COMPAT flag
-
-- Removed "nottycompat" boot option
-
-- Removed <devfs_mk_compat>
-===============================================================================
-Changes for patch v136
-
-Work sponsored by SGI
-
-- Moved BSD pty devices to /dev/pty
-
-- Added DEVFS_FL_WAIT flag
-===============================================================================
-Changes for patch v137
-
-Work sponsored by SGI
-
-- Really fixed bug in fs/partitions/check.c when rescanning
-
-- Support new "disc" naming scheme in <get_removable_partition>
-
-- Allow NULL fops in <devfs_register>
-
-- Removed redundant name functions in SCSI disc and IDE drivers
-===============================================================================
-Changes for patch v138
-
-Work sponsored by SGI
-
-- Fixed old bugs in drivers/block/paride/pt.c, drivers/char/tpqic02.c,
- drivers/net/wan/cosa.c and drivers/scsi/scsi.c
- Thanks to Sergey Kubushin <ksi@ksi-linux.com>
-
-- Fall back to major table if NULL fops given to <devfs_register>
-===============================================================================
-Changes for patch v139
-
-Work sponsored by SGI
-
-- Corrected and moved <get_blkfops> and <get_chrfops> declarations
- from arch/alpha/kernel/osf_sys.c to include/linux/fs.h
-
-- Removed name function from struct gendisk
-
-- Updated devfs FAQ
-===============================================================================
-Changes for patch v140
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.27
-===============================================================================
-Changes for patch v141
-
-Work sponsored by SGI
-
-- Bug fix in arch/m68k/atari/joystick.c
-
-- Moved ISDN and capi devices to /dev/isdn
-===============================================================================
-Changes for patch v142
-
-Work sponsored by SGI
-
-- Bug fix in drivers/block/ide-probe.c (patch confusion)
-===============================================================================
-Changes for patch v143
-
-Work sponsored by SGI
-
-- Bug fix in drivers/block/blkpg.c:partition_name()
-===============================================================================
-Changes for patch v144
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.29
-
-- Removed calls to <devfs_register> from cdu31a, cm206, mcd and mcdx
- CD-ROM drivers: generic driver handles this now
-
-- Moved joystick devices to /dev/joysticks
-===============================================================================
-Changes for patch v145
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.30-pre3
-
-- Register whole-disc entry even for invalid partition tables
-
-- Fixed bug in mounting root FS when initrd enabled
-
-- Fixed device entry leak with IDE CD-ROMs
-
-- Fixed compile problem with drivers/isdn/isdn_common.c
-
-- Moved COSA devices to /dev/cosa
-
-- Support fifos when unregistering
-
-- Created <devfs_register_series> and used in many drivers
-
-- Moved Coda devices to /dev/coda
-
-- Moved parallel port IDE tapes to /dev/pt
-
-- Moved parallel port IDE generic devices to /dev/pg
-===============================================================================
-Changes for patch v146
-
-Work sponsored by SGI
-
-- Removed obsolete DEVFS_FL_COMPAT and DEVFS_FL_TOLERANT flags
-
-- Fixed compile problem with fs/coda/psdev.c
-
-- Reinstate change to <devfs_register_blkdev> in
- drivers/block/ide-probe.c now that fs/isofs/inode.c is fixed
-
-- Switched to <devfs_register_blkdev> in drivers/block/floppy.c,
- drivers/scsi/sr.c and drivers/block/md.c
-
-- Moved DAC960 devices to /dev/dac960
-===============================================================================
-Changes for patch v147
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.32-pre4
-===============================================================================
-Changes for patch v148
-
-Work sponsored by SGI
-
-- Removed kmod support: use devfsd instead
-
-- Moved miscellaneous character devices to /dev/misc
-===============================================================================
-Changes for patch v149
-
-Work sponsored by SGI
-
-- Ensure include/linux/joystick.h is OK for user-space
-
-- Improved debugging in <get_vfs_inode>
-
-- Ensure dentries created by devfsd will be cleaned up
-===============================================================================
-Changes for patch v150
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.34
-===============================================================================
-Changes for patch v151
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.35-pre1
-
-- Created <devfs_get_name>
-===============================================================================
-Changes for patch v152
-
-Work sponsored by SGI
-
-- Updated sample modules.conf
-
-- Ported to kernel 2.3.36-pre1
-===============================================================================
-Changes for patch v153
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.42
-
-- Removed <devfs_fill_file>
-===============================================================================
-Changes for patch v154
-
-Work sponsored by SGI
-
-- Took account of device number changes for /dev/fb*
-===============================================================================
-Changes for patch v155
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.43-pre8
-
-- Moved /dev/tty0 to /dev/vc/0
-
-- Moved sequence number formatting from <_tty_make_name> to drivers
-===============================================================================
-Changes for patch v156
-
-Work sponsored by SGI
-
-- Fixed breakage in drivers/scsi/sd.c due to recent SCSI changes
-===============================================================================
-Changes for patch v157
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.45
-===============================================================================
-Changes for patch v158
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.46-pre2
-===============================================================================
-Changes for patch v159
-
-Work sponsored by SGI
-
-- Fixed drivers/block/md.c
- Thanks to Mike Galbraith <mikeg@weiden.de>
-
-- Documentation fixes
-
-- Moved device registration from <lp_init> to <lp_register>
- Thanks to Tim Waugh <twaugh@redhat.com>
-===============================================================================
-Changes for patch v160
-
-Work sponsored by SGI
-
-- Fixed drivers/char/joystick/joystick.c
- Thanks to Vojtech Pavlik <vojtech@suse.cz>
-
-- Documentation updates
-
-- Fixed arch/i386/kernel/mtrr.c if procfs and devfs not enabled
-
-- Fixed drivers/char/stallion.c
-===============================================================================
-Changes for patch v161
-
-Work sponsored by SGI
-
-- Remove /dev/ide when ide-mod is unloaded
-
-- Fixed bug in drivers/block/ide-probe.c when secondary but no primary
-
-- Added DEVFS_FL_NO_PERSISTENCE flag
-
-- Used new DEVFS_FL_NO_PERSISTENCE flag for Unix98 pty slaves
-
-- Removed unnecessary call to <update_devfs_inode_from_entry> in
- <devfs_readdir>
-
-- Only set auto-ownership for /dev/pty/s*
-===============================================================================
-Changes for patch v162
-
-Work sponsored by SGI
-
-- Set inode->i_size to correct size for symlinks
- Thanks to Jeremy Fitzhardinge <jeremy@goop.org>
-
-- Only give lookup() method to directories to comply with new VFS
- assumptions
-
-- Remove unnecessary tests in symlink methods
-
-- Don't kill existing block ops in <devfs_read_inode>
-
-- Restore auto-ownership for /dev/pty/m*
-===============================================================================
-Changes for patch v163
-
-Work sponsored by SGI
-
-- Don't create missing directories in <devfs_find_handle>
-
-- Removed Documentation/filesystems/devfs/mk-devlinks
-
-- Updated Documentation/filesystems/devfs/README
-===============================================================================
-Changes for patch v164
-
-Work sponsored by SGI
-
-- Fixed CONFIG_DEVFS breakage in drivers/char/serial.c introduced in
- linux-2.3.99-pre6-7
-===============================================================================
-Changes for patch v165
-
-Work sponsored by SGI
-
-- Ported to kernel 2.3.99-pre6
-===============================================================================
-Changes for patch v166
-
-Work sponsored by SGI
-
-- Added CONFIG_DEVFS_MOUNT
-===============================================================================
-Changes for patch v167
-
-Work sponsored by SGI
-
-- Updated Documentation/filesystems/devfs/README
-
-- Updated sample modules.conf
-===============================================================================
-Changes for patch v168
-
-Work sponsored by SGI
-
-- Disabled multi-mount capability (use VFS bindings instead)
-
-- Updated README from master HTML file
-===============================================================================
-Changes for patch v169
-
-Work sponsored by SGI
-
-- Removed multi-mount code
-
-- Removed compatibility macros: VFS has changed too much
-===============================================================================
-Changes for patch v170
-
-Work sponsored by SGI
-
-- Updated README from master HTML file
-
-- Merged devfs inode into devfs entry
-===============================================================================
-Changes for patch v171
-
-Work sponsored by SGI
-
-- Updated sample modules.conf
-
-- Removed dead code in <devfs_register> which used to call
- <free_dentries>
-
-- Ported to kernel 2.4.0-test2-pre3
-===============================================================================
-Changes for patch v172
-
-Work sponsored by SGI
-
-- Changed interface to <devfs_register>
-
-- Changed interface to <devfs_register_series>
-===============================================================================
-Changes for patch v173
-
-Work sponsored by SGI
-
-- Simplified interface to <devfs_mk_symlink>
-
-- Simplified interface to <devfs_mk_dir>
-
-- Simplified interface to <devfs_find_handle>
-===============================================================================
-Changes for patch v174
-
-Work sponsored by SGI
-
-- Updated README from master HTML file
-===============================================================================
-Changes for patch v175
-
-Work sponsored by SGI
-
-- DocBook update for fs/devfs/base.c
- Thanks to Tim Waugh <twaugh@redhat.com>
-
-- Removed stale fs/tunnel.c (was never used or completed)
-===============================================================================
-Changes for patch v176
-
-Work sponsored by SGI
-
-- Updated ToDo list
-
-- Removed sample modules.conf: now distributed with devfsd
-
-- Updated README from master HTML file
-
-- Ported to kernel 2.4.0-test3-pre4 (which had devfs-patch-v174)
-===============================================================================
-Changes for patch v177
-
-- Updated README from master HTML file
-
-- Documentation cleanups
-
-- Ensure <devfs_generate_path> terminates string for root entry
- Thanks to Tim Jansen <tim@tjansen.de>
-
-- Exported <devfs_get_name> to modules
-
-- Make <devfs_mk_symlink> send events to devfsd
-
-- Cleaned up option processing in <devfs_setup>
-
-- Fixed bugs in handling symlinks: could leak or cause Oops
-
-- Cleaned up directory handling by separating fops
- Thanks to Alexander Viro <viro@parcelfarce.linux.theplanet.co.uk>
-===============================================================================
-Changes for patch v178
-
-- Fixed handling of inverted options in <devfs_setup>
-===============================================================================
-Changes for patch v179
-
-- Adjusted <try_modload> to account for <devfs_generate_path> fix
-===============================================================================
-Changes for patch v180
-
-- Fixed !CONFIG_DEVFS_FS stub declaration of <devfs_get_info>
-===============================================================================
-Changes for patch v181
-
-- Answered question posed by Al Viro and removed his comments from <devfs_open>
-
-- Moved setting of registered flag after other fields are changed
-
-- Fixed race between <devfsd_close> and <devfsd_notify_one>
-
-- Global VFS changes added bogus BKL to devfsd_close(): removed
-
-- Widened locking in <devfs_readlink> and <devfs_follow_link>
-
-- Replaced <devfsd_read> stack usage with <devfsd_ioctl> kmalloc
-
-- Simplified locking in <devfsd_ioctl> and fixed memory leak
-===============================================================================
-Changes for patch v182
-
-- Created <devfs_*alloc_major> and <devfs_*alloc_devnum>
-
-- Removed broken devnum allocation and use <devfs_alloc_devnum>
-
-- Fixed old devnum leak by calling new <devfs_dealloc_devnum>
-
-- Created <devfs_*alloc_unique_number>
-
-- Fixed number leak for /dev/cdroms/cdrom%d
-
-- Fixed number leak for /dev/discs/disc%d
-===============================================================================
-Changes for patch v183
-
-- Fixed bug in <devfs_setup> which could hang boot process
-===============================================================================
-Changes for patch v184
-
-- Documentation typo fix for fs/devfs/util.c
-
-- Fixed drivers/char/stallion.c for devfs
-
-- Added DEVFSD_NOTIFY_DELETE event
-
-- Updated README from master HTML file
-
-- Removed #include <asm/segment.h> from fs/devfs/base.c
-===============================================================================
-Changes for patch v185
-
-- Made <block_semaphore> and <char_semaphore> in fs/devfs/util.c
- private
-
-- Fixed inode table races by removing it and using inode->u.generic_ip
- instead
-
-- Moved <devfs_read_inode> into <get_vfs_inode>
-
-- Moved <devfs_write_inode> into <devfs_notify_change>
-===============================================================================
-Changes for patch v186
-
-- Fixed race in <devfs_do_symlink> for uni-processor
-
-- Updated README from master HTML file
-===============================================================================
-Changes for patch v187
-
-- Fixed drivers/char/stallion.c for devfs
-
-- Fixed drivers/char/rocket.c for devfs
-
-- Fixed bug in <devfs_alloc_unique_number>: limited to 128 numbers
-===============================================================================
-Changes for patch v188
-
-- Updated major masks in fs/devfs/util.c up to Linus' "no new majors"
- proclamation. Block: were 126 now 122 free, char: were 26 now 19 free
-
-- Updated README from master HTML file
-
-- Removed remnant of multi-mount support in <devfs_mknod>
-
-- Removed unused DEVFS_FL_SHOW_UNREG flag
-===============================================================================
-Changes for patch v189
-
-- Removed nlink field from struct devfs_inode
-
-- Removed auto-ownership for /dev/pty/* (BSD ptys) and used
- DEVFS_FL_CURRENT_OWNER|DEVFS_FL_NO_PERSISTENCE for /dev/pty/s* (just
- like Unix98 pty slaves) and made /dev/pty/m* rw-rw-rw- access
-===============================================================================
-Changes for patch v190
-
-- Updated README from master HTML file
-
-- Replaced BKL with global rwsem to protect symlink data (quick and
- dirty hack)
-===============================================================================
-Changes for patch v191
-
-- Replaced global rwsem for symlink with per-link refcount
-===============================================================================
-Changes for patch v192
-
-- Removed unnecessary #ifdef CONFIG_DEVFS_FS from arch/i386/kernel/mtrr.c
-
-- Ported to kernel 2.4.10-pre11
-
-- Set inode->i_mapping->a_ops for block nodes in <get_vfs_inode>
-===============================================================================
-Changes for patch v193
-
-- Went back to global rwsem for symlinks (refcount scheme no good)
-===============================================================================
-Changes for patch v194
-
-- Fixed overrun in <devfs_link> by removing function (not needed)
-
-- Updated README from master HTML file
-===============================================================================
-Changes for patch v195
-
-- Fixed buffer underrun in <try_modload>
-
-- Moved down_read() from <search_for_entry_in_dir> to <find_entry>
-===============================================================================
-Changes for patch v196
-
-- Fixed race in <devfsd_ioctl> when setting event mask
- Thanks to Kari Hurtta <hurtta@leija.mh.fmi.fi>
-
-- Avoid deadlock in <devfs_follow_link> by using temporary buffer
-===============================================================================
-Changes for patch v197
-
-- First release of new locking code for devfs core (v1.0)
-
-- Fixed bug in drivers/cdrom/cdrom.c
-===============================================================================
-Changes for patch v198
-
-- Discard temporary buffer, now use "%s" for dentry names
-
-- Don't generate path in <try_modload>: use fake entry instead
-
-- Use "existing" directory in <_devfs_make_parent_for_leaf>
-
-- Use slab cache rather than fixed buffer for devfsd events
-===============================================================================
-Changes for patch v199
-
-- Removed obsolete usage of DEVFS_FL_NO_PERSISTENCE
-
-- Send DEVFSD_NOTIFY_REGISTERED events in <devfs_mk_dir>
-
-- Fixed locking bug in <devfs_d_revalidate_wait> due to typo
-
-- Do not send CREATE, CHANGE, ASYNC_OPEN or DELETE events from devfsd
- or children
-===============================================================================
-Changes for patch v200
-
-- Ported to kernel 2.5.1-pre2
-===============================================================================
-Changes for patch v201
-
-- Fixed bug in <devfsd_read>: was dereferencing freed pointer
-===============================================================================
-Changes for patch v202
-
-- Fixed bug in <devfsd_close>: was dereferencing freed pointer
-
-- Added process group check for devfsd privileges
-===============================================================================
-Changes for patch v203
-
-- Use SLAB_ATOMIC in <devfsd_notify_de> from <devfs_d_delete>
-===============================================================================
-Changes for patch v204
-
-- Removed long obsolete rc.devfs
-
-- Return old entry in <devfs_mk_dir> for 2.4.x kernels
-
-- Updated README from master HTML file
-
-- Increment refcount on module in <check_disc_changed>
-
-- Created <devfs_get_handle> and exported <devfs_put>
-
-- Increment refcount on module in <devfs_get_ops>
-
-- Created <devfs_put_ops> and used where needed to fix races
-
-- Added clarifying comments in response to preliminary EMC code review
-
-- Added poisoning to <devfs_put>
-
-- Improved debugging messages
-
-- Fixed unregister bugs in drivers/md/lvm-fs.c
-===============================================================================
-Changes for patch v205
-
-- Corrected (made useful) debugging message in <unregister>
-
-- Moved <kmem_cache_create> in <mount_devfs_fs> to <init_devfs_fs>
-
-- Fixed drivers/md/lvm-fs.c to create "lvm" entry
-
-- Added magic number to guard against scribbling drivers
-
-- Only return old entry in <devfs_mk_dir> if a directory
-
-- Defined macros for error and debug messages
-
-- Updated README from master HTML file
-===============================================================================
-Changes for patch v206
-
-- Added support for multiple Compaq cpqarray controllers
-
-- Fixed (rare, old) race in <devfs_lookup>
-===============================================================================
-Changes for patch v207
-
-- Fixed deadlock bug in <devfs_d_revalidate_wait>
-
-- Tag VFS deletable in <devfs_mk_symlink> if handle ignored
-
-- Updated README from master HTML file
-===============================================================================
-Changes for patch v208
-
-- Added KERN_* to remaining messages
-
-- Cleaned up declaration of <stat_read>
-
-- Updated README from master HTML file
-===============================================================================
-Changes for patch v209
-
-- Updated README from master HTML file
-
-- Removed silently introduced calls to lock_kernel() and
- unlock_kernel() due to recent VFS locking changes. BKL isn't
- required in devfs
-
-- Changed <devfs_rmdir> to allow later additions if not yet empty
-
-- Added calls to <devfs_register_partitions> in drivers/block/blkpc.c
- <add_partition> and <del_partition>
-
-- Fixed bug in <devfs_alloc_unique_number>: was clearing beyond
- bitfield
-
-- Fixed bitfield data type for <devfs_*alloc_devnum>
-
-- Made major bitfield type and initialiser 64 bit safe
-===============================================================================
-Changes for patch v210
-
-- Updated fs/devfs/util.c to fix shift warning on 64 bit machines
- Thanks to Anton Blanchard <anton@samba.org>
-
-- Updated README from master HTML file
-===============================================================================
-Changes for patch v211
-
-- Do not put miscellaneous character devices in /dev/misc if they
- specify their own directory (i.e. contain a '/' character)
-
-- Copied macro for error messages from fs/devfs/base.c to
- fs/devfs/util.c and made use of this macro
-
-- Removed 2.4.x compatibility code from fs/devfs/base.c
-===============================================================================
-Changes for patch v212
-
-- Added BKL to <devfs_open> because drivers still need it
-===============================================================================
-Changes for patch v213
-
-- Protected <scan_dir_for_removable> and <get_removable_partition>
- from changing directory contents
-===============================================================================
-Changes for patch v214
-
-- Switched to ISO C structure field initialisers
-
-- Switch to set_current_state() and move before add_wait_queue()
-
-- Updated README from master HTML file
-
-- Fixed devfs entry leak in <devfs_readdir> when *readdir fails
-===============================================================================
-Changes for patch v215
-
-- Created <devfs_find_and_unregister>
-
-- Switched many functions from <devfs_find_handle> to
- <devfs_find_and_unregister>
-
-- Switched many functions from <devfs_find_handle> to <devfs_get_handle>
-===============================================================================
-Changes for patch v216
-
-- Switched arch/ia64/sn/io/hcl.c from <devfs_find_handle> to
- <devfs_get_handle>
-
-- Removed deprecated <devfs_find_handle>
-===============================================================================
-Changes for patch v217
-
-- Exported <devfs_find_and_unregister> and <devfs_only> to modules
-
-- Updated README from master HTML file
-
-- Fixed module unload race in <devfs_open>
-===============================================================================
-Changes for patch v218
-
-- Removed DEVFS_FL_AUTO_OWNER flag
-
-- Switched lingering structure field initialiser to ISO C
-
-- Added locking when setting/clearing flags
-
-- Documentation fix in fs/devfs/util.c
diff --git a/Documentation/filesystems/devfs/README b/Documentation/filesystems/devfs/README
deleted file mode 100644
index aabfba24bc2e..000000000000
--- a/Documentation/filesystems/devfs/README
+++ /dev/null
@@ -1,1959 +0,0 @@
-Devfs (Device File System) FAQ
-
-
-Linux Devfs (Device File System) FAQ
-Richard Gooch
-20-AUG-2002
-
-
-Document languages:
-
-
-
-
-
-
-
------------------------------------------------------------------------------
-
-NOTE: the master copy of this document is available online at:
-
-http://www.atnf.csiro.au/~rgooch/linux/docs/devfs.html
-and looks much better than the text version distributed with the
-kernel sources. A mirror site is available at:
-
-http://www.ras.ucalgary.ca/~rgooch/linux/docs/devfs.html
-
-There is also an optional daemon that may be used with devfs. You can
-find out more about it at:
-
-http://www.atnf.csiro.au/~rgooch/linux/
-
-A mailing list is available which you may subscribe to. Send
-email
-to majordomo@oss.sgi.com with the following line in the
-body of the message:
-subscribe devfs
-To unsubscribe, send the message body:
-unsubscribe devfs
-instead. The list is archived at
-
-http://oss.sgi.com/projects/devfs/archive/.
-
------------------------------------------------------------------------------
-
-Contents
-
-
-What is it?
-
-Why do it?
-
-Who else does it?
-
-How it works
-
-Operational issues (essential reading)
-
-Instructions for the impatient
-Permissions persistence across reboots
-Dealing with drivers without devfs support
-All the way with Devfs
-Other Issues
-Kernel Naming Scheme
-Devfsd Naming Scheme
-Old Compatibility Names
-SCSI Host Probing Issues
-
-
-
-Device drivers currently ported
-
-Allocation of Device Numbers
-
-Questions and Answers
-
-Making things work
-Alternatives to devfs
-What I don't like about devfs
-How to report bugs
-Strange kernel messages
-Compilation problems with devfsd
-
-
-Other resources
-
-Translations of this document
-
-
------------------------------------------------------------------------------
-
-
-What is it?
-
-Devfs is an alternative to "real" character and block special devices
-on your root filesystem. Kernel device drivers can register devices by
-name rather than major and minor numbers. These devices will appear in
-devfs automatically, with whatever default ownership and
-protection the driver specified. A daemon (devfsd) can be used to
-override these defaults. Devfs has been in the kernel since 2.3.46.
-
-NOTE that devfs is entirely optional. If you prefer the old
-disc-based device nodes, then simply leave CONFIG_DEVFS_FS=n (the
-default). In this case, nothing will change. ALSO NOTE that if you do
-enable devfs, the defaults are such that full compatibility is
-maintained with the old devices names.
-
-There are two aspects to devfs: one is the underlying device
-namespace, which is a namespace just like any mounted filesystem. The
-other aspect is the filesystem code which provides a view of the
-device namespace. The reason I make a distinction is because devfs
-can be mounted many times, with each mount showing the same device
-namespace. Changes made are global to all mounted devfs filesystems.
-Also, because the devfs namespace exists without any devfs mounts, you
-can easily mount the root filesystem by referring to an entry in the
-devfs namespace.
-
-
-The cost of devfs is a small increase in kernel code size and memory
-usage. About 7 pages of code (some of that in __init sections) and 72
-bytes for each entry in the namespace. A modest system has only a
-couple of hundred device entries, so this costs a few more
-pages. Compare this with the suggestion to put /dev on a <a
-href="#why-faq-ramdisc">ramdisc.
-
-On a typical machine, the cost is under 0.2 percent. On a modest
-system with 64 MBytes of RAM, the cost is under 0.1 percent. The
-accusations of "bloatware" levelled at devfs are not justified.
-
------------------------------------------------------------------------------
-
-
-Why do it?
-
-There are several problems that devfs addresses. Some of these
-problems are more serious than others (depending on your point of
-view), and some can be solved without devfs. However, the totality of
-these problems really calls out for devfs.
-
-The choice is a patchwork of inefficient user space solutions, which
-are complex and likely to be fragile, or to use a simple and efficient
-devfs which is robust.
-
-There have been many counter-proposals to devfs, all seeking to
-provide some of the benefits without actually implementing devfs. So
-far there has been an absence of code and no proposed alternative has
-been able to provide all the features that devfs does. Further,
-alternative proposals require far more complexity in user-space (and
-still deliver less functionality than devfs). Some people have the
-mantra of reducing "kernel bloat", but don't consider the effects on
-user-space.
-
-A good solution limits the total complexity of kernel-space and
-user-space.
-
-
-Major&minor allocation
-
-The existing scheme requires the allocation of major and minor device
-numbers for each and every device. This means that a central
-co-ordinating authority is required to issue these device numbers
-(unless you're developing a "private" device driver), in order to
-preserve uniqueness. Devfs shifts the burden to a namespace. This may
-not seem like a huge benefit, but actually it is. Since driver authors
-will naturally choose a device name which reflects the functionality
-of the device, there is far less potential for namespace conflict.
-Solving this requires a kernel change.
-
-/dev management
-
-Because you currently access devices through device nodes, these must
-be created by the system administrator. For standard devices you can
-usually find a MAKEDEV programme which creates all these (hundreds!)
-of nodes. This means that changes in the kernel must be reflected by
-changes in the MAKEDEV programme, or else the system administrator
-creates device nodes by hand.
-
-The basic problem is that there are two separate databases of
-major and minor numbers. One is in the kernel and one is in /dev (or
-in a MAKEDEV programme, if you want to look at it that way). This is
-duplication of information, which is not good practice.
-Solving this requires a kernel change.
-
-/dev growth
-
-A typical /dev has over 1200 nodes! Most of these devices simply don't
-exist because the hardware is not available. A huge /dev increases the
-time to access devices (I'm just referring to the dentry lookup times
-and the time taken to read inodes off disc: the next subsection shows
-some more horrors).
-
-An example of how big /dev can grow is if we consider SCSI devices:
-
-host 6 bits (say up to 64 hosts on a really big machine)
-channel 4 bits (say up to 16 SCSI buses per host)
-id 4 bits
-lun 3 bits
-partition 6 bits
-TOTAL 23 bits
-
-
-This requires 8 Mega (1024*1024) inodes if we want to store all
-possible device nodes. Even if we scrap everything but id,partition
-and assume a single host adapter with a single SCSI bus and only one
-logical unit per SCSI target (id), that's still 10 bits or 1024
-inodes. Each VFS inode takes around 256 bytes (kernel 2.1.78), so
-that's 256 kBytes of inode storage on disc (assuming real inodes take
-a similar amount of space as VFS inodes). This is actually not so bad,
-because disc is cheap these days. Embedded systems would care about
-256 kBytes of /dev inodes, but you could argue that embedded systems
-would have hand-tuned /dev directories. I've had to do just that on my
-embedded systems, but I would rather just leave it to devfs.
-
-Another issue is the time taken to lookup an inode when first
-referenced. Not only does this take time in scanning through a list in
-memory, but also the seek times to read the inodes off disc.
-This could be solved in user-space using a clever programme which
-scanned the kernel logs and deleted /dev entries which are not
-available and created them when they were available. This programme
-would need to be run every time a new module was loaded, which would
-slow things down a lot.
-
-There is an existing programme called scsidev which will automatically
-create device nodes for SCSI devices. It can do this by scanning files
-in /proc/scsi. Unfortunately, to extend this idea to other device
-nodes would require significant modifications to existing drivers (so
-they too would provide information in /proc). This is a non-trivial
-change (I should know: devfs has had to do something similar). Once
-you go to this much effort, you may as well use devfs itself (which
-also provides this information). Furthermore, such a system would
-likely be implemented in an ad-hoc fashion, as different drivers will
-provide their information in different ways.
-
-Devfs is much cleaner, because it (naturally) has a uniform mechanism
-to provide this information: the device nodes themselves!
-
-
-Node to driver file_operations translation
-
-There is an important difference between the way disc-based character
-and block nodes and devfs entries make the connection between an entry
-in /dev and the actual device driver.
-
-With the current 8 bit major and minor numbers the connection between
-disc-based c&b nodes and per-major drivers is done through a
-fixed-length table of 128 entries. The various filesystem types set
-the inode operations for c&b nodes to {chr,blk}dev_inode_operations,
-so when a device is opened a few quick levels of indirection bring us
-to the driver file_operations.
-
-For miscellaneous character devices a second step is required: there
-is a scan for the driver entry with the same minor number as the file
-that was opened, and the appropriate minor open method is called. This
-scanning is done *every time* you open a device node. Potentially, you
-may be searching through dozens of misc. entries before you find your
-open method. While not an enormous performance overhead, this does
-seem pointless.
-
-Linux *must* move beyond the 8 bit major and minor barrier,
-somehow. If we simply increase each to 16 bits, then the indexing
-scheme used for major driver lookup becomes untenable, because the
-major tables (one each for character and block devices) would need to
-be 64 k entries long (512 kBytes on x86, 1 MByte for 64 bit
-systems). So we would have to use a scheme like that used for
-miscellaneous character devices, which means the search time goes up
-linearly with the average number of major device drivers on your
-system. Not all "devices" are hardware, some are higher-level drivers
-like KGI, so you can get more "devices" without adding hardware
-You can improve this by creating an ordered (balanced:-)
-binary tree, in which case your search time becomes log(N).
-Alternatively, you can use hashing to speed up the search.
-But why do that search at all if you don't have to? Once again, it
-seems pointless.
-
-Note that devfs doesn't use the major&minor system. For devfs
-entries, the connection is done when you lookup the /dev entry. When
-devfs_register() is called, an internal table is appended which has
-the entry name and the file_operations. If the dentry cache doesn't
-have the /dev entry already, this internal table is scanned to get the
-file_operations, and an inode is created. If the dentry cache already
-has the entry, there is *no lookup time* (other than the dentry scan
-itself, but we can't avoid that anyway, and besides Linux dentries
-cream other OS's which don't have them:-). Furthermore, the number of
-node entries in a devfs is only the number of available device
-entries, not the number of *conceivable* entries. Even if you remove
-unnecessary entries in a disc-based /dev, the number of conceivable
-entries remains the same: you just limit yourself in order to save
-space.
-
-Devfs provides a fast connection between a VFS node and the device
-driver, in a scalable way.
-
-/dev as a system administration tool
-
-Right now /dev contains a list of conceivable devices, most of which I
-don't have. Devfs only shows those devices available on my
-system. This means that listing /dev is a handy way of checking what
-devices are available.
-
-Major&minor size
-
-Existing major and minor numbers are limited to 8 bits each. This is
-now a limiting factor for some drivers, particularly the SCSI disc
-driver, which consumes a single major number. Only 16 discs are
-supported, and each disc may have only 15 partitions. Maybe this isn't
-a problem for you, but some of us are building huge Linux systems with
-disc arrays. With devfs an arbitrary pointer can be associated with
-each device entry, which can be used to give an effective 32 bit
-device identifier (i.e. that's like having a 32 bit minor
-number). Since this is private to the kernel, there are no C library
-compatibility issues which you would have with increasing major and
-minor number sizes. See the section on "Allocation of Device Numbers"
-for details on maintaining compatibility with userspace.
-
-Solving this requires a kernel change.
-
-Since writing this, the kernel has been modified so that the SCSI disc
-driver has more major numbers allocated to it and now supports up to
-128 discs. Since these major numbers are non-contiguous (a result of
-unplanned expansion), the implementation is a little more cumbersome
-than originally.
-
-Just like the changes to IPv4 to fix impending limitations in the
-address space, people find ways around the limitations. In the long
-run, however, solutions like IPv6 or devfs can't be put off forever.
-
-Read-only root filesystem
-
-Having your device nodes on the root filesystem means that you can't
-operate properly with a read-only root filesystem. This is because you
-want to change ownerships and protections of tty devices. Existing
-practice prevents you using a CD-ROM as your root filesystem for a
-*real* system. Sure, you can boot off a CD-ROM, but you can't change
-tty ownerships, so it's only good for installing.
-
-Also, you can't use a shared NFS root filesystem for a cluster of
-discless Linux machines (having tty ownerships changed on a common
-/dev is not good). Nor can you embed your root filesystem in a
-ROM-FS.
-
-You can get around this by creating a RAMDISC at boot time, making
-an ext2 filesystem in it, mounting it somewhere and copying the
-contents of /dev into it, then unmounting it and mounting it over
-/dev.
-
-A devfs is a cleaner way of solving this.
-
-Non-Unix root filesystem
-
-Non-Unix filesystems (such as NTFS) can't be used for a root
-filesystem because they variously don't support character and block
-special files or symbolic links. You can't have a separate disc-based
-or RAMDISC-based filesystem mounted on /dev because you need device
-nodes before you can mount these. Devfs can be mounted without any
-device nodes. Devlinks won't work because symlinks aren't supported.
-An alternative solution is to use initrd to mount a RAMDISC initial
-root filesystem (which is populated with a minimal set of device
-nodes), and then construct a new /dev in another RAMDISC, and finally
-switch to your non-Unix root filesystem. This requires clever boot
-scripts and a fragile and conceptually complex boot procedure.
-
-Devfs solves this in a robust and conceptually simple way.
-
-PTY security
-
-Current pseudo-tty (pty) devices are owned by root and read-writable
-by everyone. The user of a pty-pair cannot change
-ownership/protections without being suid-root.
-
-This could be solved with a secure user-space daemon which runs as
-root and does the actual creation of pty-pairs. Such a daemon would
-require modification to *every* programme that wants to use this new
-mechanism. It also slows down creation of pty-pairs.
-
-An alternative is to create a new open_pty() syscall which does much
-the same thing as the user-space daemon. Once again, this requires
-modifications to pty-handling programmes.
-
-The devfs solution allows a device driver to "tag" certain device
-files so that when an unopened device is opened, the ownerships are
-changed to the current euid and egid of the opening process, and the
-protections are changed to the default registered by the driver. When
-the device is closed ownership is set back to root and protections are
-set back to read-write for everybody. No programme need be changed.
-The devpts filesystem provides this auto-ownership feature for Unix98
-ptys. It doesn't support old-style pty devices, nor does it have all
-the other features of devfs.
-
-Intelligent device management
-
-Devfs implements a simple yet powerful protocol for communication with
-a device management daemon (devfsd) which runs in user space. It is
-possible to send a message (either synchronously or asynchronously) to
-devfsd on any event, such as registration/unregistration of device
-entries, opening and closing devices, looking up inodes, scanning
-directories and more. This has many possibilities. Some of these are
-already implemented. See:
-
-
-http://www.atnf.csiro.au/~rgooch/linux/
-
-Device entry registration events can be used by devfsd to change
-permissions of newly-created device nodes. This is one mechanism to
-control device permissions.
-
-Device entry registration/unregistration events can be used to run
-programmes or scripts. This can be used to provide automatic mounting
-of filesystems when a new block device media is inserted into the
-drive.
-
-Asynchronous device open and close events can be used to implement
-clever permissions management. For example, the default permissions on
-/dev/dsp do not allow everybody to read from the device. This is
-sensible, as you don't want some remote user recording what you say at
-your console. However, the console user is also prevented from
-recording. This behaviour is not desirable. With asynchronous device
-open and close events, you can have devfsd run a programme or script
-when console devices are opened to change the ownerships for *other*
-device nodes (such as /dev/dsp). On closure, you can run a different
-script to restore permissions. An advantage of this scheme over
-modifying the C library tty handling is that this works even if your
-programme crashes (how many times have you seen the utmp database with
-lingering entries for non-existent logins?).
-
-Synchronous device open events can be used to perform intelligent
-device access protections. Before the device driver open() method is
-called, the daemon must first validate the open attempt, by running an
-external programme or script. This is far more flexible than access
-control lists, as access can be determined on the basis of other
-system conditions instead of just the UID and GID.
-
-Inode lookup events can be used to authenticate module autoload
-requests. Instead of using kmod directly, the event is sent to
-devfsd which can implement an arbitrary authentication before loading
-the module itself.
-
-Inode lookup events can also be used to construct arbitrary
-namespaces, without having to resort to populating devfs with symlinks
-to devices that don't exist.
-
-Speculative Device Scanning
-
-Consider an application (like cdparanoia) that wants to find all
-CD-ROM devices on the system (SCSI, IDE and other types), whether or
-not their respective modules are loaded. The application must
-speculatively open certain device nodes (such as /dev/sr0 for the SCSI
-CD-ROMs) in order to make sure the module is loaded. This requires
-that all Linux distributions follow the standard device naming scheme
-(last time I looked RedHat did things differently). Devfs solves the
-naming problem.
-
-The same application also wants to see which devices are actually
-available on the system. With the existing system it needs to read the
-/dev directory and speculatively open each /dev/sr* device to
-determine if the device exists or not. With a large /dev this is an
-inefficient operation, especially if there are many /dev/sr* nodes. A
-solution like scsidev could reduce the number of /dev/sr* entries (but
-of course that also requires all that inefficient directory scanning).
-
-With devfs, the application can open the /dev/sr directory
-(which triggers the module autoloading if required), and proceed to
-read /dev/sr. Since only the available devices will have
-entries, there are no inefficencies in directory scanning or device
-openings.
-
------------------------------------------------------------------------------
-
-Who else does it?
-
-FreeBSD has a devfs implementation. Solaris and AIX each have a
-pseudo-devfs (something akin to scsidev but for all devices, with some
-unspecified kernel support). BeOS, Plan9 and QNX also have it. SGI's
-IRIX 6.4 and above also have a device filesystem.
-
-While we shouldn't just automatically do something because others do
-it, we should not ignore the work of others either. FreeBSD has a lot
-of competent people working on it, so their opinion should not be
-blithely ignored.
-
------------------------------------------------------------------------------
-
-
-How it works
-
-Registering device entries
-
-For every entry (device node) in a devfs-based /dev a driver must call
-devfs_register(). This adds the name of the device entry, the
-file_operations structure pointer and a few other things to an
-internal table. Device entries may be added and removed at any
-time. When a device entry is registered, it automagically appears in
-any mounted devfs'.
-
-Inode lookup
-
-When a lookup operation on an entry is performed and if there is no
-driver information for that entry devfs will attempt to call
-devfsd. If still no driver information can be found then a negative
-dentry is yielded and the next stage operation will be called by the
-VFS (such as create() or mknod() inode methods). If driver information
-can be found, an inode is created (if one does not exist already) and
-all is well.
-
-Manually creating device nodes
-
-The mknod() method allows you to create an ordinary named pipe in the
-devfs, or you can create a character or block special inode if one
-does not already exist. You may wish to create a character or block
-special inode so that you can set permissions and ownership. Later, if
-a device driver registers an entry with the same name, the
-permissions, ownership and times are retained. This is how you can set
-the protections on a device even before the driver is loaded. Once you
-create an inode it appears in the directory listing.
-
-Unregistering device entries
-
-A device driver calls devfs_unregister() to unregister an entry.
-
-Chroot() gaols
-
-2.2.x kernels
-
-The semantics of inode creation are different when devfs is mounted
-with the "explicit" option. Now, when a device entry is registered, it
-will not appear until you use mknod() to create the device. It doesn't
-matter if you mknod() before or after the device is registered with
-devfs_register(). The purpose of this behaviour is to support
-chroot(2) gaols, where you want to mount a minimal devfs inside the
-gaol. Only the devices you specifically want to be available (through
-your mknod() setup) will be accessible.
-
-2.4.x kernels
-
-As of kernel 2.3.99, the VFS has had the ability to rebind parts of
-the global filesystem namespace into another part of the namespace.
-This now works even at the leaf-node level, which means that
-individual files and device nodes may be bound into other parts of the
-namespace. This is like making links, but better, because it works
-across filesystems (unlike hard links) and works through chroot()
-gaols (unlike symbolic links).
-
-Because of these improvements to the VFS, the multi-mount capability
-in devfs is no longer needed. The administrator may create a minimal
-device tree inside a chroot(2) gaol by using VFS bindings. As this
-provides most of the features of the devfs multi-mount capability, I
-removed the multi-mount support code (after issuing an RFC). This
-yielded code size reductions and simplifications.
-
-If you want to construct a minimal chroot() gaol, the following
-command should suffice:
-
-mount --bind /dev/null /gaol/dev/null
-
-
-Repeat for other device nodes you want to expose. Simple!
-
------------------------------------------------------------------------------
-
-
-Operational issues
-
-
-Instructions for the impatient
-
-Nobody likes reading documentation. People just want to get in there
-and play. So this section tells you quickly the steps you need to take
-to run with devfs mounted over /dev. Skip these steps and you will end
-up with a nearly unbootable system. Subsequent sections describe the
-issues in more detail, and discuss non-essential configuration
-options.
-
-Devfsd
-OK, if you're reading this, I assume you want to play with
-devfs. First you should ensure that /usr/src/linux contains a
-recent kernel source tree. Then you need to compile devfsd, the device
-management daemon, available at
-
-http://www.atnf.csiro.au/~rgooch/linux/.
-Because the kernel has a naming scheme
-which is quite different from the old naming scheme, you need to
-install devfsd so that software and configuration files that use the
-old naming scheme will not break.
-
-Compile and install devfsd. You will be provided with a default
-configuration file /etc/devfsd.conf which will provide
-compatibility symlinks for the old naming scheme. Don't change this
-config file unless you know what you're doing. Even if you think you
-do know what you're doing, don't change it until you've followed all
-the steps below and booted a devfs-enabled system and verified that it
-works.
-
-Now edit your main system boot script so that devfsd is started at the
-very beginning (before any filesystem
-checks). /etc/rc.d/rc.sysinit is often the main boot script
-on systems with SysV-style boot scripts. On systems with BSD-style
-boot scripts it is often /etc/rc. Also check
-/sbin/rc.
-
-NOTE that the line you put into the boot
-script should be exactly:
-
-/sbin/devfsd /dev
-
-DO NOT use some special daemon-launching
-programme, otherwise the boot script may not wait for devfsd to finish
-initialising.
-
-System Libraries
-There may still be some problems because of broken software making
-assumptions about device names. In particular, some software does not
-handle devices which are symbolic links. If you are running a libc 5
-based system, install libc 5.4.44 (if you have libc 5.4.46, go back to
-libc 5.4.44, which is actually correct). If you are running a glibc
-based system, make sure you have glibc 2.1.3 or later.
-
-/etc/securetty
-PAM (Pluggable Authentication Modules) is supposed to be a flexible
-mechanism for providing better user authentication and access to
-services. Unfortunately, it's also fragile, complex and undocumented
-(check out RedHat 6.1, and probably other distributions as well). PAM
-has problems with symbolic links. Append the following lines to your
-/etc/securetty file:
-
-vc/1
-vc/2
-vc/3
-vc/4
-vc/5
-vc/6
-vc/7
-vc/8
-
-This will not weaken security. If you have a version of util-linux
-earlier than 2.10.h, please upgrade to 2.10.h or later. If you
-absolutely cannot upgrade, then also append the following lines to
-your /etc/securetty file:
-
-1
-2
-3
-4
-5
-6
-7
-8
-
-This may potentially weaken security by allowing root logins over the
-network (a password is still required, though). However, since there
-are problems with dealing with symlinks, I'm suspicious of the level
-of security offered in any case.
-
-XFree86
-While not essential, it's probably a good idea to upgrade to XFree86
-4.0, as patches went in to make it more devfs-friendly. If you don't,
-you'll probably need to apply the following patch to
-/etc/security/console.perms so that ordinary users can run
-startx. Note that not all distributions have this file (e.g. Debian),
-so if it's not present, don't worry about it.
-
---- /etc/security/console.perms.orig Sat Apr 17 16:26:47 1999
-+++ /etc/security/console.perms Fri Feb 25 23:53:55 2000
-@@ -14,7 +14,7 @@
- # man 5 console.perms
-
- # file classes -- these are regular expressions
--<console>=tty[0-9][0-9]* :[0-9]\.[0-9] :[0-9]
-+<console>=tty[0-9][0-9]* vc/[0-9][0-9]* :[0-9]\.[0-9] :[0-9]
-
- # device classes -- these are shell-style globs
- <floppy>=/dev/fd[0-1]*
-
-If the patch does not apply, then change the line:
-
-<console>=tty[0-9][0-9]* :[0-9]\.[0-9] :[0-9]
-
-with:
-
-<console>=tty[0-9][0-9]* vc/[0-9][0-9]* :[0-9]\.[0-9] :[0-9]
-
-
-Disable devpts
-I've had a report of devpts mounted on /dev/pts not working
-correctly. Since devfs will also manage /dev/pts, there is no
-need to mount devpts as well. You should either edit your
-/etc/fstab so devpts is not mounted, or disable devpts from
-your kernel configuration.
-
-Unsupported drivers
-Not all drivers have devfs support. If you depend on one of these
-drivers, you will need to create a script or tarfile that you can use
-at boot time to create device nodes as appropriate. There is a
-section which describes this. Another
-section lists the drivers which have
-devfs support.
-
-/dev/mouse
-
-Many disributions configure /dev/mouse to be the mouse device
-for XFree86 and GPM. I actually think this is a bad idea, because it
-adds another level of indirection. When looking at a config file, if
-you see /dev/mouse you're left wondering which mouse
-is being referred to. Hence I recommend putting the actual mouse
-device (for example /dev/psaux) into your
-/etc/X11/XF86Config file (and similarly for the GPM
-configuration file).
-
-Alternatively, use the same technique used for unsupported drivers
-described above.
-
-The Kernel
-Finally, you need to make sure devfs is compiled into your kernel. Set
-CONFIG_EXPERIMENTAL=y, CONFIG_DEVFS_FS=y and CONFIG_DEVFS_MOUNT=y by
-using favourite configuration tool (i.e. make config or
-make xconfig) and then make clean and then recompile your kernel and
-modules. At boot, devfs will be mounted onto /dev.
-
-If you encounter problems booting (for example if you forgot a
-configuration step), you can pass devfs=nomount at the kernel
-boot command line. This will prevent the kernel from mounting devfs at
-boot time onto /dev.
-
-In general, a kernel built with CONFIG_DEVFS_FS=y but without mounting
-devfs onto /dev is completely safe, and requires no
-configuration changes. One exception to take note of is when
-LABEL= directives are used in /etc/fstab. In this
-case you will be unable to boot properly. This is because the
-mount(8) programme uses /proc/partitions as part of
-the volume label search process, and the device names it finds are not
-available, because setting CONFIG_DEVFS_FS=y changes the names in
-/proc/partitions, irrespective of whether devfs is mounted.
-
-Now you've finished all the steps required. You're now ready to boot
-your shiny new kernel. Enjoy.
-
-Changing the configuration
-
-OK, you've now booted a devfs-enabled system, and everything works.
-Now you may feel like changing the configuration (common targets are
-/etc/fstab and /etc/devfsd.conf). Since you have a
-system that works, if you make any changes and it doesn't work, you
-now know that you only have to restore your configuration files to the
-default and it will work again.
-
-
-Permissions persistence across reboots
-
-If you don't use mknod(2) to create a device file, nor use chmod(2) or
-chown(2) to change the ownerships/permissions, the inode ctime will
-remain at 0 (the epoch, 12 am, 1-JAN-1970, GMT). Anything with a ctime
-later than this has had it's ownership/permissions changed. Hence, a
-simple script or programme may be used to tar up all changed inodes,
-prior to shutdown. Although effective, many consider this approach a
-kludge.
-
-A much better approach is to use devfsd to save and restore
-permissions. It may be configured to record changes in permissions and
-will save them in a database (in fact a directory tree), and restore
-these upon boot. This is an efficient method and results in immediate
-saving of current permissions (unlike the tar approach, which saves
-permissions at some unspecified future time).
-
-The default configuration file supplied with devfsd has config entries
-which you may uncomment to enable persistence management.
-
-If you decide to use the tar approach anyway, be aware that tar will
-first unlink(2) an inode before creating a new device node. The
-unlink(2) has the effect of breaking the connection between a devfs
-entry and the device driver. If you use the "devfs=only" boot option,
-you lose access to the device driver, requiring you to reload the
-module. I consider this a bug in tar (there is no real need to
-unlink(2) the inode first).
-
-Alternatively, you can use devfsd to provide more sophisticated
-management of device permissions. You can use devfsd to store
-permissions for whole groups of devices with a single configuration
-entry, rather than the conventional single entry per device entry.
-
-Permissions database stored in mounted-over /dev
-
-If you wish to save and restore your device permissions into the
-disc-based /dev while still mounting devfs onto /dev
-you may do so. This requires a 2.4.x kernel (in fact, 2.3.99 or
-later), which has the VFS binding facility. You need to do the
-following to set this up:
-
-
-
-make sure the kernel does not mount devfs at boot time
-
-
-make sure you have a correct /dev/console entry in your
-root file-system (where your disc-based /dev lives)
-
-create the /dev-state directory
-
-
-add the following lines near the very beginning of your boot
-scripts:
-
-mount --bind /dev /dev-state
-mount -t devfs none /dev
-devfsd /dev
-
-
-
-
-add the following lines to your /etc/devfsd.conf file:
-
-REGISTER ^pt[sy] IGNORE
-CREATE ^pt[sy] IGNORE
-CHANGE ^pt[sy] IGNORE
-DELETE ^pt[sy] IGNORE
-REGISTER .* COPY /dev-state/$devname $devpath
-CREATE .* COPY $devpath /dev-state/$devname
-CHANGE .* COPY $devpath /dev-state/$devname
-DELETE .* CFUNCTION GLOBAL unlink /dev-state/$devname
-RESTORE /dev-state
-
-Note that the sample devfsd.conf file contains these lines,
-as well as other sample configurations you may find useful. See the
-devfsd distribution
-
-
-reboot.
-
-
-
-
-Permissions database stored in normal directory
-
-If you are using an older kernel which doesn't support VFS binding,
-then you won't be able to have the permissions database in a
-mounted-over /dev. However, you can still use a regular
-directory to store the database. The sample /etc/devfsd.conf
-file above may still be used. You will need to create the
-/dev-state directory prior to installing devfsd. If you have
-old permissions in /dev, then just copy (or move) the device
-nodes over to the new directory.
-
-Which method is better?
-
-The best method is to have the permissions database stored in the
-mounted-over /dev. This is because you will not need to copy
-device nodes over to /dev-state, and because it allows you to
-switch between devfs and non-devfs kernels, without requiring you to
-copy permissions between /dev-state (for devfs) and
-/dev (for non-devfs).
-
-
-Dealing with drivers without devfs support
-
-Currently, not all device drivers in the kernel have been modified to
-use devfs. Device drivers which do not yet have devfs support will not
-automagically appear in devfs. The simplest way to create device nodes
-for these drivers is to unpack a tarfile containing the required
-device nodes. You can do this in your boot scripts. All your drivers
-will now work as before.
-
-Hopefully for most people devfs will have enough support so that they
-can mount devfs directly over /dev without losing most functionality
-(i.e. losing access to various devices). As of 22-JAN-1998 (devfs
-patch version 10) I am now running this way. All the devices I have
-are available in devfs, so I don't lose anything.
-
-WARNING: if your configuration requires the old-style device names
-(i.e. /dev/hda1 or /dev/sda1), you must install devfsd and configure
-it to maintain compatibility entries. It is almost certain that you
-will require this. Note that the kernel creates a compatibility entry
-for the root device, so you don't need initrd.
-
-Note that you no longer need to mount devpts if you use Unix98 PTYs,
-as devfs can manage /dev/pts itself. This saves you some RAM, as you
-don't need to compile and install devpts. Note that some versions of
-glibc have a bug with Unix98 pty handling on devfs systems. Contact
-the glibc maintainers for a fix. Glibc 2.1.3 has the fix.
-
-Note also that apart from editing /etc/fstab, other things will need
-to be changed if you *don't* install devfsd. Some software (like the X
-server) hard-wire device names in their source. It really is much
-easier to install devfsd so that compatibility entries are created.
-You can then slowly migrate your system to using the new device names
-(for example, by starting with /etc/fstab), and then limiting the
-compatibility entries that devfsd creates.
-
-IF YOU CONFIGURE TO MOUNT DEVFS AT BOOT, MAKE SURE YOU INSTALL DEVFSD
-BEFORE YOU BOOT A DEVFS-ENABLED KERNEL!
-
-Now that devfs has gone into the 2.3.46 kernel, I'm getting a lot of
-reports back. Many of these are because people are trying to run
-without devfsd, and hence some things break. Please just run devfsd if
-things break. I want to concentrate on real bugs rather than
-misconfiguration problems at the moment. If people are willing to fix
-bugs/false assumptions in other code (i.e. glibc, X server) and submit
-that to the respective maintainers, that would be great.
-
-
-All the way with Devfs
-
-The devfs kernel patch creates a rationalised device tree. As stated
-above, if you want to keep using the old /dev naming scheme,
-you just need to configure devfsd appopriately (see the man
-page). People who prefer the old names can ignore this section. For
-those of us who like the rationalised names and an uncluttered
-/dev, read on.
-
-If you don't run devfsd, or don't enable compatibility entry
-management, then you will have to configure your system to use the new
-names. For example, you will then need to edit your
-/etc/fstab to use the new disc naming scheme. If you want to
-be able to boot non-devfs kernels, you will need compatibility
-symlinks in the underlying disc-based /dev pointing back to
-the old-style names for when you boot a kernel without devfs.
-
-You can selectively decide which devices you want compatibility
-entries for. For example, you may only want compatibility entries for
-BSD pseudo-terminal devices (otherwise you'll have to patch you C
-library or use Unix98 ptys instead). It's just a matter of putting in
-the correct regular expression into /dev/devfsd.conf.
-
-There are other choices of naming schemes that you may prefer. For
-example, I don't use the kernel-supplied
-names, because they are too verbose. A common misconception is
-that the kernel-supplied names are meant to be used directly in
-configuration files. This is not the case. They are designed to
-reflect the layout of the devices attached and to provide easy
-classification.
-
-If you like the kernel-supplied names, that's fine. If you don't then
-you should be using devfsd to construct a namespace more to your
-liking. Devfsd has built-in code to construct a
-namespace that is both logical and easy to
-manage. In essence, it creates a convenient abbreviation of the
-kernel-supplied namespace.
-
-You are of course free to build your own namespace. Devfsd has all the
-infrastructure required to make this easy for you. All you need do is
-write a script. You can even write some C code and devfsd can load the
-shared object as a callable extension.
-
-
-Other Issues
-
-The init programme
-Another thing to take note of is whether your init programme
-creates a Unix socket /dev/telinit. Some versions of init
-create /dev/telinit so that the telinit programme can
-communicate with the init process. If you have such a system you need
-to make sure that devfs is mounted over /dev *before* init
-starts. In other words, you can't leave the mounting of devfs to
-/etc/rc, since this is executed after init. Other
-versions of init require a named pipe /dev/initctl
-which must exist *before* init starts. Once again, you need to
-mount devfs and then create the named pipe *before* init
-starts.
-
-The default behaviour now is not to mount devfs onto /dev at
-boot time for 2.3.x and later kernels. You can correct this with the
-"devfs=mount" boot option. This solves any problems with init,
-and also prevents the dreaded:
-
-Cannot open initial console
-
-message. For 2.2.x kernels where you need to apply the devfs patch,
-the default is to mount.
-
-If you have automatic mounting of devfs onto /dev then you
-may need to create /dev/initctl in your boot scripts. The
-following lines should suffice:
-
-mknod /dev/initctl p
-kill -SIGUSR1 1 # tell init that /dev/initctl now exists
-
-Alternatively, if you don't want the kernel to mount devfs onto
-/dev then you could use the following procedure is a
-guideline for how to get around /dev/initctl problems:
-
-# cd /sbin
-# mv init init.real
-# cat > init
-#! /bin/sh
-mount -n -t devfs none /dev
-mknod /dev/initctl p
-exec /sbin/init.real $*
-[control-D]
-# chmod a+x init
-
-Note that newer versions of init create /dev/initctl
-automatically, so you don't have to worry about this.
-
-Module autoloading
-You will need to configure devfsd to enable module
-autoloading. The following lines should be placed in your
-/etc/devfsd.conf file:
-
-LOOKUP .* MODLOAD
-
-
-As of devfsd-v1.3.10, a generic /etc/modules.devfs
-configuration file is installed, which is used by the MODLOAD
-action. This should be sufficient for most configurations. If you
-require further configuration, edit your /etc/modules.conf
-file. The way module autoloading work with devfs is:
-
-
-a process attempts to lookup a device node (e.g. /dev/fred)
-
-
-if that device node does not exist, the full pathname is passed to
-devfsd as a string
-
-
-devfsd will pass the string to the modprobe programme (provided the
-configuration line shown above is present), and specifies that
-/etc/modules.devfs is the configuration file
-
-
-/etc/modules.devfs includes /etc/modules.conf to
-access local configurations
-
-modprobe will search it's configuration files, looking for an alias
-that translates the pathname into a module name
-
-
-the translated pathname is then used to load the module.
-
-
-If you wanted a lookup of /dev/fred to load the
-mymod module, you would require the following configuration
-line in /etc/modules.conf:
-
-alias /dev/fred mymod
-
-The /etc/modules.devfs configuration file provides many such
-aliases for standard device names. If you look closely at this file,
-you will note that some modules require multiple alias configuration
-lines. This is required to support module autoloading for old and new
-device names.
-
-Mounting root off a devfs device
-If you wish to mount root off a devfs device when you pass the
-"devfs=only" boot option, then you need to pass in the
-"root=<device>" option to the kernel when booting. If you use
-LILO, then you must have this in lilo.conf:
-
-append = "root=<device>"
-
-Surprised? Yep, so was I. It turns out if you have (as most people
-do):
-
-root = <device>
-
-
-then LILO will determine the device number of <device> and will
-write that device number into a special place in the kernel image
-before starting the kernel, and the kernel will use that device number
-to mount the root filesystem. So, using the "append" variety ensures
-that LILO passes the root filesystem device as a string, which devfs
-can then use.
-
-Note that this isn't an issue if you don't pass "devfs=only".
-
-TTY issues
-The ttyname(3) function in some versions of the C library makes
-false assumptions about device entries which are symbolic links. The
-tty(1) programme is one that depends on this function. I've
-written a patch to libc 5.4.43 which fixes this. This has been
-included in libc 5.4.44 and a similar fix is in glibc 2.1.3.
-
-
-Kernel Naming Scheme
-
-The kernel provides a default naming scheme. This scheme is designed
-to make it easy to search for specific devices or device types, and to
-view the available devices. Some device types (such as hard discs),
-have a directory of entries, making it easy to see what devices of
-that class are available. Often, the entries are symbolic links into a
-directory tree that reflects the topology of available devices. The
-topological tree is useful for finding how your devices are arranged.
-
-Below is a list of the naming schemes for the most common drivers. A
-list of reserved device names is
-available for reference. Please send email to
-rgooch@atnf.csiro.au to obtain an allocation. Please be
-patient (the maintainer is busy). An alternative name may be allocated
-instead of the requested name, at the discretion of the maintainer.
-
-Disc Devices
-
-All discs, whether SCSI, IDE or whatever, are placed under the
-/dev/discs hierarchy:
-
- /dev/discs/disc0 first disc
- /dev/discs/disc1 second disc
-
-
-Each of these entries is a symbolic link to the directory for that
-device. The device directory contains:
-
- disc for the whole disc
- part* for individual partitions
-
-
-CD-ROM Devices
-
-All CD-ROMs, whether SCSI, IDE or whatever, are placed under the
-/dev/cdroms hierarchy:
-
- /dev/cdroms/cdrom0 first CD-ROM
- /dev/cdroms/cdrom1 second CD-ROM
-
-
-Each of these entries is a symbolic link to the real device entry for
-that device.
-
-Tape Devices
-
-All tapes, whether SCSI, IDE or whatever, are placed under the
-/dev/tapes hierarchy:
-
- /dev/tapes/tape0 first tape
- /dev/tapes/tape1 second tape
-
-
-Each of these entries is a symbolic link to the directory for that
-device. The device directory contains:
-
- mt for mode 0
- mtl for mode 1
- mtm for mode 2
- mta for mode 3
- mtn for mode 0, no rewind
- mtln for mode 1, no rewind
- mtmn for mode 2, no rewind
- mtan for mode 3, no rewind
-
-
-SCSI Devices
-
-To uniquely identify any SCSI device requires the following
-information:
-
- controller (host adapter)
- bus (SCSI channel)
- target (SCSI ID)
- unit (Logical Unit Number)
-
-
-All SCSI devices are placed under /dev/scsi (assuming devfs
-is mounted on /dev). Hence, a SCSI device with the following
-parameters: c=1,b=2,t=3,u=4 would appear as:
-
- /dev/scsi/host1/bus2/target3/lun4 device directory
-
-
-Inside this directory, a number of device entries may be created,
-depending on which SCSI device-type drivers were installed.
-
-See the section on the disc naming scheme to see what entries the SCSI
-disc driver creates.
-
-See the section on the tape naming scheme to see what entries the SCSI
-tape driver creates.
-
-The SCSI CD-ROM driver creates:
-
- cd
-
-
-The SCSI generic driver creates:
-
- generic
-
-
-IDE Devices
-
-To uniquely identify any IDE device requires the following
-information:
-
- controller
- bus (aka. primary/secondary)
- target (aka. master/slave)
- unit
-
-
-All IDE devices are placed under /dev/ide, and uses a similar
-naming scheme to the SCSI subsystem.
-
-XT Hard Discs
-
-All XT discs are placed under /dev/xd. The first XT disc has
-the directory /dev/xd/disc0.
-
-TTY devices
-
-The tty devices now appear as:
-
- New name Old-name Device Type
- -------- -------- -----------
- /dev/tts/{0,1,...} /dev/ttyS{0,1,...} Serial ports
- /dev/cua/{0,1,...} /dev/cua{0,1,...} Call out devices
- /dev/vc/0 /dev/tty Current virtual console
- /dev/vc/{1,2,...} /dev/tty{1...63} Virtual consoles
- /dev/vcc/{0,1,...} /dev/vcs{1...63} Virtual consoles
- /dev/pty/m{0,1,...} /dev/ptyp?? PTY masters
- /dev/pty/s{0,1,...} /dev/ttyp?? PTY slaves
-
-
-RAMDISCS
-
-The RAMDISCS are placed in their own directory, and are named thus:
-
- /dev/rd/{0,1,2,...}
-
-
-Meta Devices
-
-The meta devices are placed in their own directory, and are named
-thus:
-
- /dev/md/{0,1,2,...}
-
-
-Floppy discs
-
-Floppy discs are placed in the /dev/floppy directory.
-
-Loop devices
-
-Loop devices are placed in the /dev/loop directory.
-
-Sound devices
-
-Sound devices are placed in the /dev/sound directory
-(audio, sequencer, ...).
-
-
-Devfsd Naming Scheme
-
-Devfsd provides a naming scheme which is a convenient abbreviation of
-the kernel-supplied namespace. In some
-cases, the kernel-supplied naming scheme is quite convenient, so
-devfsd does not provide another naming scheme. The convenience names
-that devfsd creates are in fact the same names as the original devfs
-kernel patch created (before Linus mandated the Big Name
-Change). These are referred to as "new compatibility entries".
-
-In order to configure devfsd to create these convenience names, the
-following lines should be placed in your /etc/devfsd.conf:
-
-REGISTER .* MKNEWCOMPAT
-UNREGISTER .* RMNEWCOMPAT
-
-This will cause devfsd to create (and destroy) symbolic links which
-point to the kernel-supplied names.
-
-SCSI Hard Discs
-
-All SCSI discs are placed under /dev/sd (assuming devfs is
-mounted on /dev). Hence, a SCSI disc with the following
-parameters: c=1,b=2,t=3,u=4 would appear as:
-
- /dev/sd/c1b2t3u4 for the whole disc
- /dev/sd/c1b2t3u4p5 for the 5th partition
- /dev/sd/c1b2t3u4p5s6 for the 6th slice in the 5th partition
-
-
-SCSI Tapes
-
-All SCSI tapes are placed under /dev/st. A similar naming
-scheme is used as for SCSI discs. A SCSI tape with the
-parameters:c=1,b=2,t=3,u=4 would appear as:
-
- /dev/st/c1b2t3u4m0 for mode 0
- /dev/st/c1b2t3u4m1 for mode 1
- /dev/st/c1b2t3u4m2 for mode 2
- /dev/st/c1b2t3u4m3 for mode 3
- /dev/st/c1b2t3u4m0n for mode 0, no rewind
- /dev/st/c1b2t3u4m1n for mode 1, no rewind
- /dev/st/c1b2t3u4m2n for mode 2, no rewind
- /dev/st/c1b2t3u4m3n for mode 3, no rewind
-
-
-SCSI CD-ROMs
-
-All SCSI CD-ROMs are placed under /dev/sr. A similar naming
-scheme is used as for SCSI discs. A SCSI CD-ROM with the
-parameters:c=1,b=2,t=3,u=4 would appear as:
-
- /dev/sr/c1b2t3u4
-
-
-SCSI Generic Devices
-
-The generic (aka. raw) interface for all SCSI devices are placed under
-/dev/sg. A similar naming scheme is used as for SCSI discs. A
-SCSI generic device with the parameters:c=1,b=2,t=3,u=4 would appear
-as:
-
- /dev/sg/c1b2t3u4
-
-
-IDE Hard Discs
-
-All IDE discs are placed under /dev/ide/hd, using a similar
-convention to SCSI discs. The following mappings exist between the new
-and the old names:
-
- /dev/hda /dev/ide/hd/c0b0t0u0
- /dev/hdb /dev/ide/hd/c0b0t1u0
- /dev/hdc /dev/ide/hd/c0b1t0u0
- /dev/hdd /dev/ide/hd/c0b1t1u0
-
-
-IDE Tapes
-
-A similar naming scheme is used as for IDE discs. The entries will
-appear in the /dev/ide/mt directory.
-
-IDE CD-ROM
-
-A similar naming scheme is used as for IDE discs. The entries will
-appear in the /dev/ide/cd directory.
-
-IDE Floppies
-
-A similar naming scheme is used as for IDE discs. The entries will
-appear in the /dev/ide/fd directory.
-
-XT Hard Discs
-
-All XT discs are placed under /dev/xd. The first XT disc
-would appear as /dev/xd/c0t0.
-
-
-Old Compatibility Names
-
-The old compatibility names are the legacy device names, such as
-/dev/hda, /dev/sda, /dev/rtc and so on.
-Devfsd can be configured to create compatibility symlinks so that you
-may continue to use the old names in your configuration files and so
-that old applications will continue to function correctly.
-
-In order to configure devfsd to create these legacy names, the
-following lines should be placed in your /etc/devfsd.conf:
-
-REGISTER .* MKOLDCOMPAT
-UNREGISTER .* RMOLDCOMPAT
-
-This will cause devfsd to create (and destroy) symbolic links which
-point to the kernel-supplied names.
-
-
------------------------------------------------------------------------------
-
-
-Device drivers currently ported
-
-- All miscellaneous character devices support devfs (this is done
- transparently through misc_register())
-
-- SCSI discs and generic hard discs
-
-- Character memory devices (null, zero, full and so on)
- Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- Loop devices (/dev/loop?)
-
-- TTY devices (console, serial ports, terminals and pseudo-terminals)
- Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- SCSI tapes (/dev/scsi and /dev/tapes)
-
-- SCSI CD-ROMs (/dev/scsi and /dev/cdroms)
-
-- SCSI generic devices (/dev/scsi)
-
-- RAMDISCS (/dev/ram?)
-
-- Meta Devices (/dev/md*)
-
-- Floppy discs (/dev/floppy)
-
-- Parallel port printers (/dev/printers)
-
-- Sound devices (/dev/sound)
- Thanks to Eric Dumas <dumas@linux.eu.org> and
- C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- Joysticks (/dev/joysticks)
-
-- Sparc keyboard (/dev/kbd)
-
-- DSP56001 digital signal processor (/dev/dsp56k)
-
-- Apple Desktop Bus (/dev/adb)
-
-- Coda network file system (/dev/cfs*)
-
-- Virtual console capture devices (/dev/vcc)
- Thanks to Dennis Hou <smilax@mindmeld.yi.org>
-
-- Frame buffer devices (/dev/fb)
-
-- Video capture devices (/dev/v4l)
-
-
------------------------------------------------------------------------------
-
-
-Allocation of Device Numbers
-
-Devfs allows you to write a driver which doesn't need to allocate a
-device number (major&minor numbers) for the internal operation of the
-kernel. However, there are a number of userspace programmes that use
-the device number as a unique handle for a device. An example is the
-find programme, which uses device numbers to determine whether
-an inode is on a different filesystem than another inode. The device
-number used is the one for the block device which a filesystem is
-using. To preserve compatibility with userspace programmes, block
-devices using devfs need to have unique device numbers allocated to
-them. Furthermore, POSIX specifies device numbers, so some kind of
-device number needs to be presented to userspace.
-
-The simplest option (especially when porting drivers to devfs) is to
-keep using the old major and minor numbers. Devfs will take whatever
-values are given for major&minor and pass them onto userspace.
-
-This device number is a 16 bit number, so this leaves plenty of space
-for large numbers of discs and partitions. This scheme can also be
-used for character devices, in particular the tty devices, which are
-currently limited to 256 pseudo-ttys (this limits the total number of
-simultaneous xterms and remote logins). Note that the device number
-is limited to the range 36864-61439 (majors 144-239), in order to
-avoid any possible conflicts with existing official allocations.
-
-Please note that using dynamically allocated block device numbers may
-break the NFS daemons (both user and kernel mode), which expect dev_t
-for a given device to be constant over the lifetime of remote mounts.
-
-A final note on this scheme: since it doesn't increase the size of
-device numbers, there are no compatibility issues with userspace.
-
------------------------------------------------------------------------------
-
-
-Questions and Answers
-
-
-Making things work
-Alternatives to devfs
-What I don't like about devfs
-How to report bugs
-Strange kernel messages
-Compilation problems with devfsd
-
-
-
-Making things work
-
-Here are some common questions and answers.
-
-
-
-Devfsd doesn't start
-
-Make sure you have compiled and installed devfsd
-Make sure devfsd is being started from your boot
-scripts
-Make sure you have configured your kernel to enable devfs (see
-below)
-Make sure devfs is mounted (see below)
-
-
-Devfsd is not managing all my permissions
-
-Make sure you are capturing the appropriate events. For example,
-device entries created by the kernel generate REGISTER events,
-but those created by devfsd generate CREATE events.
-
-
-Devfsd is not capturing all REGISTER events
-
-See the previous entry: you may need to capture CREATE events.
-
-
-X will not start
-
-Make sure you followed the steps
-outlined above.
-
-
-Why don't my network devices appear in devfs?
-
-This is not a bug. Network devices have their own, completely separate
-namespace. They are accessed via socket(2) and
-setsockopt(2) calls, and thus require no device nodes. I have
-raised the possibilty of moving network devices into the device
-namespace, but have had no response.
-
-
-How can I test if I have devfs compiled into my kernel?
-
-All filesystems built-in or currently loaded are listed in
-/proc/filesystems. If you see a devfs entry, then
-you know that devfs was compiled into your kernel. If you have
-correctly configured and rebuilt your kernel, then devfs will be
-built-in. If you think you've configured it in, but
-/proc/filesystems doesn't show it, you've made a mistake.
-Common mistakes include:
-
-Using a 2.2.x kernel without applying the devfs patch (if you
-don't know how to patch your kernel, use 2.4.x instead, don't bother
-asking me how to patch)
-Forgetting to set CONFIG_EXPERIMENTAL=y
-Forgetting to set CONFIG_DEVFS_FS=y
-Forgetting to set CONFIG_DEVFS_MOUNT=y (if you want devfs
-to be automatically mounted at boot)
-Editing your .config manually, instead of using make
-config or make xconfig
-Forgetting to run make dep; make clean after changing the
-configuration and before compiling
-Forgetting to compile your kernel and modules
-Forgetting to install your kernel
-Forgetting to install your modules
-
-Please check twice that you've done all these steps before sending in
-a bug report.
-
-
-
-How can I test if devfs is mounted on /dev?
-
-The device filesystem will always create an entry called
-".devfsd", which is used to communicate with the daemon. Even
-if the daemon is not running, this entry will exist. Testing for the
-existence of this entry is the approved method of determining if devfs
-is mounted or not. Note that the type of entry (i.e. regular file,
-character device, named pipe, etc.) may change without notice. Only
-the existence of the entry should be relied upon.
-
-
-When I start devfsd, I see the error:
-Error opening file: ".devfsd" No such file or directory?
-
-This means that devfs is not mounted. Make sure you have devfs mounted.
-
-
-How do I mount devfs?
-
-First make sure you have devfs compiled into your kernel (see
-above). Then you will either need to:
-
-set CONFIG_DEVFS_MOUNT=y in your kernel config
-pass devfs=mount to your boot loader
-mount devfs manually in your boot scripts with:
-mount -t none devfs /dev
-
-
-
-Mount by volume LABEL=<label> doesn't work with
-devfs
-
-Most probably you are not mounting devfs onto /dev. What
-happens is that if your kernel config has CONFIG_DEVFS_FS=y
-then the contents of /proc/partitions will have the devfs
-names (such as scsi/host0/bus0/target0/lun0/part1). The
-contents of /proc/partitions are used by mount(8) when
-mounting by volume label. If devfs is not mounted on /dev,
-then mount(8) will fail to find devices. The solution is to
-make sure that devfs is mounted on /dev. See above for how to
-do that.
-
-
-I have extra or incorrect entries in /dev
-
-You may have stale entries in your dev-state area. Check for a
-RESTORE configuration line in your devfsd configuration
-(typically /etc/devfsd.conf). If you have this line, check
-the contents of the specified directory for stale entries. Remove
-any entries which are incorrect, then reboot.
-
-
-I get "Unable to open initial console" messages at boot
-
-This usually happens when you don't have devfs automounted onto
-/dev at boot time, and there is no valid
-/dev/console entry on your root file-system. Create a valid
-/dev/console device node.
-
-
-
-
-
-Alternatives to devfs
-
-I've attempted to collate all the anti-devfs proposals and explain
-their limitations. Under construction.
-
-
-Why not just pass device create/remove events to a daemon?
-
-Here the suggestion is to develop an API in the kernel so that devices
-can register create and remove events, and a daemon listens for those
-events. The daemon would then populate/depopulate /dev (which
-resides on disc).
-
-This has several limitations:
-
-
-it only works for modules loaded and unloaded (or devices inserted
-and removed) after the kernel has finished booting. Without a database
-of events, there is no way the daemon could fully populate
-/dev
-
-
-if you add a database to this scheme, the question is then how to
-present that database to user-space. If you make it a list of strings
-with embedded event codes which are passed through a pipe to the
-daemon, then this is only of use to the daemon. I would argue that the
-natural way to present this data is via a filesystem (since many of
-the events will be of a hierarchical nature), such as devfs.
-Presenting the data as a filesystem makes it easy for the user to see
-what is available and also makes it easy to write scripts to scan the
-"database"
-
-
-the tight binding between device nodes and drivers is no longer
-possible (requiring the otherwise perfectly avoidable
-table lookups)
-
-
-you cannot catch inode lookup events on /dev which means
-that module autoloading requires device nodes to be created. This is a
-problem, particularly for drivers where only a few inodes are created
-from a potentially large set
-
-
-this technique can't be used when the root FS is mounted
-read-only
-
-
-
-
-Just implement a better scsidev
-
-This suggestion involves taking the scsidev programme and
-extending it to scan for all devices, not just SCSI devices. The
-scsidev programme works by scanning /proc/scsi
-
-Problems:
-
-
-the kernel does not currently provide a list of all devices
-available. Not all drivers register entries in /proc or
-generate kernel messages
-
-
-there is no uniform mechanism to register devices other than the
-devfs API
-
-
-implementing such an API is then the same as the
-proposal above
-
-
-
-
-Put /dev on a ramdisc
-
-This suggestion involves creating a ramdisc and populating it with
-device nodes and then mounting it over /dev.
-
-Problems:
-
-
-
-this doesn't help when mounting the root filesystem, since you
-still need a device node to do that
-
-
-if you want to use this technique for the root device node as
-well, you need to use initrd. This complicates the booting sequence
-and makes it significantly harder to administer and configure. The
-initrd is essentially opaque, robbing the system administrator of easy
-configuration
-
-
-insufficient information is available to correctly populate the
-ramdisc. So we come back to the
-proposal above to "solve" this
-
-
-a ramdisc-based solution would take more kernel memory, since the
-backing store would be (at best) normal VFS inodes and dentries, which
-take 284 bytes and 112 bytes, respectively, for each entry. Compare
-that to 72 bytes for devfs
-
-
-
-
-Do nothing: there's no problem
-
-Sometimes people can be heard to claim that the existing scheme is
-fine. This is what they're ignoring:
-
-
-device number size (8 bits each for major and minor) is a real
-limitation, and must be fixed somehow. Systems with large numbers of
-SCSI devices, for example, will continue to consume the remaining
-unallocated major numbers. USB will also need to push beyond the 8 bit
-minor limitation
-
-
-simply increasing the device number size is insufficient. Apart
-from causing a lot of pain, it doesn't solve the management issues
-of a /dev with thousands or more device nodes
-
-
-ignoring the problem of a huge /dev will not make it go
-away, and dismisses the legitimacy of a large number of people who
-want a dynamic /dev
-
-
-the standard response then becomes: "write a device management
-daemon", which brings us back to the
-proposal above
-
-
-
-
-What I don't like about devfs
-
-Here are some common complaints about devfs, and some suggestions and
-solutions that may make it more palatable for you. I can't please
-everybody, but I do try :-)
-
-I hate the naming scheme
-
-First, remember that no naming scheme will please everybody. You hate
-the scheme, others love it. Who's to say who's right and who's wrong?
-Ultimately, the person who writes the code gets to choose, and what
-exists now is a combination of the choices made by the
-devfs author and the
-kernel maintainer (Linus).
-
-However, not all is lost. If you want to create your own naming
-scheme, it is a simple matter to write a standalone script, hack
-devfsd, or write a script called by devfsd. You can create whatever
-naming scheme you like.
-
-Further, if you want to remove all traces of the devfs naming scheme
-from /dev, you can mount devfs elsewhere (say
-/devfs) and populate /dev with links into
-/devfs. This population can be automated using devfsd if you
-wish.
-
-You can even use the VFS binding facility to make the links, rather
-than using symbolic links. This way, you don't even have to see the
-"destination" of these symbolic links.
-
-Devfs puts policy into the kernel
-
-There's already policy in the kernel. Device numbers are in fact
-policy (why should the kernel dictate what device numbers I use?).
-Face it, some policy has to be in the kernel. The real difference
-between device names as policy and device numbers as policy is that
-no one will use device numbers directly, because device
-numbers are devoid of meaning to humans and are ugly. At least with
-the devfs device names, (even though you can add your own naming
-scheme) some people will use the devfs-supplied names directly. This
-offends some people :-)
-
-Devfs is bloatware
-
-This is not even remotely true. As shown above,
-both code and data size are quite modest.
-
-
-How to report bugs
-
-If you have (or think you have) a bug with devfs, please follow the
-steps below:
-
-
-
-make sure you have enabled debugging output when configuring your
-kernel. You will need to set (at least) the following config options:
-
-CONFIG_DEVFS_DEBUG=y
-CONFIG_DEBUG_KERNEL=y
-CONFIG_DEBUG_SLAB=y
-
-
-
-please make sure you have the latest devfs patches applied. The
-latest kernel version might not have the latest devfs patches applied
-yet (Linus is very busy)
-
-
-save a copy of your complete kernel logs (preferably by
-using the dmesg programme) for later inclusion in your bug
-report. You may need to use the -s switch to increase the
-internal buffer size so you can capture all the boot messages.
-Don't edit or trim the dmesg output
-
-
-
-
-try booting with devfs=dall passed to the kernel boot
-command line (read the documentation on your bootloader on how to do
-this), and save the result to a file. This may be quite verbose, and
-it may overflow the messages buffer, but try to get as much of it as
-you can
-
-
-send a copy of your devfsd configuration file(s)
-
-send the bug report to me first.
-Don't expect that I will see it if you post it to the linux-kernel
-mailing list. Include all the information listed above, plus
-anything else that you think might be relevant. Put the string
-devfs somewhere in the subject line, so my mail filters mark
-it as urgent
-
-
-
-
-Here is a general guide on how to ask questions in a way that greatly
-improves your chances of getting a reply:
-
-http://www.tuxedo.org/~esr/faqs/smart-questions.html. If you have
-a bug to report, you should also read
-
-http://www.chiark.greenend.org.uk/~sgtatham/bugs.html.
-
-
-Strange kernel messages
-
-You may see devfs-related messages in your kernel logs. Below are some
-messages and what they mean (and what you should do about them, if
-anything).
-
-
-
-devfs_register(fred): could not append to parent, err: -17
-
-You need to check what the error code means, but usually 17 means
-EEXIST. This means that a driver attempted to create an entry
-fred in a directory, but there already was an entry with that
-name. This is often caused by flawed boot scripts which untar a bunch
-of inodes into /dev, as a way to restore permissions. This
-message is harmless, as the device nodes will still
-provide access to the driver (unless you use the devfs=only
-boot option, which is only for dedicated souls:-). If you want to get
-rid of these annoying messages, upgrade to devfsd-v1.3.20 and use the
-recommended RESTORE directive to restore permissions.
-
-
-devfs_mk_dir(bill): using old entry in dir: c1808724 ""
-
-This is similar to the message above, except that a driver attempted
-to create a directory named bill, and the parent directory
-has an entry with the same name. In this case, to ensure that drivers
-continue to work properly, the old entry is re-used and given to the
-driver. In 2.5 kernels, the driver is given a NULL entry, and thus,
-under rare circumstances, may not create the require device nodes.
-The solution is the same as above.
-
-
-
-
-
-Compilation problems with devfsd
-
-Usually, you can compile devfsd just by typing in
-make in the source directory, followed by a make
-install (as root). Sometimes, you may have problems, particularly
-on broken configurations.
-
-
-
-error messages relating to DEVFSD_NOTIFY_DELETE
-
-This happened because you have an ancient set of kernel headers
-installed in /usr/include/linux or /usr/src/linux.
-Install kernel 2.4.10 or later. You may need to pass the
-KERNEL_DIR variable to make (if you did not install
-the new kernel sources as /usr/src/linux), or you may copy
-the devfs_fs.h file in the kernel source tree into
-/usr/include/linux.
-
-
-
-
------------------------------------------------------------------------------
-
-
-Other resources
-
-
-
-Douglas Gilbert has written a useful document at
-
-http://www.torque.net/sg/devfs_scsi.html which
-explores the SCSI subsystem and how it interacts with devfs
-
-
-Douglas Gilbert has written another useful document at
-
-http://www.torque.net/scsi/SCSI-2.4-HOWTO/ which
-discusses the Linux SCSI subsystem in 2.4.
-
-
-Johannes Erdfelt has started a discussion paper on Linux and
-hot-swap devices, describing what the requirements are for a scalable
-solution and how and why he's used devfs+devfsd. Note that this is an
-early draft only, available in plain text form at:
-
-http://johannes.erdfelt.com/hotswap.txt.
-Johannes has promised a HTML version will follow.
-
-
-I presented an invited
-paper
-at the
-
-2nd Annual Storage Management Workshop held in Miamia, Florida,
-U.S.A. in October 2000.
-
-
-
-
------------------------------------------------------------------------------
-
-
-Translations of this document
-
-This document has been translated into other languages.
-
-
-
-
-The document master (in English) by rgooch@atnf.csiro.au is
-available at
-
-http://www.atnf.csiro.au/~rgooch/linux/docs/devfs.html
-
-
-
-A Korean translation by viatoris@nownuri.net is available at
-
-http://your.destiny.pe.kr/devfs/devfs.html
-
-
-
-
------------------------------------------------------------------------------
-Most flags courtesy of ITA's
-Flags of All Countries
-used with permission.
diff --git a/Documentation/filesystems/devfs/ToDo b/Documentation/filesystems/devfs/ToDo
deleted file mode 100644
index afd5a8f2c19b..000000000000
--- a/Documentation/filesystems/devfs/ToDo
+++ /dev/null
@@ -1,40 +0,0 @@
- Device File System (devfs) ToDo List
-
- Richard Gooch <rgooch@atnf.csiro.au>
-
- 3-JUL-2000
-
-This is a list of things to be done for better devfs support in the
-Linux kernel. If you'd like to contribute to the devfs, please have a
-look at this list for anything that is unallocated. Also, if there are
-items missing (surely), please contact me so I can add them to the
-list (preferably with your name attached to them:-).
-
-
-- >256 ptys
- Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
-
-- Amiga floppy driver (drivers/block/amiflop.c)
-
-- Atari floppy driver (drivers/block/ataflop.c)
-
-- SWIM3 (Super Woz Integrated Machine 3) floppy driver (drivers/block/swim3.c)
-
-- Amiga ZorroII ramdisc driver (drivers/block/z2ram.c)
-
-- Parallel port ATAPI CD-ROM (drivers/block/paride/pcd.c)
-
-- Parallel port ATAPI floppy (drivers/block/paride/pf.c)
-
-- AP1000 block driver (drivers/ap1000/ap.c, drivers/ap1000/ddv.c)
-
-- Archimedes floppy (drivers/acorn/block/fd1772.c)
-
-- MFM hard drive (drivers/acorn/block/mfmhd.c)
-
-- I2O block device (drivers/message/i2o/i2o_block.c)
-
-- ST-RAM device (arch/m68k/atari/stram.c)
-
-- Raw devices
-
diff --git a/Documentation/filesystems/devfs/boot-options b/Documentation/filesystems/devfs/boot-options
deleted file mode 100644
index df3d33b03e0a..000000000000
--- a/Documentation/filesystems/devfs/boot-options
+++ /dev/null
@@ -1,65 +0,0 @@
-/* -*- auto-fill -*- */
-
- Device File System (devfs) Boot Options
-
- Richard Gooch <rgooch@atnf.csiro.au>
-
- 18-AUG-2001
-
-
-When CONFIG_DEVFS_DEBUG is enabled, you can pass several boot options
-to the kernel to debug devfs. The boot options are prefixed by
-"devfs=", and are separated by commas. Spaces are not allowed. The
-syntax looks like this:
-
-devfs=<option1>,<option2>,<option3>
-
-and so on. For example, if you wanted to turn on debugging for module
-load requests and device registration, you would do:
-
-devfs=dmod,dreg
-
-You may prefix "no" to any option. This will invert the option.
-
-
-Debugging Options
-=================
-
-These requires CONFIG_DEVFS_DEBUG to be enabled.
-Note that all debugging options have 'd' as the first character. By
-default all options are off. All debugging output is sent to the
-kernel logs. The debugging options do not take effect until the devfs
-version message appears (just prior to the root filesystem being
-mounted).
-
-These are the options:
-
-dmod print module load requests to <request_module>
-
-dreg print device register requests to <devfs_register>
-
-dunreg print device unregister requests to <devfs_unregister>
-
-dchange print device change requests to <devfs_set_flags>
-
-dilookup print inode lookup requests
-
-diget print VFS inode allocations
-
-diunlink print inode unlinks
-
-dichange print inode changes
-
-dimknod print calls to mknod(2)
-
-dall some debugging turned on
-
-
-Other Options
-=============
-
-These control the default behaviour of devfs. The options are:
-
-mount mount devfs onto /dev at boot time
-
-only disable non-devfs device nodes for devfs-capable drivers
diff --git a/Documentation/initrd.txt b/Documentation/initrd.txt
index 7de1c80cd719..b1b6440237a6 100644
--- a/Documentation/initrd.txt
+++ b/Documentation/initrd.txt
@@ -67,8 +67,7 @@ initrd adds the following new options:
as the last process has closed it, all data is freed and /dev/initrd
can't be opened anymore.
- root=/dev/ram0 (without devfs)
- root=/dev/rd/0 (with devfs)
+ root=/dev/ram0
initrd is mounted as root, and the normal boot procedure is followed,
with the RAM disk still mounted as root.
@@ -90,8 +89,7 @@ you're building an install floppy), the root file system creation
procedure should create the /initrd directory.
If initrd will not be mounted in some cases, its content is still
-accessible if the following device has been created (note that this
-does not work if using devfs):
+accessible if the following device has been created:
# mknod /dev/initrd b 1 250
# chmod 400 /dev/initrd
@@ -119,8 +117,7 @@ We'll describe the loopback device method:
(if space is critical, you may want to use the Minix FS instead of Ext2)
3) mount the file system, e.g.
# mount -t ext2 -o loop initrd /mnt
- 4) create the console device (not necessary if using devfs, but it can't
- hurt to do it anyway):
+ 4) create the console device:
# mkdir /mnt/dev
# mknod /mnt/dev/console c 5 1
5) copy all the files that are needed to properly use the initrd
@@ -152,12 +149,7 @@ have to be given:
root=/dev/ram0 init=/linuxrc rw
-if not using devfs, or
-
- root=/dev/rd/0 init=/linuxrc rw
-
-if using devfs. (rw is only necessary if writing to the initrd file
-system.)
+(rw is only necessary if writing to the initrd file system.)
With LOADLIN, you simply execute
@@ -217,9 +209,9 @@ following command:
# exec chroot . what-follows <dev/console >dev/console 2>&1
Where what-follows is a program under the new root, e.g. /sbin/init
-If the new root file system will be used with devfs and has no valid
-/dev directory, devfs must be mounted before invoking chroot in order to
-provide /dev/console.
+If the new root file system will be used with udev and has no valid
+/dev directory, udev must be initialized before invoking chroot in order
+to provide /dev/console.
Note: implementation details of pivot_root may change with time. In order
to ensure compatibility, the following points should be observed:
@@ -236,7 +228,7 @@ Now, the initrd can be unmounted and the memory allocated by the RAM
disk can be freed:
# umount /initrd
-# blockdev --flushbufs /dev/ram0 # /dev/rd/0 if using devfs
+# blockdev --flushbufs /dev/ram0
It is also possible to use initrd with an NFS-mounted root, see the
pivot_root(8) man page for details.
diff --git a/Documentation/ioctl-number.txt b/Documentation/ioctl-number.txt
index 1543802ef53e..edc04d74ae23 100644
--- a/Documentation/ioctl-number.txt
+++ b/Documentation/ioctl-number.txt
@@ -119,7 +119,6 @@ Code Seq# Include File Comments
'c' 00-7F linux/comstats.h conflict!
'c' 00-7F linux/coda.h conflict!
'd' 00-FF linux/char/drm/drm/h conflict!
-'d' 00-1F linux/devfs_fs.h conflict!
'd' 00-DF linux/video_decoder.h conflict!
'd' F0-FF linux/digi1.h
'e' all linux/digi1.h conflict!
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 2e352a605fcf..86e9282d1c20 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -35,7 +35,6 @@ parameter is applicable:
APM Advanced Power Management support is enabled.
AX25 Appropriate AX.25 support is enabled.
CD Appropriate CD support is enabled.
- DEVFS devfs support is enabled.
DRM Direct Rendering Management support is enabled.
EDD BIOS Enhanced Disk Drive Services (EDD) is enabled
EFI EFI Partitioning (GPT) is enabled
@@ -440,9 +439,6 @@ running once the system is up.
Format: <area>[,<node>]
See also Documentation/networking/decnet.txt.
- devfs= [DEVFS]
- See Documentation/filesystems/devfs/boot-options.
-
dhash_entries= [KNL]
Set number of hash buckets for dentry cache.
@@ -1669,6 +1665,10 @@ running once the system is up.
usbhid.mousepoll=
[USBHID] The interval which mice are to be polled at.
+ vdso= [IA-32]
+ vdso=1: enable VDSO (default)
+ vdso=0: disable VDSO mapping
+
video= [FB] Frame buffer configuration
See Documentation/fb/modedb.txt.
@@ -1685,9 +1685,14 @@ running once the system is up.
decrease the size and leave more room for directly
mapped kernel RAM.
- vmhalt= [KNL,S390]
+ vmhalt= [KNL,S390] Perform z/VM CP command after system halt.
+ Format: <command>
+
+ vmpanic= [KNL,S390] Perform z/VM CP command after kernel panic.
+ Format: <command>
- vmpoff= [KNL,S390]
+ vmpoff= [KNL,S390] Perform z/VM CP command after power off.
+ Format: <command>
waveartist= [HW,OSS]
Format: <io>,<irq>,<dma>,<dma2>
diff --git a/Documentation/keys-request-key.txt b/Documentation/keys-request-key.txt
index 22488d791168..c1f64fdf84cb 100644
--- a/Documentation/keys-request-key.txt
+++ b/Documentation/keys-request-key.txt
@@ -3,16 +3,23 @@
===================
The key request service is part of the key retention service (refer to
-Documentation/keys.txt). This document explains more fully how that the
-requesting algorithm works.
+Documentation/keys.txt). This document explains more fully how the requesting
+algorithm works.
The process starts by either the kernel requesting a service by calling
-request_key():
+request_key*():
struct key *request_key(const struct key_type *type,
const char *description,
const char *callout_string);
+or:
+
+ struct key *request_key_with_auxdata(const struct key_type *type,
+ const char *description,
+ const char *callout_string,
+ void *aux);
+
Or by userspace invoking the request_key system call:
key_serial_t request_key(const char *type,
@@ -20,16 +27,26 @@ Or by userspace invoking the request_key system call:
const char *callout_info,
key_serial_t dest_keyring);
-The main difference between the two access points is that the in-kernel
-interface does not need to link the key to a keyring to prevent it from being
-immediately destroyed. The kernel interface returns a pointer directly to the
-key, and it's up to the caller to destroy the key.
+The main difference between the access points is that the in-kernel interface
+does not need to link the key to a keyring to prevent it from being immediately
+destroyed. The kernel interface returns a pointer directly to the key, and
+it's up to the caller to destroy the key.
+
+The request_key_with_auxdata() call is like the in-kernel request_key() call,
+except that it permits auxiliary data to be passed to the upcaller (the default
+is NULL). This is only useful for those key types that define their own upcall
+mechanism rather than using /sbin/request-key.
The userspace interface links the key to a keyring associated with the process
to prevent the key from going away, and returns the serial number of the key to
the caller.
+The following example assumes that the key types involved don't define their
+own upcall mechanisms. If they do, then those should be substituted for the
+forking and execution of /sbin/request-key.
+
+
===========
THE PROCESS
===========
@@ -40,8 +57,8 @@ A request proceeds in the following manner:
interface].
(2) request_key() searches the process's subscribed keyrings to see if there's
- a suitable key there. If there is, it returns the key. If there isn't, and
- callout_info is not set, an error is returned. Otherwise the process
+ a suitable key there. If there is, it returns the key. If there isn't,
+ and callout_info is not set, an error is returned. Otherwise the process
proceeds to the next step.
(3) request_key() sees that A doesn't have the desired key yet, so it creates
@@ -62,7 +79,7 @@ A request proceeds in the following manner:
instantiation.
(7) The program may want to access another key from A's context (say a
- Kerberos TGT key). It just requests the appropriate key, and the keyring
+ Kerberos TGT key). It just requests the appropriate key, and the keyring
search notes that the session keyring has auth key V in its bottom level.
This will permit it to then search the keyrings of process A with the
@@ -79,10 +96,11 @@ A request proceeds in the following manner:
(10) The program then exits 0 and request_key() deletes key V and returns key
U to the caller.
-This also extends further. If key W (step 7 above) didn't exist, key W would be
-created uninstantiated, another auth key (X) would be created (as per step 3)
-and another copy of /sbin/request-key spawned (as per step 4); but the context
-specified by auth key X will still be process A, as it was in auth key V.
+This also extends further. If key W (step 7 above) didn't exist, key W would
+be created uninstantiated, another auth key (X) would be created (as per step
+3) and another copy of /sbin/request-key spawned (as per step 4); but the
+context specified by auth key X will still be process A, as it was in auth key
+V.
This is because process A's keyrings can't simply be attached to
/sbin/request-key at the appropriate places because (a) execve will discard two
@@ -118,17 +136,17 @@ A search of any particular keyring proceeds in the following fashion:
(2) It considers all the non-keyring keys within that keyring and, if any key
matches the criteria specified, calls key_permission(SEARCH) on it to see
- if the key is allowed to be found. If it is, that key is returned; if
+ if the key is allowed to be found. If it is, that key is returned; if
not, the search continues, and the error code is retained if of higher
priority than the one currently set.
(3) It then considers all the keyring-type keys in the keyring it's currently
- searching. It calls key_permission(SEARCH) on each keyring, and if this
+ searching. It calls key_permission(SEARCH) on each keyring, and if this
grants permission, it recurses, executing steps (2) and (3) on that
keyring.
The process stops immediately a valid key is found with permission granted to
-use it. Any error from a previous match attempt is discarded and the key is
+use it. Any error from a previous match attempt is discarded and the key is
returned.
When search_process_keyrings() is invoked, it performs the following searches
@@ -153,7 +171,7 @@ The moment one succeeds, all pending errors are discarded and the found key is
returned.
Only if all these fail does the whole thing fail with the highest priority
-error. Note that several errors may have come from LSM.
+error. Note that several errors may have come from LSM.
The error priority is:
diff --git a/Documentation/keys.txt b/Documentation/keys.txt
index 61c0fad2fe2f..e373f0212843 100644
--- a/Documentation/keys.txt
+++ b/Documentation/keys.txt
@@ -780,6 +780,17 @@ payload contents" for more information.
See also Documentation/keys-request-key.txt.
+(*) To search for a key, passing auxiliary data to the upcaller, call:
+
+ struct key *request_key_with_auxdata(const struct key_type *type,
+ const char *description,
+ const char *callout_string,
+ void *aux);
+
+ This is identical to request_key(), except that the auxiliary data is
+ passed to the key_type->request_key() op if it exists.
+
+
(*) When it is no longer required, the key should be released using:
void key_put(struct key *key);
@@ -1031,6 +1042,24 @@ The structure has a number of fields, some of which are mandatory:
as might happen when the userspace buffer is accessed.
+ (*) int (*request_key)(struct key *key, struct key *authkey, const char *op,
+ void *aux);
+
+ This method is optional. If provided, request_key() and
+ request_key_with_auxdata() will invoke this function rather than
+ upcalling to /sbin/request-key to operate upon a key of this type.
+
+ The aux parameter is as passed to request_key_with_auxdata() or is NULL
+ otherwise. Also passed are the key to be operated upon, the
+ authorisation key for this operation and the operation type (currently
+ only "create").
+
+ This function should return only when the upcall is complete. Upon return
+ the authorisation key will be revoked, and the target key will be
+ negatively instantiated if it is still uninstantiated. The error will be
+ returned to the caller of request_key*().
+
+
============================
REQUEST-KEY CALLBACK SERVICE
============================
diff --git a/Documentation/pi-futex.txt b/Documentation/pi-futex.txt
new file mode 100644
index 000000000000..5d61dacd21f6
--- /dev/null
+++ b/Documentation/pi-futex.txt
@@ -0,0 +1,121 @@
+Lightweight PI-futexes
+----------------------
+
+We are calling them lightweight for 3 reasons:
+
+ - in the user-space fastpath a PI-enabled futex involves no kernel work
+ (or any other PI complexity) at all. No registration, no extra kernel
+ calls - just pure fast atomic ops in userspace.
+
+ - even in the slowpath, the system call and scheduling pattern is very
+ similar to normal futexes.
+
+ - the in-kernel PI implementation is streamlined around the mutex
+ abstraction, with strict rules that keep the implementation
+ relatively simple: only a single owner may own a lock (i.e. no
+ read-write lock support), only the owner may unlock a lock, no
+ recursive locking, etc.
+
+Priority Inheritance - why?
+---------------------------
+
+The short reply: user-space PI helps achieving/improving determinism for
+user-space applications. In the best-case, it can help achieve
+determinism and well-bound latencies. Even in the worst-case, PI will
+improve the statistical distribution of locking related application
+delays.
+
+The longer reply:
+-----------------
+
+Firstly, sharing locks between multiple tasks is a common programming
+technique that often cannot be replaced with lockless algorithms. As we
+can see it in the kernel [which is a quite complex program in itself],
+lockless structures are rather the exception than the norm - the current
+ratio of lockless vs. locky code for shared data structures is somewhere
+between 1:10 and 1:100. Lockless is hard, and the complexity of lockless
+algorithms often endangers to ability to do robust reviews of said code.
+I.e. critical RT apps often choose lock structures to protect critical
+data structures, instead of lockless algorithms. Furthermore, there are
+cases (like shared hardware, or other resource limits) where lockless
+access is mathematically impossible.
+
+Media players (such as Jack) are an example of reasonable application
+design with multiple tasks (with multiple priority levels) sharing
+short-held locks: for example, a highprio audio playback thread is
+combined with medium-prio construct-audio-data threads and low-prio
+display-colory-stuff threads. Add video and decoding to the mix and
+we've got even more priority levels.
+
+So once we accept that synchronization objects (locks) are an
+unavoidable fact of life, and once we accept that multi-task userspace
+apps have a very fair expectation of being able to use locks, we've got
+to think about how to offer the option of a deterministic locking
+implementation to user-space.
+
+Most of the technical counter-arguments against doing priority
+inheritance only apply to kernel-space locks. But user-space locks are
+different, there we cannot disable interrupts or make the task
+non-preemptible in a critical section, so the 'use spinlocks' argument
+does not apply (user-space spinlocks have the same priority inversion
+problems as other user-space locking constructs). Fact is, pretty much
+the only technique that currently enables good determinism for userspace
+locks (such as futex-based pthread mutexes) is priority inheritance:
+
+Currently (without PI), if a high-prio and a low-prio task shares a lock
+[this is a quite common scenario for most non-trivial RT applications],
+even if all critical sections are coded carefully to be deterministic
+(i.e. all critical sections are short in duration and only execute a
+limited number of instructions), the kernel cannot guarantee any
+deterministic execution of the high-prio task: any medium-priority task
+could preempt the low-prio task while it holds the shared lock and
+executes the critical section, and could delay it indefinitely.
+
+Implementation:
+---------------
+
+As mentioned before, the userspace fastpath of PI-enabled pthread
+mutexes involves no kernel work at all - they behave quite similarly to
+normal futex-based locks: a 0 value means unlocked, and a value==TID
+means locked. (This is the same method as used by list-based robust
+futexes.) Userspace uses atomic ops to lock/unlock these mutexes without
+entering the kernel.
+
+To handle the slowpath, we have added two new futex ops:
+
+ FUTEX_LOCK_PI
+ FUTEX_UNLOCK_PI
+
+If the lock-acquire fastpath fails, [i.e. an atomic transition from 0 to
+TID fails], then FUTEX_LOCK_PI is called. The kernel does all the
+remaining work: if there is no futex-queue attached to the futex address
+yet then the code looks up the task that owns the futex [it has put its
+own TID into the futex value], and attaches a 'PI state' structure to
+the futex-queue. The pi_state includes an rt-mutex, which is a PI-aware,
+kernel-based synchronization object. The 'other' task is made the owner
+of the rt-mutex, and the FUTEX_WAITERS bit is atomically set in the
+futex value. Then this task tries to lock the rt-mutex, on which it
+blocks. Once it returns, it has the mutex acquired, and it sets the
+futex value to its own TID and returns. Userspace has no other work to
+perform - it now owns the lock, and futex value contains
+FUTEX_WAITERS|TID.
+
+If the unlock side fastpath succeeds, [i.e. userspace manages to do a
+TID -> 0 atomic transition of the futex value], then no kernel work is
+triggered.
+
+If the unlock fastpath fails (because the FUTEX_WAITERS bit is set),
+then FUTEX_UNLOCK_PI is called, and the kernel unlocks the futex on the
+behalf of userspace - and it also unlocks the attached
+pi_state->rt_mutex and thus wakes up any potential waiters.
+
+Note that under this approach, contrary to previous PI-futex approaches,
+there is no prior 'registration' of a PI-futex. [which is not quite
+possible anyway, due to existing ABI properties of pthread mutexes.]
+
+Also, under this scheme, 'robustness' and 'PI' are two orthogonal
+properties of futexes, and all four combinations are possible: futex,
+robust-futex, PI-futex, robust+PI-futex.
+
+More details about priority inheritance can be found in
+Documentation/rtmutex.txt.
diff --git a/Documentation/robust-futexes.txt b/Documentation/robust-futexes.txt
index df82d75245a0..76e8064b8c3a 100644
--- a/Documentation/robust-futexes.txt
+++ b/Documentation/robust-futexes.txt
@@ -95,7 +95,7 @@ comparison. If the thread has registered a list, then normally the list
is empty. If the thread/process crashed or terminated in some incorrect
way then the list might be non-empty: in this case the kernel carefully
walks the list [not trusting it], and marks all locks that are owned by
-this thread with the FUTEX_OWNER_DEAD bit, and wakes up one waiter (if
+this thread with the FUTEX_OWNER_DIED bit, and wakes up one waiter (if
any).
The list is guaranteed to be private and per-thread at do_exit() time,
diff --git a/Documentation/rt-mutex-design.txt b/Documentation/rt-mutex-design.txt
new file mode 100644
index 000000000000..c472ffacc2f6
--- /dev/null
+++ b/Documentation/rt-mutex-design.txt
@@ -0,0 +1,781 @@
+#
+# Copyright (c) 2006 Steven Rostedt
+# Licensed under the GNU Free Documentation License, Version 1.2
+#
+
+RT-mutex implementation design
+------------------------------
+
+This document tries to describe the design of the rtmutex.c implementation.
+It doesn't describe the reasons why rtmutex.c exists. For that please see
+Documentation/rt-mutex.txt. Although this document does explain problems
+that happen without this code, but that is in the concept to understand
+what the code actually is doing.
+
+The goal of this document is to help others understand the priority
+inheritance (PI) algorithm that is used, as well as reasons for the
+decisions that were made to implement PI in the manner that was done.
+
+
+Unbounded Priority Inversion
+----------------------------
+
+Priority inversion is when a lower priority process executes while a higher
+priority process wants to run. This happens for several reasons, and
+most of the time it can't be helped. Anytime a high priority process wants
+to use a resource that a lower priority process has (a mutex for example),
+the high priority process must wait until the lower priority process is done
+with the resource. This is a priority inversion. What we want to prevent
+is something called unbounded priority inversion. That is when the high
+priority process is prevented from running by a lower priority process for
+an undetermined amount of time.
+
+The classic example of unbounded priority inversion is were you have three
+processes, let's call them processes A, B, and C, where A is the highest
+priority process, C is the lowest, and B is in between. A tries to grab a lock
+that C owns and must wait and lets C run to release the lock. But in the
+meantime, B executes, and since B is of a higher priority than C, it preempts C,
+but by doing so, it is in fact preempting A which is a higher priority process.
+Now there's no way of knowing how long A will be sleeping waiting for C
+to release the lock, because for all we know, B is a CPU hog and will
+never give C a chance to release the lock. This is called unbounded priority
+inversion.
+
+Here's a little ASCII art to show the problem.
+
+ grab lock L1 (owned by C)
+ |
+A ---+
+ C preempted by B
+ |
+C +----+
+
+B +-------->
+ B now keeps A from running.
+
+
+Priority Inheritance (PI)
+-------------------------
+
+There are several ways to solve this issue, but other ways are out of scope
+for this document. Here we only discuss PI.
+
+PI is where a process inherits the priority of another process if the other
+process blocks on a lock owned by the current process. To make this easier
+to understand, let's use the previous example, with processes A, B, and C again.
+
+This time, when A blocks on the lock owned by C, C would inherit the priority
+of A. So now if B becomes runnable, it would not preempt C, since C now has
+the high priority of A. As soon as C releases the lock, it loses its
+inherited priority, and A then can continue with the resource that C had.
+
+Terminology
+-----------
+
+Here I explain some terminology that is used in this document to help describe
+the design that is used to implement PI.
+
+PI chain - The PI chain is an ordered series of locks and processes that cause
+ processes to inherit priorities from a previous process that is
+ blocked on one of its locks. This is described in more detail
+ later in this document.
+
+mutex - In this document, to differentiate from locks that implement
+ PI and spin locks that are used in the PI code, from now on
+ the PI locks will be called a mutex.
+
+lock - In this document from now on, I will use the term lock when
+ referring to spin locks that are used to protect parts of the PI
+ algorithm. These locks disable preemption for UP (when
+ CONFIG_PREEMPT is enabled) and on SMP prevents multiple CPUs from
+ entering critical sections simultaneously.
+
+spin lock - Same as lock above.
+
+waiter - A waiter is a struct that is stored on the stack of a blocked
+ process. Since the scope of the waiter is within the code for
+ a process being blocked on the mutex, it is fine to allocate
+ the waiter on the process's stack (local variable). This
+ structure holds a pointer to the task, as well as the mutex that
+ the task is blocked on. It also has the plist node structures to
+ place the task in the waiter_list of a mutex as well as the
+ pi_list of a mutex owner task (described below).
+
+ waiter is sometimes used in reference to the task that is waiting
+ on a mutex. This is the same as waiter->task.
+
+waiters - A list of processes that are blocked on a mutex.
+
+top waiter - The highest priority process waiting on a specific mutex.
+
+top pi waiter - The highest priority process waiting on one of the mutexes
+ that a specific process owns.
+
+Note: task and process are used interchangeably in this document, mostly to
+ differentiate between two processes that are being described together.
+
+
+PI chain
+--------
+
+The PI chain is a list of processes and mutexes that may cause priority
+inheritance to take place. Multiple chains may converge, but a chain
+would never diverge, since a process can't be blocked on more than one
+mutex at a time.
+
+Example:
+
+ Process: A, B, C, D, E
+ Mutexes: L1, L2, L3, L4
+
+ A owns: L1
+ B blocked on L1
+ B owns L2
+ C blocked on L2
+ C owns L3
+ D blocked on L3
+ D owns L4
+ E blocked on L4
+
+The chain would be:
+
+ E->L4->D->L3->C->L2->B->L1->A
+
+To show where two chains merge, we could add another process F and
+another mutex L5 where B owns L5 and F is blocked on mutex L5.
+
+The chain for F would be:
+
+ F->L5->B->L1->A
+
+Since a process may own more than one mutex, but never be blocked on more than
+one, the chains merge.
+
+Here we show both chains:
+
+ E->L4->D->L3->C->L2-+
+ |
+ +->B->L1->A
+ |
+ F->L5-+
+
+For PI to work, the processes at the right end of these chains (or we may
+also call it the Top of the chain) must be equal to or higher in priority
+than the processes to the left or below in the chain.
+
+Also since a mutex may have more than one process blocked on it, we can
+have multiple chains merge at mutexes. If we add another process G that is
+blocked on mutex L2:
+
+ G->L2->B->L1->A
+
+And once again, to show how this can grow I will show the merging chains
+again.
+
+ E->L4->D->L3->C-+
+ +->L2-+
+ | |
+ G-+ +->B->L1->A
+ |
+ F->L5-+
+
+
+Plist
+-----
+
+Before I go further and talk about how the PI chain is stored through lists
+on both mutexes and processes, I'll explain the plist. This is similar to
+the struct list_head functionality that is already in the kernel.
+The implementation of plist is out of scope for this document, but it is
+very important to understand what it does.
+
+There are a few differences between plist and list, the most important one
+being that plist is a priority sorted linked list. This means that the
+priorities of the plist are sorted, such that it takes O(1) to retrieve the
+highest priority item in the list. Obviously this is useful to store processes
+based on their priorities.
+
+Another difference, which is important for implementation, is that, unlike
+list, the head of the list is a different element than the nodes of a list.
+So the head of the list is declared as struct plist_head and nodes that will
+be added to the list are declared as struct plist_node.
+
+
+Mutex Waiter List
+-----------------
+
+Every mutex keeps track of all the waiters that are blocked on itself. The mutex
+has a plist to store these waiters by priority. This list is protected by
+a spin lock that is located in the struct of the mutex. This lock is called
+wait_lock. Since the modification of the waiter list is never done in
+interrupt context, the wait_lock can be taken without disabling interrupts.
+
+
+Task PI List
+------------
+
+To keep track of the PI chains, each process has its own PI list. This is
+a list of all top waiters of the mutexes that are owned by the process.
+Note that this list only holds the top waiters and not all waiters that are
+blocked on mutexes owned by the process.
+
+The top of the task's PI list is always the highest priority task that
+is waiting on a mutex that is owned by the task. So if the task has
+inherited a priority, it will always be the priority of the task that is
+at the top of this list.
+
+This list is stored in the task structure of a process as a plist called
+pi_list. This list is protected by a spin lock also in the task structure,
+called pi_lock. This lock may also be taken in interrupt context, so when
+locking the pi_lock, interrupts must be disabled.
+
+
+Depth of the PI Chain
+---------------------
+
+The maximum depth of the PI chain is not dynamic, and could actually be
+defined. But is very complex to figure it out, since it depends on all
+the nesting of mutexes. Let's look at the example where we have 3 mutexes,
+L1, L2, and L3, and four separate functions func1, func2, func3 and func4.
+The following shows a locking order of L1->L2->L3, but may not actually
+be directly nested that way.
+
+void func1(void)
+{
+ mutex_lock(L1);
+
+ /* do anything */
+
+ mutex_unlock(L1);
+}
+
+void func2(void)
+{
+ mutex_lock(L1);
+ mutex_lock(L2);
+
+ /* do something */
+
+ mutex_unlock(L2);
+ mutex_unlock(L1);
+}
+
+void func3(void)
+{
+ mutex_lock(L2);
+ mutex_lock(L3);
+
+ /* do something else */
+
+ mutex_unlock(L3);
+ mutex_unlock(L2);
+}
+
+void func4(void)
+{
+ mutex_lock(L3);
+
+ /* do something again */
+
+ mutex_unlock(L3);
+}
+
+Now we add 4 processes that run each of these functions separately.
+Processes A, B, C, and D which run functions func1, func2, func3 and func4
+respectively, and such that D runs first and A last. With D being preempted
+in func4 in the "do something again" area, we have a locking that follows:
+
+D owns L3
+ C blocked on L3
+ C owns L2
+ B blocked on L2
+ B owns L1
+ A blocked on L1
+
+And thus we have the chain A->L1->B->L2->C->L3->D.
+
+This gives us a PI depth of 4 (four processes), but looking at any of the
+functions individually, it seems as though they only have at most a locking
+depth of two. So, although the locking depth is defined at compile time,
+it still is very difficult to find the possibilities of that depth.
+
+Now since mutexes can be defined by user-land applications, we don't want a DOS
+type of application that nests large amounts of mutexes to create a large
+PI chain, and have the code holding spin locks while looking at a large
+amount of data. So to prevent this, the implementation not only implements
+a maximum lock depth, but also only holds at most two different locks at a
+time, as it walks the PI chain. More about this below.
+
+
+Mutex owner and flags
+---------------------
+
+The mutex structure contains a pointer to the owner of the mutex. If the
+mutex is not owned, this owner is set to NULL. Since all architectures
+have the task structure on at least a four byte alignment (and if this is
+not true, the rtmutex.c code will be broken!), this allows for the two
+least significant bits to be used as flags. This part is also described
+in Documentation/rt-mutex.txt, but will also be briefly described here.
+
+Bit 0 is used as the "Pending Owner" flag. This is described later.
+Bit 1 is used as the "Has Waiters" flags. This is also described later
+ in more detail, but is set whenever there are waiters on a mutex.
+
+
+cmpxchg Tricks
+--------------
+
+Some architectures implement an atomic cmpxchg (Compare and Exchange). This
+is used (when applicable) to keep the fast path of grabbing and releasing
+mutexes short.
+
+cmpxchg is basically the following function performed atomically:
+
+unsigned long _cmpxchg(unsigned long *A, unsigned long *B, unsigned long *C)
+{
+ unsigned long T = *A;
+ if (*A == *B) {
+ *A = *C;
+ }
+ return T;
+}
+#define cmpxchg(a,b,c) _cmpxchg(&a,&b,&c)
+
+This is really nice to have, since it allows you to only update a variable
+if the variable is what you expect it to be. You know if it succeeded if
+the return value (the old value of A) is equal to B.
+
+The macro rt_mutex_cmpxchg is used to try to lock and unlock mutexes. If
+the architecture does not support CMPXCHG, then this macro is simply set
+to fail every time. But if CMPXCHG is supported, then this will
+help out extremely to keep the fast path short.
+
+The use of rt_mutex_cmpxchg with the flags in the owner field help optimize
+the system for architectures that support it. This will also be explained
+later in this document.
+
+
+Priority adjustments
+--------------------
+
+The implementation of the PI code in rtmutex.c has several places that a
+process must adjust its priority. With the help of the pi_list of a
+process this is rather easy to know what needs to be adjusted.
+
+The functions implementing the task adjustments are rt_mutex_adjust_prio,
+__rt_mutex_adjust_prio (same as the former, but expects the task pi_lock
+to already be taken), rt_mutex_get_prio, and rt_mutex_setprio.
+
+rt_mutex_getprio and rt_mutex_setprio are only used in __rt_mutex_adjust_prio.
+
+rt_mutex_getprio returns the priority that the task should have. Either the
+task's own normal priority, or if a process of a higher priority is waiting on
+a mutex owned by the task, then that higher priority should be returned.
+Since the pi_list of a task holds an order by priority list of all the top
+waiters of all the mutexes that the task owns, rt_mutex_getprio simply needs
+to compare the top pi waiter to its own normal priority, and return the higher
+priority back.
+
+(Note: if looking at the code, you will notice that the lower number of
+ prio is returned. This is because the prio field in the task structure
+ is an inverse order of the actual priority. So a "prio" of 5 is
+ of higher priority than a "prio" of 10.)
+
+__rt_mutex_adjust_prio examines the result of rt_mutex_getprio, and if the
+result does not equal the task's current priority, then rt_mutex_setprio
+is called to adjust the priority of the task to the new priority.
+Note that rt_mutex_setprio is defined in kernel/sched.c to implement the
+actual change in priority.
+
+It is interesting to note that __rt_mutex_adjust_prio can either increase
+or decrease the priority of the task. In the case that a higher priority
+process has just blocked on a mutex owned by the task, __rt_mutex_adjust_prio
+would increase/boost the task's priority. But if a higher priority task
+were for some reason to leave the mutex (timeout or signal), this same function
+would decrease/unboost the priority of the task. That is because the pi_list
+always contains the highest priority task that is waiting on a mutex owned
+by the task, so we only need to compare the priority of that top pi waiter
+to the normal priority of the given task.
+
+
+High level overview of the PI chain walk
+----------------------------------------
+
+The PI chain walk is implemented by the function rt_mutex_adjust_prio_chain.
+
+The implementation has gone through several iterations, and has ended up
+with what we believe is the best. It walks the PI chain by only grabbing
+at most two locks at a time, and is very efficient.
+
+The rt_mutex_adjust_prio_chain can be used either to boost or lower process
+priorities.
+
+rt_mutex_adjust_prio_chain is called with a task to be checked for PI
+(de)boosting (the owner of a mutex that a process is blocking on), a flag to
+check for deadlocking, the mutex that the task owns, and a pointer to a waiter
+that is the process's waiter struct that is blocked on the mutex (although this
+parameter may be NULL for deboosting).
+
+For this explanation, I will not mention deadlock detection. This explanation
+will try to stay at a high level.
+
+When this function is called, there are no locks held. That also means
+that the state of the owner and lock can change when entered into this function.
+
+Before this function is called, the task has already had rt_mutex_adjust_prio
+performed on it. This means that the task is set to the priority that it
+should be at, but the plist nodes of the task's waiter have not been updated
+with the new priorities, and that this task may not be in the proper locations
+in the pi_lists and wait_lists that the task is blocked on. This function
+solves all that.
+
+A loop is entered, where task is the owner to be checked for PI changes that
+was passed by parameter (for the first iteration). The pi_lock of this task is
+taken to prevent any more changes to the pi_list of the task. This also
+prevents new tasks from completing the blocking on a mutex that is owned by this
+task.
+
+If the task is not blocked on a mutex then the loop is exited. We are at
+the top of the PI chain.
+
+A check is now done to see if the original waiter (the process that is blocked
+on the current mutex) is the top pi waiter of the task. That is, is this
+waiter on the top of the task's pi_list. If it is not, it either means that
+there is another process higher in priority that is blocked on one of the
+mutexes that the task owns, or that the waiter has just woken up via a signal
+or timeout and has left the PI chain. In either case, the loop is exited, since
+we don't need to do any more changes to the priority of the current task, or any
+task that owns a mutex that this current task is waiting on. A priority chain
+walk is only needed when a new top pi waiter is made to a task.
+
+The next check sees if the task's waiter plist node has the priority equal to
+the priority the task is set at. If they are equal, then we are done with
+the loop. Remember that the function started with the priority of the
+task adjusted, but the plist nodes that hold the task in other processes
+pi_lists have not been adjusted.
+
+Next, we look at the mutex that the task is blocked on. The mutex's wait_lock
+is taken. This is done by a spin_trylock, because the locking order of the
+pi_lock and wait_lock goes in the opposite direction. If we fail to grab the
+lock, the pi_lock is released, and we restart the loop.
+
+Now that we have both the pi_lock of the task as well as the wait_lock of
+the mutex the task is blocked on, we update the task's waiter's plist node
+that is located on the mutex's wait_list.
+
+Now we release the pi_lock of the task.
+
+Next the owner of the mutex has its pi_lock taken, so we can update the
+task's entry in the owner's pi_list. If the task is the highest priority
+process on the mutex's wait_list, then we remove the previous top waiter
+from the owner's pi_list, and replace it with the task.
+
+Note: It is possible that the task was the current top waiter on the mutex,
+ in which case the task is not yet on the pi_list of the waiter. This
+ is OK, since plist_del does nothing if the plist node is not on any
+ list.
+
+If the task was not the top waiter of the mutex, but it was before we
+did the priority updates, that means we are deboosting/lowering the
+task. In this case, the task is removed from the pi_list of the owner,
+and the new top waiter is added.
+
+Lastly, we unlock both the pi_lock of the task, as well as the mutex's
+wait_lock, and continue the loop again. On the next iteration of the
+loop, the previous owner of the mutex will be the task that will be
+processed.
+
+Note: One might think that the owner of this mutex might have changed
+ since we just grab the mutex's wait_lock. And one could be right.
+ The important thing to remember is that the owner could not have
+ become the task that is being processed in the PI chain, since
+ we have taken that task's pi_lock at the beginning of the loop.
+ So as long as there is an owner of this mutex that is not the same
+ process as the tasked being worked on, we are OK.
+
+ Looking closely at the code, one might be confused. The check for the
+ end of the PI chain is when the task isn't blocked on anything or the
+ task's waiter structure "task" element is NULL. This check is
+ protected only by the task's pi_lock. But the code to unlock the mutex
+ sets the task's waiter structure "task" element to NULL with only
+ the protection of the mutex's wait_lock, which was not taken yet.
+ Isn't this a race condition if the task becomes the new owner?
+
+ The answer is No! The trick is the spin_trylock of the mutex's
+ wait_lock. If we fail that lock, we release the pi_lock of the
+ task and continue the loop, doing the end of PI chain check again.
+
+ In the code to release the lock, the wait_lock of the mutex is held
+ the entire time, and it is not let go when we grab the pi_lock of the
+ new owner of the mutex. So if the switch of a new owner were to happen
+ after the check for end of the PI chain and the grabbing of the
+ wait_lock, the unlocking code would spin on the new owner's pi_lock
+ but never give up the wait_lock. So the PI chain loop is guaranteed to
+ fail the spin_trylock on the wait_lock, release the pi_lock, and
+ try again.
+
+ If you don't quite understand the above, that's OK. You don't have to,
+ unless you really want to make a proof out of it ;)
+
+
+Pending Owners and Lock stealing
+--------------------------------
+
+One of the flags in the owner field of the mutex structure is "Pending Owner".
+What this means is that an owner was chosen by the process releasing the
+mutex, but that owner has yet to wake up and actually take the mutex.
+
+Why is this important? Why can't we just give the mutex to another process
+and be done with it?
+
+The PI code is to help with real-time processes, and to let the highest
+priority process run as long as possible with little latencies and delays.
+If a high priority process owns a mutex that a lower priority process is
+blocked on, when the mutex is released it would be given to the lower priority
+process. What if the higher priority process wants to take that mutex again.
+The high priority process would fail to take that mutex that it just gave up
+and it would need to boost the lower priority process to run with full
+latency of that critical section (since the low priority process just entered
+it).
+
+There's no reason a high priority process that gives up a mutex should be
+penalized if it tries to take that mutex again. If the new owner of the
+mutex has not woken up yet, there's no reason that the higher priority process
+could not take that mutex away.
+
+To solve this, we introduced Pending Ownership and Lock Stealing. When a
+new process is given a mutex that it was blocked on, it is only given
+pending ownership. This means that it's the new owner, unless a higher
+priority process comes in and tries to grab that mutex. If a higher priority
+process does come along and wants that mutex, we let the higher priority
+process "steal" the mutex from the pending owner (only if it is still pending)
+and continue with the mutex.
+
+
+Taking of a mutex (The walk through)
+------------------------------------
+
+OK, now let's take a look at the detailed walk through of what happens when
+taking a mutex.
+
+The first thing that is tried is the fast taking of the mutex. This is
+done when we have CMPXCHG enabled (otherwise the fast taking automatically
+fails). Only when the owner field of the mutex is NULL can the lock be
+taken with the CMPXCHG and nothing else needs to be done.
+
+If there is contention on the lock, whether it is owned or pending owner
+we go about the slow path (rt_mutex_slowlock).
+
+The slow path function is where the task's waiter structure is created on
+the stack. This is because the waiter structure is only needed for the
+scope of this function. The waiter structure holds the nodes to store
+the task on the wait_list of the mutex, and if need be, the pi_list of
+the owner.
+
+The wait_lock of the mutex is taken since the slow path of unlocking the
+mutex also takes this lock.
+
+We then call try_to_take_rt_mutex. This is where the architecture that
+does not implement CMPXCHG would always grab the lock (if there's no
+contention).
+
+try_to_take_rt_mutex is used every time the task tries to grab a mutex in the
+slow path. The first thing that is done here is an atomic setting of
+the "Has Waiters" flag of the mutex's owner field. Yes, this could really
+be false, because if the the mutex has no owner, there are no waiters and
+the current task also won't have any waiters. But we don't have the lock
+yet, so we assume we are going to be a waiter. The reason for this is to
+play nice for those architectures that do have CMPXCHG. By setting this flag
+now, the owner of the mutex can't release the mutex without going into the
+slow unlock path, and it would then need to grab the wait_lock, which this
+code currently holds. So setting the "Has Waiters" flag forces the owner
+to synchronize with this code.
+
+Now that we know that we can't have any races with the owner releasing the
+mutex, we check to see if we can take the ownership. This is done if the
+mutex doesn't have a owner, or if we can steal the mutex from a pending
+owner. Let's look at the situations we have here.
+
+ 1) Has owner that is pending
+ ----------------------------
+
+ The mutex has a owner, but it hasn't woken up and the mutex flag
+ "Pending Owner" is set. The first check is to see if the owner isn't the
+ current task. This is because this function is also used for the pending
+ owner to grab the mutex. When a pending owner wakes up, it checks to see
+ if it can take the mutex, and this is done if the owner is already set to
+ itself. If so, we succeed and leave the function, clearing the "Pending
+ Owner" bit.
+
+ If the pending owner is not current, we check to see if the current priority is
+ higher than the pending owner. If not, we fail the function and return.
+
+ There's also something special about a pending owner. That is a pending owner
+ is never blocked on a mutex. So there is no PI chain to worry about. It also
+ means that if the mutex doesn't have any waiters, there's no accounting needed
+ to update the pending owner's pi_list, since we only worry about processes
+ blocked on the current mutex.
+
+ If there are waiters on this mutex, and we just stole the ownership, we need
+ to take the top waiter, remove it from the pi_list of the pending owner, and
+ add it to the current pi_list. Note that at this moment, the pending owner
+ is no longer on the list of waiters. This is fine, since the pending owner
+ would add itself back when it realizes that it had the ownership stolen
+ from itself. When the pending owner tries to grab the mutex, it will fail
+ in try_to_take_rt_mutex if the owner field points to another process.
+
+ 2) No owner
+ -----------
+
+ If there is no owner (or we successfully stole the lock), we set the owner
+ of the mutex to current, and set the flag of "Has Waiters" if the current
+ mutex actually has waiters, or we clear the flag if it doesn't. See, it was
+ OK that we set that flag early, since now it is cleared.
+
+ 3) Failed to grab ownership
+ ---------------------------
+
+ The most interesting case is when we fail to take ownership. This means that
+ there exists an owner, or there's a pending owner with equal or higher
+ priority than the current task.
+
+We'll continue on the failed case.
+
+If the mutex has a timeout, we set up a timer to go off to break us out
+of this mutex if we failed to get it after a specified amount of time.
+
+Now we enter a loop that will continue to try to take ownership of the mutex, or
+fail from a timeout or signal.
+
+Once again we try to take the mutex. This will usually fail the first time
+in the loop, since it had just failed to get the mutex. But the second time
+in the loop, this would likely succeed, since the task would likely be
+the pending owner.
+
+If the mutex is TASK_INTERRUPTIBLE a check for signals and timeout is done
+here.
+
+The waiter structure has a "task" field that points to the task that is blocked
+on the mutex. This field can be NULL the first time it goes through the loop
+or if the task is a pending owner and had it's mutex stolen. If the "task"
+field is NULL then we need to set up the accounting for it.
+
+Task blocks on mutex
+--------------------
+
+The accounting of a mutex and process is done with the waiter structure of
+the process. The "task" field is set to the process, and the "lock" field
+to the mutex. The plist nodes are initialized to the processes current
+priority.
+
+Since the wait_lock was taken at the entry of the slow lock, we can safely
+add the waiter to the wait_list. If the current process is the highest
+priority process currently waiting on this mutex, then we remove the
+previous top waiter process (if it exists) from the pi_list of the owner,
+and add the current process to that list. Since the pi_list of the owner
+has changed, we call rt_mutex_adjust_prio on the owner to see if the owner
+should adjust its priority accordingly.
+
+If the owner is also blocked on a lock, and had its pi_list changed
+(or deadlock checking is on), we unlock the wait_lock of the mutex and go ahead
+and run rt_mutex_adjust_prio_chain on the owner, as described earlier.
+
+Now all locks are released, and if the current process is still blocked on a
+mutex (waiter "task" field is not NULL), then we go to sleep (call schedule).
+
+Waking up in the loop
+---------------------
+
+The schedule can then wake up for a few reasons.
+ 1) we were given pending ownership of the mutex.
+ 2) we received a signal and was TASK_INTERRUPTIBLE
+ 3) we had a timeout and was TASK_INTERRUPTIBLE
+
+In any of these cases, we continue the loop and once again try to grab the
+ownership of the mutex. If we succeed, we exit the loop, otherwise we continue
+and on signal and timeout, will exit the loop, or if we had the mutex stolen
+we just simply add ourselves back on the lists and go back to sleep.
+
+Note: For various reasons, because of timeout and signals, the steal mutex
+ algorithm needs to be careful. This is because the current process is
+ still on the wait_list. And because of dynamic changing of priorities,
+ especially on SCHED_OTHER tasks, the current process can be the
+ highest priority task on the wait_list.
+
+Failed to get mutex on Timeout or Signal
+----------------------------------------
+
+If a timeout or signal occurred, the waiter's "task" field would not be
+NULL and the task needs to be taken off the wait_list of the mutex and perhaps
+pi_list of the owner. If this process was a high priority process, then
+the rt_mutex_adjust_prio_chain needs to be executed again on the owner,
+but this time it will be lowering the priorities.
+
+
+Unlocking the Mutex
+-------------------
+
+The unlocking of a mutex also has a fast path for those architectures with
+CMPXCHG. Since the taking of a mutex on contention always sets the
+"Has Waiters" flag of the mutex's owner, we use this to know if we need to
+take the slow path when unlocking the mutex. If the mutex doesn't have any
+waiters, the owner field of the mutex would equal the current process and
+the mutex can be unlocked by just replacing the owner field with NULL.
+
+If the owner field has the "Has Waiters" bit set (or CMPXCHG is not available),
+the slow unlock path is taken.
+
+The first thing done in the slow unlock path is to take the wait_lock of the
+mutex. This synchronizes the locking and unlocking of the mutex.
+
+A check is made to see if the mutex has waiters or not. On architectures that
+do not have CMPXCHG, this is the location that the owner of the mutex will
+determine if a waiter needs to be awoken or not. On architectures that
+do have CMPXCHG, that check is done in the fast path, but it is still needed
+in the slow path too. If a waiter of a mutex woke up because of a signal
+or timeout between the time the owner failed the fast path CMPXCHG check and
+the grabbing of the wait_lock, the mutex may not have any waiters, thus the
+owner still needs to make this check. If there are no waiters than the mutex
+owner field is set to NULL, the wait_lock is released and nothing more is
+needed.
+
+If there are waiters, then we need to wake one up and give that waiter
+pending ownership.
+
+On the wake up code, the pi_lock of the current owner is taken. The top
+waiter of the lock is found and removed from the wait_list of the mutex
+as well as the pi_list of the current owner. The task field of the new
+pending owner's waiter structure is set to NULL, and the owner field of the
+mutex is set to the new owner with the "Pending Owner" bit set, as well
+as the "Has Waiters" bit if there still are other processes blocked on the
+mutex.
+
+The pi_lock of the previous owner is released, and the new pending owner's
+pi_lock is taken. Remember that this is the trick to prevent the race
+condition in rt_mutex_adjust_prio_chain from adding itself as a waiter
+on the mutex.
+
+We now clear the "pi_blocked_on" field of the new pending owner, and if
+the mutex still has waiters pending, we add the new top waiter to the pi_list
+of the pending owner.
+
+Finally we unlock the pi_lock of the pending owner and wake it up.
+
+
+Contact
+-------
+
+For updates on this document, please email Steven Rostedt <rostedt@goodmis.org>
+
+
+Credits
+-------
+
+Author: Steven Rostedt <rostedt@goodmis.org>
+
+Reviewers: Ingo Molnar, Thomas Gleixner, Thomas Duetsch, and Randy Dunlap
+
+Updates
+-------
+
+This document was originally written for 2.6.17-rc3-mm1
diff --git a/Documentation/rt-mutex.txt b/Documentation/rt-mutex.txt
new file mode 100644
index 000000000000..243393d882ee
--- /dev/null
+++ b/Documentation/rt-mutex.txt
@@ -0,0 +1,79 @@
+RT-mutex subsystem with PI support
+----------------------------------
+
+RT-mutexes with priority inheritance are used to support PI-futexes,
+which enable pthread_mutex_t priority inheritance attributes
+(PTHREAD_PRIO_INHERIT). [See Documentation/pi-futex.txt for more details
+about PI-futexes.]
+
+This technology was developed in the -rt tree and streamlined for
+pthread_mutex support.
+
+Basic principles:
+-----------------
+
+RT-mutexes extend the semantics of simple mutexes by the priority
+inheritance protocol.
+
+A low priority owner of a rt-mutex inherits the priority of a higher
+priority waiter until the rt-mutex is released. If the temporarily
+boosted owner blocks on a rt-mutex itself it propagates the priority
+boosting to the owner of the other rt_mutex it gets blocked on. The
+priority boosting is immediately removed once the rt_mutex has been
+unlocked.
+
+This approach allows us to shorten the block of high-prio tasks on
+mutexes which protect shared resources. Priority inheritance is not a
+magic bullet for poorly designed applications, but it allows
+well-designed applications to use userspace locks in critical parts of
+an high priority thread, without losing determinism.
+
+The enqueueing of the waiters into the rtmutex waiter list is done in
+priority order. For same priorities FIFO order is chosen. For each
+rtmutex, only the top priority waiter is enqueued into the owner's
+priority waiters list. This list too queues in priority order. Whenever
+the top priority waiter of a task changes (for example it timed out or
+got a signal), the priority of the owner task is readjusted. [The
+priority enqueueing is handled by "plists", see include/linux/plist.h
+for more details.]
+
+RT-mutexes are optimized for fastpath operations and have no internal
+locking overhead when locking an uncontended mutex or unlocking a mutex
+without waiters. The optimized fastpath operations require cmpxchg
+support. [If that is not available then the rt-mutex internal spinlock
+is used]
+
+The state of the rt-mutex is tracked via the owner field of the rt-mutex
+structure:
+
+rt_mutex->owner holds the task_struct pointer of the owner. Bit 0 and 1
+are used to keep track of the "owner is pending" and "rtmutex has
+waiters" state.
+
+ owner bit1 bit0
+ NULL 0 0 mutex is free (fast acquire possible)
+ NULL 0 1 invalid state
+ NULL 1 0 Transitional state*
+ NULL 1 1 invalid state
+ taskpointer 0 0 mutex is held (fast release possible)
+ taskpointer 0 1 task is pending owner
+ taskpointer 1 0 mutex is held and has waiters
+ taskpointer 1 1 task is pending owner and mutex has waiters
+
+Pending-ownership handling is a performance optimization:
+pending-ownership is assigned to the first (highest priority) waiter of
+the mutex, when the mutex is released. The thread is woken up and once
+it starts executing it can acquire the mutex. Until the mutex is taken
+by it (bit 0 is cleared) a competing higher priority thread can "steal"
+the mutex which puts the woken up thread back on the waiters list.
+
+The pending-ownership optimization is especially important for the
+uninterrupted workflow of high-prio tasks which repeatedly
+takes/releases locks that have lower-prio waiters. Without this
+optimization the higher-prio thread would ping-pong to the lower-prio
+task [because at unlock time we always assign a new owner].
+
+(*) The "mutex has waiters" bit gets set to take the lock. If the lock
+doesn't already have an owner, this bit is quickly cleared if there are
+no waiters. So this is a transitional state to synchronize with looking
+at the owner field of the mutex and the mutex owner releasing the lock.
diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt
index 87d76a5c73d0..f61af23dd85d 100644
--- a/Documentation/sound/alsa/ALSA-Configuration.txt
+++ b/Documentation/sound/alsa/ALSA-Configuration.txt
@@ -472,6 +472,22 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
The power-management is supported.
+ Module snd-darla20
+ ------------------
+
+ Module for Echoaudio Darla20
+
+ This module supports multiple cards.
+ The driver requires the firmware loader support on kernel.
+
+ Module snd-darla24
+ ------------------
+
+ Module for Echoaudio Darla24
+
+ This module supports multiple cards.
+ The driver requires the firmware loader support on kernel.
+
Module snd-dt019x
-----------------
@@ -499,6 +515,14 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
The power-management is supported.
+ Module snd-echo3g
+ -----------------
+
+ Module for Echoaudio 3G cards (Gina3G/Layla3G)
+
+ This module supports multiple cards.
+ The driver requires the firmware loader support on kernel.
+
Module snd-emu10k1
------------------
@@ -657,6 +681,22 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
The power-management is supported.
+ Module snd-gina20
+ -----------------
+
+ Module for Echoaudio Gina20
+
+ This module supports multiple cards.
+ The driver requires the firmware loader support on kernel.
+
+ Module snd-gina24
+ -----------------
+
+ Module for Echoaudio Gina24
+
+ This module supports multiple cards.
+ The driver requires the firmware loader support on kernel.
+
Module snd-gusclassic
---------------------
@@ -760,12 +800,18 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
basic fixed pin assignment w/o SPDIF
auto auto-config reading BIOS (default)
- ALC882/883/885
+ ALC882/885
3stack-dig 3-jack with SPDIF I/O
6stck-dig 6-jack digital with SPDIF I/O
auto auto-config reading BIOS (default)
- ALC861
+ ALC883/888
+ 3stack-dig 3-jack with SPDIF I/O
+ 6stack-dig 6-jack digital with SPDIF I/O
+ 6stack-dig-demo 6-stack digital for Intel demo board
+ auto auto-config reading BIOS (default)
+
+ ALC861/660
3stack 3-jack
3stack-dig 3-jack with SPDIF I/O
6stack-dig 6-jack with SPDIF I/O
@@ -937,6 +983,30 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
driver isn't configured properly or you want to try another
type for testing.
+ Module snd-indigo
+ -----------------
+
+ Module for Echoaudio Indigo
+
+ This module supports multiple cards.
+ The driver requires the firmware loader support on kernel.
+
+ Module snd-indigodj
+ -------------------
+
+ Module for Echoaudio Indigo DJ
+
+ This module supports multiple cards.
+ The driver requires the firmware loader support on kernel.
+
+ Module snd-indigoio
+ -------------------
+
+ Module for Echoaudio Indigo IO
+
+ This module supports multiple cards.
+ The driver requires the firmware loader support on kernel.
+
Module snd-intel8x0
-------------------
@@ -1036,6 +1106,22 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
This module supports multiple cards.
+ Module snd-layla20
+ ------------------
+
+ Module for Echoaudio Layla20
+
+ This module supports multiple cards.
+ The driver requires the firmware loader support on kernel.
+
+ Module snd-layla24
+ ------------------
+
+ Module for Echoaudio Layla24
+
+ This module supports multiple cards.
+ The driver requires the firmware loader support on kernel.
+
Module snd-maestro3
-------------------
@@ -1056,6 +1142,14 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
The power-management is supported.
+ Module snd-mia
+ ---------------
+
+ Module for Echoaudio Mia
+
+ This module supports multiple cards.
+ The driver requires the firmware loader support on kernel.
+
Module snd-miro
---------------
@@ -1088,6 +1182,14 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
When no hotplug fw loader is available, you need to load the
firmware via mixartloader utility in alsa-tools package.
+ Module snd-mona
+ ---------------
+
+ Module for Echoaudio Mona
+
+ This module supports multiple cards.
+ The driver requires the firmware loader support on kernel.
+
Module snd-mpu401
-----------------
diff --git a/Documentation/video4linux/README.pvrusb2 b/Documentation/video4linux/README.pvrusb2
new file mode 100644
index 000000000000..c73a32c34528
--- /dev/null
+++ b/Documentation/video4linux/README.pvrusb2
@@ -0,0 +1,212 @@
+
+$Id$
+Mike Isely <isely@pobox.com>
+
+ pvrusb2 driver
+
+Background:
+
+ This driver is intended for the "Hauppauge WinTV PVR USB 2.0", which
+ is a USB 2.0 hosted TV Tuner. This driver is a work in progress.
+ Its history started with the reverse-engineering effort by Björn
+ Danielsson <pvrusb2@dax.nu> whose web page can be found here:
+
+ http://pvrusb2.dax.nu/
+
+ From there Aurelien Alleaume <slts@free.fr> began an effort to
+ create a video4linux compatible driver. I began with Aurelien's
+ last known snapshot and evolved the driver to the state it is in
+ here.
+
+ More information on this driver can be found at:
+
+ http://www.isely.net/pvrusb2.html
+
+
+ This driver has a strong separation of layers. They are very
+ roughly:
+
+ 1a. Low level wire-protocol implementation with the device.
+
+ 1b. I2C adaptor implementation and corresponding I2C client drivers
+ implemented elsewhere in V4L.
+
+ 1c. High level hardware driver implementation which coordinates all
+ activities that ensure correct operation of the device.
+
+ 2. A "context" layer which manages instancing of driver, setup,
+ tear-down, arbitration, and interaction with high level
+ interfaces appropriately as devices are hotplugged in the
+ system.
+
+ 3. High level interfaces which glue the driver to various published
+ Linux APIs (V4L, sysfs, maybe DVB in the future).
+
+ The most important shearing layer is between the top 2 layers. A
+ lot of work went into the driver to ensure that any kind of
+ conceivable API can be laid on top of the core driver. (Yes, the
+ driver internally leverages V4L to do its work but that really has
+ nothing to do with the API published by the driver to the outside
+ world.) The architecture allows for different APIs to
+ simultaneously access the driver. I have a strong sense of fairness
+ about APIs and also feel that it is a good design principle to keep
+ implementation and interface isolated from each other. Thus while
+ right now the V4L high level interface is the most complete, the
+ sysfs high level interface will work equally well for similar
+ functions, and there's no reason I see right now why it shouldn't be
+ possible to produce a DVB high level interface that can sit right
+ alongside V4L.
+
+ NOTE: Complete documentation on the pvrusb2 driver is contained in
+ the html files within the doc directory; these are exactly the same
+ as what is on the web site at the time. Browse those files
+ (especially the FAQ) before asking questions.
+
+
+Building
+
+ To build these modules essentially amounts to just running "Make",
+ but you need the kernel source tree nearby and you will likely also
+ want to set a few controlling environment variables first in order
+ to link things up with that source tree. Please see the Makefile
+ here for comments that explain how to do that.
+
+
+Source file list / functional overview:
+
+ (Note: The term "module" used below generally refers to loosely
+ defined functional units within the pvrusb2 driver and bears no
+ relation to the Linux kernel's concept of a loadable module.)
+
+ pvrusb2-audio.[ch] - This is glue logic that resides between this
+ driver and the msp3400.ko I2C client driver (which is found
+ elsewhere in V4L).
+
+ pvrusb2-context.[ch] - This module implements the context for an
+ instance of the driver. Everything else eventually ties back to
+ or is otherwise instanced within the data structures implemented
+ here. Hotplugging is ultimately coordinated here. All high level
+ interfaces tie into the driver through this module. This module
+ helps arbitrate each interface's access to the actual driver core,
+ and is designed to allow concurrent access through multiple
+ instances of multiple interfaces (thus you can for example change
+ the tuner's frequency through sysfs while simultaneously streaming
+ video through V4L out to an instance of mplayer).
+
+ pvrusb2-debug.h - This header defines a printk() wrapper and a mask
+ of debugging bit definitions for the various kinds of debug
+ messages that can be enabled within the driver.
+
+ pvrusb2-debugifc.[ch] - This module implements a crude command line
+ oriented debug interface into the driver. Aside from being part
+ of the process for implementing manual firmware extraction (see
+ the pvrusb2 web site mentioned earlier), probably I'm the only one
+ who has ever used this. It is mainly a debugging aid.
+
+ pvrusb2-eeprom.[ch] - This is glue logic that resides between this
+ driver the tveeprom.ko module, which is itself implemented
+ elsewhere in V4L.
+
+ pvrusb2-encoder.[ch] - This module implements all protocol needed to
+ interact with the Conexant mpeg2 encoder chip within the pvrusb2
+ device. It is a crude echo of corresponding logic in ivtv,
+ however the design goals (strict isolation) and physical layer
+ (proxy through USB instead of PCI) are enough different that this
+ implementation had to be completely different.
+
+ pvrusb2-hdw-internal.h - This header defines the core data structure
+ in the driver used to track ALL internal state related to control
+ of the hardware. Nobody outside of the core hardware-handling
+ modules should have any business using this header. All external
+ access to the driver should be through one of the high level
+ interfaces (e.g. V4L, sysfs, etc), and in fact even those high
+ level interfaces are restricted to the API defined in
+ pvrusb2-hdw.h and NOT this header.
+
+ pvrusb2-hdw.h - This header defines the full internal API for
+ controlling the hardware. High level interfaces (e.g. V4L, sysfs)
+ will work through here.
+
+ pvrusb2-hdw.c - This module implements all the various bits of logic
+ that handle overall control of a specific pvrusb2 device.
+ (Policy, instantiation, and arbitration of pvrusb2 devices fall
+ within the jurisdiction of pvrusb-context not here).
+
+ pvrusb2-i2c-chips-*.c - These modules implement the glue logic to
+ tie together and configure various I2C modules as they attach to
+ the I2C bus. There are two versions of this file. The "v4l2"
+ version is intended to be used in-tree alongside V4L, where we
+ implement just the logic that makes sense for a pure V4L
+ environment. The "all" version is intended for use outside of
+ V4L, where we might encounter other possibly "challenging" modules
+ from ivtv or older kernel snapshots (or even the support modules
+ in the standalone snapshot).
+
+ pvrusb2-i2c-cmd-v4l1.[ch] - This module implements generic V4L1
+ compatible commands to the I2C modules. It is here where state
+ changes inside the pvrusb2 driver are translated into V4L1
+ commands that are in turn send to the various I2C modules.
+
+ pvrusb2-i2c-cmd-v4l2.[ch] - This module implements generic V4L2
+ compatible commands to the I2C modules. It is here where state
+ changes inside the pvrusb2 driver are translated into V4L2
+ commands that are in turn send to the various I2C modules.
+
+ pvrusb2-i2c-core.[ch] - This module provides an implementation of a
+ kernel-friendly I2C adaptor driver, through which other external
+ I2C client drivers (e.g. msp3400, tuner, lirc) may connect and
+ operate corresponding chips within the the pvrusb2 device. It is
+ through here that other V4L modules can reach into this driver to
+ operate specific pieces (and those modules are in turn driven by
+ glue logic which is coordinated by pvrusb2-hdw, doled out by
+ pvrusb2-context, and then ultimately made available to users
+ through one of the high level interfaces).
+
+ pvrusb2-io.[ch] - This module implements a very low level ring of
+ transfer buffers, required in order to stream data from the
+ device. This module is *very* low level. It only operates the
+ buffers and makes no attempt to define any policy or mechanism for
+ how such buffers might be used.
+
+ pvrusb2-ioread.[ch] - This module layers on top of pvrusb2-io.[ch]
+ to provide a streaming API usable by a read() system call style of
+ I/O. Right now this is the only layer on top of pvrusb2-io.[ch],
+ however the underlying architecture here was intended to allow for
+ other styles of I/O to be implemented with additonal modules, like
+ mmap()'ed buffers or something even more exotic.
+
+ pvrusb2-main.c - This is the top level of the driver. Module level
+ and USB core entry points are here. This is our "main".
+
+ pvrusb2-sysfs.[ch] - This is the high level interface which ties the
+ pvrusb2 driver into sysfs. Through this interface you can do
+ everything with the driver except actually stream data.
+
+ pvrusb2-tuner.[ch] - This is glue logic that resides between this
+ driver and the tuner.ko I2C client driver (which is found
+ elsewhere in V4L).
+
+ pvrusb2-util.h - This header defines some common macros used
+ throughout the driver. These macros are not really specific to
+ the driver, but they had to go somewhere.
+
+ pvrusb2-v4l2.[ch] - This is the high level interface which ties the
+ pvrusb2 driver into video4linux. It is through here that V4L
+ applications can open and operate the driver in the usual V4L
+ ways. Note that **ALL** V4L functionality is published only
+ through here and nowhere else.
+
+ pvrusb2-video-*.[ch] - This is glue logic that resides between this
+ driver and the saa711x.ko I2C client driver (which is found
+ elsewhere in V4L). Note that saa711x.ko used to be known as
+ saa7115.ko in ivtv. There are two versions of this; one is
+ selected depending on the particular saa711[5x].ko that is found.
+
+ pvrusb2.h - This header contains compile time tunable parameters
+ (and at the moment the driver has very little that needs to be
+ tuned).
+
+
+ -Mike Isely
+ isely@pobox.com
+
diff --git a/Documentation/watchdog/pcwd-watchdog.txt b/Documentation/watchdog/pcwd-watchdog.txt
index 12187a33e310..d9ee6336c1d4 100644
--- a/Documentation/watchdog/pcwd-watchdog.txt
+++ b/Documentation/watchdog/pcwd-watchdog.txt
@@ -22,78 +22,9 @@
to run the program with an "&" to run it in the background!)
If you want to write a program to be compatible with the PC Watchdog
- driver, simply do the following:
-
--- Snippet of code --
-/*
- * Watchdog Driver Test Program
- */
-
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-#include <unistd.h>
-#include <fcntl.h>
-#include <sys/ioctl.h>
-#include <linux/types.h>
-#include <linux/watchdog.h>
-
-int fd;
-
-/*
- * This function simply sends an IOCTL to the driver, which in turn ticks
- * the PC Watchdog card to reset its internal timer so it doesn't trigger
- * a computer reset.
- */
-void keep_alive(void)
-{
- int dummy;
-
- ioctl(fd, WDIOC_KEEPALIVE, &dummy);
-}
-
-/*
- * The main program. Run the program with "-d" to disable the card,
- * or "-e" to enable the card.
- */
-int main(int argc, char *argv[])
-{
- fd = open("/dev/watchdog", O_WRONLY);
-
- if (fd == -1) {
- fprintf(stderr, "Watchdog device not enabled.\n");
- fflush(stderr);
- exit(-1);
- }
-
- if (argc > 1) {
- if (!strncasecmp(argv[1], "-d", 2)) {
- ioctl(fd, WDIOC_SETOPTIONS, WDIOS_DISABLECARD);
- fprintf(stderr, "Watchdog card disabled.\n");
- fflush(stderr);
- exit(0);
- } else if (!strncasecmp(argv[1], "-e", 2)) {
- ioctl(fd, WDIOC_SETOPTIONS, WDIOS_ENABLECARD);
- fprintf(stderr, "Watchdog card enabled.\n");
- fflush(stderr);
- exit(0);
- } else {
- fprintf(stderr, "-d to disable, -e to enable.\n");
- fprintf(stderr, "run by itself to tick the card.\n");
- fflush(stderr);
- exit(0);
- }
- } else {
- fprintf(stderr, "Watchdog Ticking Away!\n");
- fflush(stderr);
- }
-
- while(1) {
- keep_alive();
- sleep(1);
- }
-}
--- End snippet --
+ driver, simply use of modify the watchdog test program:
+ Documentation/watchdog/src/watchdog-test.c
+
Other IOCTL functions include:
diff --git a/Documentation/watchdog/src/watchdog-simple.c b/Documentation/watchdog/src/watchdog-simple.c
new file mode 100644
index 000000000000..85cf17c48669
--- /dev/null
+++ b/Documentation/watchdog/src/watchdog-simple.c
@@ -0,0 +1,15 @@
+#include <stdlib.h>
+#include <fcntl.h>
+
+int main(int argc, const char *argv[]) {
+ int fd = open("/dev/watchdog", O_WRONLY);
+ if (fd == -1) {
+ perror("watchdog");
+ exit(1);
+ }
+ while (1) {
+ write(fd, "\0", 1);
+ fsync(fd);
+ sleep(10);
+ }
+}
diff --git a/Documentation/watchdog/src/watchdog-test.c b/Documentation/watchdog/src/watchdog-test.c
new file mode 100644
index 000000000000..65f6c19cb865
--- /dev/null
+++ b/Documentation/watchdog/src/watchdog-test.c
@@ -0,0 +1,68 @@
+/*
+ * Watchdog Driver Test Program
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <sys/ioctl.h>
+#include <linux/types.h>
+#include <linux/watchdog.h>
+
+int fd;
+
+/*
+ * This function simply sends an IOCTL to the driver, which in turn ticks
+ * the PC Watchdog card to reset its internal timer so it doesn't trigger
+ * a computer reset.
+ */
+void keep_alive(void)
+{
+ int dummy;
+
+ ioctl(fd, WDIOC_KEEPALIVE, &dummy);
+}
+
+/*
+ * The main program. Run the program with "-d" to disable the card,
+ * or "-e" to enable the card.
+ */
+int main(int argc, char *argv[])
+{
+ fd = open("/dev/watchdog", O_WRONLY);
+
+ if (fd == -1) {
+ fprintf(stderr, "Watchdog device not enabled.\n");
+ fflush(stderr);
+ exit(-1);
+ }
+
+ if (argc > 1) {
+ if (!strncasecmp(argv[1], "-d", 2)) {
+ ioctl(fd, WDIOC_SETOPTIONS, WDIOS_DISABLECARD);
+ fprintf(stderr, "Watchdog card disabled.\n");
+ fflush(stderr);
+ exit(0);
+ } else if (!strncasecmp(argv[1], "-e", 2)) {
+ ioctl(fd, WDIOC_SETOPTIONS, WDIOS_ENABLECARD);
+ fprintf(stderr, "Watchdog card enabled.\n");
+ fflush(stderr);
+ exit(0);
+ } else {
+ fprintf(stderr, "-d to disable, -e to enable.\n");
+ fprintf(stderr, "run by itself to tick the card.\n");
+ fflush(stderr);
+ exit(0);
+ }
+ } else {
+ fprintf(stderr, "Watchdog Ticking Away!\n");
+ fflush(stderr);
+ }
+
+ while(1) {
+ keep_alive();
+ sleep(1);
+ }
+}
diff --git a/Documentation/watchdog/watchdog-api.txt b/Documentation/watchdog/watchdog-api.txt
index 21ed51173662..958ff3d48be3 100644
--- a/Documentation/watchdog/watchdog-api.txt
+++ b/Documentation/watchdog/watchdog-api.txt
@@ -34,22 +34,7 @@ activates as soon as /dev/watchdog is opened and will reboot unless
the watchdog is pinged within a certain time, this time is called the
timeout or margin. The simplest way to ping the watchdog is to write
some data to the device. So a very simple watchdog daemon would look
-like this:
-
-#include <stdlib.h>
-#include <fcntl.h>
-
-int main(int argc, const char *argv[]) {
- int fd=open("/dev/watchdog",O_WRONLY);
- if (fd==-1) {
- perror("watchdog");
- exit(1);
- }
- while(1) {
- write(fd, "\0", 1);
- sleep(10);
- }
-}
+like this source file: see Documentation/watchdog/src/watchdog-simple.c
A more advanced driver could for example check that a HTTP server is
still responding before doing the write call to ping the watchdog.
@@ -110,7 +95,40 @@ current timeout using the GETTIMEOUT ioctl.
ioctl(fd, WDIOC_GETTIMEOUT, &timeout);
printf("The timeout was is %d seconds\n", timeout);
-Envinronmental monitoring:
+Pretimeouts:
+
+Some watchdog timers can be set to have a trigger go off before the
+actual time they will reset the system. This can be done with an NMI,
+interrupt, or other mechanism. This allows Linux to record useful
+information (like panic information and kernel coredumps) before it
+resets.
+
+ pretimeout = 10;
+ ioctl(fd, WDIOC_SETPRETIMEOUT, &pretimeout);
+
+Note that the pretimeout is the number of seconds before the time
+when the timeout will go off. It is not the number of seconds until
+the pretimeout. So, for instance, if you set the timeout to 60 seconds
+and the pretimeout to 10 seconds, the pretimout will go of in 50
+seconds. Setting a pretimeout to zero disables it.
+
+There is also a get function for getting the pretimeout:
+
+ ioctl(fd, WDIOC_GETPRETIMEOUT, &timeout);
+ printf("The pretimeout was is %d seconds\n", timeout);
+
+Not all watchdog drivers will support a pretimeout.
+
+Get the number of seconds before reboot:
+
+Some watchdog drivers have the ability to report the remaining time
+before the system will reboot. The WDIOC_GETTIMELEFT is the ioctl
+that returns the number of seconds before reboot.
+
+ ioctl(fd, WDIOC_GETTIMELEFT, &timeleft);
+ printf("The timeout was is %d seconds\n", timeleft);
+
+Environmental monitoring:
All watchdog drivers are required return more information about the system,
some do temperature, fan and power level monitoring, some can tell you
@@ -169,6 +187,10 @@ The watchdog saw a keepalive ping since it was last queried.
WDIOF_SETTIMEOUT Can set/get the timeout
+The watchdog can do pretimeouts.
+
+ WDIOF_PRETIMEOUT Pretimeout (in seconds), get/set
+
For those drivers that return any bits set in the option field, the
GETSTATUS and GETBOOTSTATUS ioctls can be used to ask for the current
diff --git a/Documentation/watchdog/watchdog.txt b/Documentation/watchdog/watchdog.txt
index dffda29c8799..4b1ff69cc19a 100644
--- a/Documentation/watchdog/watchdog.txt
+++ b/Documentation/watchdog/watchdog.txt
@@ -65,28 +65,7 @@ The external event interfaces on the WDT boards are not currently supported.
Minor numbers are however allocated for it.
-Example Watchdog Driver
------------------------
-
-#include <stdio.h>
-#include <unistd.h>
-#include <fcntl.h>
-
-int main(int argc, const char *argv[])
-{
- int fd=open("/dev/watchdog",O_WRONLY);
- if(fd==-1)
- {
- perror("watchdog");
- exit(1);
- }
- while(1)
- {
- write(fd,"\0",1);
- fsync(fd);
- sleep(10);
- }
-}
+Example Watchdog Driver: see Documentation/watchdog/src/watchdog-simple.c
Contact Information