diff options
Diffstat (limited to 'docs/manual/content-negotiation.html.en')
-rw-r--r-- | docs/manual/content-negotiation.html.en | 216 |
1 files changed, 216 insertions, 0 deletions
diff --git a/docs/manual/content-negotiation.html.en b/docs/manual/content-negotiation.html.en new file mode 100644 index 0000000000..20825f3661 --- /dev/null +++ b/docs/manual/content-negotiation.html.en @@ -0,0 +1,216 @@ +<html> +<head> +<title>Apache server Content arbitration: MultiViews and *.var files</title> +</head> + +<body> +<IMG SRC="../images/apache_sub.gif" ALT=""> +<h3>Content Arbitration: MultiViews and *.var files</h3> + +The HTTP standard allows clients (i.e., browsers like Mosaic or +Netscape) to specify what data formats they are prepared to accept. +The intention is that when information is available in multiple +variants (e.g., in different data formats), servers can use this +information to decide which variant to send. This feature has been +supported in the CERN server for a while, and while it is not yet +supported in the NCSA server, it is likely to assume a new importance +in light of the emergence of HTML3 capable browsers. <p> + +The Apache module <A HREF="mod_negotiation.html">mod_negotiation</A> handles +content negotiation in two different ways; special treatment for the +pseudo-mime-type <code>application/x-type-map</code>, and the +MultiViews per-directory Option (which can be set in srm.conf, or in +.htaccess files, as usual). These features are alternate user +interfaces to what amounts to the same piece of code (in the new file +<code>http_mime_db.c</code>) which implements the content negotiation +portion of the HTTP protocol. <p> + +Each of these features allows one of several files to satisfy a +request, based on what the client says it's willing to accept; the +differences are in the way the files are identified: + +<ul> + <li> A type map (i.e., a <code>*.var</code> file) names the files + containing the variants explicitly + <li> In a MultiViews search, the server does an implicit filename + pattern match, and chooses from among the results. +</ul> + +Apache also supports a new pseudo-MIME type, +text/x-server-parsed-html3, which is treated as text/html;level=3 +for purposes of content negotiation, and as server-side-included HTML +elsewhere. + +<h3>Type maps (*.var files)</h3> + +A type map is a document which is typed by the server (using its +normal suffix-based mechanisms) as +<code>application/x-type-map</code>. Note that to use this feature, +you've got to have an <code>AddType</code> some place which defines a +file suffix as <code>application/x-type-map</code>; the easiest thing +may be to stick a +<pre> + + AddType application/x-type-map var + +</pre> +in <code>srm.conf</code>. See comments in the sample config files for +details. <p> + +Type map files have an entry for each available variant; these entries +consist of contiguous RFC822-format header lines. Entries for +different variants are separated by blank lines. Blank lines are +illegal within an entry. It is conventional to begin a map file with +an entry for the combined entity as a whole, e.g., +<pre> + + URI: foo; vary="type,language" + + URI: foo.en.html + Content-type: text/html; level=2 + Content-language: en + + URI: foo.fr.html + Content-type: text/html; level=2 + Content-language: fr + +</pre> +If the variants have different qualities, that may be indicated by the +"qs" parameter, as in this picture (available as jpeg, gif, or ASCII-art): +<pre> + + URI: foo; vary="type,language" + + URI: foo.jpeg + Content-type: image/jpeg; qs=0.8 + + URI: foo.gif + Content-type: image/gif; qs=0.5 + + URI: foo.txt + Content-type: text/plain; qs=0.01 + +</pre><p> + +The full list of headers recognized is: + +<dl> + <dt> <code>URI:</code> + <dd> uri of the file containing the variant (of the given media + type, encoded with the given content encoding). These are + interpreted as URLs relative to the map file; they must be on + the same server (!), and they must refer to files to which the + client would be granted access if they were to be requested + directly. + <dt> <code>Content-type:</code> + <dd> media type --- level may be specified, along with "qs". These + are often referred to as MIME types; typical media types are + <code>image/gif</code>, <code>text/plain</code>, or + <code>text/html; level=3</code>. + <dt> <code>Content-language:</code> + <dd> The language of the variant, specified as an internet standard + language code (e.g., <code>en</code> for English, + <code>kr</code> for Korean, etc.). + <dt> <code>Content-encoding:</code> + <dd> If the file is compressed, or otherwise encoded, rather than + containing the actual raw data, this says how that was done. + For compressed files (the only case where this generally comes + up), content encoding should be + <code>x-compress</code>, or <code>gzip</code>, as appropriate. + <dt> <code>Content-length:</code> + <dd> The size of the file. Clients can ask to receive a given media + type only if the variant isn't too big; specifying a content + length in the map allows the server to compare against these + thresholds without checking the actual file. +</dl> + +<h3>Multiviews</h3> + +This is a per-directory option, meaning it can be set with an +<code>Options</code> directive within a <code><Directory></code> +section in <code>access.conf</code>, or (if <code>AllowOverride</code> +is properly set) in <code>.htaccess</code> files. Note that +<code>Options All</code> does not set <code>MultiViews</code>; you +have to ask for it by name. (Fixing this is a one-line change to +<code>httpd.h</code>). + +<p> + +The effect of <code>MultiViews</code> is as follows: if the server +receives a request for <code>/some/dir/foo</code>, if +<code>/some/dir</code> has <code>MultiViews</code> enabled, and +<code>/some/dir/foo</code> does *not* exist, then the server reads the +directory looking for files named foo.*, and effectively fakes up a +type map which names all those files, assigning them the same media +types and content-encodings it would have if the client had asked for +one of them by name. It then chooses the best match to the client's +requirements, and forwards them along. + +<p> + +This applies to searches for the file named by the +<code>DirectoryIndex</code> directive, if the server is trying to +index a directory; if the configuration files specify +<pre> + + DirectoryIndex index + +</pre> then the server will arbitrate between <code>index.html</code> +and <code>index.html3</code> if both are present. If neither are +present, and <code>index.cgi</code> is there, the server will run it. + +<p> + +If one of the files found by the globbing is a CGI script, it's not +obvious what should happen. My code gives that case gets special +treatment --- if the request was a POST, or a GET with QUERY_ARGS or +PATH_INFO, the script is given an extremely high quality rating, and +generally invoked; otherwise it is given an extremely low quality +rating, which generally causes one of the other views (if any) to be +retrieved. This is the only jiggering of quality ratings done by the +MultiViews code; aside from that, all Qualities in the synthesized +type maps are 1.0. + +<p> + +<B>New as of 0.8:</B> Documents in multiple languages can also be resolved through the use +of the <code>AddLanguage</code> and <code>LanguagePriority</code> +directives: + +<pre> +AddLanguage en .en +AddLanguage fr .fr +AddLanguage de .de +AddLanguage da .da +AddLanguage el .el +AddLanguage it .it + +# LanguagePriority allows you to give precedence to some languages +# in case of a tie during content negotiation. +# Just list the languages in decreasing order of preference. + +LanguagePriority en fr de +</pre> + +Here, a request for "foo.html" matched against "foo.html.en" and +"foo.html.fr" would return an French document to a browser that +indicated a preference for French, or an English document otherwise. +In fact, a request for "foo" matched against "foo.html.en", +"foo.html.fr", "foo.ps.en", "foo.pdf.de", and "foo.txt.it" would do +just what you expect - treat those suffices as a database and compare +the request to it, returning the best match. The languages and data +types share the same suffix name space. + +<p> + +Note that this machinery only comes into play if the file which the +user attempted to retrieve does <em>not</em> exist by that name; if it +does, it is simply retrieved as usual. (So, someone who actually asks +for <code>foo.jpeg</code>, as opposed to <code>foo</code>, never gets +<code>foo.gif</code>). + +<P><HR><P> +<A HREF="../"><IMG SRC="../images/apache_home.gif" ALT="Home"></A> +<A HREF="./"><IMG SRC="../images/apache_index.gif" ALT="Index"></A> + +</body> </html> |