summaryrefslogtreecommitdiffstats
path: root/docs/manual/urlmapping.xml
blob: b061ef5f1d6634df2c0e8ee21837988c3ce3c1ad (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE manualpage SYSTEM "./style/manualpage.dtd">
<?xml-stylesheet type="text/xsl" href="./style/manual.en.xsl"?>
<!-- $LastChangedRevision$ -->

<!--
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0
 (the "License"); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
-->

<manualpage metafile="urlmapping.xml.meta">

  <title>Mapping URLs to Filesystem Locations</title>

  <summary>
    <p>This document explains how the Apache HTTP Server uses the URL of a request
    to determine the filesystem location from which to serve a
    file.</p>
  </summary>

<section id="related"><title>Related Modules and Directives</title>

<related>
<modulelist>
<module>mod_actions</module>
<module>mod_alias</module>
<module>mod_dir</module>
<module>mod_imagemap</module>
<module>mod_negotiation</module>
<module>mod_proxy</module>
<module>mod_rewrite</module>
<module>mod_speling</module>
<module>mod_userdir</module>
<module>mod_vhost_alias</module>
</modulelist>
<directivelist>
<directive module="mod_alias">Alias</directive>
<directive module="mod_alias">AliasMatch</directive>
<directive module="mod_speling">CheckSpelling</directive>
<directive module="core">DocumentRoot</directive>
<directive module="core">ErrorDocument</directive>
<directive module="core">Options</directive>
<directive module="mod_proxy">ProxyPass</directive>
<directive module="mod_proxy">ProxyPassReverse</directive>
<directive module="mod_proxy">ProxyPassReverseCookieDomain</directive>
<directive module="mod_proxy">ProxyPassReverseCookiePath</directive>
<directive module="mod_alias">Redirect</directive>
<directive module="mod_alias">RedirectMatch</directive>
<directive module="mod_rewrite">RewriteCond</directive>
<directive module="mod_rewrite">RewriteRule</directive>
<directive module="mod_alias">ScriptAlias</directive>
<directive module="mod_alias">ScriptAliasMatch</directive>
<directive module="mod_userdir">UserDir</directive>
</directivelist>
</related>
</section>

<section id="documentroot"><title>DocumentRoot</title>

    <p>In deciding what file to serve for a given request, httpd's
    default behavior is to take the URL-Path for the request (the part
    of the URL following the hostname and port) and add it to the end
    of the <directive module="core">DocumentRoot</directive> specified
    in your configuration files. Therefore, the files and directories
    underneath the <directive module="core">DocumentRoot</directive>
    make up the basic document tree which will be visible from the
    web.</p>

    <p>For example, if <directive module="core">DocumentRoot</directive>
    were set to <code>/var/www/html</code> then a request for
    <code>http://www.example.com/fish/guppies.html</code> would result
    in the file <code>/var/www/html/fish/guppies.html</code> being
    served to the requesting client.</p>

    <p>httpd is also capable of <a href="vhosts/">Virtual
    Hosting</a>, where the server receives requests for more than one
    host. In this case, a different <directive
    module="core">DocumentRoot</directive> can be specified for each
    virtual host, or alternatively, the directives provided by the
    module <module>mod_vhost_alias</module> can
    be used to dynamically determine the appropriate place from which
    to serve content based on the requested IP address or
    hostname.</p>

    <p>The <directive module="core">DocumentRoot</directive> directive
    is set in your main server configuration file
    (<code>httpd.conf</code>) and, possibly, once per additional <a
    href="vhosts/">Virtual Host</a> you create.</p>
</section>

<section id="outside"><title>Files Outside the DocumentRoot</title>

    <p>There are frequently circumstances where it is necessary to
    allow web access to parts of the filesystem that are not strictly
    underneath the <directive
    module="core">DocumentRoot</directive>. httpd offers several
    different ways to accomplish this. On Unix systems, symbolic links
    can bring other parts of the filesystem under the <directive
    module="core">DocumentRoot</directive>. For security reasons,
    httpd will follow symbolic links only if the <directive
    module="core">Options</directive> setting for the relevant
    directory includes <code>FollowSymLinks</code> or
    <code>SymLinksIfOwnerMatch</code>.</p>

    <p>Alternatively, the <directive
    module="mod_alias">Alias</directive> directive will map any part
    of the filesystem into the web space. For example, with</p>

<example>Alias /docs /var/web</example>

    <p>the URL <code>http://www.example.com/docs/dir/file.html</code>
    will be served from <code>/var/web/dir/file.html</code>. The
    <directive module="mod_alias">ScriptAlias</directive> directive
    works the same way, with the additional effect that all content
    located at the target path is treated as <glossary ref="cgi"
    >CGI</glossary> scripts.</p>

    <p>For situations where you require additional flexibility, you
    can use the <directive module="mod_alias">AliasMatch</directive>
    and <directive module="mod_alias">ScriptAliasMatch</directive>
    directives to do powerful <glossary ref="regex">regular
    expression</glossary> based matching and substitution. For
    example,</p>

<example>ScriptAliasMatch ^/~([a-zA-Z0-9]+)/cgi-bin/(.+)
      /home/$1/cgi-bin/$2</example>

    <p>will map a request to
    <code>http://example.com/~user/cgi-bin/script.cgi</code> to the
    path <code>/home/user/cgi-bin/script.cgi</code> and will treat
    the resulting file as a CGI script.</p>
</section>

<section id="user"><title>User Directories</title>

    <p>Traditionally on Unix systems, the home directory of a
    particular <em>user</em> can be referred to as
    <code>~user/</code>. The module <module>mod_userdir</module>
    extends this idea to the web by allowing files under each user's
    home directory to be accessed using URLs such as the
    following.</p>

<example>http://www.example.com/~user/file.html</example>

    <p>For security reasons, it is inappropriate to give direct
    access to a user's home directory from the web. Therefore, the
    <directive module="mod_userdir">UserDir</directive> directive
    specifies a directory underneath the user's home directory
    where web files are located. Using the default setting of
    <code>Userdir public_html</code>, the above URL maps to a file
    at a directory like
    <code>/home/user/public_html/file.html</code> where
    <code>/home/user/</code> is the user's home directory as
    specified in <code>/etc/passwd</code>.</p>

    <p>There are also several other forms of the
    <code>Userdir</code> directive which you can use on systems
    where <code>/etc/passwd</code> does not contain the location of
    the home directory.</p>

    <p>Some people find the "~" symbol (which is often encoded on the
    web as <code>%7e</code>) to be awkward and prefer to use an
    alternate string to represent user directories. This functionality
    is not supported by mod_userdir. However, if users' home
    directories are structured in a regular way, then it is possible
    to use the <directive module="mod_alias">AliasMatch</directive>
    directive to achieve the desired effect. For example, to make
    <code>http://www.example.com/upages/user/file.html</code> map to
    <code>/home/user/public_html/file.html</code>, use the following
    <code>AliasMatch</code> directive:</p>

<example>AliasMatch ^/upages/([a-zA-Z0-9]+)(/(.*))?$
      /home/$1/public_html/$3</example>
</section>

<section id="redirect"><title>URL Redirection</title>

    <p>The configuration directives discussed in the above sections
    tell httpd to get content from a specific place in the filesystem
    and return it to the client. Sometimes, it is desirable instead to
    inform the client that the requested content is located at a
    different URL, and instruct the client to make a new request with
    the new URL. This is called <em>redirection</em> and is
    implemented by the <directive
    module="mod_alias">Redirect</directive> directive. For example, if
    the contents of the directory <code>/foo/</code> under the
    <directive module="core">DocumentRoot</directive> are moved
    to the new directory <code>/bar/</code>, you can instruct clients
    to request the content at the new location as follows:</p>

<example>Redirect permanent /foo/
      http://www.example.com/bar/</example>

    <p>This will redirect any URL-Path starting in
    <code>/foo/</code> to the same URL path on the
    <code>www.example.com</code> server with <code>/bar/</code>
    substituted for <code>/foo/</code>. You can redirect clients to
    any server, not only the origin server.</p>

    <p>httpd also provides a <directive
    module="mod_alias">RedirectMatch</directive> directive for more
    complicated rewriting problems. For example, to redirect requests
    for the site home page to a different site, but leave all other
    requests alone, use the following configuration:</p>

<example>RedirectMatch permanent ^/$
      http://www.example.com/startpage.html</example>

    <p>Alternatively, to temporarily redirect all pages on one site
    to a particular page on another site, use the following:</p>

<example>RedirectMatch temp .*
      http://othersite.example.com/startpage.html</example>
</section>

<section id="proxy"><title>Reverse Proxy</title>

<p>httpd also allows you to bring remote documents into the URL space
of the local server.  This technique is called <em>reverse
proxying</em> because the web server acts like a proxy server by
fetching the documents from a remote server and returning them to the
client.  It is different from normal (forward) proxying because, to the client,
it appears the documents originate at the reverse proxy server.</p>

<p>In the following example, when clients request documents under the
<code>/foo/</code> directory, the server fetches those documents from
the <code>/bar/</code> directory on <code>internal.example.com</code>
and returns them to the client as if they were from the local
server.</p>

<example>
ProxyPass /foo/ http://internal.example.com/bar/<br />
ProxyPassReverse /foo/ http://internal.example.com/bar/<br />
ProxyPassReverseCookieDomain internal.example.com public.example.com<br />
ProxyPassReverseCookiePath /foo/ /bar/
</example>

<p>The <directive module="mod_proxy">ProxyPass</directive> configures
the server to fetch the appropriate documents, while the
<directive module="mod_proxy">ProxyPassReverse</directive>
directive rewrites redirects originating at
<code>internal.example.com</code> so that they target the appropriate
directory on the local server.  Similarly, the
<directive module="mod_proxy">ProxyPassReverseCookieDomain</directive>
and <directive module="mod_proxy">ProxyPassReverseCookiePath</directive>
rewrite cookies set by the backend server.</p>
<p>It is important to note, however, that
links inside the documents will not be rewritten. So any absolute
links on <code>internal.example.com</code> will result in the client
breaking out of the proxy server and requesting directly from
<code>internal.example.com</code>. You can modify these links (and other
content) in a page as it is being served to the client using
<module>mod_substitute</module>.</p>

<example>
Substitute s/internal\.example\.com/www.example.com/i
</example>

<p>Additionally, a third-party module,
<a href="http://apache.webthing.com/mod_proxy_html/">mod_proxy_html</a>,
is available to rewrite links in HTML and XHTML.</p>
</section>

<section id="rewrite"><title>Rewriting Engine</title>

    <p>When even more powerful substitution is required, the rewriting
    engine provided by <module>mod_rewrite</module>
    can be useful. The directives provided by this module can use
    characteristics of the request such as browser type or source IP
    address in deciding from where to serve content. In addition,
    mod_rewrite can use external database files or programs to
    determine how to handle a request. The rewriting engine is capable
    of performing all three types of mappings discussed above:
    internal redirects (aliases), external redirects, and proxying.
    Many practical examples employing mod_rewrite are discussed in the
    <a href="rewrite/">detailed mod_rewrite documentation</a>.</p>
</section>

<section id="notfound"><title>File Not Found</title>

    <p>Inevitably, URLs will be requested for which no matching
    file can be found in the filesystem. This can happen for
    several reasons. In some cases, it can be a result of moving
    documents from one location to another. In this case, it is
    best to use <a href="#redirect">URL redirection</a> to inform
    clients of the new location of the resource. In this way, you
    can assure that old bookmarks and links will continue to work,
    even though the resource is at a new location.</p>

    <p>Another common cause of "File Not Found" errors is
    accidental mistyping of URLs, either directly in the browser,
    or in HTML links. httpd provides the module
    <module>mod_speling</module> (sic) to help with
    this problem. When this module is activated, it will intercept
    "File Not Found" errors and look for a resource with a similar
    filename. If one such file is found, mod_speling will send an
    HTTP redirect to the client informing it of the correct
    location. If several "close" files are found, a list of
    available alternatives will be presented to the client.</p>

    <p>An especially useful feature of mod_speling, is that it will
    compare filenames without respect to case. This can help
    systems where users are unaware of the case-sensitive nature of
    URLs and the unix filesystem. But using mod_speling for
    anything more than the occasional URL correction can place
    additional load on the server, since each "incorrect" request
    is followed by a URL redirection and a new request from the
    client.</p>

    <p><module>mod_dir</module> provides <directive module="mod_dir"
    >FallbackResource</directive>, which can be used to map virtual
    URIs to a real resource, which then serves them. This is a very
    useful replace to <module>mod_rewrite</module> when implementing
    a 'front controler'</p>

    <p>If all attempts to locate the content fail, httpd returns
    an error page with HTTP status code 404 (file not found). The
    appearance of this page is controlled with the
    <directive module="core">ErrorDocument</directive> directive
    and can be customized in a flexible manner as discussed in the
    <a href="custom-error.html">Custom error responses</a>
    document.</p>
</section>

<section id="other"><title>Other URL Mapping Modules</title>

<!-- TODO Flesh out each of the items in the list below. -->

    <p>Other modules available for URL mapping include:</p>

    <ul>
    <li><module>mod_actions</module> - Maps a request to a CGI script
    based on the request method, or resource MIME type.</li>
    <li><module>mod_dir</module> - Provides basic mapping of a trailing
    slash into an index file such as <code>index.html</code>.</li>
    <li><module>mod_imagemap</module> - Maps a request to a URL based
    on where a user clicks on an image embedded in a HTML document.</li>
    <li><module>mod_negotiation</module> - Selects an appropriate
    document based on client preferences such as language or content
    compression.</li>
    </ul>

</section>

</manualpage>