summaryrefslogtreecommitdiffstats
path: root/docs/manual/rewrite/tech.html.en
blob: 023d1836f0a14a58cbc996bfcf4dfbf99e0174d2 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head><!--
        XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
              This file is generated from xml source: DO NOT EDIT
        XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
      -->
<title>Apache mod_rewrite Technical Details - Apache HTTP Server</title>
<link href="../style/css/manual.css" rel="stylesheet" media="all" type="text/css" title="Main stylesheet" />
<link href="../style/css/manual-loose-100pc.css" rel="alternate stylesheet" media="all" type="text/css" title="No Sidebar - Default font size" />
<link href="../style/css/manual-print.css" rel="stylesheet" media="print" type="text/css" />
<link href="../images/favicon.ico" rel="shortcut icon" /></head>
<body id="manual-page"><div id="page-header">
<p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/directives.html">Directives</a> | <a href="../faq/">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p>
<p class="apache">Apache HTTP Server Version 2.5</p>
<img alt="" src="../images/feather.gif" /></div>
<div class="up"><a href="./"><img title="&lt;-" alt="&lt;-" src="../images/left.gif" /></a></div>
<div id="path">
<a href="http://www.apache.org/">Apache</a> &gt; <a href="http://httpd.apache.org/">HTTP Server</a> &gt; <a href="http://httpd.apache.org/docs/">Documentation</a> &gt; <a href="../">Version 2.5</a> &gt; <a href="./">Rewrite</a></div><div id="page-content"><div id="preamble"><h1>Apache mod_rewrite Technical Details</h1>
<div class="toplang">
<p><span>Available Languages: </span><a href="../en/rewrite/tech.html" title="English">&nbsp;en&nbsp;</a> |
<a href="../fr/rewrite/tech.html" hreflang="fr" rel="alternate" title="Fran�ais">&nbsp;fr&nbsp;</a></p>
</div>

<p>This document discusses some of the technical details of mod_rewrite
and URL matching.</p>
</div>
<div id="quickview"><ul id="toc"><li><img alt="" src="../images/down.gif" /> <a href="#InternalAPI">API Phases</a></li>
<li><img alt="" src="../images/down.gif" /> <a href="#InternalRuleset">Ruleset Processing</a></li>
</ul><h3>See also</h3><ul class="seealso"><li><a href="../mod/mod_rewrite.html">Module documentation</a></li><li><a href="intro.html">mod_rewrite introduction</a></li><li><a href="remapping.html">Redirection and remapping</a></li><li><a href="access.html">Controlling access</a></li><li><a href="vhosts.html">Virtual hosts</a></li><li><a href="proxy.html">Proxying</a></li><li><a href="rewritemap.html">Using RewriteMap</a></li><li><a href="advanced.html">Advanced techniques</a></li><li><a href="avoid.html">When not to use mod_rewrite</a></li></ul></div>
<div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
<div class="section">
<h2><a name="InternalAPI" id="InternalAPI">API Phases</a></h2>

    <p>The Apache HTTP Server handles requests in several phases. At
    each of these phases, one or more modules may be called upon to
    handle that portion of the request lifecycle. Phases include things
    like URL-to-filename translation, authentication, authorization,
    content, and logging. (This is not an exhaustive list.)</p>

    <p>mod_rewrite acts in two of these phases (or "hooks", as they are
    often called) to influence how URLs may be rewritten.</p>

    <p>First, it uses the URL-to-filename translation hook, which occurs
    after the HTTP request has been read, but before any authorization
    starts. Secondly, it uses the Fixup hook, which is after the
    authorization phases, and after per-directory configuration files
    (<code>.htaccess</code> files) have been read, but before the
    content handler is called.</p>

    <p>So, after a request comes in and a corresponding server or
    virtual host has been determined, the rewriting engine starts
    processing any <code>mod_rewrite</code> directives appearing in the
    per-server configuration. (i.e., in the main server configuration file
    and <code class="directive"><a href="../mod/core.html#virtualhost">&lt;Virtualhost&gt;</a></code>
    sections.) This happens in the URL-to-filename phase.</p>

    <p>A few steps later, once the final data directories have been found,
    the per-directory configuration directives (<code>.htaccess</code>
    files and <code class="directive"><a href="../mod/core.html#directory">&lt;Directory&gt;</a></code> blocks) are applied. This
    happens in the Fixup phase.</p>

    <p>In each of these cases, mod_rewrite rewrites the
    <code>REQUEST_URI</code> either to a new URL, or to a filename.</p>

    <p>In per-directory context (i.e., within <code>.htaccess</code> files
    and <code>Directory</code> blocks), these rules are being applied
    after a URL has already been translated to a filename. Because of
    this, the URL-path that mod_rewrite initially compares <code class="directive"><a href="../mod/mod_rewrite.html#rewriterule">RewriteRule</a></code> directives against
    is the full filesystem path to the translated filename with the current
    directories path (including a trailing slash) removed from the front.</p>

    <p> To illustrate: If rules are in /var/www/foo/.htaccess and a request
    for /foo/bar/baz is being processed, an expression like ^bar/baz$ would
    match.</p>

    <p> If a substitution is made in per-directory context, a new internal 
    subrequest is issued with the new URL, which restarts processing of the 
    request phases. If the substitution is a relative path, the <code class="directive"><a href="../mod/mod_rewrite.html#rewritebase">RewriteBase</a></code> directive 
    determines the URL-path prefix prepended to the substitution.
    In per-directory context, care must be taken to 
    create rules which will eventually (in some future "round" of per-directory
    rewrite processing) not perform a substitution to avoid looping.
    (See <a href="http://wiki.apache.org/httpd/RewriteLooping">RewriteLooping</a>
    for further discussion of this problem.)</p>

    <p>Because of this further manipulation of the URL in per-directory
    context, you'll need to take care to craft your rewrite rules
    differently in that context. In particular, remember that the
    leading directory path will be stripped off of the URL that your
    rewrite rules will see. Consider the examples below for further
    clarification.</p>

    <table class="bordered">

        <tr>
            <th>Location of rule</th>
            <th>Rule</th>
        </tr>

        <tr>
            <td>VirtualHost section</td>
            <td>RewriteRule ^/images/(.+)\.jpg /images/$1.gif</td>
        </tr>

        <tr>
            <td>.htaccess file in document root</td>
            <td>RewriteRule ^images/(.+)\.jpg images/$1.gif</td>
        </tr>

        <tr>
            <td>.htaccess file in images directory</td>
            <td>RewriteRule ^(.+)\.jpg $1.gif</td>
        </tr>

    </table>

    <p>For even more insight into how mod_rewrite manipulates URLs in
    different contexts, you should consult the <a href="../mod/mod_rewrite.html#logging">log entries</a> made during 
    rewriting.</p>

</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
<div class="section">
<h2><a name="InternalRuleset" id="InternalRuleset">Ruleset Processing</a></h2>

      <p>Now when mod_rewrite is triggered in these two API phases, it
      reads the configured rulesets from its configuration
      structure (which itself was either created on startup for
      per-server context or during the directory walk of the Apache
      kernel for per-directory context). Then the URL rewriting
      engine is started with the contained ruleset (one or more
      rules together with their conditions). The operation of the
      URL rewriting engine itself is exactly the same for both
      configuration contexts. Only the final result processing is
      different.</p>

      <p>The order of rules in the ruleset is important because the
      rewriting engine processes them in a special (and not very
      obvious) order. The rule is this: The rewriting engine loops
      through the ruleset rule by rule (<code class="directive"><a href="../mod/mod_rewrite.html#rewriterule">RewriteRule</a></code> directives) and
      when a particular rule matches it optionally loops through
      existing corresponding conditions (<code>RewriteCond</code>
      directives). For historical reasons the conditions are given
      first, and so the control flow is a little bit long-winded. See
      Figure 1 for more details.</p>
<p class="figure">
      <img src="../images/rewrite_process_uri.png" alt="Flow of RewriteRule and RewriteCond matching" /><br />
      <dfn>Figure 1:</dfn>The control flow through the rewriting ruleset
</p>
      <p>First the URL is matched against the
      <em>Pattern</em> of each rule. If it fails, mod_rewrite
      immediately stops processing this rule, and continues with the
      next rule. If the <em>Pattern</em> matches, mod_rewrite looks
      for corresponding rule conditions (RewriteCond directives,
      appearing immediately above the RewriteRule in the configuration).
      If none are present, it substitutes the URL with a new value, which is
      constructed from the string <em>Substitution</em>, and goes on
      with its rule-looping. But if conditions exist, it starts an
      inner loop for processing them in the order that they are
      listed. For conditions, the logic is different: we don't match
      a pattern against the current URL. Instead we first create a
      string <em>TestString</em> by expanding variables,
      back-references, map lookups, <em>etc.</em> and then we try
      to match <em>CondPattern</em> against it. If the pattern
      doesn't match, the complete set of conditions and the
      corresponding rule fails. If the pattern matches, then the
      next condition is processed until no more conditions are
      available. If all conditions match, processing is continued
      with the substitution of the URL with
      <em>Substitution</em>.</p>

</div></div>
<div class="bottomlang">
<p><span>Available Languages: </span><a href="../en/rewrite/tech.html" title="English">&nbsp;en&nbsp;</a> |
<a href="../fr/rewrite/tech.html" hreflang="fr" rel="alternate" title="Fran�ais">&nbsp;fr&nbsp;</a></p>
</div><div id="footer">
<p class="apache">Copyright 2012 The Apache Software Foundation.<br />Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p>
<p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/directives.html">Directives</a> | <a href="../faq/">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p></div>
</body></html>