summaryrefslogtreecommitdiffstats
path: root/docs/manual/howto/reverse_proxy.xml
blob: 6aca0229a026b413c692a21b7821c97266365386 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
<?xml version='1.0' encoding='UTF-8' ?>
<!DOCTYPE manualpage SYSTEM "../style/manualpage.dtd">
<?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?>
<!-- $LastChangedRevision: 1673932 $ -->

<!--
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0
 (the "License"); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
-->

<manualpage metafile="public_html.xml.meta">
<parentdocument href="./">How-To / Tutorials</parentdocument>

  <title>Reverse Proxy Guide</title>

  <summary>
    <p>In addition to being a "basic" web server, and providing static and
    dynamic content to end-users, Apache httpd (as well as most other web
    servers) can also act as a reverse proxy server, also-known-as a
    "gateway" server.</p>

    <p>In such scenarios, httpd itself does not generate or host the data,
    but rather the content is obtained by one or several backend servers,
    which normally have no direct connection to the external network. As
    httpd receives a request from a client, the request itself is <em>proxied</em>
    to one of these backend servers, which then handles the request, generates
    the content and then sends this content back to httpd, which then
    generates the actual HTTP response back to the client.</p>

    <p>There are numerous reasons for such an implementation, but generally
    the typical rationales are due to security, high-availability, load-balancing
    and centralized authentication/authorization. It is critical in these
    implementations that the layout, design and architecture of the backend
    infrastructure (those servers which actually handle the requests) are
    insulated and protected from the outside; as far as the client is concerned,
    the reverse proxy server <em>is</em> the sole source of all content.</p>

    <p>A typical implementation is below:</p>
    <p><img src="../images/reverse-proxy-arch.png" alt="reverse-proxy-arch" /></p>

  </summary>


  <section id="related">
  <title>Reverse Proxy</title>
  <related>
    <modulelist>
      <module>mod_proxy</module>
      <module>mod_proxy_balancer</module>
      <module>mod_proxy_hcheck</module>
    </modulelist>
    <directivelist>
      <directive module="mod_proxy">ProxyPass</directive>
      <directive module="mod_proxy">BalancerMember</directive>
    </directivelist>
  </related>
  </section>

  <section id="simple">
    <title>Simple reverse proxying</title>

    <p>
      The <directive module="mod_proxy">ProxyPass</directive>
      directive specifies the mapping of incoming requests to the backend
      server (or a cluster of servers known as a <code>Balancer</code>
      group). The simpliest example proxies all requests (<code>"/"</code>)
      to a single backend:
    </p>

    <highlight language="config">
ProxyPass "/"  "http://www.example.com"
    </highlight>

    <p>
      To ensure that and <code>Location:</code> headers generated from
      the backend are modified to point to the reverse proxy, instead of
      back to itself, the <directive module="mod_proxy">ProxyPassReverse</directive>
      directive is most often required:
    </p>

    <highlight language="config">
ProxyPass "/"  "http://www.example.com"
ProxyPassReverse "/"  "http://www.example.com"
    </highlight>

    <p>Only specific URIs can be proxied, as shown in this example:</p>

    <highlight language="config">
ProxyPass "/images"  "http://www.example.com"
ProxyPassReverse "/images"  "http://www.example.com"
    </highlight>

    <p>In the above, any requests which start with the <code>/images</code>
      path with be proxied to the specified backend, otherwise it will be handled
      locally.
    </p>
  </section>

  <section id="cluster">
    <title>Clusters and Balancers</title>

    <p>
      As useful as the above is, it still has the deficiencies that should
      the (single) backend node go down, or become heavily loaded, that proxying
      those requests provides no real advantage. What is needed is the ability
      to define a set or group of backend servers which can handle such
      requests and for the reverse proxy to load balance and failover among
      them. This group is sometimes called a <em>cluster</em> but Apache httpd's
      term is a <em>balancer</em>. One defines a balancer by leveraging the
      <directive module="mod_proxy">Proxy</directive> and
      <directive module="mod_proxy">BalancerMember</directive> directives as
      shown:
    </p>

    <highlight language="config">
&lt;Proxy balancer://myset&gt;
    BalancerMember http://www2.example.com:8080
    BalancerMember http://www3.example.com:8080
    ProxySet lbmethod=bytraffic
&lt;/Proxy&gt;

ProxyPass "/images"  "balancer://myset"
ProxyPassReverse "/images"  "balancer://myset"
    </highlight>

    <p>
      The <code>balancer://</code> scheme is what tells httpd that we are creating
      a balancer set, with the name <em>myset</em>. It includes 2 backend servers,
      which httpd calls <em>BalancerMembers</em>. In this case, any requests for
      <code>/images</code> will be proxied to <em>one</em> of the 2 backends.
      The <directive module="mod_proxy">ProxySet</directive> directive
      specifies that the <em>myset</em> Balancer use a load balancing algorithm
      that balances based on I/O bytes.
    </p>

    <note type="hint"><title>Hint</title>
      <p>
      	<em>BalancerMembers</em> are also sometimes referred to as <em>workers</em>.
      </p>
   </note>

  </section>

  <section id="config">
    <title>Balancer and BalancerMember configuration</title>

    <p>
      You can adjust numerous configuration details of the <em>balancers</em>
      and the <em>workers</em> via the various parameters defined in
      <directive module="mod_proxy">ProxyPass</directive>. For example,
      assuming we would want <code>http://www3.example.com:8080</code> to
      handle 3x the traffic with a timeout of 1 second, we would adjust the
      configuration as follows:
    </p>

    <highlight language="config">
&lt;Proxy balancer://myset&gt;
    BalancerMember http://www2.example.com:8080
    BalancerMember http://www3.example.com:8080 loadfactor=3 timeout=1
    ProxySet lbmethod=bytraffic
&lt;/Proxy&gt;

ProxyPass "/images"  "balancer://myset"
ProxyPassReverse "/images"  "balancer://myset"
    </highlight>

  </section>

  <section id="failover">
    <title>Failover</title>

    <p>
      You can also fine-tune various failover scenarios, detailing which
      workers and even which balancers should accessed in such cases. For
      example, the below setup implements 2 failover cases: In the first,
      <code>http://hstandby.example.com:8080</code> is only sent traffic
      if all other workers in the <em>myset</em> balancer are not available.
      If that worker itself is not available, only then will the
      <code>http://bkup1.example.com:8080</code> and <code>http://bkup2.example.com:8080</code>
      workers be brought into rotation:
    </p>

    <highlight language="config">
&lt;Proxy balancer://myset&gt;
    BalancerMember http://www2.example.com:8080
    BalancerMember http://www3.example.com:8080 loadfactor=3 timeout=1
    BalancerMember http://hstandby.example.com:8080 status=+H
    BalancerMember http://bkup1.example.com:8080 lbset=1
    BalancerMember http://bkup2.example.com:8080 lbset=1
    ProxySet lbmethod=byrequests
&lt;/Proxy&gt;

ProxyPass "/images"  "balancer://myset"
ProxyPassReverse "/images"  "balancer://myset"
    </highlight>

    <p>
      The magic of this failover setup is setting <code>http://hstandby.example.com:8080</code>
      with the <code>+H</code> status flag, which puts it in <em>hot standby</em> mode,
      and making the 2 <code>bkup#</code> servers part of the #1 load balancer set (the
      default set is 0); for failover, hot standbys (if they exist) are used 1st, when all regular
      workers are unavailable; load balancer sets are always tried lowest number first.
    </p>

  </section>

  <section id="manager">
    <title>Balancer Manager</title>

    <p>
      One of the most unique and useful features of Apache httpd's reverse proxy is
	  the embedded <em>balancer-manager</em> application. Similar to
	  <module>mod_status</module>, <em>balancer-manager</em> displays
	  the current working configuration and status of the enabled
	  balancers and workers currently in use. However, not only does it
	  display these parameters, it also allows for dynamic, runtime, on-the-fly
	  reconfiguration of almost all of them, including adding new <em>BalancerMembers</em>
	  (workers) to an existing balancer. To enable these capability, the following
	  needs to be added to your configuration:
    </p>

<highlight language="config">
&lt;Location "/balancer-manager"&gt;
    SetHandler balancer-manager
    Require host localhost
&lt;/Location&gt;
</highlight>

    <note type="warning"><title>Warning</title>
      <p>Do not enable the <em>balancer-manager</em> until you have <a
      href="mod_proxy.html#access">secured your server</a>. In
      particular, ensure that access to the URL is tightly
      restricted.</p>
    </note>

    <p>
      When the reverse proxy server is accessed at that url
      (eg: <code>http://rproxy.example.com/balancer-manager/</code>, you will see a
      page similar to the below:
    </p>
    <p><img src="../images/bal-man.png" alt="balancer-manager page" /></p>

    <p>
      This form allows the devops admin to adjust various parameters, take
      workers offline, change load balancing methods and add new works. For
      example, clicking on the balancer itself, you will get the following page:
    </p>
    <p><img src="../images/bal-man-b.png" alt="balancer-manager page" /></p>

    <p>
      Whereas clicking on a worker, displays this page:
    </p>
    <p><img src="../images/bal-man-w.png" alt="balancer-manager page" /></p>

    <p>
      To have these changes persist restarts of the reverse proxy, ensure that
      <directive module="mod_proxy">BalancerPersist</directive> is enabled.
    </p>

  </section>

</manualpage>