1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
|
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE manualpage SYSTEM "./style/manualpage.dtd">
<?xml-stylesheet type="text/xsl" href="./style/manual.en.xsl"?>
<!-- $LastChangedRevision$ -->
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<manualpage metafile="compliance.xml.meta">
<title>HTTP Protocol Compliance</title>
<summary>
<p>This document describes the mechanism to set a policy for HTTP
protocol compliance for a given URL space by the origin servers or
applications behind that URL space.</p>
<p>For those who may have received an error message from a rejected
policy, and need to know what the policy rejection means and what
they might do to fix the error, each policy is described below.</p>
</summary>
<seealso><a href="filter.html">Filters</a></seealso>
<section id="intro">
<title>Enforcing HTTP Protocol Compliance in Apache 2</title>
<related>
<modulelist>
<module>mod_policy</module>
</modulelist>
<directivelist>
<directive module="mod_policy">PolicyConditional</directive>
<directive module="mod_policy">PolicyLength</directive>
<directive module="mod_policy">PolicyKeepalive</directive>
<directive module="mod_policy">PolicyType</directive>
<directive module="mod_policy">PolicyVary</directive>
<directive module="mod_policy">PolicyValidation</directive>
<directive module="mod_policy">PolicyNocache</directive>
<directive module="mod_policy">PolicyMaxage</directive>
<directive module="mod_policy">PolicyVersion</directive>
</directivelist>
</related>
<p>The HTTP protocol follows the <strong>robustness principle</strong>
as described in <a href="http://tools.ietf.org/html/rfc1122">RFC1122</a>,
which states <strong>"Be liberal in what you accept, and conservative in
what you send"</strong>. As a result of this principle, HTTP clients will
compensate for and recover from incorrect or misconfigured responses, or
responses that are uncacheable.</p>
<p>As a website is scaled up to face greater and greater traffic loads,
suboptimal or misconfigured applications or server configurations can
threaten both the stability and scalability of the website, as well as
the hosting costs associated with it. A website can also scale up to face
greater configuration complexity, and it can be increasingly difficult to
detect and keep track of suboptimally configured URL spaces on a given
server.</p>
<p>Eventually a point is reached where the principle "conservative in
what you send" needs to be enforced by the server administrator.</p>
<p>The <module>mod_policy</module> module provides a set of filters
which can be applied to a server, allowing key features of the HTTP
protocol to be explicitly tested, and non compliant responses logged as
warnings, or rejected outright as an error. Each filter can be applied
separately, allowing the administrator to pick and choose which policies
should be enforced depending on the circumstances of their environment.
</p>
<p>The filters might be placed in testing and staging environments for
the benefit of application and website developers, or may be applied
to production servers to protect infrastructure from systems outside
the administrator's direct control.</p>
<p class="figure">
<img src="images/compliance-reverse-proxy.png" width="666" height="239" alt=
"Enforcing HTTP protocol compliance for an application server"/>
</p>
<p>In the above example, an Apache httpd server has been placed between
the application server and the internet at large, and configured to cache
responses from the application server. The <module>mod_policy</module>
filters have been added to enforce support for cacheable content and
conditional requests, ensuring that both <module>mod_cache</module> and
public caches on the internet are fully able to cache content created
by the restful application server efficiently.</p>
<p class="figure">
<img src="images/compliance-static.png" width="469" height="239" alt=
"Enforcing HTTP protocol compliance in a static server"/>
</p>
<p>In the above simpler example, a static server serving highly cacheable
content has a set of policies applied to ensure that the server configuration
conforms to a minimum level of compliance.</p>
</section>
<section id="policyconditional">
<title>Conditional Request Policy</title>
<related>
<modulelist>
<module>mod_policy</module>
</modulelist>
<directivelist>
<directive module="mod_policy">PolicyConditional</directive>
</directivelist>
</related>
<p>This policy will be rejected if the server does not correctly respond
to a conditional request with the appropriate status code.</p>
<p>Conditional requests form the mechanism by which an HTTP cache makes
stale content fresh again, and particularly for content with short freshness
lifetimes, lack of support for conditional requests can add avoidable load
to the server.</p>
<p>Most specifically, the existence of any of following headers in the
request makes the request conditional:</p>
<dl>
<dt><code>If-Match</code></dt>
<dd>If the provided ETag in the <code>If-Match</code> header does not match
the ETag of the response, the server should return
<code>412 Precondition Failed</code>. Full details of how to handle an
<code>If-Match</code> header can be found in
<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.24">
RFC2616 section 14.24</a>.</dd>
<dt><code>If-None-Match</code></dt>
<dd>If the provided ETag in the <code>If-None-Match</code> header matches
the ETag of the response, the server should return either
<code>304 Not Modified</code> for GET/HEAD requests, or
<code>412 Precondition Failed</code> for other methods. Full details of how
to handle an <code>If-None-Match</code> header can be found in
<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26">
RFC2616 section 14.26</a>.</dd>
<dt><code>If-Modified-Since</code></dt>
<dd>If the provided date in the <code>If-Modified-Since</code> header is
older than the <code>Last-Modified</code> header of the response, the server
should return <code>304 Not Modified</code>. Full details of how to handle an
<code>If-Modified-Since</code> header can be found in
<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.25">
RFC2616 section 14.25</a>.</dd>
<dt><code>If-Unmodified-Since</code></dt>
<dd>If the provided date in the <code>If-Modified-Since</code> header is
newer than the <code>Last-Modified</code> header of the response, the server
should return <code>412 Precondition Failed</code>. Full details of how to
handle an <code>If-Unmodified-Since</code> header can be found in
<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.28">
RFC2616 section 14.28</a>.</dd>
<dt><code>If-Range</code></dt>
<dd>If the provided ETag or date in the <code>If-Range</code> header matches
the ETag or Last-Modified of the response, and a valid <code>Range</code>
is present, the server should return
<code>206 Partial Response</code>. Full details of how to handle an
<code>If-Range</code> header can be found in
<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.27">
RFC2616 section 14.27</a>.</dd>
</dl>
<p>If the response is detected to have been successful (a 2xx response),
but was conditional and one of the responses above was expected instead,
this policy will be rejected. Responses that indicate a redirect or a
failure of some kind (3xx, 4xx, 5xx) will be ignored by this policy.</p>
<p>This policy is implemented by the <strong>POLICY_CONDITIONAL</strong>
filter.</p>
</section>
<section id="policylength">
<title>Content-Length Policy</title>
<related>
<modulelist>
<module>mod_policy</module>
</modulelist>
<directivelist>
<directive module="mod_policy">PolicyLength</directive>
</directivelist>
</related>
<p>This policy will be rejected if the server response does not contain
an explicit <code>Content-Length</code> header.</p>
<p>There are a number of ways of determining the length of a response
body, described in full in
<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4">
RFC2616 section 4.4 Message Length</a>.</p>
<p>When the <code>Content-Length</code> header is present, the size of
the body is declared at the start of the response. If this information
is missing, an HTTP cache might choose to ignore the response, as it
does not know in advance whether the response will fit within the
cache's defined limits.</p>
<p>HTTP/1.1 defines the <code>Transfer-Encoding</code> header as an
alternative to <code>Content-Length</code>, allowing the end of the
response to be indicated to the client without the client having to
know the length beforehand. However, when HTTP/1.0 requests are
processed, and no <code>Content-Length</code> is specified, the only
mechanism available to the server to indicate the end of the request
is to drop the connection. In an environment containing load
balancers, this can cause the keepalive mechanism to be bypassed.
</p>
<p>If the response is detected to have been successful (a 2xx response),
and has a response body (this excludes <code>204 No Content</code>), and
the <code>Content-Length</code> header is missing, this policy will be
rejected. Responses that indicate a redirect or a failure of some kind
(3xx, 4xx, 5xx) will be ignored by this policy.</p>
<note type="warning">It should be noted that some modules, such as
<module>mod_proxy</module>, add their own <code>Content-Length</code>
header should the response be small enough for it to have been possible
to read the response lacking such a header in one go. This may cause
small responses to pass this policy, while larger responses may
fail for the same URL.</note>
<p>This policy is implemented by the <strong>POLICY_LENGTH</strong>
filter.</p>
</section>
<section id="policytype">
<title>Content-Type Policy</title>
<related>
<modulelist>
<module>mod_policy</module>
</modulelist>
<directivelist>
<directive module="mod_policy">PolicyType</directive>
</directivelist>
</related>
<p>This policy will be rejected if the server response does not contain
an explicit and syntactically correct <code>Content-Type</code> header
that matches the server defined pattern.</p>
<p>The media type of the body is placed in the <code>Content-Type</code>
header, and the format of the header is described in full in
<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7">
RFC2616 section 3.7 Media Types</a>.</p>
<p>A syntactically valid content type might look as follows:</p>
<example>
Content-Type: text/html; charset=iso-8859-1
</example>
<p>Invalid content types might include:</p>
<example>
# invalid<br />
Content-Type: foo<br />
# blank<br />
Content-Type:
</example>
<p>The server administrator has the option to restrict the policy to one
or more specific types, or could specify a general wildcard type such as
<code>*/*</code>.</p>
<p>This policy is implemented by the <strong>POLICY_TYPE</strong>
filter.</p>
</section>
<section id="policykeepalive">
<title>Keepalive Policy</title>
<related>
<modulelist>
<module>mod_policy</module>
</modulelist>
<directivelist>
<directive module="mod_policy">PolicyKeepalive</directive>
</directivelist>
</related>
<p>This policy will be rejected if the server response does not contain
an explicit <code>Content-Length</code> header, or a
<code>Transfer-Encoding</code> of chunked.</p>
<p>There are a number of ways of determining the length of a response
body, described in full in
<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4">
RFC2616 section 4.4 Message Length</a>.</p>
<p>When the <code>Content-Length</code> header is present, the size of
the body is declared at the start of the response. HTTP/1.1 defines the
<code>Transfer-Encoding</code> header as an alternative to
<code>Content-Length</code>, allowing the end of the response to be
indicated to the client without the client having to know the length
beforehand. In the absence of these two mechanisms, the only way for
a server to indicate the end of the request is to drop the connection.
In an environment containing load balancers, this can cause the keepalive
mechanism to be bypassed.
</p>
<p>Most specifically, we follow these rules:</p>
<dl>
<dt>IF</dt>
<dd>we have not marked this connection as errored;</dd>
<dt>and</dt>
<dd>the client isn't expecting 100-continue</dd>
<dt>and</dt>
<dd>the response status does not require a close;</dd>
<dt>and</dt>
<dd>the response body has a defined length due to the status code
being 304 or 204, the request method being HEAD, already having defined
Content-Length or Transfer-Encoding: chunked, or the request version
being HTTP/1.1 and thus capable of being set as chunked</dd>
<dt>THEN</dt>
<dd>we support keepalive.</dd>
</dl>
<note type="warning">The server may choose to turn off keepalive for
various reasons, such as an imminent shutdown, or a Connection: close from
the client, or an HTTP/1.0 client request with a response with no
<code>Content-Length</code>, but for our purposes we only care that
keepalive was possible from the application, not that keepalive actually
took place.</note>
<p>It should also be noted that the Apache httpd server includes a filter
that adds chunked encoding to responses without an explicit content
length. This policy catches those cases where this filter is bypassed or
not in effect.</p>
<p>This policy is implemented by the <strong>POLICY_KEEPALIVE</strong>
filter.</p>
</section>
<section id="policymaxage">
<title>Freshness Lifetime / Maxage Policy</title>
<related>
<modulelist>
<module>mod_policy</module>
</modulelist>
<directivelist>
<directive module="mod_policy">PolicyMaxage</directive>
</directivelist>
</related>
<p>This policy will be rejected if the server response does not have
an explicit <strong>freshness lifetime</strong> at least as long
as the server defined limit, or if the freshness lifetime is
calculated based on a heuristic.</p>
<p>Full details of how a freshness lifetime is calculated is described in
full in
<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.2">
RFC2616 section 13.2 Expiration Model</a>.</p>
<p>During the freshness lifetime, a cache does not need to contact the
origin server at all, it can simply pass the cached content as is back
to the client.</p>
<p>When the freshness lifetime is reached, the cache should contact the
origin server in an effort to check whether the content is still fresh,
and if not, replace the content.</p>
<p>When the freshness lifetime is too short, it can result in excessive
load on the server. In addition, should an outage occur that is as long
or longer than the freshness lifetime, all cached content will become
stale, which could cause a thundering herd of traffic when the
server or network returns.</p>
<p>This policy is implemented by the <strong>POLICY_MAXAGE</strong>
filter.</p>
</section>
<section id="policynocache">
<title>No Cache Policy</title>
<related>
<modulelist>
<module>mod_policy</module>
</modulelist>
<directivelist>
<directive module="mod_policy">PolicyNocache</directive>
</directivelist>
</related>
<p>This policy will be rejected if the server response declares itself
uncacheable using either the <code>Cache-Control</code> or
<code>Pragma</code> headers.</p>
<p>Full details of how content may be declared uncacheable is described in
full in
<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.1">
RFC2616 section 14.9.1 What is Cacheable</a>, and within the definition
for the <code>Pragma</code> header in
<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.32">
RFC2616 section 14.32 Pragma</a>.</p>
<p>Most specifically, should any of the following header combinations
exist in the response headers, the response will be rejected:</p>
<ul>
<li><code>Cache-Control: no-cache</code></li>
<li><code>Cache-Control: no-store</code></li>
<li><code>Cache-Control: private</code></li>
<li><code>Pragma: no-cache</code></li>
</ul>
<p>When unexpected, uncacheable content may produce unacceptable levels
of server load, or may incur significant cost. When this policy is enabled,
all server defined uncacheable content will be rejected.</p>
<p>This policy is implemented by the <strong>POLICY_NOCACHE</strong>
filter.</p>
</section>
<section id="policyvalidation">
<title>Validation Policy</title>
<related>
<modulelist>
<module>mod_policy</module>
</modulelist>
<directivelist>
<directive module="mod_policy">PolicyValidation</directive>
</directivelist>
</related>
<p>This policy will be rejected if the server response does not contain
either a syntactically correct <code>ETag</code> or
<code>Last-Modified</code> header.</p>
<p>The <code>ETag</code> header is described in full in
<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.19">
RFC2616 section 14.19 Etag</a>, and the <code>Last-Modified</code> header
is described in full in
<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.29">
RFC2616 section 14.29 Last-Modified</a>.</p>
<p>In addition to being checked present, the headers are checked for
syntax.</p>
<p>An <code>ETag</code> that is not surrounded with quotes, or is not
declared "weak" by prefixing it with a "W/" will cause the policy to be
rejected. A <code>Last-Modified</code> that is not parsed as a valid date
will cause the policy to be rejected.</p>
<p>This policy is implemented by the <strong>POLICY_VALIDATION</strong>
filter.</p>
</section>
<section id="policyvary">
<title>Vary Header Policy</title>
<related>
<modulelist>
<module>mod_policy</module>
</modulelist>
<directivelist>
<directive module="mod_policy">PolicyVary</directive>
</directivelist>
</related>
<p>This policy will be rejected if the server response contains a
<code>Vary</code> header, and that header in turn contains a header
forbidden by the administrator.</p>
<p>The <code>Vary</code> header is described in full in
<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44">
RFC2616 section 14.44 Vary</a>.</p>
<p>Some client provided headers, such as <code>User-Agent</code>,
can contain thousands or millions of combinations of values over a period
of time, and if the response is declared cacheable, a cache might attempt
to cache each of these responses separately, filling up the cache and
crowding out other entries in the cache. In this scenario, if so
configured, the policy will reject the response.</p>
<p>This policy is implemented by the <strong>POLICY_VARY</strong>
filter.</p>
</section>
<section id="policyversion">
<title>Protocol Version Policy</title>
<related>
<modulelist>
<module>mod_policy</module>
</modulelist>
<directivelist>
<directive module="mod_policy">PolicyVersion</directive>
</directivelist>
</related>
<p>This policy will be rejected if the client request was made with a
version number lower than the version of HTTP specified.</p>
<p>This policy is typically used with restful applications where
control over the type of client is desired. This policy can be used
alongside the <code>POLICY_KEEPALIVE</code> filter to ensure that
HTTP/1.0 clients don't cause keepalive connections to be dropped.</p>
<p>Possible minimum versions that could be specified are:</p>
<ul><li><code>HTTP/1.1</code></li>
<li><code>HTTP/1.0</code></li>
<li><code>HTTP/0.9</code></li>
</ul>
<p>This policy is implemented by the <strong>POLICY_VERSON</strong>
filter.</p>
</section>
</manualpage>
|