mod_cache
RFC 2616 compliant HTTP caching filter.
Extension
mod_cache.c
cache_module
This module should be used with care, as when the
CacheQuickHandler directive is
in its default value of on, the Allow and Deny directives will be circumvented.
You should not enable quick handler caching for any content to which you
wish to limit access by client host name, address or environment
variable.
mod_cache implements an RFC 2616 compliant
HTTP content caching filter, with support for the caching
of content negotiated responses containing the Vary header.
RFC 2616 compliant caching provides a mechanism to verify whether
stale or expired content is still fresh, and can represent a significant
performance boost when the origin server supports conditional
requests by honouring the
If-None-Match
HTTP request header. Content is only regenerated from scratch when the content
has changed, and not when the cached entry expires.
As a filter, mod_cache can be placed in front of
content originating from any handler, including flat
files (served from a slow disk cached on a fast disk), the output
of a CGI script or dynamic content
generator, or content proxied from another
server.
In the default configuration, mod_cache inserts the
caching filter as far forward as possible within the filter stack,
utilising the quick handler to bypass all per request
processing when returning content to the client. In this mode of
operation, mod_cache may be thought of as a caching
proxy server bolted to the front of the webserver, while running within
the webserver itself.
When the quick handler is switched off using the
CacheQuickHandler directive,
it becomes possible to insert the CACHE filter at a
point in the filter stack chosen by the administrator. This provides the
opportunity to cache content before that content is personalised by the
mod_include filter, or optionally compressed by the
mod_deflate filter.
Under normal operation, mod_cache will respond to
and can be controlled by the
Cache-Control
and
Pragma
headers sent from a client in a request, or from a
server within a response. Under exceptional circumstances,
mod_cache can be configured to override these headers
and force site specific behaviour, however such behaviour will be limited
to this cache only, and will not affect the operation of other caches
that may exist between the client and server, and as a result is not
recommended unless strictly necessary.
RFC 2616 allows for the cache to return stale data while the existing
stale entry is refreshed from the origin server, and this is supported
by mod_cache when the
CacheLock directive is suitably
configured. Such responses will contain a
Warning
HTTP header with a 110 response code. RFC 2616 also allows a cache to return
stale data when the attempt made to refresh the stale data returns an
error 500 or above, and this behaviour is supported by default by
mod_cache. Such responses will contain a
Warning
HTTP header with a 111 response code.
mod_cache requires the services of one or more
storage management modules. The following storage management modules are included in
the base Apache distribution:
- mod_cache_disk
- Implements a disk based storage manager. Headers and bodies are
stored separately on disk, in a directory structure derived from the
md5 hash of the cached URL. Multiple content negotiated responses can
be stored concurrently, however the caching of partial content is not
supported by this module. The htcacheclean tool is
provided to list cached URLs, remove cached URLs, or to maintain the size
of the disk cache within size and inode limits.
- mod_cache_socache
- Implements a shared object cache based storage manager. Headers and
bodies are stored together beneath a single key based on the URL of the
response being cached. Multiple content negotiated responses can
be stored concurrently, however the caching of partial content is not
supported by this module.
Further details, discussion, and examples, are provided in the
Caching Guide.
Caching Guide
Sample Configuration
Sample httpd.conf
#
# Sample Cache Configuration
#
LoadModule cache_module modules/mod_cache.so
<IfModule mod_cache.c>
LoadModule cache_disk_module modules/mod_cache_disk.so
<IfModule mod_cache_disk.c>
CacheRoot c:/cacheroot
CacheEnable disk /
CacheDirLevels 5
CacheDirLength 3
</IfModule>
# When acting as a proxy, don't cache the list of security updates
CacheDisable http://security.update.server/update-list/
</IfModule>
Avoiding the Thundering Herd
When a cached entry becomes stale, mod_cache will submit
a conditional request to the backend, which is expected to confirm whether the
cached entry is still fresh, and send an updated entity if not.
A small but finite amount of time exists between the time the cached entity
becomes stale, and the time the stale entity is fully refreshed. On a busy
server, a significant number of requests might arrive during this time, and
cause a thundering herd of requests to strike the backend
suddenly and unpredictably.
To keep the thundering herd at bay, the CacheLock
directive can be used to define a directory in which locks are created for
URLs in flight. The lock is used as a hint
by other requests to either suppress an attempt to cache (someone else has
gone to fetch the entity), or to indicate that a stale entry is being refreshed
(stale content will be returned in the mean time).
Initial caching of an entry
When an entity is cached for the first time, a lock will be created for the
entity until the response has been fully cached. During the lifetime of the
lock, the cache will suppress the second and subsequent attempt to cache the
same entity. While this doesn't hold back the thundering herd, it does stop
the cache attempting to cache the same entity multiple times simultaneously.
Refreshment of a stale entry
When an entity reaches its freshness lifetime and becomes stale, a lock
will be created for the entity until the response has either been confirmed as
still fresh, or replaced by the backend. During the lifetime of the lock, the
second and subsequent incoming request will cause stale data to be returned,
and the thundering herd is kept at bay.
Locks and Cache-Control: no-cache
Locks are used as a hint only to enable the cache to be
more gentle on backend servers, however the lock can be overridden if necessary.
If the client sends a request with a Cache-Control header forcing a reload, any
lock that may be present will be ignored, and the client's request will be
honored immediately and the cached entry refreshed.
As a further safety mechanism, locks have a configurable maximum age.
Once this age has been reached, the lock is removed, and a new request is
given the opportunity to create a new lock. This maximum age can be set using
the CacheLockMaxAge directive, and defaults
to 5 seconds.
Example configuration
Enabling the cache lock
#
# Enable the cache lock
#
<IfModule mod_cache.c>
CacheLock on
CacheLockPath /tmp/mod_cache-lock
CacheLockMaxAge 5
</IfModule>
Fine Control with the CACHE Filter
Under the default mode of cache operation, the cache runs as a quick handler,
short circuiting the majority of server processing and offering the highest
cache performance available.
In this mode, the cache bolts onto the front of the server,
acting as if a free standing RFC 2616 caching proxy had been placed in front of
the server.
While this mode offers the best performance, the administrator may find that
under certain circumstances they may want to perform further processing on the
request after the request is cached, such as to inject personalisation into the
cached page, or to apply authorization restrictions to the content. Under these
circumstances, an administrator is often forced to place independent reverse
proxy servers either behind or in front of the caching server to achieve this.
To solve this problem the CacheQuickHandler
directive can be set to off, and the server will
process all phases normally handled by a non-cached request, including the
authentication and authorization phases.
In addition, the administrator may optionally specify the precise point
within the filter chain where caching is to take place by adding the
CACHE filter to the output filter chain.
For example, to cache content before applying compression to the response,
place the CACHE filter before the DEFLATE
filter as in the example below:
# Cache content before optional compression
CacheQuickHandler off
AddOutputFilterByType CACHE;DEFLATE text/plain
Another option is to have content cached before personalisation is applied
by mod_include (or another content processing filter). In this
example templates containing tags understood by
mod_include are cached before being parsed:
# Cache content before mod_include and mod_deflate
CacheQuickHandler off
AddOutputFilterByType CACHE;INCLUDES;DEFLATE text/html
You may place the CACHE filter anywhere you wish within the
filter chain. In this example, content is cached after being parsed by
mod_include, but before being processed by
mod_deflate:
# Cache content between mod_include and mod_deflate
CacheQuickHandler off
AddOutputFilterByType INCLUDES;CACHE;DEFLATE text/html
Warning:If the location of the
CACHE filter in the filter chain is changed for any reason,
you may need to flush your cache to ensure that your data
served remains consistent. mod_cache is not in a position
to enforce this for you.
Cache Status and Logging
Once mod_cache has made a decision as to whether or not
an entity is to be served from cache, the detailed reason for the decision
is written to the subprocess environment within the request under the
cache-status key. This reason can be logged by the
LogFormat directive as
follows:
LogFormat "%{cache-status}e ..."
Based on the caching decision made, the reason is also written to the
subprocess environment under one the following four keys, as appropriate:
- cache-hit
- The response was served from cache.
- cache-revalidate
- The response was stale and was successfully
revalidated, then served from cache.
- cache-miss
- The response was served from the upstream server.
- cache-invalidate
- The cached entity was invalidated by a request
method other than GET or HEAD.
This makes it possible to support conditional logging of cached requests
as per the following example:
CustomLog "cached-requests.log" common env=cache-hit
CustomLog "uncached-requests.log" common env=cache-miss
CustomLog "revalidated-requests.log" common env=cache-revalidate
CustomLog "invalidated-requests.log" common env=cache-invalidate
For module authors, a hook called cache_status is available,
allowing modules to respond to the caching outcomes above in customised
ways.
CacheEnable
Enable caching of specified URLs using a specified storage
manager
CacheEnable cache_type [url-string]
server configvirtual host
directory
A url-string of '/' applied to forward proxy content in 2.2 and
earlier.
The CacheEnable directive instructs
mod_cache to cache urls at or below
url-string. The cache storage manager is specified with the
cache_type argument. The CacheEnable
directive can alternatively be placed inside either
Location or
LocationMatch sections to indicate
the content is cacheable.
cache_type disk
instructs
mod_cache to use the disk based storage manager
implemented by mod_cache_disk. cache_type
socache
instructs mod_cache to use the
shared object cache based storage manager implemented by
mod_cache_socache.
In the event that the URL space overlaps between different
CacheEnable directives (as in the example below),
each possible storage manager will be run until the first one that
actually processes the request. The order in which the storage managers are
run is determined by the order of the CacheEnable
directives in the configuration file. CacheEnable
directives within Location or
LocationMatch sections are processed
before globally defined CacheEnable directives.
When acting as a forward proxy server, url-string must
minimally begin with a protocol for which caching should be enabled.
# Cache content (normal handler only)
CacheQuickHandler off
<Location "/foo">
CacheEnable disk
</Location>
# Cache regex (normal handler only)
CacheQuickHandler off
<LocationMatch "foo$">
CacheEnable disk
</LocationMatch>
# Cache all but forward proxy url's (normal or quick handler)
CacheEnable disk /
# Cache FTP-proxied url's (normal or quick handler)
CacheEnable disk ftp://
# Cache forward proxy content from www.example.org (normal or quick handler)
CacheEnable disk http://www.example.org/
A hostname starting with a "*" matches all hostnames with
that suffix. A hostname starting with "." matches all
hostnames containing the domain components that follow.
# Match www.example.org, and fooexample.org
CacheEnable disk http://*example.org/
# Match www.example.org, but not fooexample.org
CacheEnable disk http://.example.org/
The no-cache
environment variable can be set to
disable caching on a finer grained set of resources in versions
2.2.12 and later.
Environment Variables in Apache
CacheDisable
Disable caching of specified URLs
CacheDisable url-string | on
server configvirtual host
directory.htaccess
The CacheDisable directive instructs
mod_cache to not cache urls at or below
url-string.
Example
CacheDisable /local_files
If used in a Location directive,
the path needs to be specified below the Location, or if the word "on"
is used, caching for the whole location will be disabled.
Example
<Location "/foo">
CacheDisable on
</Location>
The no-cache
environment variable can be set to
disable caching on a finer grained set of resources in versions
2.2.12 and later.
Environment Variables in Apache
CacheMaxExpire
The maximum time in seconds to cache a document
CacheMaxExpire seconds
CacheMaxExpire 86400 (one day)
server config
virtual host
directory
.htaccess
The CacheMaxExpire directive specifies the maximum number of
seconds for which cacheable HTTP documents will be retained without checking the origin
server. Thus, documents will be out of date at most this number of seconds. This maximum
value is enforced even if an expiry date was supplied with the document.
CacheMaxExpire 604800
CacheMinExpire
The minimum time in seconds to cache a document
CacheMinExpire seconds
CacheMinExpire 0
server config
virtual host
directory
.htaccess
The CacheMinExpire directive specifies the minimum number of
seconds for which cacheable HTTP documents will be retained without checking the origin
server. This is only used if no valid expire time was supplied with the document.
CacheMinExpire 3600
CacheDefaultExpire
The default duration to cache a document when no expiry date is specified.
CacheDefaultExpire seconds
CacheDefaultExpire 3600 (one hour)
server config
virtual host
directory
.htaccess
The CacheDefaultExpire directive specifies a default time,
in seconds, to cache a document if neither an expiry date nor last-modified date are provided
with the document. The value specified with the CacheMaxExpire
directive does not override this setting.
CacheDefaultExpire 86400
CacheIgnoreNoLastMod
Ignore the fact that a response has no Last Modified
header.
CacheIgnoreNoLastMod On|Off
CacheIgnoreNoLastMod Off
server config
virtual host
directory
.htaccess
Ordinarily, documents without a last-modified date are not cached.
Under some circumstances the last-modified date is removed (during
mod_include processing for example) or not provided
at all. The CacheIgnoreNoLastMod directive
provides a way to specify that documents without last-modified dates
should be considered for caching, even without a last-modified date.
If neither a last-modified date nor an expiry date are provided with
the document then the value specified by the
CacheDefaultExpire directive will be used to
generate an expiration date.
CacheIgnoreNoLastMod On
CacheIgnoreCacheControl
Ignore request to not serve cached content to client
CacheIgnoreCacheControl On|Off
CacheIgnoreCacheControl Off
server configvirtual host
Ordinarily, requests containing a Cache-Control: no-cache
or
Pragma: no-cache header value will not be served from the cache. The
CacheIgnoreCacheControl directive allows this
behavior to be overridden. CacheIgnoreCacheControl On
tells the server to attempt to serve the resource from the cache even
if the request contains no-cache header values.
CacheIgnoreCacheControl On
Warning:
This directive will allow serving from the cache even if the client has
requested that the document not be served from the cache. This might
result in stale content being served.
CacheStorePrivate
CacheStoreNoStore
CacheIgnoreQueryString
Ignore query string when caching
CacheIgnoreQueryString On|Off
CacheIgnoreQueryString Off
server configvirtual host
Ordinarily, requests with query string parameters are cached separately
for each unique query string. This is according to RFC 2616/13.9 done only
if an expiration time is specified. The
CacheIgnoreQueryString directive tells the cache to
cache requests even if no expiration time is specified, and to reply with
a cached reply even if the query string differs. From a caching point of
view the request is treated as if having no query string when this
directive is enabled.
CacheIgnoreQueryString On
CacheLastModifiedFactor
The factor used to compute an expiry date based on the
LastModified date.
CacheLastModifiedFactor float
CacheLastModifiedFactor 0.1
server config
virtual host
directory
.htaccess
In the event that a document does not provide an expiry date but does
provide a last-modified date, an expiry date can be calculated based on
the time since the document was last modified. The
CacheLastModifiedFactor directive specifies a
factor to be used in the generation of this expiry date
according to the following formula:
expiry-period = time-since-last-modified-date * factor
expiry-date = current-date + expiry-period
For example, if the document was last modified 10 hours ago, and
factor is 0.1 then the expiry-period will be set to
10*0.1 = 1 hour. If the current time was 3:00pm then the computed
expiry-date would be 3:00pm + 1hour = 4:00pm.
If the expiry-period would be longer than that set by
CacheMaxExpire, then the latter takes
precedence.
CacheLastModifiedFactor 0.5
CacheIgnoreHeaders
Do not store the given HTTP header(s) in the cache.
CacheIgnoreHeaders header-string [header-string] ...
CacheIgnoreHeaders None
server configvirtual host
According to RFC 2616, hop-by-hop HTTP headers are not stored in
the cache. The following HTTP headers are hop-by-hop headers and thus
do not get stored in the cache in any case regardless of the
setting of CacheIgnoreHeaders:
Connection
Keep-Alive
Proxy-Authenticate
Proxy-Authorization
TE
Trailers
Transfer-Encoding
Upgrade
CacheIgnoreHeaders specifies additional HTTP
headers that should not to be stored in the cache. For example, it makes
sense in some cases to prevent cookies from being stored in the cache.
CacheIgnoreHeaders takes a space separated list
of HTTP headers that should not be stored in the cache. If only hop-by-hop
headers not should be stored in the cache (the RFC 2616 compliant
behaviour), CacheIgnoreHeaders can be set to
None
.
Example 1
CacheIgnoreHeaders Set-Cookie
Example 2
CacheIgnoreHeaders None
Warning:
If headers like Expires
which are needed for proper cache
management are not stored due to a
CacheIgnoreHeaders setting, the behaviour of
mod_cache is undefined.
CacheIgnoreURLSessionIdentifiers
Ignore defined session identifiers encoded in the URL when caching
CacheIgnoreURLSessionIdentifiers identifier [identifier] ...
CacheIgnoreURLSessionIdentifiers None
server configvirtual host
Sometimes applications encode the session identifier into the URL like in the following
Examples:
/someapplication/image.gif;jsessionid=123456789
/someapplication/image.gif?PHPSESSIONID=12345678
This causes cacheable resources to be stored separately for each session, which
is often not desired. CacheIgnoreURLSessionIdentifiers lets
define a list of identifiers that are removed from the key that is used to identify
an entity in the cache, such that cacheable resources are not stored separately for
each session.
CacheIgnoreURLSessionIdentifiers None
clears the list of ignored
identifiers. Otherwise, each identifier is added to the list.
Example 1
CacheIgnoreURLSessionIdentifiers jsessionid
Example 2
CacheIgnoreURLSessionIdentifiers None
CacheStoreExpired
Attempt to cache responses that the server reports as expired
CacheStoreExpired On|Off
CacheStoreExpired Off
server config
virtual host
directory
.htaccess
Since httpd 2.2.4, responses which have already expired are not
stored in the cache. The CacheStoreExpired
directive allows this behavior to be overridden.
CacheStoreExpired On
tells the server to attempt to cache the resource if it is stale.
Subsequent requests would trigger an If-Modified-Since request of
the origin server, and the response may be fulfilled from cache
if the backend resource has not changed.
CacheStoreExpired On
CacheStorePrivate
Attempt to cache responses that the server has marked as private
CacheStorePrivate On|Off
CacheStorePrivate Off
server config
virtual host
directory
.htaccess
Ordinarily, responses with Cache-Control: private
header values will not
be stored in the cache. The CacheStorePrivate
directive allows this behavior to be overridden.
CacheStorePrivate On
tells the server to attempt to cache the resource even if it contains
private header values.
CacheStorePrivate On
Warning:
This directive will allow caching even if the upstream server has
requested that the resource not be cached. This directive is only
ideal for a 'private' cache.
CacheIgnoreCacheControl
CacheStoreNoStore
CacheStoreNoStore
Attempt to cache requests or responses that have been marked as no-store.
CacheStoreNoStore On|Off
CacheStoreNoStore Off
server config
virtual host
directory
.htaccess
Ordinarily, requests or responses with Cache-Control: no-store
header
values will not be stored in the cache. The
CacheStoreNoStore directive allows this
behavior to be overridden. CacheStoreNoStore On
tells the server to attempt to cache the resource even if it contains
no-store header values.
CacheStoreNoStore On
Warning:
As described in RFC 2616, the no-store directive is intended to
"prevent the inadvertent release or retention of sensitive information
(for example, on backup tapes)." Enabling this option could store
sensitive information in the cache. You are hereby warned.
CacheIgnoreCacheControl
CacheStorePrivate
CacheLock
Enable the thundering herd lock.
CacheLock on|off
CacheLock off
server configvirtual host
The CacheLock directive enables the thundering herd lock
for the given URL space.
In a minimal configuration the following directive is all that is needed to
enable the thundering herd lock in the default run-time file directory.
# Enable cache lock
CacheLock on
Locks consist of empty files that only exist for stale URLs in flight, so this
is significantly less resource intensive than the traditional disk cache.
CacheLockPath
Set the lock path directory.
CacheLockPath directory
CacheLockPath mod_cache-lock
server configvirtual host
The CacheLockPath directive allows you to specify the
directory in which the locks are created. If directory is not an absolute
path, the location specified will be relative to the value of
DefaultRuntimeDir.
CacheLockMaxAge
Set the maximum possible age of a cache lock.
CacheLockMaxAge integer
CacheLockMaxAge 5
server configvirtual host
The CacheLockMaxAge directive specifies the maximum
age of any cache lock.
A lock older than this value in seconds will be ignored, and the next
incoming request will be given the opportunity to re-establish the lock.
This mechanism prevents a slow client taking an excessively long time to refresh
an entity.
CacheQuickHandler
Run the cache from the quick handler.
CacheQuickHandler on|off
CacheQuickHandler on
server configvirtual host
Apache HTTP Server 2.3.3 and later
The CacheQuickHandler directive
controls the phase in which the cache is handled.
In the default enabled configuration, the cache operates within the quick
handler phase. This phase short circuits the majority of server processing,
and represents the most performant mode of operation for a typical server.
The cache bolts onto the front of the server, and the
majority of server processing is avoided.
When disabled, the cache operates as a normal handler, and is subject to
the full set of phases when handling a server request. While this mode is
slower than the default, it allows the cache to be used in cases where full
processing is required, such as when content is subject to authorization.
# Run cache as a normal handler
CacheQuickHandler off
It is also possible, when the quick handler is disabled, for the
administrator to choose the precise location within the filter chain where
caching is to be performed, by adding the CACHE filter to
the chain.
# Cache content before mod_include and mod_deflate
CacheQuickHandler off
AddOutputFilterByType CACHE;INCLUDES;DEFLATE text/html
If the CACHE filter is specified more than once, the last instance will
apply.
CacheHeader
Add an X-Cache header to the response.
CacheHeader on|off
CacheHeader off
server config
virtual host
directory
.htaccess
Available in Apache 2.3.9 and later
When the CacheHeader directive
is switched on, an X-Cache header will be added to the response
with the cache status of this response. If the normal handler is used, this
directive may appear within a Directory
or Location directive. If the quick
handler is used, this directive must appear within a server or virtual host
context, otherwise the setting will be ignored.
- HIT
- The entity was fresh, and was served from
cache.
- REVALIDATE
- The entity was stale, was successfully
revalidated and was served from cache.
- MISS
- The entity was fetched from the upstream
server and was not served from cache.
# Enable the X-Cache header
CacheHeader on
X-Cache: HIT from localhost
CacheDetailHeader
Add an X-Cache-Detail header to the response.
CacheDetailHeader on|off
CacheDetailHeader off
server config
virtual host
directory
.htaccess
Available in Apache 2.3.9 and later
When the CacheDetailHeader directive
is switched on, an X-Cache-Detail header will be added to the response
containing the detailed reason for a particular caching decision.
It can be useful during development of cached RESTful services to have additional
information about the caching decision written to the response headers, so as to
confirm whether Cache-Control
and other headers have been correctly
used by the service and client.
If the normal handler is used, this directive may appear within a
Directory or
Location directive. If the quick handler
is used, this directive must appear within a server or virtual host context, otherwise
the setting will be ignored.
# Enable the X-Cache-Detail header
CacheDetailHeader on
X-Cache-Detail: "conditional cache hit: entity refreshed" from localhost
CacheKeyBaseURL
Override the base URL of reverse proxied cache keys.
CacheKeyBaseURL URL
server config
virtual host
Available in Apache 2.3.9 and later
When the CacheKeyBaseURL directive
is specified, the URL provided will be used as the base URL to calculate
the URL of the cache keys in the reverse proxy configuration. When not specified,
the scheme, hostname and port of the current virtual host is used to construct
the cache key. When a cluster of machines is present, and all cached entries
should be cached beneath the same cache key, a new base URL can be specified
with this directive.
# Override the base URL of the cache key.
CacheKeyBaseURL http://www.example.com/
Take care when setting this directive. If two separate virtual
hosts are accidentally given the same base URL, entries from one virtual host
will be served to the other.
CacheStaleOnError
Serve stale content in place of 5xx responses.
CacheStaleOnError on|off
CacheStaleOnError on
server config
virtual host
directory
.htaccess
Available in Apache 2.3.9 and later
When the CacheStaleOnError directive
is switched on, and when stale data is available in the cache, the cache will
respond to 5xx responses from the backend by returning the stale data instead of
the 5xx response. While the Cache-Control headers sent by clients will be respected,
and the raw 5xx responses returned to the client on request, the 5xx response so
returned to the client will not invalidate the content in the cache.
# Serve stale data on error.
CacheStaleOnError on