Warning - this is a first (fast) draft that needs further revision!
Several changes in 2.0 and above affect the internal request processing mechanics. Module authors need to be aware of these changes so they may take advantage of the optimizations and security enhancements.
The first major change is to the subrequest and redirect
mechanisms. There were a number of different code paths in
the Apache HTTP Server 1.3 to attempt to optimize subrequest
or redirect behavior. As patches were introduced to 2.0, these
optimizations (and the server behavior) were quickly broken due
to this duplication of code. All duplicate code has been folded
back into ap_process_request_internal()
to prevent
the code from falling out of sync again.
This means that much of the existing code was 'unoptimized'. It is the Apache HTTP Project's first goal to create a robust and correct implementation of the HTTP server RFC. Additional goals include security, scalability and optimization. New methods were sought to optimize the server (beyond the performance of 1.3) without introducing fragile or insecure code.
All requests pass through ap_process_request_internal()
in server/request.c
, including subrequests and redirects. If a module
doesn't pass generated requests through this code, the author is cautioned
that the module may be broken by future changes to request
processing.
To streamline requests, the module author can take advantage of the hooks offered to drop out of the request cycle early, or to bypass core hooks which are irrelevant (and costly in terms of CPU.)
The request's parsed_uri
path is unescaped, once and only
once, at the beginning of internal request processing.
This step is bypassed if the proxyreq flag is set, or the
parsed_uri.path
element is unset. The module has no further
control of this one-time unescape operation, either failing to
unescape or multiply unescaping the URL leads to security
repercussions.
All /../
and /./
elements are
removed by ap_getparents()
. This helps to ensure
the path is (nearly) absolute before the request processing
continues.
This step cannot be bypassed.
Every request is subject to an
ap_location_walk()
call. This ensures that
Modules can determine the file name, or alter the given URI
in this step. For example,
If all modules DECLINE
this phase, an error 500 is
returned to the browser, and a "couldn't translate name" error is logged
automatically.
After the file or correct URI was determined, the
appropriate per-dir configurations are merged together. For
example, TRACE
request, the core handles the request and returns DONE
.
If no module answers this hook with OK
or DONE
,
the core will run the request filename against the
Every request is hardened by a second
ap_location_walk()
call. This reassures that a
translated request is still subjected to the configured
location_walk
above, so this step is almost always very
efficient unless the translated URI mapped to a substantially different
path or Virtual Host.
The main request then parses the client's headers. This prepares the remaining request processing steps to better serve the client's request.
Needs Documentation. Code is:
The modules have an opportunity to test the URI or filename
against the target resource, and set mime information for the
request. Both
If all modules DECLINE
this phase, an error 500 is
returned to the browser, and a "couldn't find types" error is logged
automatically.
Many modules are 'trounced' by some phase above. The fixups phase is used by modules to 'reassert' their ownership or force the request's fields to their appropriate values. It isn't always the cleanest mechanism, but occasionally it's the only option.
This phase is not part of the processing in
ap_process_request_internal()
. Many
modules prepare one or more subrequests prior to creating any
content at all. After the core, or a module calls
ap_process_request_internal()
it then calls
ap_invoke_handler()
to generate the request.
Modules that transform the content in some way can insert their values and override existing filters, such that if the user configured a more advanced filter out-of-order, then the module can move its order as need be. There is no result code, so actions in this hook better be trusted to always succeed.
The module finally has a chance to serve the request in its
handler hook. Note that not every prepared request is sent to
the handler hook. Many modules, such as