| « CherryPy 3 has fastest WSGI server yet | If you like CherryPy except for the dispatching... » |
I played around with this as a potential hack for CherryPy 3. It's WSGI middleware for adding almost-transparent "internal redirect" capabilities to any WSGI application.
My operating theory was that anyone writing a WSGI app that does not already have an internal-redirect feature was probably using HTTP redirects (302, 303, or 307) to do nearly the same thing. This middleware simply waits for a 307 response status and performs the redirection itself within the same request, without informing the user-agent.
This should be OK because 307 isn't normally cacheable anyway, and some versions of IE don't bother to ask the user as the spec requires already, so it just duplicates an existing browser bug. I could have used a custom HTTP code like 399, but if that ever leaked out to the UA (because someone forgot to enable the middleware) then the UA should fall back to "300 Multiple Choices", which didn't seem like a good fit. At least by using 307, the fallback should be appropriate, if not graceful.
Here's the code, which could probably use some improvements:
"""WSGI middleware which performs "internal" redirection."""
import StringIO
class _Redirector(object):
def __init__(self, nextapp, recursive=False):
self.nextapp = nextapp
self.recursive = recursive
self.location = None
self.write_proxy = None
self.status = None
self.headers = None
self.exc_info = None
self.seen_paths = []
def start_response(self, status, headers, exc_info):
if status[:3] == "307":
for name, value in headers:
if name.lower() == "location":
self.location = value
break
self.status = status
self.headers = headers
self.exc_info = exc_info
return self.write
def write(self, data):
# This is only here for silly apps which call write.
if self.write_proxy is None:
self.write_proxy = self.sr(self.status, self.headers, self.exc_info)
self.write_proxy(data)
def __call__(self, environ, start_response):
self.sr = start_response
nextenv = environ.copy()
curpath = nextenv['PATH_INFO']
if nextenv.get('QUERY_STRING'):
curpath = curpath + "?" + nextenv['QUERY_STRING']
self.seen_paths.append(curpath)
while True:
# Consume the response (in case it's a generator).
response = [x for x in self.nextapp(nextenv, self.start_response)]
if self.location is None:
# No redirection required; complete the response normally.
self.sr(self.status, self.headers, self.exc_info)
return response
# Start with a fresh copy of the environ and start altering it.
nextenv = environ.copy()
nextenv['REQUEST_METHOD'] = 'GET'
nextenv['CONTENT_LENGTH'] = '0'
nextenv['wsgi.input'] = StringIO.StringIO()
nextenv['redirector.history'] = self.seen_paths[:]
# "The [Location response-header] field value
# consists of a single absolute URI."
(nextenv["wsgi.url_scheme"],
nextenv["SERVER_NAME"],
path, params,
nextenv["QUERY_STRING"], frag) = urlparse(self.location)
if frag:
raise ValueError("Illegal #fragment in Location response "
"header %r" % self.location)
if params:
path = path + ";" + params
# Assume 'path' is already unquoted according to
# <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5.1.2">http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5.1.2</a>
if path.lower().startswith(environ['SCRIPT_NAME'].lower()):
nextenv["PATH_INFO"] = path[len(environ['SCRIPT_NAME']):]
else:
raise ValueError("Location response header %r does not "
"match current SCRIPT_NAME %r"
% (self.location, environ['SCRIPT_NAME']))
# Update self.seen_paths and check for recursive calls.
curpath = nextenv['PATH_INFO']
if nextenv.get('QUERY_STRING'):
curpath = curpath + "?" + nextenv['QUERY_STRING']
if curpath in self.seen_paths:
raise RuntimeError("redirector visited the same URL twice: %r"
% curpath)
else:
self.seen_paths.append(curpath)
# Reset self for the next iteration
self.location = None
self.write_proxy = None
self.status = None
self.headers = None
self.exc_info = None
def redirector(nextapp, recursive=False):
"""WSGI middleware which performs "internal" redirection.
Whenever the next application sets a response status of 307 and
provides a Location response header, this component will not pass
that response on to the user-agent; instead, it parses the URI
provided in the Location response header and calls the same
application again using that URI. The following entries in the
WSGI environ dict may be modified when redirecting: wsgi.url_scheme,
SERVER_NAME, PATH_INFO, QUERY_STRING. REQUEST_METHOD is always
set to 'GET', so any desired parameters must be supplied as
query string arguments in the Location response header.
The wsgi.input entry will always be reset to an empty StringIO,
and CONTENT_LENGTH will be set to 0.
If 'recursive' is False (the default), each new target URI will be
checked to see if it has already been visited in the same request;
if so, a RuntimeError is raised. If 'recursive' is True, no check
is made and therefore no such errors are raised.
"""
def redirect_wrapper(environ, start_response):
ir = _Redirector(nextapp, recursive)
return ir(environ, start_response)
return redirect_wrapper
You might want to check out Ian Bicking's WSGIRemote: http://pythonpaste.org/wsgiremote/
I think it solves a similar problem.
I think you mean paste.recursive? http://pythonpaste.org/module-paste.recursive.html
That is similar; however, it requires the next application to import paste so it can raise a known exception (that the middleware then traps). In CherryPy, at least, it would take more code to pass such an exception out of the app and make sure all the right finalization code is run before the forward occurs; that's all sidestepped neatly by (ab)using 307 (which has to be supported regardless of whether or not you use the middleware).
I also considered informing the middleware via a custom environ entry, but that seemed like it was contrary to WSGI's style, where the intent is that the server writes environ entries, middleware reads and/or changes them, and apps only read them.
FYI, this middleware is not WSGI compliant, because it consumes the entire response of every response given to it, not merely those that are redirects. This is explicitly forbidden by the following section of PEP 333:
http://www.python.org/dev/peps/pep-0333/#middleware-handling-of-block-boundaries