Archives for: November 2006

11/13/06

Permalink 10:57:51 pm, by fumanchu Email , 745 words   English (US)
Categories: Python, CherryPy, WSGI

Internal Redirect WSGI middleware

I played around with this as a potential hack for CherryPy 3. It's WSGI middleware for adding almost-transparent "internal redirect" capabilities to any WSGI application.

My operating theory was that anyone writing a WSGI app that does not already have an internal-redirect feature was probably using HTTP redirects (302, 303, or 307) to do nearly the same thing. This middleware simply waits for a 307 response status and performs the redirection itself within the same request, without informing the user-agent.

This should be OK because 307 isn't normally cacheable anyway, and some versions of IE don't bother to ask the user as the spec requires already, so it just duplicates an existing browser bug. I could have used a custom HTTP code like 399, but if that ever leaked out to the UA (because someone forgot to enable the middleware) then the UA should fall back to "300 Multiple Choices", which didn't seem like a good fit. At least by using 307, the fallback should be appropriate, if not graceful.

Here's the code, which could probably use some improvements:

"""WSGI middleware which performs "internal" redirection."""

import StringIO


class _Redirector(object):

    def __init__(self, nextapp, recursive=False):
        self.nextapp = nextapp
        self.recursive = recursive

        self.location = None
        self.write_proxy = None
        self.status = None
        self.headers = None
        self.exc_info = None

        self.seen_paths = []

    def start_response(self, status, headers, exc_info):
        if status[:3] == "307":
            for name, value in headers:
                if name.lower() == "location":
                    self.location = value
                    break
        self.status = status
        self.headers = headers
        self.exc_info = exc_info
        return self.write

    def write(self, data):
        # This is only here for silly apps which call write.
        if self.write_proxy is None:
            self.write_proxy = self.sr(self.status, self.headers, self.exc_info)
        self.write_proxy(data)

    def __call__(self, environ, start_response):
        self.sr = start_response

        nextenv = environ.copy()
        curpath = nextenv['PATH_INFO']
        if nextenv.get('QUERY_STRING'):
            curpath = curpath + "?" + nextenv['QUERY_STRING']
        self.seen_paths.append(curpath)

        while True:
            # Consume the response (in case it's a generator).
            response = [x for x in self.nextapp(nextenv, self.start_response)]

            if self.location is None:
                # No redirection required; complete the response normally.
                self.sr(self.status, self.headers, self.exc_info)
                return response

            # Start with a fresh copy of the environ and start altering it.
            nextenv = environ.copy()
            nextenv['REQUEST_METHOD'] = 'GET'
            nextenv['CONTENT_LENGTH'] = '0'
            nextenv['wsgi.input'] = StringIO.StringIO()
            nextenv['redirector.history'] = self.seen_paths[:]

            # "The [Location response-header] field value
            # consists of a single absolute URI."
            (nextenv["wsgi.url_scheme"],
             nextenv["SERVER_NAME"],
             path, params,
             nextenv["QUERY_STRING"], frag) = urlparse(self.location)

            if frag:
                raise ValueError("Illegal #fragment in Location response "
                                 "header %r" % self.location)

            if params:
                path = path + ";" + params

            # Assume 'path' is already unquoted according to
            # <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5.1.2">http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5.1.2</a>
            if path.lower().startswith(environ['SCRIPT_NAME'].lower()):
                nextenv["PATH_INFO"] = path[len(environ['SCRIPT_NAME']):]
            else:
                raise ValueError("Location response header %r does not "
                                 "match current SCRIPT_NAME %r"
                                 % (self.location, environ['SCRIPT_NAME']))

            # Update self.seen_paths and check for recursive calls.
            curpath = nextenv['PATH_INFO']
            if nextenv.get('QUERY_STRING'):
                curpath = curpath + "?" + nextenv['QUERY_STRING']
            if curpath in self.seen_paths:
                raise RuntimeError("redirector visited the same URL twice: %r"
                                   % curpath)
            else:
                self.seen_paths.append(curpath)

            # Reset self for the next iteration
            self.location = None
            self.write_proxy = None
            self.status = None
            self.headers = None
            self.exc_info = None


def redirector(nextapp, recursive=False):
    """WSGI middleware which performs "internal" redirection.

    Whenever the next application sets a response status of 307 and
    provides a Location response header, this component will not pass
    that response on to the user-agent; instead, it parses the URI
    provided in the Location response header and calls the same
    application again using that URI. The following entries in the
    WSGI environ dict may be modified when redirecting: wsgi.url_scheme,
    SERVER_NAME, PATH_INFO, QUERY_STRING. REQUEST_METHOD is always
    set to 'GET', so any desired parameters must be supplied as
    query string arguments in the Location response header.
    The wsgi.input entry will always be reset to an empty StringIO,
    and CONTENT_LENGTH will be set to 0.

    If 'recursive' is False (the default), each new target URI will be
    checked to see if it has already been visited in the same request;
    if so, a RuntimeError is raised. If 'recursive' is True, no check
    is made and therefore no such errors are raised.
    """
    def redirect_wrapper(environ, start_response):
        ir = _Redirector(nextapp, recursive)
        return ir(environ, start_response)
    return redirect_wrapper
November 2006
Sun Mon Tue Wed Thu Fri Sat
 << < Current> >>
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30    

Search

The requested Blog doesn't exist any more!

XML Feeds

powered by b2evolution