Category: WSGI

Pages: << 1 2

07/30/05

Permalink 01:58:46 pm, by fumanchu Email , 1199 words   English (US)
Categories: CherryPy, WSGI

Plugin madness

Glyph is talking about a complete refactoring of divmod, and mentions:

At every point in implementing this system we have known whether to fuse a component together because we'd built unnecessary additional complexity into previous systems, and where to use a plug-in architecture because we'd needed to inject ugly code into the middle of a monolithic routine.

As a result, where our architecture was heavily monolithic before, now it is almost entirely composed of plugins. It is so plugin-happy, in fact, that there is a database with Service plugins in it, which activate when the database is started from twistd; it contains its own configuration, including port-numbers, so nothing need live in a text configuration file.

Plugins are great because they facilitate customization: you can make small changes in system behavior with small changes in your code. An architecture that is "monolithic", to use Glyph's term, is one where small changes in system behavior require large changes in your code.

CherryPy 2.1 has a system of Filters (both built-in and user-provided), which act as plugins. As each HTTP request is processed, there are a few fixed points where the Request processor searches for registered Filter methods and gives up control to them. The Filter then must either return control to the Request processor, or raise a control-flow exception, like NotFound, RequestHandled, or HTTPRedirect. Here's the guts of the Request processor (the run method of the Request class) itself:

def run(self):
    """Process the Request."""
    try:
        try:
            applyFilters('onStartResource')

            try:
                self.processRequestHeaders()

                applyFilters('beforeRequestBody')
                if cherrypy.request.processRequestBody:
                    self.processRequestBody()

                applyFilters('beforeMain')
                if cherrypy.response.body is None:
                    main()

                applyFilters('beforeFinalize')
                finalize()
            except cherrypy.RequestHandled:
                pass
            except cherrypy.HTTPRedirect, inst:
                # For an HTTPRedirect, we don't go through the regular
                # mechanism: we return the redirect immediately
                inst.set_response()
                finalize()
        finally:
            applyFilters('onEndResource')
    except cherrypy.NotFound:
        cherrypy.response.status = 404
        handleError(sys.exc_info())
    except:
        handleError(sys.exc_info())

It's decent; that is, it's fairly clean and understandable IMO. But it's quite limited in two very important ways:

  1. There are only 7 points at which customization is possible [beforeErrorResponse and afterErrorResponse are found inside handleError]. Any filters which require the core to release control at multiple points have to do some fancy dancing to coordinate state between those applyFilters calls. Any filters which require additional control points are out of luck—their only recourse is to handle the remainder of the process themselves (probably with generous cut-and-paste from the CP core) and raise RequestHandled.

  2. Certain core processes are locked in time. For example, processRequestHeaders gets the "path" from the first line of the HTTP request; however, the Filters themselves are supposed to be dependent upon the current path! Therefore, for example, onStartResource filters must always be global ("/").

  3. (Slightly unrelated: the builtin "request" filters get run before any user-defined filters, and vice-versa for "response" filters. This needs a fix).

I tried to ameliorate some of these issues in the short term a) by making a Request class (which 2.0 didn't have); that might become subclassable someday, and b) by keeping a lot of logic out of the Request class, placing it instead in module-level global functions (which people can then call as needed).

A couple of weeks ago, I joked that maybe main (which calls the user's page handler) and finalize (which cleans up the response status, headers, and body) should themselves become filters, and it wasn't entirely a joke. When I look at CherryPy as-is, I see not one, but three separate API's.

The first, simplest one is for application developers, and includes:

  1. The cherrypy.root hierarchy.
  2. Page handler methods, including the spec for index and default methods, and the "exposed" attribute.
  3. Passing CGI params (etc) to page handler methods as keyword args.
  4. Expecting response content to be passed back to the core via "return" or "yield" statements.

The second API is quite different. It assumes a much higher level of competency, and is both more powerful and more complicated. I see it as mostly useful for those writing frameworks or libraries on top of CherryPy, although many "normal" app developers will end up using some of this interface. It includes:

  1. Filters. Creating, organizing, maintaining.
  2. Changing status, headers, or body via cherrypy.response.
  3. For that matter, almost anything involving cherrypy.request or .response: cookies and sessions, path inspection, HTTP-method dispatching, etc.

Finally, there's a third "API", which CherryPy supports quite well, but in a different fashion. There are a number of people who will run into, say, the filter limitations I outlined above, and will customize their copy of CherryPy to do what they want. One of my design goals has always been to make this easier by making the core insanely simple. There has been a lot of work done to keep the various components isolated, preferring a data-driven approach, centered around the cherrypy.request and .response objects, mostly; that is, your filter or site-specific customization can do whatever it likes, as long as it sets valid response.status, .headers, and .body before returning control to the HTTP server.

I don't see anything fundamentally wrong with having "different API's". It would be nice for some items from the "middle layer" to become more easily-accessible to the "shallow layer"; a lot of that can be done with clever wrapper classes, customized for specific situations. But I think it's quite all right to have a separation between a clean, simple, limited interface and a more powerful, but more complicated, interface behind that, to be used when needed.

However, I'd like to see the "lowest layer" unstick itself a bit more yet. That Request.run method, in particular, is far too frozen at the moment—I'd like to see it become the default processor, with an easy way to override some or all of it. I think that would free up CherryPy developers to better manage the current web-application space, which continues to change quickly as new ideas and technologies roll in. [Turning on my World Domination Mode for a moment, it might also allow a small, focused CherryPy core to become the backplane for several of the existing Python web frameworks, especially as more of them begin to support WSGI.]

Part of the reason there is a Filter specification at all is to shield CherryPy application deployers; they now have an interface for plugging in various components, that is simpler than, say, subclassing Request. For example, a deployer can decide to gzip their HTTP responses with a single line in their config file. In addition, the developer of that app need not be aware that this is being done. It's a "freebie" from his or her point of view.

But what if one could write one's own Request.run in such a way that that became easier than using config files? If that were possible, almost all of the overhead of the Filter architecture could be removed completely. In addition, developers and deployers could share total control over the request process, rather than the limited, dare-I-say "clunky" process we have now with filters. I think that with CherryPy 2.1, we're halfway there, and it won't take much work to make it happen in an elegant and powerful fashion.

06/04/05

Permalink 11:07:46 pm, by fumanchu Email , 127 words   English (US)
Categories: Python, CherryPy, WSGI

CherryPy WSGI is up and running

Update: 1) "lydon" is Oliver Graf. Thanks, Oliver! 2) the tests all pass now.

"lydon" contributed a recipe for using FastCGI with CherryPy's new WSGI interface. Thanks! (I notice he or she also used my brand new recipe for the Virtual Path Filter—nice to know when someone likes your little side projects. ;)

Peter Hunt has contributed a very nice WSGI server, as well. See the latest SVN trunk for the current version (here's a link to the Timeline.) There's still a bug in the WSGI server; the test suite isn't completing, because the server isn't shutting down when it should. I'm trying to track that down and fix it tonight.

05/26/05

Permalink 03:31:02 pm, by fumanchu Email , 542 words   English (US)
Categories: Python, WSGI

WSGI gateway for ASP (Microsoft IIS)

Update: I forgot to address buffering.

As I mentioned, I threw together an WSGI wrapper (gateway) for ASP. Here it is. Feedback welcome.

It blanks out SCRIPT_NAME to behave more like Apache. It also handles URL-rewriting, since that's pretty much the only sane way to use ASP with WSGI (or any framework, for that matter).


"""
WSGI wrapper for ASP.


Example Global.asa for a CherryPy app called "mcontrol":

<script language=Python runat=Server> 
def Application_OnStart():
    Application.Contents("multiprocess") = False
    Application.Contents("multithread") = True
    from mcontrol import chpy
</script>


Example handler.asp:

<%@Language=Python%>
<%
from wsgiref.asp_gateway import handler
from cherrypy.wsgiapp import wsgiApp

handler(Application, Request, Response).run(wsgiApp)
%>

"""

import sys
from wsgiref.handlers import BaseCGIHandler


class ASPInputWrapper(object):
    
    def __init__(self, Request):
        self.stream = Request.BinaryRead
        size = Request.ServerVariables('CONTENT_LENGTH')
        self.remainder = self.size = int(size)
    
    def read(self, size=-1):
        if size lt; 0:
            size = self.remainder
        content, size = self.stream(size)
        self.remainder -= size
        return content
    
    def readline(self):
        output = []
        while True:
            # Use an internal buffer instead? Still have to check for \n
            char = self.read(1)
            if not char:
                break
            output.append(char)
            if char in ('\n', '\r'):
                break
        return ''.join(output)
    
    def readlines(self, hint=-1):
        lines = []
        while True:
            line = self.readline()
            if not line:
                break
            lines.append(line)
        return lines
    
    def __iter__(self):
        line = self.readline()
        while line:
            yield line
            # Notice this won't prefetch the next line; it only
            # gets called if the generator is resumed.
            line = self.readline()


class handler(BaseCGIHandler):
    
    def __init__(self, Application, Request, Response, buffering=True):
        # If you set buffering to False, you must not "Enable Buffering" in
        # the current Virtual Directory, NOR in any of its parent containers
        # (directory, site, or server). IIS 5 and 6 buffer by default.
        # See http://support.microsoft.com/default.aspx?scid=kb;en-us;Q306805
        # and http://www.aspfaq.com/show.asp?id=2262
        Response.Buffer = buffering
        
        env = {}
        for name in Request.ServerVariables:
            try:
                # names and values are both probably unicode. coerce them.
                env[str(name)] = str(Request.ServerVariables(name))
            except UnicodeEncodeError, x:
                # There's a potential problem lurking here, since some ASP
                # server var's which are required by WSGI may be high ASCII.
                x.args += ((u"Server Variable '%s'" % name),)
                raise x
        
        multiprocess = str(Application.Contents("multiprocess"))
        multithread = str(Application.Contents("multithread"))
        
        # You will probably need *some* form of rewriter to use ASP
        # with WSGI, since ASP requires one physical .asp file
        # per requestable-URL; so far, we support one:
        
        # Handle URL rewriting done by ISAPI_Rewrite Lite.
        # http://www.isapirewrite.com/
        # Note that PATH_TRANSLATED is also rewritten, but we
        # don't make any provision for unmunging that.
        old_path = env.get("HTTP_X_REWRITE_URL", None)
        if old_path:
            # Tear off any params.
            env["PATH_INFO"] = old_path.split("?")[0]
        
        # ASP puts the same values in SCRIPT_NAME and PATH_INFO,
        # for some odd reason. Empty one of them.
        env["SCRIPT_NAME"] = ""
        
        BaseCGIHandler.__init__(self,
                                stdin=ASPInputWrapper(Request),
                                stdout=None,
                                stderr=sys.stderr,
                                environ=env,
                                multiprocess=multiprocess,
                                multithread=multithread
                                )
        
        self.Response = Response
        self._write = Response.Write
    
    def _flush(self):
        self.Response.Flush()
    
    def send_headers(self):
        self.cleanup_headers()
        self.headers_sent = True
        for key, val in self.headers.items():
            self.Response.AddHeader(key, val)

05/21/05

Permalink 12:02:07 pm, by fumanchu Email , 191 words   English (US)
Categories: Python, CherryPy, WSGI

CherryPy going WSGI...conservatively

As I announced, I've got a new core with both a wsgiapp and the original native httpserver. They both pass all tests.

[Remi] I don't have any problem with WSGI being the preferred interface. But in that case, we need to package with CherryPy a standalone WSGI HTTP server so that people can still run their sites without any external dependencies if they want to (and the server should support thread-pooling).

I'm about to write a WSGI server for CP, so that we can run the test suite against the wsgi handler without any dependency on wsgiref. I can either:

  1. Write a new WSGI server specifically for CherryPy, or
  2. Rewrite the existing native httpserver to have a WSGI interface.

If I did (2), then CP would no longer have a non-WSGI server. I think I'll avoid that for now--it would be better to grow the new interface, test it thoroughly through at least a minor version or two, and cut the old one later if we find nobody is using it.

I should have a patch ready by Monday. :)

05/19/05

Permalink 02:23:49 pm, by fumanchu Email , 117 words   English (US)
Categories: Python, CherryPy, WSGI

Abstracting the CherryPy webserver

CherryPy had a great IRC meeting today. I had several issues I wanted to discuss going into it—not only were they discussed, but I didn't even have to bring them up! :)

One important outcome was: I'm not the only one who wants the included HTTPServer to be less-strongly coupled to the rest of the framework. So I'll be working on that during the rest of the week; I told everyone I'd have a draft ready by Monday. I'll be asking lots of questions on cherrpy-devel, I'm sure. ;)

My biggest hope is that I can now eliminate the "write callable" hack which I just placed in the new wsgiapp module.

05/10/05

Permalink 11:18:48 pm, by fumanchu Email , 200 words   English (US)
Categories: Python, WSGI

Discarded WSGI wrappers

Last year I threw together a preliminary mod_python handler for WSGI (with much help from PJE, of course). Looks like Steven Armstrong has polished it up a bit. Good show!

Here's another adaptation. Anyone interested in consolidating them so a standard can be placed in wsgiref itself?


I've also been thinking about an ASP handler again. Now that I've moved my own Python ASP apps to using a URL rewriter, one of the two strikes against an ASP handler is gone.

The other strike was that the server doesn't have any facility for telling the application whether it's being run multithreaded or multiprocess. The mod_python wrapper dealt with this (in versions less than 3.1) by checking PythonOption directives for the necessary parameters; I think an ASP wrapper could do the same by requiring all ASP+WSGI apps to stuff that metadata into Application.Contents within a mandatory Global.asa. Ugly, but it would nicely round out the WSGI-server offerings.

<< 1 2

January 2018
Sun Mon Tue Wed Thu Fri Sat
 << <   > >>
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      

Search

The requested Blog doesn't exist any more!

XML Feeds

free blog software