« Book MemeIchi, ni, sashimi »

Plugin madness

07/30/05

Permalink 01:58:46 pm, by fumanchu Email , 1199 words   English (US)
Categories: CherryPy, WSGI

Plugin madness

Glyph is talking about a complete refactoring of divmod, and mentions:

At every point in implementing this system we have known whether to fuse a component together because we'd built unnecessary additional complexity into previous systems, and where to use a plug-in architecture because we'd needed to inject ugly code into the middle of a monolithic routine.

As a result, where our architecture was heavily monolithic before, now it is almost entirely composed of plugins. It is so plugin-happy, in fact, that there is a database with Service plugins in it, which activate when the database is started from twistd; it contains its own configuration, including port-numbers, so nothing need live in a text configuration file.

Plugins are great because they facilitate customization: you can make small changes in system behavior with small changes in your code. An architecture that is "monolithic", to use Glyph's term, is one where small changes in system behavior require large changes in your code.

CherryPy 2.1 has a system of Filters (both built-in and user-provided), which act as plugins. As each HTTP request is processed, there are a few fixed points where the Request processor searches for registered Filter methods and gives up control to them. The Filter then must either return control to the Request processor, or raise a control-flow exception, like NotFound, RequestHandled, or HTTPRedirect. Here's the guts of the Request processor (the run method of the Request class) itself:

def run(self):
    """Process the Request."""
    try:
        try:
            applyFilters('onStartResource')

            try:
                self.processRequestHeaders()

                applyFilters('beforeRequestBody')
                if cherrypy.request.processRequestBody:
                    self.processRequestBody()

                applyFilters('beforeMain')
                if cherrypy.response.body is None:
                    main()

                applyFilters('beforeFinalize')
                finalize()
            except cherrypy.RequestHandled:
                pass
            except cherrypy.HTTPRedirect, inst:
                # For an HTTPRedirect, we don't go through the regular
                # mechanism: we return the redirect immediately
                inst.set_response()
                finalize()
        finally:
            applyFilters('onEndResource')
    except cherrypy.NotFound:
        cherrypy.response.status = 404
        handleError(sys.exc_info())
    except:
        handleError(sys.exc_info())

It's decent; that is, it's fairly clean and understandable IMO. But it's quite limited in two very important ways:

  1. There are only 7 points at which customization is possible [beforeErrorResponse and afterErrorResponse are found inside handleError]. Any filters which require the core to release control at multiple points have to do some fancy dancing to coordinate state between those applyFilters calls. Any filters which require additional control points are out of luck—their only recourse is to handle the remainder of the process themselves (probably with generous cut-and-paste from the CP core) and raise RequestHandled.

  2. Certain core processes are locked in time. For example, processRequestHeaders gets the "path" from the first line of the HTTP request; however, the Filters themselves are supposed to be dependent upon the current path! Therefore, for example, onStartResource filters must always be global ("/").

  3. (Slightly unrelated: the builtin "request" filters get run before any user-defined filters, and vice-versa for "response" filters. This needs a fix).

I tried to ameliorate some of these issues in the short term a) by making a Request class (which 2.0 didn't have); that might become subclassable someday, and b) by keeping a lot of logic out of the Request class, placing it instead in module-level global functions (which people can then call as needed).

A couple of weeks ago, I joked that maybe main (which calls the user's page handler) and finalize (which cleans up the response status, headers, and body) should themselves become filters, and it wasn't entirely a joke. When I look at CherryPy as-is, I see not one, but three separate API's.

The first, simplest one is for application developers, and includes:

  1. The cherrypy.root hierarchy.
  2. Page handler methods, including the spec for index and default methods, and the "exposed" attribute.
  3. Passing CGI params (etc) to page handler methods as keyword args.
  4. Expecting response content to be passed back to the core via "return" or "yield" statements.

The second API is quite different. It assumes a much higher level of competency, and is both more powerful and more complicated. I see it as mostly useful for those writing frameworks or libraries on top of CherryPy, although many "normal" app developers will end up using some of this interface. It includes:

  1. Filters. Creating, organizing, maintaining.
  2. Changing status, headers, or body via cherrypy.response.
  3. For that matter, almost anything involving cherrypy.request or .response: cookies and sessions, path inspection, HTTP-method dispatching, etc.

Finally, there's a third "API", which CherryPy supports quite well, but in a different fashion. There are a number of people who will run into, say, the filter limitations I outlined above, and will customize their copy of CherryPy to do what they want. One of my design goals has always been to make this easier by making the core insanely simple. There has been a lot of work done to keep the various components isolated, preferring a data-driven approach, centered around the cherrypy.request and .response objects, mostly; that is, your filter or site-specific customization can do whatever it likes, as long as it sets valid response.status, .headers, and .body before returning control to the HTTP server.

I don't see anything fundamentally wrong with having "different API's". It would be nice for some items from the "middle layer" to become more easily-accessible to the "shallow layer"; a lot of that can be done with clever wrapper classes, customized for specific situations. But I think it's quite all right to have a separation between a clean, simple, limited interface and a more powerful, but more complicated, interface behind that, to be used when needed.

However, I'd like to see the "lowest layer" unstick itself a bit more yet. That Request.run method, in particular, is far too frozen at the moment—I'd like to see it become the default processor, with an easy way to override some or all of it. I think that would free up CherryPy developers to better manage the current web-application space, which continues to change quickly as new ideas and technologies roll in. [Turning on my World Domination Mode for a moment, it might also allow a small, focused CherryPy core to become the backplane for several of the existing Python web frameworks, especially as more of them begin to support WSGI.]

Part of the reason there is a Filter specification at all is to shield CherryPy application deployers; they now have an interface for plugging in various components, that is simpler than, say, subclassing Request. For example, a deployer can decide to gzip their HTTP responses with a single line in their config file. In addition, the developer of that app need not be aware that this is being done. It's a "freebie" from his or her point of view.

But what if one could write one's own Request.run in such a way that that became easier than using config files? If that were possible, almost all of the overhead of the Filter architecture could be removed completely. In addition, developers and deployers could share total control over the request process, rather than the limited, dare-I-say "clunky" process we have now with filters. I think that with CherryPy 2.1, we're halfway there, and it won't take much work to make it happen in an elegant and powerful fashion.

1 comment

Comment from: Sylvain Hellegouarch [Visitor] · http://www.defuze.org/oss/

Robert,

Very interesting entry... as usual you are far away ahead of all of us ;)

I think what you say is basically more or less a ground for a proposal (maybe speicification) for a CP3? :)

- Sylvain

07/30/05 @ 14:09

Leave a comment


Your email address will not be revealed on this site.

Your URL will be displayed.

Please enter the phrase "I am a real human." in the textbox above.
(Line breaks become <br />)
(Name, email & website)
(Allow users to contact you through a message form (your email will not be revealed.)
January 2018
Sun Mon Tue Wed Thu Fri Sat
 << <   > >>
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      

Search

The requested Blog doesn't exist any more!

XML Feeds

powered by free blog software