Categories: Python, Cation, CherryPy, Dejavu, WHELPS, WSGI

Pages: << 1 2 3 4 5 6 7 8 9 10 11 >>

10/14/06

Permalink 02:46:43 pm, by fumanchu Email , 373 words   English (US)
Categories: CherryPy

If you like CherryPy except for the dispatching...

...you should know that CherryPy 3 (soon to be released) includes a Routes dispatcher:

class City:

    def __init__(self, name):
        self.name = name
        self.population = 10000

    def index(self, **kwargs):
        return "Welcome to %s, pop. %s" % (self.name, self.population)

    def update(self, **kwargs):
        self.population = kwargs['pop']
        return "OK"

d = cherrypy._cprequest.RoutesDispatcher()
d.connect(name='hounslow', route='hounslow', controller=City('Hounslow'))
d.connect(name='surbiton', route='surbiton', controller=City('Surbiton'),
          action='index', conditions=dict(method=['GET']))
d.mapper.connect('surbiton', controller='surbiton',
                 action='update', conditions=dict(method=['POST']))

conf = {'/': {'request.dispatch': d}}
cherrypy.tree.mount(root=None, config=conf)
cherrypy.config.update({'environment': 'test_suite'})

You tell CherryPy you want to use Routes dispatching in your app config with "request.dispatch = <obj>". The astute reader will note this means:

  1. You can make your own dispatchers for Django-style, regex style, Quixote-style, etc. You can even modify the builtin Routes dispatcher to add HTTP method dispatch or what-have-you.
  2. You can use the default CherryPy tree-style dispatcher for most paths, and Routes (or any other style) for select subpaths.

To make your own dispatcher:

  1. Make it callable. If it's a class, give it a __call__ method that takes a path_info argument.
  2. When called, it should set cherrypy.request.handler to a callable that takes no arguments. This "handler" should be (or should call) the user's application code. The default CherryPy handler, for example, sends virtual path atoms as *args and GET/POST parameters as **kwargs, but if that's not what your dispatch style requires, do something else. It's completely customizable. You can even set request.handler to None if you don't want anything called at that point. Note also that HTTPRedirect and HTTPError (including NotFound) can be used as handlers; when called, they raise self.
  3. Set cherrypy.request.config. This should be a flat dictionary of all config entries (from both global and application config) which apply to the current request, based on the path_info argument above. The default CherryPy dispatcher does a lot of work to correctly allow config file entries to override _cp_config entries on the CherryPy object tree. But if your dispatch style doesn't use a tree, you don't need to do all that.

Grab the latest trunk and start playing!

08/24/06

Permalink 12:05:29 am, by fumanchu Email , 320 words   English (US)
Categories: Python, Dejavu, CherryPy

Upgrades to Python 2.5

I probably waited too long, but today I upgraded both CherryPy (3.0alpha/trunk) and Dejavu (1.5alpha/trunk) to Python 2.5. The moves were surprisingly easy:

CherryPy

There were three changes in all:

  1. When the WSGI server socket is closed, socket.accept now fails with a socket.error "Socket operation on non-socket". I "fixed" this by just ignoring the error.
  2. The output of Response.SimpleCookie now has no trailing semicolon as it did in Python 2.4. Just had to fix the test suite to be aware of that.
  3. Some attributes of unittest.TestCase moved from double-underscore names to single; webtest had a custom subclass of it. This was easy enough to fix: define a different method for 2.5 than 2.4 or less.

[P.S. I've noticed CP 3 is about 3% slower in 2.5 than 2.4, even with the zombie frames and other optimizations. Hmmm.]

Dejavu

Amazingly, even though Dejavu makes extensive use of bytecode hacks, there was only one real change! The "logic" module needed an upgrade to the "comparison" function, which produces an Expression via the types.CodeType() constructor. Apparently, function args are no longer included in co_names, and co_consts no longer includes a leading 'None' value (except when there are cell references?). Finally, co_flags for "normal" functions now includes CO_NESTED by default. These changes also forced some parallel upgrades to the test suite.

While fixing the above, however, I noticed a long-standing bug in Dejavu's LambdaDecompiler. Python 2.4 used ROT_THREE before a STORE_SUBSCR, and this worked in Dejavu; but Python 2.5 uses ROT_TWO before STORE_SUBSCR, which showed me I had the stack-popping backwards in both functions. Bah. Fixed now.

Absolute imports

Both packages needed a good bit of work changing some relative import statements into absolute ones. Not really hard, just boring. ;)

Thanks to the Python core devs for a very smooth transition!

07/10/06

Permalink 11:07:40 am, by fumanchu Email , 361 words   English (US)
Categories: CherryPy

CherryPy 3 optimization

Currently (rev 1193), a typical CherryPy request has a standard execution path, and a standard time to complete it:

0.008 _cpwsgi.py:51(_wsgi_callable)
    0.001 _cpwsgi.py:36(translate_headers)
    0.001 _cpengine.py:131(request)
        0.001 _cprequest.py:623(Response.__init__)
    0.006 _cprequest.py:116(run)
        0.000 _cprequest.py:230(process_request_line)
        0.001 _cprequest.py:265(process_headers)
        0.003 _cprequest.py:189(respond)
            0.001 _cprequest.py:294(get_resource)
                0.001 _cprequest.py:415(Dispatcher.__call__)
                    0.001 _cprequest.py:432(find_handler)
            0.001 _cprequest.py:326(tool_up)
            0.001 _cprequest.py:644(finalize)
        0.001 cherrypy\__init__.py:96(log_access)
            0.001 logging\__init__.py:1015(log)
                0.001 logging\__init__.py:1055(_log)

0.001 cherrypy\__init__.py:51(__getattr__)
0.001 :0(getattr)

That is, _cpwsgi._wsgi_callable() takes about 8 msec (on my box using the builtin timer). That number breaks down into 1 msec for translate_headers(), 1 msec for _cpengine.request(), and 6 msec for Request.run(). Etcetera. These are all of the calls which take 1 msec or more to complete.

It looks like moving to Python's builtin logging for the access log has added 1 msec to Request.run(). I think that's reasonable; we lose a millisecond but gain syslog and rotating log options.

Somebody please explain to me why _cpwsgi.translate_headers takes a millisecond to change 20 strings from "HTTP_HEADER_NAME" to "Header-Name". I've tried lots of rewritings of that to no avail; moving from "yield" to returning a list did nothing, nor did inlining it into _wsgi_callable.

I tried making the default Dispatcher cache the results from find_handler. That is, cache[(app, path_info)] = func, vpath, request.config. I couldn't see any speedup on cache hits.

The next-to-last line above is interesting. 0.001 cherrypy\__init__.py:51(__getattr__) shows 1 msec being used for cherrypy.request and cherrypy.response. I've already done a lot of work to minimize this by looking them up once and binding to a local, for example, request = cherrypy.request, and then looking up further attributes using the local name. But perhaps there's more to be done.

The last line above shows 1 msec being used to call the builtin getattr() function. Seems we have a very object-oriented style. ;)

I'll keep looking for ways to get any of those 0.001's to read 0.000. Perhaps now that I've moved profiling to WSGI middleware, I can aggregate times and work with numbers that have a little more precision. ;)

06/28/06

Permalink 09:25:43 am, by admin Email , 365 words   English (US)
Categories: IT, Python, Dejavu

The Fourth Way

Ted Neward has written a good discussion of Object-Relational Mapper concerns. I'd like to react to, and associate, a couple of points he makes, seemingly unrelatedly:

...we typically end up with one of Query-By-Example (QBE), Query-By-API (QBA), or Query-By-Language (QBL) approaches.

A QBE approach states that you fill out an object template of the type of object you're looking for...
a "Query-By-API" approach, in which queries are constructed by Query objects...
a "Query-By-Language" approach, in which a new language, similar to SQL but "better" somehow, is written...

and

Several possible solutions present themselves...

5. Integration of relational concepts into the languages. Developers simply accept that this is a problem that should be solved by the language, not by a library or framework...bring relational concepts (which, at heart, are set-based) into mainstream programming languages, making it easier to bridge the gap between "sets" and "objects"...[such as] direct integration into traditional O-O languages, such as the LINQ project from Microsoft...

I propose (and implemented in Dejavu) a fourth approach from QBE, QBA, and QBL. Rather than build a DSL on top of a programming language (as QBL does), use the language instead. Rather than change the programming language by introducing relational syntax (as LINQ does), use the language instead. In Dejavu, you write plain old Python functions which take an object and return True or False. Iterate over a collection of objects and it works as a filter. Pass it to the storage backend and it is translated into SQL for you. Most commonly, you pass it to the library and it does both for you: iterates over its in-memory cache of objects and merges in new objects, queried from storage. Let's call it... "query". It's not "by" anything. It has an infinitesimal learning curve. LINQ is, in essence, shoehorning higher-order functions into its various target languages in a very limited domain. Why not use a programming language that has real HOF's?

06/15/06

Permalink 01:23:24 am, by admin Email , 753 words   English (US)
Categories: Python, CherryPy

How CherryPy processes a request

Inspired by James Bennett, here's a little treatise on how CherryPy processes a request. A couple of differences, though. First, Django is a "full-stack" web framework, with an ORM, built-in templating, etcetera, whereas CherryPy focuses on HTTP. Second, I'll be showing the process for CherryPy 2.2 (the current stable branch), but I'll try to point out along the way where CherryPy 3 (now in alpha) differs.

HTTP Server

Something must actually sit on a listening socket and receive requests from HTTP clients. CherryPy provides an HTTP server (_cpwsgiserver.py), or you can use Apache, lighttpd, or others.

Bridge from HTTP Server to CherryPy

The Web Server Gateway Interface spec came into being to connect various HTTP servers to various web frameworks (and gateways and middleware and...). If you want to use it to connect an HTTP server with CherryPy, feel free. CherryPy provides a "WSGI application callable" in _cpwsgi.py. Otherwise, you need a specific adapter at this stage to connect the two.

The CherryPy Engine

Whether you use WSGI or not for the Bridge, it calls Engine.request(), which creates the all-important objects cherrypy.request and cherrypy.response, returning the former. The Bridge then calls request.run(), passing it the incoming message stream.

The CherryPy Request

Several steps occur here to convert the incoming stream to more usable data structures, pass the request to the appropriate user code, and then convert outbound data. In-between the standard processing steps, users can define extra code to be run via filters (CP 2.2) or hooks (CP 3). Here's how CherryPy 2 does it:

  1. Request.processRequestLine() analyzes the first line of the request, turning "GET /path/to/resource?key=val HTTP/1.1" into a request method, path, query string, and version.
  2. Any on_start_resource filters are run.
  3. Request.processHeaders() turns the incoming HTTP request headers into a dictionary, and separates Cookie information.
  4. Any before_request_body filters are run.
  5. Request.processBody() turns the incoming HTTP request body into a dictionary if possible, otherwise, it's passed onward as a file-like object.
  6. Any before_main filters are run.
  7. The user-supplied page handler is looked up (see below).
  8. The user-supplied page handler is invoked. Its return value, which can be a string, a list, a file, or a generator object, will be used for the response body.
  9. Any before_finalize filters will be run.
  10. Response.finalize() checks for HTTP correctness of the response, and transforms user-friendly data structures into HTTP-server-friendly structures.
  11. Any on_end_resource filters are run.

CherryPy 3 performs the same steps as above, but in the order: 1, 3, 7, 2, 4, 5, 6, 8, 9, 10, 11. That is, it determines which bit of user code will respond to the request much earlier in the process. This also means that internal redirects can "start over" much earlier. In addition, CP 3 can collect configuration data once (at the same time that it looks up the page handler); CP 2 recollected config data every time it was used.

Page handlers

As mentioned (steps 7 and 8, above), CherryPy users write "page handlers", functions which receive the request parameters as arguments, and return the response body. CherryPy makes clever use of threadlocals, so all other data a developer needs is available in the global cherrypy.request and cherrypy.response objects (the parameters are as well, but it's awfully convenient to receive them as arguments to the page handler, and to return the body rather than setting it).

The URL is mapped to a page handler by traversing a tree of such handlers, so that the handler for "/a/b/c" is most likely root.a.b.c(). I say "most likely", because you can also define index() handlers and default() handlers.

The CherryPy Response

When the call to Request.run() returns, the Bridge uses the Response attributes status, header_list, and body to construct the outbound stream, and pass it to the HTTP server that made the request. CherryPy works hard to support both buffered and streaming output, so the body may be a generator object that is only iterated over at this point.

Exceptional circumstances

The page handler, or any of the filters/hooks, can decide that the response is complete, and that processing should be stopped. Most often, this is accomplished by raising an HTTPRedirect (3xx) exception, or an HTTPError (4xx or 5xx; NotFound (404) is so common it has its own subclass). Unanticipated errors are automatically converted into HTTPError(500). Users have some facility for modifying the actual error output with additional error filters/hooks.

That's it!

05/09/06

Permalink 11:49:50 pm, by fumanchu Email , 134 words   English (US)
Categories: CherryPy

One of the ways CherryPy 3 will rock

Looks like CherryPy 3 will be significantly faster than CP 2.2. Here are some quick benchmark (Apache ab) stats from my little Win2k laptop. The first three are from the same test (1000 requests, 14 byte response body, 10 server threads), for 10 to 50 client threads:

req/sec x threadsmsec/req x threadskb/sec x threads

These two are from a different test (1000 requests, 50 client threads, 10 server threads), for response sizes of 10 bytes, 100, 1K, 10K, 100K, and 100M: req/sec x byteskb/sec x bytes

I believe the improvement comes from three areas. First, the lowercase_api flag and checks are no longer needed. Second, filters are no longer called just to see if they're turned on. Third, all of the configs and special attributes are now looked up once, inline with the page handler (i.e, controller method) lookup.

I can't wait to run the benchmark suite on a real server. :)

04/23/06

Permalink 08:15:15 pm, by fumanchu Email , 1363 words   English (US)
Categories: CherryPy

CherryPy 3 directions

I committed the first round of changes for CherryPy version 3 on Friday. It's nowhere near complete, but it hopefully can give hints about the future.

Before I dive into the meat, you should know I moved some things around:

  • _cphttptools is now called _cprequest
  • There's a new 'tools.py' module (see below).
  • All of the code in the /filters folder still exists, but it's all been moved into the /lib folder. The filters folder has been removed.
  • You can now call functions and instantiate objects in config files. For example: now = cherrypy.lib.httptools.HTTPDate()

Dispatchers

In CherryPy 2.2, you're able to replace the page-handler-dispatch mechanism by using a custom Request class; that is, you would subclass _cphttptools.Request and override the main or mapPathToObject methods. That can be tedious, since you can't specify the Request class on a per-request basis; the Request object has already been formed by the time the URL has been parsed.

In CP 3, there's a new _cprequest.dispatch function, and each Request object calls it. If you don't like the way CP looks up page handlers by default, you can declare your own dispatcher in the config:

dispatcher = my.custom.dispatcher.function

or

dispatcher = my.custom.DispatchClass(blah)

The only requirement is that the right-hand-side be a callable: it takes a "path" argument and should return a page handler (a callable). The default Dispatcher also sets request.virtual_path, so unless you're also setting request.execute_main to False you should probably do the same.

Filtering is now Hooking

I had a good long look at filters in CP 2.2. Despite their name, they don't really "filter" anything; nothing "passes through them". Some of them modify cherrypy.request attributes, but just as many of them don't. They're not implemented as filters; instead, they're "hooks".

A "hook" usually means a place where callbacks are called, and CherryPy filters have always been called from a pre-determined set of hooks (e.g. before_request_body). So I went ahead and changed the terminology throughout the codebase.

But there's a much bigger change than just the name. People have been pining for CP to release its grip over both filter declaration (which filters are available) and filter invocation (CP 2 calls all filter methods whether enabled or not). These issues have largely been solved in the current trunk by moving control out of the global cherrypy.filters module and into each Request object. Every Request object now possesses a "hooks" attribute, a _cprequest.HookMap object. The HookMap class has the following attributes and methods:

  • callbacks: a dict of the form {hookpoint: [callback, ...]}. The "hookpoint" is one of our old filter method names, like "before_finalize".
  • failsafe: a list of hooknames that should run all their callbacks, even if some of those callbacks raise exceptions.
  • attach(point, callback, conf=None): allows you to attach a callback to be invoked by this request. Any code can do this, and can do it on the fly! See the new caching module for an example; if the request is served from cache before_main, then the logic which would cache the page handler output is never attached, and therefore never invoked.
  • run(point): runs all registered callbacks for the given hook point.
  • populate_from_config(): this is called automatically by the Request object, and searches for Tools which it can call to setup hooks. What's a Tool? Read on...

Tools

CherryPy has always included a number of extensions and libraries which help you design web applications more quickly. In addition, many people have designed their own extensions to CP, some as custom filters, some as decorators, some as base classes to be subclassed, some as WSGI middleware, custom Request objects, on*Start methods, etc., etc., etc.

I'd like to call all of these extensions "features" for the rest of this post. A "feature" in this sense is any function(s) or module which could be implemented in a variety of ways. If the feature should apply site-wide, you probably want to run it like a CP2-style filter, and perhaps declare its scope in the config dict/file. But if it only applies to a page handler or two, you might think a decorator would be more attractive syntax. Sometimes, you want to invoke the feature from inside the page handler, after you've inspected a certain header, or after a lookup has failed.

However, it often happens that implementing your feature to be used in one of these ways harms its use in another: if you make a lovely decorator out of your feature, chances are that you cannot just "plug it in" as a before_main handler and expect it to work. This was a big problem for CP-2; a lot of logic could be useful elsewhere, but wasn't available because it was "locked away" inside a filter or some other construct.

A Tool is my new term for "feature adapter". If you can write your feature as a normal Python function, with normal Python arguments instead of config.get calls, chances are it can be wrapped in a Tool in a single line of code:

cherrypy.tools.cool_stuff = cherrypy.tools.Tool('before_finalize', cool.stuff)

What does that line buy you?

  • Your function is registered in the CherryPy tool registry, so
  • Your function can be called from the tools namespace: tools.cool_stuff(*args, **kwargs).
  • Your function can be used as a decorator via @tools.cool_stuff.wrap(*args, **kwargs). Any arguments passed to wrap() get passed to your function whenever it is called.
  • Your function can be used as a hook and managed in config. Remember the populate function (above)? It scans through the current config, finds any items that start with "tools.", and checks to see that "tools.cool_stuff.on" is True. If it is, it takes all other "tools.cool_stuff.*" config entries and passes them as named arguments to your cool_stuff function, at the hook point you requested.

That is the "simple case", and there is sufficient room for very complex additions to that (grep for the setup method). If your feature needs to replace the page handler, for example (as caching, static, and xmlrpc do), there's a tools.MainTool class; when used as a decorator or a hook, it automatically skips the page handler for you if your function returns True (meaning "I've handled this request, thanks").

I plan to explore other Tool improvements in the near future:

  • Argument inspection is high on the list, so that decorators, etc get the same argspec as your original function. You might also be able to import tools and let your IDE auto-complete your config entries, which in my mind would cut down on reaching for manuals quite a bit. It would have to be optional, because IIRC Jython doesn't have an "inspect" module.
  • Other wrappers on the Tool class for...what? Base classes? WSGI middleware? custom Request objects? on*Start/Stop methods?
  • Look harder at the flags request.processRequestBody and request.execute_main. They're ugly. Devious thought: replace request.processRequestBody and request.main with default hooks.
  • "Tools" may not be the best name.
  • Other hook points are possible. Investigate using hooks in a more generic fashion.
  • Using a tool as a decorator effectively means that it is not overridable in config. This "feature lock" is something I've wanted for quite a while, but there may need to be some means of allowing config to override such features or their arguments. For example, a developer may want to insist that a "staticfilter" be in place, but not particularly care about the OS path to its resources.

There are other issues that need to be addressed in CherryPy 3, of course (separating the CP server and the HTTP server springs instantly to mind). But these changes should give a us a good basis for consolidation of a lot of code, and the freedom to use all our beautiful library logic in whatever way is most appropriate to each application and installation. I look forward to all your ideas and improvements.

04/05/06

Permalink 11:03:42 pm, by fumanchu Email , 225 words   English (US)
Categories: Dejavu

Single Table Inheritance certainly *sounds* evil

Jeff Shell (who needs to turn on comments) wrote:

...By my understanding of Single Table Inheritance, the flight leg, ground leg, and scheduled meeting legs data would all be mashed up in one table. If I were designing tables in an RDBMS, I would never design that way – and I’m no genius Relational designer.

I guess, after thinking about it, I would write the three legs as separate tables/objects and write one legs action to combine them (in Rails)?

That's how I'd do it in Dejavu (three different tables). But Dejavu allows you to recall objects from those three tables either individually or collectively, without having to write your own "combine action". If you recall the subclass, you get just that subclass. If you recall the superclass, you get objects from all subclasses together in the same result set (you also get objects from the superclass, although quite often it's abstract and there aren't any).

I've never worked with Rails' inheritance, but I have been horrified to see mashup tables in plenty of databases. You know the ones: three columns common to all records, 28 columns that only apply to 50% of the rows, and 34 additional columns that only apply to the other 50%. Pick larger column-counts and smaller percentages if you're into mental masochism. ;)

04/03/06

Permalink 03:56:49 pm, by fumanchu Email , 317 words   English (US)
Categories: Python

How to use "require group" with Trac and SSPI

You may have run into this problem if you've ever tried to use "Require group" and "SSPIOmitDomain" simultaneously with mod_auth_sspi. I encountered it while trying to set up a new Trac site, to which I only wanted to allow access by staff members.

The problem seems to be that, if you set "SSPIOmitDomain On", then no SSPI call is made to check credentials. This could be because the domain is "omitted" before the authentication is done (I cared on Friday night, but I don't now). Regardless, the group requirement seems to fall through the sspi handler, at which point Apache complains that no group file could be found.

Anyway, rather than patch mod_auth_sspi, there's an easy workaround: use a PythonFixupHandler to strip the domain, instead of using SSPIOmitDomain.

from mod_python import apache

def strip_domain(req):
    if req.user:
        if "\\" in req.user:
            req.user = req.user.split("\\", 1)[1]
    return apache.OK
    ##else:
    ##    return apache.DECLINED

def lcase_user(req):
    if req.user:
        req.user = req.user.lower()
    return apache.OK
    ##else:
    ##    return apache.DECLINED

You should be able to use the same trick with "SSPIOfferBasic On". I threw in a lowercase function because I've also had problems with SSPIUsernameCase causing Apache to not start.

You can declare the fixup handler like so (Trac example):

<Location /incidents>
   SetHandler mod_python
   PythonHandler trac.web.modpython_frontend
   PythonOption TracUriRoot "/incidents"
   PythonOption TracEnv "C:/projects/incidents/trac"

   #NT Domain auth config
   AuthType SSPI
   AuthName "Amor Ministries"

   SSPIAuth On
   SSPIAuthoritative On
   SSPIDomain HQAMOR
   PythonFixupHandler fixupsspi::strip_domain fixupsspi::lcase_user

   Require group "HQAMOR\Amor Staff"
</Location>

Hope that helps someone. At the least, it should save you the trouble of porting mod_auth_sspi to a PythonAuthenHandler, as I tried to do first. Since there's limited access to the Apache connection objects in mod_python (e.g., you can't define connection cleanup functions), that's a whole 'nother barrel of fun...

03/19/06

Permalink 11:27:57 pm, by fumanchu Email , 129 words   English (US)
Categories: Python, CherryPy

Python webapps no longer deadorex

Are we live, or are we deadorex?

I spent a few hours of my weekend working on getting a Read-Eval-Print Loop (sometimes called an "interactive interpreter") in a web browser. It was surprisingly easy to do so using Python's builtin code module and CherryPy. You can get it here: http://projects.amor.org/misc/wiki/HTTPREPL If anyone wants to contribute adapters for other web frameworks, I'd be happy to include them.

Anyway, now that you can build your application completely on the fly, we're one step closer to Smalltalk-style web nirvana. Maybe I should include a textarea option for larger chunks of code? Maybe an option to save the command history with the prompts stripped out? Hm...

Example HTTPREPL session

<< 1 2 3 4 5 6 7 8 9 10 11 >>

October 2017
Sun Mon Tue Wed Thu Fri Sat
 << <   > >>
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31        

Search

The requested Blog doesn't exist any more!

XML Feeds

powered by b2evolution