Category: CherryPy

Pages: << 1 2 3 4 5 6 >>


Permalink 01:06:38 am, by fumanchu Email , 676 words   English (US)
Categories: CherryPy

CherryPy now handles partial GETs

Partial GET requests are a handy way for a client to request a portion of a resource, rather than the entire resource. HTTP clients send a Range: bytes=start-stop request header, where start and stop are non-negative integers. The HTTP server can then send only those bytes (inclusive) in the response. Multiple byte ranges are also possible. CherryPy has had support for this since, well, earlier this morning (changeset 549, in the current svn trunk).

As John Udell noted a while back, Adobe Reader uses Range headers to accomplish this if the server supports it. Here's an example .pdf request, and the server's response (many headers omitted for clarity):

GET /mail.pdf HTTP/1.1

200 OK
Accept-Ranges 'bytes'
Content-Length '6786140'
Content-Type 'application/pdf'
Last-Modified 'Mon, 01 Dec 2003 18:13:02 GMT'

On the first request, the server returns a normal 200 response, and begins outputting the file. However, it also outputs the "Accept-Ranges" response header. This tells the client that partial GET requests (using the Range header) will be honored. Therefore, the client tries it, jumping to the PDF's content catalog at the end of the file:

GET /mail.pdf HTTP/1.1
RANGE 'bytes=6633280-6634323,6633278-6633279,6634324-6636107,6669998-6672067,5710727-5712197,

206 Partial Content
Accept-Ranges 'bytes'
Content-Length '158347'
Content-Type 'multipart/byteranges; boundary='

Now our server has returned a different status-code, "206 Partial Content". Since the client requested multiple byteranges, the response body is a multipart/byteranges entity. Each part inside that multipart body has its own Content-Type and Content-Range headers.

Since that worked, Adobe Reader proceeds to read more ranges:

GET /mail.pdf HTTP/1.1
RANGE 'bytes=6626812-6627960,6785258-6786139'

206 Partial Content
Accept-Ranges 'bytes'
Content-Length '2305'
Content-Type 'multipart/byteranges; boundary='

GET /mail.pdf HTTP/1.1
RANGE 'bytes=6636108-6636167,194633-198372,198373-202575'

206 Partial Content
Accept-Ranges 'bytes'
Content-Length '8392'
Content-Type 'multipart/byteranges; boundary='
Last-Modified 'Mon, 01 Dec 2003 18:13:02 GMT'

...and then appears to have finished, after taking some time to process those responses. However, when I scrolled to the last page in the PDF document, Adobe Reader made an additional request:

GET /mail.pdf HTTP/1.1
RANGE 'bytes=6668978-6669997,6672068-6672327,5698663-5699716,5723626-5724706,

206 Partial Content
Accept-Ranges 'bytes'
Content-Length '36370'
Content-Type 'multipart/byteranges; boundary='
Last-Modified 'Mon, 01 Dec 2003 18:13:02 GMT'

So it seems there's quite a bit of partial-retrieval going on, ultimately making the client more responsive from the user's point-of-view.

How to take advantage of partial GET support in CherryPy

If you want to serve static files so that clients can requests portions of them (including resumable downloads!), you only need to use the StaticFilter, which handles Range requests transparently. Here's the script I used to serve mail.pdf:

import cherrypy

class Root: pass
cherrypy.root = Root()

        'server.environment': 'production',
        'staticFilter.on': True,
        'staticFilter.dir': 'static',

Yes, that's really all you need to make an HTTP static file server! The staticFilter.dir (where I saved mail.pdf) is relative to wherever you save the above script. If you'd like it relative to some other absolute path, set that in staticFilter.root.

If you'd like to respond to Range request headers, but you're not serving static files, you can still benefit from CherryPy's core. In cherrypy/_cphttptools, there is a get_ranges(content_length) function which you can use; it examines the current request's Range header, and returns a list of (start, stop) tuples (or just returns None if there's no header). For example, given the Range header:

Range: bytes=30000-40000

The call get_ranges(50000) will return [(30000, 40001)]. Note that we've incremented stop by 1, so that you can use it in a string-slicing operation (byte-ranges are inclusive, but Python's slices have exclusive upper-bounds).

Note also that you need to supply a content-length to get_ranges. It's perfectly valid for a client to request Range: bytes=-500, and expect to receive the last 500 bytes of the resource. So you need to specify the total length in order to do the subtraction.

CherryPy doesn't yet handle the If-Range request header, so feel free to write that and contribute it. ;) ETag support would be nice, too.


Permalink 11:08:42 am, by fumanchu Email , 120 words   English (US)
Categories: CherryPy

Is your code a novel?

Remco, a long-time friend of and contributor to CherryPy, started porting his first app to CP 2.1 today, and had this to say:

<remco>    btw, the code has been cleaned beautifully!!!
[fumanchu] well, thanks
<remco>    respect to all of you who contributed to it
[fumanchu] I tried to make the core process easy to read
<remco>    well, it's still a webserver core, so one has to keep focussed,
<remco>    but compared to 2.0 or prior to that : it reads like a novel! :D
[fumanchu] heh
[fumanchu] 2.0 was a collection of short stories
<remco>    and you can jot that on ur resume

Thanks, I just might do that. :)


Permalink 12:05:47 pm, by fumanchu Email , 785 words   English (US)
Categories: Python, CherryPy

Code Coverage with CherryPy 2.1

CherryPy1 helps with both the collection and the analysis of coverage data (for a good introduction to code coverage, see Now, I'm a visual learner, so I'm going to skip right to the screenshot and explain it in detail afterward. This is a browser session with two frames: a menu frame on the left and a file frame on the right. Clicking on one of the filenames in the menu will show you that file, annotated with coverage data, in the right-hand frame. This stats-browser is included with CherryPy, and can be used for any application, not just CherryPy or CP apps.

1 All of this is present in CherryPy 2.1 beta, revision 543. Get it via SVN

coverage stats browser session

Collection of coverage statistics

You need to start by obtaining the module, either the original from Gareth Rees, or Ned Batchelder's updated version. Drop it in site-packages.

Covering CherryPy

If you're collecting coverage statistics for CherryPy itself, just run the test suite with the --cover option. Coverage data will be collected in cherrypy/lib/coverage.cache. Example:

mp5:/usr/lib/python2.3/site-packages# python cherrypy/test/ --cover

Covering CherryPy applications

If you write a test suite for your own applications, build it on top of the tools present in cherrypy/test. Here's a minimal example:

import os, sys
localDir = os.path.dirname(__file__)
dbpath = os.path.join(localDir, "db")

from cherrypy.test import test

if __name__ == '__main__':
    # Place our current directory's parent (myapp/) at the beginning
    # of sys.path, so that all imports are from our current directory.
    curpath = os.path.normpath(os.path.join(os.getcwd(), localDir))
    sys.path.insert(0, os.path.normpath(os.path.join(curpath, '../../')))

    testList = ["test_directory",
    testConf = os.path.join(localDir, "test.conf")

By using the TestHarness from CherryPy's test suite, you automatically get access to the --cover command-line arg (and --profile and all the others, too, but that's for another day). Again, coverage data will be collected in cherrypy/lib/coverage.cache by default.

Covering Other Applications

You can use the stats-browser even if you don't use the CherryPy framework to develop your applications. Just use as it was originally intended: -x

The coverage data, in this case, will be collected by default into a .coverage file. You need to tell the stats-server where this file is (see below). Note that successive manual calls to will accumulate stats; the CherryPy test suite, in contrast, erases the data on each run.

Analysis of coverage statistics

Once you've got coverage data sitting around in a file somewhere, it's a snap to have CherryPy serve it in your browser. If you're covering the CherryPy test suite, or your own CP app using CP's TestHarness (see above), just execute:

mp5:/usr/lib/python2.3/site-packages# python cherrypy/lib/

Then, point your browser to http://localhost:8080, and you should see an image similar to the above.

By default, the server reads coverage data from cherrypy/lib/coverage.cache, the same file our collector wrote to by default. If you covered your own application and collected the data in another file, you can supply that path as a command-line arg:

# python cherrypy/lib/ /path/to/.coverage 8088

If you supply a second arg, as in this example, it will change the port for you (from the default of 8080).

You need to stop (Ctrl-C) and restart the server if you recollect coverage data.

The interface

Each file in the menu has coverage stats, and is a hyperlink; click on one, and the file frame will show you the file contents, annotated with coverage data. Lines that start with ">" were touched, and those that start with "!" were not.

Click the "Show %" button to show a "percent covered" figure for each file. This can take a long time if you have lots of files, so it's best to first restrict your view using the directory links. Each directory is a hyperlink; click on one to restrict the menu to that folder only. Percentages below the "threshold" value will be shown in red. The "Show %" feature isn't "sticky", by the way; that is, if you click on a different directory link, or refresh the page, the figures will disappear. That's a necessary evil due to the slowness of generating percentages for many files. Just hit the "Show %" button again as needed.

As you can see from the screenshot, I've got some more tests to write! Hope you find this tool as useful as I do. :)


Permalink 05:17:39 pm, by fumanchu Email , 662 words   English (US)
Categories: CherryPy, WSGI

Funny how people only goggle over the baby

Simon Willison recently wrote a description of Django's request-handling mechanism. Here's a quick comparison with CherryPy:

When Django receives a request, the first thing it does is create an HttpRequest object (or subclass there-of) to represent that request.

CherryPy has a Request object, as well. However, it's purely an internal object; it doesn't get passed around to application code. One of the design points of CherryPy is that it allows you to write (at least a majority of) your code "like any other app"; this means that input arrives as "simple data" via function parameters, and you use the "return" statement to output data, not custom HTTP-framework objects. Point in favor of CP, IMO.

Once the object has been created, Django performs URL resolution. This is a process by which the URL specified in the request is used to select a view function to handle the creation of a response. A trivial Django application is simply one or more view functions and a configuration file that maps those functions to URLs.

Like almost every other web framework. ;) The only difference from CherryPy is that CP specifies the mapping in code, not config files. Another point to CP.

Having resolved the URL to a view, the view function is called with the request object as the first argument. Other keyword arguments may be passed as well depending on the URL configuration; see the documentation for details.

See above; CherryPy is flatter, and tends to pass data, not internal objects.

The view function is where the bulk of the work happens: it is here that database queries are made, templates loaded, HTML is generated and an HttpResponse object encapsulating the result is created. The view function returns this object, which is then passed back to the environment-specific code (mod_python or WSGI) which passes it back to the browser as an HTTP response.

Again, CherryPy is flatter, expecting you to return data, not objects. You can return a string, an iterable of strings, a file, or None, or yield any of those. Point.

This is all pretty straightforward stuff - but I skipped a couple of important details: exceptions and middleware. The view function doesn't have to return an HttpResponse; it can raise an exception instead, the most common varieties being Http404 (for file-not-found) or Http500 (for server error). In development servers these exceptions will be formatted and sent back to the browser, while in production mode they will be silently logged and a "friendly" error message displayed.

CherryPy also has user-raisable exceptions; however, they're not so low-level. Instead of Http404, you raise cherrypy.NotFound. Instead of Http3xx, you raise cherrypy.HTTPRedirect. I prefer CP's style, of course, but I don't think it's a clear "winner" over Httpxxx exceptions.

Middleware is even more interesting. Django provides three hooks in the above sequence where middleware classes can intervene, with the middleware classes to be used defined in the site's configuration file. This results in three types of middleware: request, view and response (although one middleware class can apply for more than one hook).

CherryPy has 7 such hooks; two are for errors, so let's call it 5 for a more-reasonable comparison. But see my previous post on why static hook points may not be the best approach. Still, 5 is better than 3 :). Point.

The bulk of the above code can be found in the call method of the ModPythonHandler class and the get_response method of the BaseHandler class.

That sounds like an unfortunate violation of the DRY principle. CherryPy isolates all of that nicely via the server.request() function. Are we keeping score yet?

As Django is not yet at a 1.0 release, the above is all subject to potential refactoring future change.

I can't wait to see Django 1.0! Until then, I'm going to take our adolescent web framework and go sulk in my room. ;)

Permalink 11:01:16 am, by fumanchu Email , 239 words   English (US)
Categories: Python, Dejavu, CherryPy

It doesn't take much of a Python to swallow my brain

Lines of code in the four systems I hack on most often (and of which I have a more-or-less complete grasp):

>>> import LOC
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\cherrypy")
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\dejavu")
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\endue")
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\mcontrol")

Something about my brain must naturally fit 7500-to-10000-line chunks of Python code. I certainly experience a strong drive to keep these systems from becoming more complicated, which I usually express via aggressive refactoring.

Some other packages (which I don't hack on) for comparison:

>>> LOC.LOC(r"C:\Python23\Lib\site-packages\colorstudy\SQLObject")
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\paste")
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\PIL")
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\twisted")
>>> LOC.LOC(r"D:\download\Zope-2.8.1-final\Zope-2.8.1-final")

I think the sheer size of paste, twisted, and zope has actively kept me from wanting to dig into them further (but it's certainly not the only factor). Irrational, perhaps, but a natural human response to information overload.

Here's the LOC script if anyone wants to compare packages:

import os, codecs, re

def LOC(root, pattern='^.*\.py$'):
    LOCs = []
    pattern = re.compile(pattern)
    for path, dirs, files in os.walk(root):
        for f in files:
            if pattern.match(f):
                mod = os.path.join(root, path, f)
                lines = len(, "rb").readlines())
    return sum(LOCs)


Permalink 01:12:42 am, by fumanchu Email , 466 words   English (US)
Categories: Python, CherryPy

unittest's bad rap

Phillip J. Eby recently said:

unittest has gotten something of a bad rap, I think. Regardless of whether you like its basic testing facilities or not, it is an extremely good framework. In fact, I think it's one of the most beautiful frameworks in the Python standard library. Its functionality is cleanly separated into four roles, each of which can be filled by any object implementing the right interface: runners, loaders, cases, and results. Because of this exceptionally clean factoring, the basic framework is amazingly extensible.

I couldn't agree more, which is why I recently converted CherryPy's ad hoc test suite to one based on unittest. Although unittest doesn't fit every program out-of-the-box, its components are a breeze to subclass in order to make it fit your problem domain. CherryPy 2.1 now has a nice (which will only get better as it is extended), specifically designed for simultaneously testing both the client and server sides of an HTTP request. The webtest module:

  • Understands that full web-application test suites can have a lot of components and requests, and tries to keep its "no failures" output to a minimum.
  • Provides simplified page-request functions--all HTTP methods are available (like POST and PUT), and required request headers are set automatically if not provided manually.
  • Runs the page requests in the same process as the web-application server thread(s). This allows errors in the server to be trapped and then reported in the unittest thread (see the server_error function).
  • Automatically reloads all modules that have been imported by each test module.
  • Allows the person running the tests to see the response status, headers, and body, and also the requested URL, whenever an assertion fails. This helps test-first design tremendously.
  • Allows failed assertions to be ignored, so that the current test method may proceed with the remainder of the tests. Since many page requests are idempotent GET's, this can help debugging by collecting more failure information at once.
  • Provides easy regular-expression matching against the response body.

The webtest module is available, by the way, to be used in other frameworks or applications. There's nothing CherryPy-specific in that module; all of that is found in test\, which wraps webtest to fit the CP test suite. Anyone using webtest for their own framework or app could learn a thing or two from the wrappers there.

It'd be nice if some of PJE's mini-tutorial found its way into the docs for unittest; I said it was "a breeze to subclass", but only by reading most of the unittest source. Oh, and thanks, Phillip, for disallowing anonymous comments on your blog—that made me write up a more extensive post here on my own blog. But could you give me a trackback URL at least? ;)


Permalink 01:57:26 pm, by fumanchu Email , 45 words   English (US)
Categories: CherryPy

New CherryPy Planet


There's a new Planet in the OSS solar system, for posts related to CherryPy! CherryPy is "a pythonic, object-oriented web-development framework", which also happens to be fast, WSGI-ready, and easily extendable. Check out version 2.1, now in beta; you won't be disappointed!


Permalink 01:58:46 pm, by fumanchu Email , 1199 words   English (US)
Categories: CherryPy, WSGI

Plugin madness

Glyph is talking about a complete refactoring of divmod, and mentions:

At every point in implementing this system we have known whether to fuse a component together because we'd built unnecessary additional complexity into previous systems, and where to use a plug-in architecture because we'd needed to inject ugly code into the middle of a monolithic routine.

As a result, where our architecture was heavily monolithic before, now it is almost entirely composed of plugins. It is so plugin-happy, in fact, that there is a database with Service plugins in it, which activate when the database is started from twistd; it contains its own configuration, including port-numbers, so nothing need live in a text configuration file.

Plugins are great because they facilitate customization: you can make small changes in system behavior with small changes in your code. An architecture that is "monolithic", to use Glyph's term, is one where small changes in system behavior require large changes in your code.

CherryPy 2.1 has a system of Filters (both built-in and user-provided), which act as plugins. As each HTTP request is processed, there are a few fixed points where the Request processor searches for registered Filter methods and gives up control to them. The Filter then must either return control to the Request processor, or raise a control-flow exception, like NotFound, RequestHandled, or HTTPRedirect. Here's the guts of the Request processor (the run method of the Request class) itself:

def run(self):
    """Process the Request."""


                if cherrypy.request.processRequestBody:

                if cherrypy.response.body is None:

            except cherrypy.RequestHandled:
            except cherrypy.HTTPRedirect, inst:
                # For an HTTPRedirect, we don't go through the regular
                # mechanism: we return the redirect immediately
    except cherrypy.NotFound:
        cherrypy.response.status = 404

It's decent; that is, it's fairly clean and understandable IMO. But it's quite limited in two very important ways:

  1. There are only 7 points at which customization is possible [beforeErrorResponse and afterErrorResponse are found inside handleError]. Any filters which require the core to release control at multiple points have to do some fancy dancing to coordinate state between those applyFilters calls. Any filters which require additional control points are out of luck—their only recourse is to handle the remainder of the process themselves (probably with generous cut-and-paste from the CP core) and raise RequestHandled.

  2. Certain core processes are locked in time. For example, processRequestHeaders gets the "path" from the first line of the HTTP request; however, the Filters themselves are supposed to be dependent upon the current path! Therefore, for example, onStartResource filters must always be global ("/").

  3. (Slightly unrelated: the builtin "request" filters get run before any user-defined filters, and vice-versa for "response" filters. This needs a fix).

I tried to ameliorate some of these issues in the short term a) by making a Request class (which 2.0 didn't have); that might become subclassable someday, and b) by keeping a lot of logic out of the Request class, placing it instead in module-level global functions (which people can then call as needed).

A couple of weeks ago, I joked that maybe main (which calls the user's page handler) and finalize (which cleans up the response status, headers, and body) should themselves become filters, and it wasn't entirely a joke. When I look at CherryPy as-is, I see not one, but three separate API's.

The first, simplest one is for application developers, and includes:

  1. The cherrypy.root hierarchy.
  2. Page handler methods, including the spec for index and default methods, and the "exposed" attribute.
  3. Passing CGI params (etc) to page handler methods as keyword args.
  4. Expecting response content to be passed back to the core via "return" or "yield" statements.

The second API is quite different. It assumes a much higher level of competency, and is both more powerful and more complicated. I see it as mostly useful for those writing frameworks or libraries on top of CherryPy, although many "normal" app developers will end up using some of this interface. It includes:

  1. Filters. Creating, organizing, maintaining.
  2. Changing status, headers, or body via cherrypy.response.
  3. For that matter, almost anything involving cherrypy.request or .response: cookies and sessions, path inspection, HTTP-method dispatching, etc.

Finally, there's a third "API", which CherryPy supports quite well, but in a different fashion. There are a number of people who will run into, say, the filter limitations I outlined above, and will customize their copy of CherryPy to do what they want. One of my design goals has always been to make this easier by making the core insanely simple. There has been a lot of work done to keep the various components isolated, preferring a data-driven approach, centered around the cherrypy.request and .response objects, mostly; that is, your filter or site-specific customization can do whatever it likes, as long as it sets valid response.status, .headers, and .body before returning control to the HTTP server.

I don't see anything fundamentally wrong with having "different API's". It would be nice for some items from the "middle layer" to become more easily-accessible to the "shallow layer"; a lot of that can be done with clever wrapper classes, customized for specific situations. But I think it's quite all right to have a separation between a clean, simple, limited interface and a more powerful, but more complicated, interface behind that, to be used when needed.

However, I'd like to see the "lowest layer" unstick itself a bit more yet. That method, in particular, is far too frozen at the moment—I'd like to see it become the default processor, with an easy way to override some or all of it. I think that would free up CherryPy developers to better manage the current web-application space, which continues to change quickly as new ideas and technologies roll in. [Turning on my World Domination Mode for a moment, it might also allow a small, focused CherryPy core to become the backplane for several of the existing Python web frameworks, especially as more of them begin to support WSGI.]

Part of the reason there is a Filter specification at all is to shield CherryPy application deployers; they now have an interface for plugging in various components, that is simpler than, say, subclassing Request. For example, a deployer can decide to gzip their HTTP responses with a single line in their config file. In addition, the developer of that app need not be aware that this is being done. It's a "freebie" from his or her point of view.

But what if one could write one's own in such a way that that became easier than using config files? If that were possible, almost all of the overhead of the Filter architecture could be removed completely. In addition, developers and deployers could share total control over the request process, rather than the limited, dare-I-say "clunky" process we have now with filters. I think that with CherryPy 2.1, we're halfway there, and it won't take much work to make it happen in an elegant and powerful fashion.


Permalink 01:03:08 pm, by fumanchu Email , 91 words   English (US)
Categories: IT, Python, CherryPy


I've really been enjoying Ryan Tomayko's new site. I've been on the simplicity bandwagon for about a year now, which coincides nicely with my Python learning curve ;). Check out lesscode if you're tired of overengineering.

Oh, and check out CherryPy if you're tired of overengineered web frameworks. I've worked pretty hard to make the upcoming 2.1 release as simple as possible, but no simpler.


Permalink 11:07:46 pm, by fumanchu Email , 127 words   English (US)
Categories: Python, CherryPy, WSGI

CherryPy WSGI is up and running

Update: 1) "lydon" is Oliver Graf. Thanks, Oliver! 2) the tests all pass now.

"lydon" contributed a recipe for using FastCGI with CherryPy's new WSGI interface. Thanks! (I notice he or she also used my brand new recipe for the Virtual Path Filter—nice to know when someone likes your little side projects. ;)

Peter Hunt has contributed a very nice WSGI server, as well. See the latest SVN trunk for the current version (here's a link to the Timeline.) There's still a bug in the WSGI server; the test suite isn't completing, because the server isn't shutting down when it should. I'm trying to track that down and fix it tonight.

<< 1 2 3 4 5 6 >>

February 2019
Sun Mon Tue Wed Thu Fri Sat
 << <   > >>
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28    


The requested Blog doesn't exist any more!

XML Feeds

free blog software