Categories: Python, Cation, CherryPy, Dejavu, WHELPS, WSGI

Pages: << 1 2 3 4 5 6 7 8 9 10 11 >>

10/06/05

Permalink 05:52:07 pm, by fumanchu Email , 359 words   English (US)
Categories: CherryPy

Eat less, exercise more

Dave Warnock muses:

I am not about to argue that [TurboGears and Subway] should merge, instead I feel they can improve most by making sure that they each stay thin putting all the improvements they can back into the components eg CherryPy (which they both have been doing) and into the deployment elements (setuptools and paste).

That's a good point, and I think it's at the heart of what CherryPy is trying to be: non-fattening. So there's definitely going to be some pushback from CherryPy itself trying to "stay thin".

I often say that CherryPy is not a "web framework"; it is an "HTTP framework". That is, it doesn't try to provide tools for every facet of web development. Instead, it concentrates on wrapping HTTP up in a Pythonic way.

IMO, working to stay thin is an important factor in getting CherryPy "more exercise": it gets used in more meta/mega-frameworks like Subway and TurboGears precisely because it hasn't gobbled up every good idea, just because it's web-related, or even just because it's Python. For example, CherryPy 2.1 is deprecating the Aspect module that was in 2.0, because it isn't related to the HTTP-focus of CherryPy.

David goes on:

Another project that is the next level down from these frameworks but that is also moving fast is Quixote, I feel the differences between Quixote and CherryPy are also becoming smaller (shown by the recent blog posts on Python Web Controllers). Whether they could ever merge is a different matter. Probably not possible (or even desireable) for the moment.

I would have to agree. There's a decisive difference in architectural style between CherryPy and Quixote. That doesn't mean there aren't components that are common to each, and there are certainly some which are unique to each which deserve to be ported! If the Quixote coders are willing to give up all the method names starting with set_ and get_ we're ready to have a conversation about merging. ;)

10/04/05

Permalink 04:09:43 pm, by admin Email , 65 words   English (US)
Categories: CherryPy

The medusa called autoreload

This is what I spent my weekend working on (among other things). It's the "autoreload" functionality in CherryPy. It was so complicated that it took me 15 minutes to understand it again, anytime I got distracted; having the diagram makes it quicker, at least. They say the human brain can handle about 7 things simultaneously, and this snake-pit takes about 4 at a minimum:

09/19/05

Permalink 11:52:49 am, by fumanchu Email , 193 words   English (US)
Categories: CherryPy

CherryPy 2.1 RC1 is out

The official mailing-list announcement is here.

The big change from 2.1 beta is the session filter—it's been completely rewritten.

Minor updates/fixes:

  • Server-side image maps (ISMAP) now supported.
  • More documentation in the official CherryPy book.
  • Improved coverage tool output.
  • Support for partial GET requests.
  • New HTTPError(status) exception, plus pretty HTML pages for 4xx-5xx responses.(which are customizable).
  • Separate access and error logs.
  • % HEX HEX decoding now works for URL's, not just params.
  • New cptools.serveFile function.
  • New config entries which allow you to limit the size of request headers and body (to avoid denial-of-service attacks).
  • Tracebacks can now be inserted into the CherryPy log via "server.logTracebacks" config entry (True by default).
  • New expose() function/decorator, which allows you to alias any page handler method.
  • HTTPRedirect can now be raised in _cpOnError or error filter methods.
  • Other minor bugs in the beta were fixed.

Way to go, team. I'm pleased as punch to be a part of this powerful, Pythonic product. :)

09/18/05

Permalink 10:04:45 pm, by fumanchu Email , 119 words   English (US)
Categories: CherryPy

mod_python wrapper for CherryPy

I wrote a WSGI wrapper for modpython a while back, but nobody's gotten it to work yet with Apache2 on Linux (at least, nobody in the CherryPy community). The current theory is that it's due to the differences in MPM's between Windows (mpm_winnt, which is one process with multiple threads) and Linux (worker, threadpool).

If you're stuck wanting CherryPy + modpython on Linux, have a look at Jamie Turner's new mpcp.py, which skips the WSGI layer and directly connects CherryPy to modpython. Then let everyone on #cherrypy know how it went. ;)

08/24/05

Permalink 01:06:38 am, by fumanchu Email , 676 words   English (US)
Categories: CherryPy

CherryPy now handles partial GETs

Partial GET requests are a handy way for a client to request a portion of a resource, rather than the entire resource. HTTP clients send a Range: bytes=start-stop request header, where start and stop are non-negative integers. The HTTP server can then send only those bytes (inclusive) in the response. Multiple byte ranges are also possible. CherryPy has had support for this since, well, earlier this morning (changeset 549, in the current svn trunk).

As John Udell noted a while back, Adobe Reader uses Range headers to accomplish this if the server supports it. Here's an example .pdf request, and the server's response (many headers omitted for clarity):

GET /mail.pdf HTTP/1.1

200 OK
Accept-Ranges 'bytes'
Content-Length '6786140'
Content-Type 'application/pdf'
Last-Modified 'Mon, 01 Dec 2003 18:13:02 GMT'

On the first request, the server returns a normal 200 response, and begins outputting the file. However, it also outputs the "Accept-Ranges" response header. This tells the client that partial GET requests (using the Range header) will be honored. Therefore, the client tries it, jumping to the PDF's content catalog at the end of the file:

GET /mail.pdf HTTP/1.1
RANGE 'bytes=6633280-6634323,6633278-6633279,6634324-6636107,6669998-6672067,5710727-5712197,
       6676118-6678187,6112458-6113719,184998-189293,6113720-6126880,189294-194632,60480-184997'

206 Partial Content
Accept-Ranges 'bytes'
Content-Length '158347'
Content-Type 'multipart/byteranges; boundary=192.168.1.102.1.2840.1124866199.113.1'

Now our server has returned a different status-code, "206 Partial Content". Since the client requested multiple byteranges, the response body is a multipart/byteranges entity. Each part inside that multipart body has its own Content-Type and Content-Range headers.

Since that worked, Adobe Reader proceeds to read more ranges:

GET /mail.pdf HTTP/1.1
RANGE 'bytes=6626812-6627960,6785258-6786139'

206 Partial Content
Accept-Ranges 'bytes'
Content-Length '2305'
Content-Type 'multipart/byteranges; boundary=192.168.1.102.1.2840.1124866199.653.2'

GET /mail.pdf HTTP/1.1
RANGE 'bytes=6636108-6636167,194633-198372,198373-202575'

206 Partial Content
Accept-Ranges 'bytes'
Content-Length '8392'
Content-Type 'multipart/byteranges; boundary=192.168.1.102.1.2840.1124866499.354.3'
Last-Modified 'Mon, 01 Dec 2003 18:13:02 GMT'

...and then appears to have finished, after taking some time to process those responses. However, when I scrolled to the last page in the PDF document, Adobe Reader made an additional request:

GET /mail.pdf HTTP/1.1
RANGE 'bytes=6668978-6669997,6672068-6672327,5698663-5699716,5723626-5724706,
       5699717-5702732,6112097-6112457,6165630-6166884,5702733-5704994,
       5713308-5723625,5704995-5705748,5724707-5724832,6166885-6175196,5705749-5710726'

206 Partial Content
Accept-Ranges 'bytes'
Content-Length '36370'
Content-Type 'multipart/byteranges; boundary=192.168.1.102.1.2840.1124866600.580.4'
Last-Modified 'Mon, 01 Dec 2003 18:13:02 GMT'

So it seems there's quite a bit of partial-retrieval going on, ultimately making the client more responsive from the user's point-of-view.

How to take advantage of partial GET support in CherryPy

If you want to serve static files so that clients can requests portions of them (including resumable downloads!), you only need to use the StaticFilter, which handles Range requests transparently. Here's the script I used to serve mail.pdf:

import cherrypy

class Root: pass
cherrypy.root = Root()

cherrypy.config.update({
        'server.environment': 'production',
        'staticFilter.on': True,
        'staticFilter.dir': 'static',
})
cherrypy.server.start()

Yes, that's really all you need to make an HTTP static file server! The staticFilter.dir (where I saved mail.pdf) is relative to wherever you save the above script. If you'd like it relative to some other absolute path, set that in staticFilter.root.

If you'd like to respond to Range request headers, but you're not serving static files, you can still benefit from CherryPy's core. In cherrypy/_cphttptools, there is a get_ranges(content_length) function which you can use; it examines the current request's Range header, and returns a list of (start, stop) tuples (or just returns None if there's no header). For example, given the Range header:

Range: bytes=30000-40000

The call get_ranges(50000) will return [(30000, 40001)]. Note that we've incremented stop by 1, so that you can use it in a string-slicing operation (byte-ranges are inclusive, but Python's slices have exclusive upper-bounds).

Note also that you need to supply a content-length to get_ranges. It's perfectly valid for a client to request Range: bytes=-500, and expect to receive the last 500 bytes of the resource. So you need to specify the total length in order to do the subtraction.

CherryPy doesn't yet handle the If-Range request header, so feel free to write that and contribute it. ;) ETag support would be nice, too.

08/20/05

Permalink 11:08:42 am, by fumanchu Email , 120 words   English (US)
Categories: CherryPy

Is your code a novel?

Remco, a long-time friend of and contributor to CherryPy, started porting his first app to CP 2.1 today, and had this to say:

<remco>    btw, the code has been cleaned beautifully!!!
[fumanchu] well, thanks
<remco>    respect to all of you who contributed to it
[fumanchu] I tried to make the core process easy to read
<remco>    well, it's still a webserver core, so one has to keep focussed,
<remco>    but compared to 2.0 or prior to that : it reads like a novel! :D
[fumanchu] heh
[fumanchu] 2.0 was a collection of short stories
<remco>    and you can jot that on ur resume

Thanks, I just might do that. :)

08/19/05

Permalink 12:05:47 pm, by fumanchu Email , 785 words   English (US)
Categories: Python, CherryPy

Code Coverage with CherryPy 2.1

CherryPy1 helps with both the collection and the analysis of coverage data (for a good introduction to code coverage, see bullseye.com). Now, I'm a visual learner, so I'm going to skip right to the screenshot and explain it in detail afterward. This is a browser session with two frames: a menu frame on the left and a file frame on the right. Clicking on one of the filenames in the menu will show you that file, annotated with coverage data, in the right-hand frame. This stats-browser is included with CherryPy, and can be used for any application, not just CherryPy or CP apps.

1 All of this is present in CherryPy 2.1 beta, revision 543. Get it via SVN

coverage stats browser session

Collection of coverage statistics

You need to start by obtaining the coverage.py module, either the original from Gareth Rees, or Ned Batchelder's updated version. Drop it in site-packages.

Covering CherryPy

If you're collecting coverage statistics for CherryPy itself, just run the test suite with the --cover option. Coverage data will be collected in cherrypy/lib/coverage.cache. Example:

mp5:/usr/lib/python2.3/site-packages# python cherrypy/test/test.py --cover

Covering CherryPy applications

If you write a test suite for your own applications, build it on top of the tools present in cherrypy/test. Here's a minimal example:

import os, sys
localDir = os.path.dirname(__file__)
dbpath = os.path.join(localDir, "db")

from cherrypy.test import test

if __name__ == '__main__':
    # Place our current directory's parent (myapp/) at the beginning
    # of sys.path, so that all imports are from our current directory.
    curpath = os.path.normpath(os.path.join(os.getcwd(), localDir))
    sys.path.insert(0, os.path.normpath(os.path.join(curpath, '../../')))

    testList = ["test_directory",
                "test_inventory",
                "test_invoice",
                ]
    testConf = os.path.join(localDir, "test.conf")
    test.TestHarness(testList).run(testConf)

By using the TestHarness from CherryPy's test suite, you automatically get access to the --cover command-line arg (and --profile and all the others, too, but that's for another day). Again, coverage data will be collected in cherrypy/lib/coverage.cache by default.

Covering Other Applications

You can use the stats-browser even if you don't use the CherryPy framework to develop your applications. Just use coverage.py as it was originally intended:

coverage.py -x yourapp.py

The coverage data, in this case, will be collected by default into a .coverage file. You need to tell the stats-server where this file is (see below). Note that successive manual calls to coverage.py will accumulate stats; the CherryPy test suite, in contrast, erases the data on each run.

Analysis of coverage statistics

Once you've got coverage data sitting around in a file somewhere, it's a snap to have CherryPy serve it in your browser. If you're covering the CherryPy test suite, or your own CP app using CP's TestHarness (see above), just execute:

mp5:/usr/lib/python2.3/site-packages# python cherrypy/lib/covercp.py

Then, point your browser to http://localhost:8080, and you should see an image similar to the above.

By default, the server reads coverage data from cherrypy/lib/coverage.cache, the same file our collector wrote to by default. If you covered your own application and collected the data in another file, you can supply that path as a command-line arg:

# python cherrypy/lib/covercp.py /path/to/.coverage 8088

If you supply a second arg, as in this example, it will change the port for you (from the default of 8080).

You need to stop (Ctrl-C) and restart the server if you recollect coverage data.

The interface

Each file in the menu has coverage stats, and is a hyperlink; click on one, and the file frame will show you the file contents, annotated with coverage data. Lines that start with ">" were touched, and those that start with "!" were not.

Click the "Show %" button to show a "percent covered" figure for each file. This can take a long time if you have lots of files, so it's best to first restrict your view using the directory links. Each directory is a hyperlink; click on one to restrict the menu to that folder only. Percentages below the "threshold" value will be shown in red. The "Show %" feature isn't "sticky", by the way; that is, if you click on a different directory link, or refresh the page, the figures will disappear. That's a necessary evil due to the slowness of generating percentages for many files. Just hit the "Show %" button again as needed.

As you can see from the screenshot, I've got some more tests to write! Hope you find this tool as useful as I do. :)

08/16/05

Permalink 05:17:39 pm, by fumanchu Email , 662 words   English (US)
Categories: CherryPy, WSGI

Funny how people only goggle over the baby

Simon Willison recently wrote a description of Django's request-handling mechanism. Here's a quick comparison with CherryPy:

When Django receives a request, the first thing it does is create an HttpRequest object (or subclass there-of) to represent that request.

CherryPy has a Request object, as well. However, it's purely an internal object; it doesn't get passed around to application code. One of the design points of CherryPy is that it allows you to write (at least a majority of) your code "like any other app"; this means that input arrives as "simple data" via function parameters, and you use the "return" statement to output data, not custom HTTP-framework objects. Point in favor of CP, IMO.

Once the object has been created, Django performs URL resolution. This is a process by which the URL specified in the request is used to select a view function to handle the creation of a response. A trivial Django application is simply one or more view functions and a configuration file that maps those functions to URLs.

Like almost every other web framework. ;) The only difference from CherryPy is that CP specifies the mapping in code, not config files. Another point to CP.

Having resolved the URL to a view, the view function is called with the request object as the first argument. Other keyword arguments may be passed as well depending on the URL configuration; see the documentation for details.

See above; CherryPy is flatter, and tends to pass data, not internal objects.

The view function is where the bulk of the work happens: it is here that database queries are made, templates loaded, HTML is generated and an HttpResponse object encapsulating the result is created. The view function returns this object, which is then passed back to the environment-specific code (mod_python or WSGI) which passes it back to the browser as an HTTP response.

Again, CherryPy is flatter, expecting you to return data, not objects. You can return a string, an iterable of strings, a file, or None, or yield any of those. Point.

This is all pretty straightforward stuff - but I skipped a couple of important details: exceptions and middleware. The view function doesn't have to return an HttpResponse; it can raise an exception instead, the most common varieties being Http404 (for file-not-found) or Http500 (for server error). In development servers these exceptions will be formatted and sent back to the browser, while in production mode they will be silently logged and a "friendly" error message displayed.

CherryPy also has user-raisable exceptions; however, they're not so low-level. Instead of Http404, you raise cherrypy.NotFound. Instead of Http3xx, you raise cherrypy.HTTPRedirect. I prefer CP's style, of course, but I don't think it's a clear "winner" over Httpxxx exceptions.

Middleware is even more interesting. Django provides three hooks in the above sequence where middleware classes can intervene, with the middleware classes to be used defined in the site's configuration file. This results in three types of middleware: request, view and response (although one middleware class can apply for more than one hook).

CherryPy has 7 such hooks; two are for errors, so let's call it 5 for a more-reasonable comparison. But see my previous post on why static hook points may not be the best approach. Still, 5 is better than 3 :). Point.

The bulk of the above code can be found in the call method of the ModPythonHandler class and the get_response method of the BaseHandler class.

That sounds like an unfortunate violation of the DRY principle. CherryPy isolates all of that nicely via the server.request() function. Are we keeping score yet?

As Django is not yet at a 1.0 release, the above is all subject to potential refactoring future change.

I can't wait to see Django 1.0! Until then, I'm going to take our adolescent web framework and go sulk in my room. ;)

Permalink 11:01:16 am, by fumanchu Email , 239 words   English (US)
Categories: Python, Dejavu, CherryPy

It doesn't take much of a Python to swallow my brain

Lines of code in the four systems I hack on most often (and of which I have a more-or-less complete grasp):

>>> import LOC
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\cherrypy")
9709
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\dejavu")
8165
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\endue")
7914
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\mcontrol")
9468

Something about my brain must naturally fit 7500-to-10000-line chunks of Python code. I certainly experience a strong drive to keep these systems from becoming more complicated, which I usually express via aggressive refactoring.

Some other packages (which I don't hack on) for comparison:

>>> LOC.LOC(r"C:\Python23\Lib\site-packages\colorstudy\SQLObject")
10477
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\paste")
20660
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\PIL")
14177
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\twisted")
136318
>>> LOC.LOC(r"D:\download\Zope-2.8.1-final\Zope-2.8.1-final")
415978

I think the sheer size of paste, twisted, and zope has actively kept me from wanting to dig into them further (but it's certainly not the only factor). Irrational, perhaps, but a natural human response to information overload.

Here's the LOC script if anyone wants to compare packages:

import os, codecs, re

def LOC(root, pattern='^.*\.py$'):
    LOCs = []
    pattern = re.compile(pattern)
    for path, dirs, files in os.walk(root):
        for f in files:
            if pattern.match(f):
                mod = os.path.join(root, path, f)
                lines = len(codecs.open(mod, "rb").readlines())
                LOCs.append(lines)
    return sum(LOCs)

08/14/05

Permalink 01:12:42 am, by fumanchu Email , 466 words   English (US)
Categories: Python, CherryPy

unittest's bad rap

Phillip J. Eby recently said:

unittest has gotten something of a bad rap, I think. Regardless of whether you like its basic testing facilities or not, it is an extremely good framework. In fact, I think it's one of the most beautiful frameworks in the Python standard library. Its functionality is cleanly separated into four roles, each of which can be filled by any object implementing the right interface: runners, loaders, cases, and results. Because of this exceptionally clean factoring, the basic framework is amazingly extensible.

I couldn't agree more, which is why I recently converted CherryPy's ad hoc test suite to one based on unittest. Although unittest doesn't fit every program out-of-the-box, its components are a breeze to subclass in order to make it fit your problem domain. CherryPy 2.1 now has a nice webtest.py (which will only get better as it is extended), specifically designed for simultaneously testing both the client and server sides of an HTTP request. The webtest module:

  • Understands that full web-application test suites can have a lot of components and requests, and tries to keep its "no failures" output to a minimum.
  • Provides simplified page-request functions--all HTTP methods are available (like POST and PUT), and required request headers are set automatically if not provided manually.
  • Runs the page requests in the same process as the web-application server thread(s). This allows errors in the server to be trapped and then reported in the unittest thread (see the server_error function).
  • Automatically reloads all modules that have been imported by each test module.
  • Allows the person running the tests to see the response status, headers, and body, and also the requested URL, whenever an assertion fails. This helps test-first design tremendously.
  • Allows failed assertions to be ignored, so that the current test method may proceed with the remainder of the tests. Since many page requests are idempotent GET's, this can help debugging by collecting more failure information at once.
  • Provides easy regular-expression matching against the response body.

The webtest module is available, by the way, to be used in other frameworks or applications. There's nothing CherryPy-specific in that module; all of that is found in test\helper.py, which wraps webtest to fit the CP test suite. Anyone using webtest for their own framework or app could learn a thing or two from the wrappers there.

It'd be nice if some of PJE's mini-tutorial found its way into the docs for unittest; I said it was "a breeze to subclass", but only by reading most of the unittest source. Oh, and thanks, Phillip, for disallowing anonymous comments on your blog—that made me write up a more extensive post here on my own blog. But could you give me a trackback URL at least? ;)

<< 1 2 3 4 5 6 7 8 9 10 11 >>

August 2014
Sun Mon Tue Wed Thu Fri Sat
 << <   > >>
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31            

Search

The requested Blog doesn't exist any more!

XML Feeds

powered by b2evolution