« Broken mail, and when to outsource itIs your code a novel? »

CherryPy now handles partial GETs

08/24/05

Permalink 01:06:38 am, by fumanchu Email , 676 words   English (US)
Categories: CherryPy

CherryPy now handles partial GETs

Partial GET requests are a handy way for a client to request a portion of a resource, rather than the entire resource. HTTP clients send a Range: bytes=start-stop request header, where start and stop are non-negative integers. The HTTP server can then send only those bytes (inclusive) in the response. Multiple byte ranges are also possible. CherryPy has had support for this since, well, earlier this morning (changeset 549, in the current svn trunk).

As John Udell noted a while back, Adobe Reader uses Range headers to accomplish this if the server supports it. Here's an example .pdf request, and the server's response (many headers omitted for clarity):

GET /mail.pdf HTTP/1.1

200 OK
Accept-Ranges 'bytes'
Content-Length '6786140'
Content-Type 'application/pdf'
Last-Modified 'Mon, 01 Dec 2003 18:13:02 GMT'

On the first request, the server returns a normal 200 response, and begins outputting the file. However, it also outputs the "Accept-Ranges" response header. This tells the client that partial GET requests (using the Range header) will be honored. Therefore, the client tries it, jumping to the PDF's content catalog at the end of the file:

GET /mail.pdf HTTP/1.1
RANGE 'bytes=6633280-6634323,6633278-6633279,6634324-6636107,6669998-6672067,5710727-5712197,
       6676118-6678187,6112458-6113719,184998-189293,6113720-6126880,189294-194632,60480-184997'

206 Partial Content
Accept-Ranges 'bytes'
Content-Length '158347'
Content-Type 'multipart/byteranges; boundary=192.168.1.102.1.2840.1124866199.113.1'

Now our server has returned a different status-code, "206 Partial Content". Since the client requested multiple byteranges, the response body is a multipart/byteranges entity. Each part inside that multipart body has its own Content-Type and Content-Range headers.

Since that worked, Adobe Reader proceeds to read more ranges:

GET /mail.pdf HTTP/1.1
RANGE 'bytes=6626812-6627960,6785258-6786139'

206 Partial Content
Accept-Ranges 'bytes'
Content-Length '2305'
Content-Type 'multipart/byteranges; boundary=192.168.1.102.1.2840.1124866199.653.2'

GET /mail.pdf HTTP/1.1
RANGE 'bytes=6636108-6636167,194633-198372,198373-202575'

206 Partial Content
Accept-Ranges 'bytes'
Content-Length '8392'
Content-Type 'multipart/byteranges; boundary=192.168.1.102.1.2840.1124866499.354.3'
Last-Modified 'Mon, 01 Dec 2003 18:13:02 GMT'

...and then appears to have finished, after taking some time to process those responses. However, when I scrolled to the last page in the PDF document, Adobe Reader made an additional request:

GET /mail.pdf HTTP/1.1
RANGE 'bytes=6668978-6669997,6672068-6672327,5698663-5699716,5723626-5724706,
       5699717-5702732,6112097-6112457,6165630-6166884,5702733-5704994,
       5713308-5723625,5704995-5705748,5724707-5724832,6166885-6175196,5705749-5710726'

206 Partial Content
Accept-Ranges 'bytes'
Content-Length '36370'
Content-Type 'multipart/byteranges; boundary=192.168.1.102.1.2840.1124866600.580.4'
Last-Modified 'Mon, 01 Dec 2003 18:13:02 GMT'

So it seems there's quite a bit of partial-retrieval going on, ultimately making the client more responsive from the user's point-of-view.

How to take advantage of partial GET support in CherryPy

If you want to serve static files so that clients can requests portions of them (including resumable downloads!), you only need to use the StaticFilter, which handles Range requests transparently. Here's the script I used to serve mail.pdf:

import cherrypy

class Root: pass
cherrypy.root = Root()

cherrypy.config.update({
        'server.environment': 'production',
        'staticFilter.on': True,
        'staticFilter.dir': 'static',
})
cherrypy.server.start()

Yes, that's really all you need to make an HTTP static file server! The staticFilter.dir (where I saved mail.pdf) is relative to wherever you save the above script. If you'd like it relative to some other absolute path, set that in staticFilter.root.

If you'd like to respond to Range request headers, but you're not serving static files, you can still benefit from CherryPy's core. In cherrypy/_cphttptools, there is a get_ranges(content_length) function which you can use; it examines the current request's Range header, and returns a list of (start, stop) tuples (or just returns None if there's no header). For example, given the Range header:

Range: bytes=30000-40000

The call get_ranges(50000) will return [(30000, 40001)]. Note that we've incremented stop by 1, so that you can use it in a string-slicing operation (byte-ranges are inclusive, but Python's slices have exclusive upper-bounds).

Note also that you need to supply a content-length to get_ranges. It's perfectly valid for a client to request Range: bytes=-500, and expect to receive the last 500 bytes of the resource. So you need to specify the total length in order to do the subtraction.

CherryPy doesn't yet handle the If-Range request header, so feel free to write that and contribute it. ;) ETag support would be nice, too.

1 comment

Comment from: Greg Fuller [Visitor]

Very nice. Thanks!

08/25/05 @ 09:20

Leave a comment


Your email address will not be revealed on this site.

Your URL will be displayed.

Please enter the phrase "I am a real human." in the textbox above.
(Line breaks become <br />)
(Name, email & website)
(Allow users to contact you through a message form (your email will not be revealed.)
July 2019
Sun Mon Tue Wed Thu Fri Sat
 << <   > >>
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      

Search

The requested Blog doesn't exist any more!

XML Feeds

blogging soft