« Mapping Python types to DB typesSelenium RC fixed for FF 2.0.0.1 »

help(CherryPy 3.0)

01/20/07

Permalink 02:46:25 am, by fumanchu Email , 1378 words   English (US)
Categories: CherryPy

help(CherryPy 3.0)

Abstract

  1. CherryPy just grew its first metaclass.
  2. CherryPy just grew its first stdlib monkeypatch.
  3. Because of 1 and 2, CherryPy is now a heck of a lot easier to learn and use.
  4. Points 1, 2, and 3 all apply to unreleased trunk code and are subject to change.

Intro

I've been a proper fool (and might still be). I've been telling everyone that CherryPy 3 is much easier to learn and use because it's been tailored to be help()-ful. What I meant was that you could open an interactive interpreter, type help(cherrypy.<thing>) and have at least some idea of what it does. I spent quite a bit of time honing the top-level namespace down to as few components as possible (and some of the component namespaces, too) in order to make help() easier to read.

This is harder to do than you might think. Unlike simple linear scripts or libraries, the most important objects when CherryPy is "live" don't exist at an interactive prompt. The Request, Response, and Session objects are all heavily dependent on the context of a real HTTP conversation. They're hard to create in a vacuum. And although there's one of each per thread while the system is running, they are implemented as thread local objects so that the CherryPy programmer can treat each of them as if there were only one: a global.

Reusing thread locals

Thread locals are a great invention, but they suffer from one serious drawback when used in a threaded framework: they allow anyone to add attributes to them. If the framework re-uses the same thread for multiple requests, it becomes difficult to reliably clean out all of those attributes between requests.

CherryPy's solution to that was to add a container in 2.1; instead of a separate thread local for the Request, Response, and Session objects, there is a single, hidden thread local called cherrypy._serving, and the Request, Response, and Session objects for each thread are attributes of the "serving" object. This makes it easy for cleanup code: it just calls cherrypy._serving.__dict__.clear() when the request ends. (Aside: this technique also allows the Request, Response and Session types to be overridden).

However, pushing those objects into a container means they're no longer so easy to reference. CherryPy code would become uglier and more difficult if, instead of:

cherrypy.request.method

...you had to write:

cherrypy._serving.request.method

So a _ThreadLocalProxy class was introduced to allow CherryPy code to keep writing the nicer, shorter syntax. In short, it passes __getattr__ (and other double-underscore methods) through to a wrapped object. So cherrypy.request became a proxy object to a wrapped Request object. Ditto for response and session.

That was fine for CherryPy 2, but one of the goals for version 3.0 is better IDE support. Most IDE's at least provide calltips for code completion, but there aren't usually any HTTP requests coming in as you're writing code! CP 2's thread local proxies didn't have a request object in the main thread (or any thread that wasn't started by the HTTP server), so typing cherrypy.request. couldn't result in a calltip as you coded. The solution for CherryPy 3 was to have the proxy's __getattr__ and friends wrap a default object if a live object could not be found. And the default objects' attributes are true defaults; if they're not overridden (in config or code), they won't change when the system goes live. This makes interactive exploration even easier; you can forget all about the threading and pretend you're looking at live, global objects.

help(proxy) isn't helpful

But there's another catch: one of the few problems with using a proxy object in pure Python is that it's no longer of the same type as the wrapped object. Unfortunately for us, Python's builtin help function uses pydoc, and pydoc calls type(obj) quite a bit.

You can certainly call help(cherrypy.request.run) and get the correct docstring, because "run" is an attribute of cherrypy.request, the proxy calls __getattr__ first, and then type() is called on the attribute, not the request object/proxy. But if you attempt help(cherrypy.request), you're in for some confusion, because the proxy implementation leaks out.

Or rather, it did leak out until just now. I took the plunge and CherryPy now monkeypatches pydoc, so that it "passes the help() call through the proxy". Monkeypatching the standard library is of course a huge no-no, but the alternative was to essentially copy and paste most of pydoc and distribute the result with CherryPy. Now, help(cherrypy.response) at least prints:

>>> help(cherrypy.response)
Help on Response in module cherrypy._cprequest object:

class Response(__builtin__.object)
 |  An HTTP Response, including status, headers, and body.
 |  
 |  Application developers should use Response.headers (a dict) to
 |  set or modify HTTP response headers. When the response is finalized,
 |  Response.headers is transformed into Response.header_list as
 |  (key, value) tuples.
 |  
 |  Methods defined here:
 |  
 |  __init__(self)
 |  
 |  check_timeout(self)
 |      If now > self.time + self.timeout, set self.timed_out.
 |      
 |      This purposefully sets a flag, rather than raising an error,
 |      so that a monitor thread can interrupt the Response thread.
 |  
 |  collapse_body(self)
 |  
 |  finalize(self)
 |      Transform headers (and cookies) into self.header_list.
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  __dict__ = <dictproxy object>
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__ = <attribute '__weakref__' of 'Response' objects>
 |      list of weak references to the object (if defined)
 |  
 |  body = <cherrypy._cprequest.Body object>
 |      The body of the HTTP response (the response entity).
 |  
 |  cookie = <SimpleCookie: >
 |  
 |  header_list = []
 |  
 |  headers = {}
 |  
 |  status = ''
 |  
 |  stream = False
 |  
 |  time = None
 |  
 |  timed_out = False
 |  
 |  timeout = 300

Documenting data

But there's a further flaw with the above output of help(); none of the data members of the Response class are documented! A few of them are mentioned in the class docstring, to be sure, but hardly to a truly useful extent. The Request object is an even poorer state, since it has so many more data members.

The solution for that issue is somewhat complicated, as well. It turns out that there are plenty of good documentation generators for Python code (that emit HTML or text; epydoc and pudge spring to mind), but no serious helpers for making help() more informative. This is a real shame; I would almost always rather have help() be truly helpful than go read a book or search online docs.

So I proposed a (small!) metaclass to help alleviate the problem for CherryPy. When you look at CherryPy source code, now, you might see something like this:

class Request(object):
    """An HTTP request."""

    __metaclass__ = cherrypy._AttributeDocstrings

    prev = None
    prev__doc = """
    The previous Request object (if any). This should be None
    unless we are processing an InternalRedirect."""

    # Conversation/connection attributes
    local = http.Host("localhost", 80)
    local__doc = \
        "An http.Host(ip, port, hostname) object for the server socket."

    remote = http.Host("localhost", 1111)
    remote__doc = \
        "An http.Host(ip, port, hostname) object for the client socket."

The _AttributeDocstrings metaclass does one thing: finds class members whose names look like <attrname>__doc, takes their str value, formats it, and folds it into the class docstring. Here's a snippet of the resulting help() output:

Help on Request in module cherrypy._cprequest object:

class Request(__builtin__.object)
 |  An HTTP request.
 |  
 |  local [= http.Host('localhost', 80, 'localhost')]:
 |      An http.Host(ip, port, hostname) object for the server socket.
 |  
 |  prev [= None]:
 |      The previous Request object (if any). This should be None
 |      unless we are processing an InternalRedirect.
 |  
 |  remote [= http.Host('localhost', 1111, 'localhost')]:
 |      An http.Host(ip, port, hostname) object for the client socket.

Christian's first question was, "why not just write it yourself by hand in the docstring?" Here's the long answer. The metaclass:

  1. Places the docstring nearer to the attribute declaration.
  2. Makes attribute docs more uniform ("name (default): doc").
  3. Automatically gets the attribute name right in the docstring.
  4. Automatically gets the default value right in the docstring.

I chose the naming convention because it allows the attribute name and the attribute__doc name to line up horizontally (it doesn't matter which comes first; I prefer to put the doc after the attribute). It also looks similar to the conventions in Python's C code, where doc variable names look like module_attribute__doc__ or sometimes just attribute_doc.

Code faster

Hopefully these two improvements, although more awkward than I like implementation-wise, will make using CherryPy much easier and faster. Feel free to help() us out by writing a few data member docstrings!

1 comment

Comment from: Arnar [Visitor]

Brilliant, I like it.

02/13/07 @ 05:35

Leave a comment


Your email address will not be revealed on this site.

Your URL will be displayed.

Please enter the phrase "I am a real human." in the textbox above.
(Line breaks become <br />)
(Name, email & website)
(Allow users to contact you through a message form (your email will not be revealed.)
November 2017
Sun Mon Tue Wed Thu Fri Sat
 << <   > >>
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30    

Search

The requested Blog doesn't exist any more!

XML Feeds

powered by b2evolution free blog software