Categories: IT, Architecture, Linnaeus Award, Python, Cation, CherryPy, Dejavu, WHELPS, WSGI, Robotics and Engineering

Pages: << 1 2 3 4 5 6 7 8 9 10 11 ... 17 >>

01/20/07

Permalink 02:46:25 am, by fumanchu Email , 1378 words   English (US)
Categories: CherryPy

help(CherryPy 3.0)

Abstract

  1. CherryPy just grew its first metaclass.
  2. CherryPy just grew its first stdlib monkeypatch.
  3. Because of 1 and 2, CherryPy is now a heck of a lot easier to learn and use.
  4. Points 1, 2, and 3 all apply to unreleased trunk code and are subject to change.

Intro

I've been a proper fool (and might still be). I've been telling everyone that CherryPy 3 is much easier to learn and use because it's been tailored to be help()-ful. What I meant was that you could open an interactive interpreter, type help(cherrypy.<thing>) and have at least some idea of what it does. I spent quite a bit of time honing the top-level namespace down to as few components as possible (and some of the component namespaces, too) in order to make help() easier to read.

This is harder to do than you might think. Unlike simple linear scripts or libraries, the most important objects when CherryPy is "live" don't exist at an interactive prompt. The Request, Response, and Session objects are all heavily dependent on the context of a real HTTP conversation. They're hard to create in a vacuum. And although there's one of each per thread while the system is running, they are implemented as thread local objects so that the CherryPy programmer can treat each of them as if there were only one: a global.

Reusing thread locals

Thread locals are a great invention, but they suffer from one serious drawback when used in a threaded framework: they allow anyone to add attributes to them. If the framework re-uses the same thread for multiple requests, it becomes difficult to reliably clean out all of those attributes between requests.

CherryPy's solution to that was to add a container in 2.1; instead of a separate thread local for the Request, Response, and Session objects, there is a single, hidden thread local called cherrypy._serving, and the Request, Response, and Session objects for each thread are attributes of the "serving" object. This makes it easy for cleanup code: it just calls cherrypy._serving.__dict__.clear() when the request ends. (Aside: this technique also allows the Request, Response and Session types to be overridden).

However, pushing those objects into a container means they're no longer so easy to reference. CherryPy code would become uglier and more difficult if, instead of:

cherrypy.request.method

...you had to write:

cherrypy._serving.request.method

So a _ThreadLocalProxy class was introduced to allow CherryPy code to keep writing the nicer, shorter syntax. In short, it passes __getattr__ (and other double-underscore methods) through to a wrapped object. So cherrypy.request became a proxy object to a wrapped Request object. Ditto for response and session.

That was fine for CherryPy 2, but one of the goals for version 3.0 is better IDE support. Most IDE's at least provide calltips for code completion, but there aren't usually any HTTP requests coming in as you're writing code! CP 2's thread local proxies didn't have a request object in the main thread (or any thread that wasn't started by the HTTP server), so typing cherrypy.request. couldn't result in a calltip as you coded. The solution for CherryPy 3 was to have the proxy's __getattr__ and friends wrap a default object if a live object could not be found. And the default objects' attributes are true defaults; if they're not overridden (in config or code), they won't change when the system goes live. This makes interactive exploration even easier; you can forget all about the threading and pretend you're looking at live, global objects.

help(proxy) isn't helpful

But there's another catch: one of the few problems with using a proxy object in pure Python is that it's no longer of the same type as the wrapped object. Unfortunately for us, Python's builtin help function uses pydoc, and pydoc calls type(obj) quite a bit.

You can certainly call help(cherrypy.request.run) and get the correct docstring, because "run" is an attribute of cherrypy.request, the proxy calls __getattr__ first, and then type() is called on the attribute, not the request object/proxy. But if you attempt help(cherrypy.request), you're in for some confusion, because the proxy implementation leaks out.

Or rather, it did leak out until just now. I took the plunge and CherryPy now monkeypatches pydoc, so that it "passes the help() call through the proxy". Monkeypatching the standard library is of course a huge no-no, but the alternative was to essentially copy and paste most of pydoc and distribute the result with CherryPy. Now, help(cherrypy.response) at least prints:

>>> help(cherrypy.response)
Help on Response in module cherrypy._cprequest object:

class Response(__builtin__.object)
 |  An HTTP Response, including status, headers, and body.
 |  
 |  Application developers should use Response.headers (a dict) to
 |  set or modify HTTP response headers. When the response is finalized,
 |  Response.headers is transformed into Response.header_list as
 |  (key, value) tuples.
 |  
 |  Methods defined here:
 |  
 |  __init__(self)
 |  
 |  check_timeout(self)
 |      If now > self.time + self.timeout, set self.timed_out.
 |      
 |      This purposefully sets a flag, rather than raising an error,
 |      so that a monitor thread can interrupt the Response thread.
 |  
 |  collapse_body(self)
 |  
 |  finalize(self)
 |      Transform headers (and cookies) into self.header_list.
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  __dict__ = <dictproxy object>
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__ = <attribute '__weakref__' of 'Response' objects>
 |      list of weak references to the object (if defined)
 |  
 |  body = <cherrypy._cprequest.Body object>
 |      The body of the HTTP response (the response entity).
 |  
 |  cookie = <SimpleCookie: >
 |  
 |  header_list = []
 |  
 |  headers = {}
 |  
 |  status = ''
 |  
 |  stream = False
 |  
 |  time = None
 |  
 |  timed_out = False
 |  
 |  timeout = 300

Documenting data

But there's a further flaw with the above output of help(); none of the data members of the Response class are documented! A few of them are mentioned in the class docstring, to be sure, but hardly to a truly useful extent. The Request object is an even poorer state, since it has so many more data members.

The solution for that issue is somewhat complicated, as well. It turns out that there are plenty of good documentation generators for Python code (that emit HTML or text; epydoc and pudge spring to mind), but no serious helpers for making help() more informative. This is a real shame; I would almost always rather have help() be truly helpful than go read a book or search online docs.

So I proposed a (small!) metaclass to help alleviate the problem for CherryPy. When you look at CherryPy source code, now, you might see something like this:

class Request(object):
    """An HTTP request."""

    __metaclass__ = cherrypy._AttributeDocstrings

    prev = None
    prev__doc = """
    The previous Request object (if any). This should be None
    unless we are processing an InternalRedirect."""

    # Conversation/connection attributes
    local = http.Host("localhost", 80)
    local__doc = \
        "An http.Host(ip, port, hostname) object for the server socket."

    remote = http.Host("localhost", 1111)
    remote__doc = \
        "An http.Host(ip, port, hostname) object for the client socket."

The _AttributeDocstrings metaclass does one thing: finds class members whose names look like <attrname>__doc, takes their str value, formats it, and folds it into the class docstring. Here's a snippet of the resulting help() output:

Help on Request in module cherrypy._cprequest object:

class Request(__builtin__.object)
 |  An HTTP request.
 |  
 |  local [= http.Host('localhost', 80, 'localhost')]:
 |      An http.Host(ip, port, hostname) object for the server socket.
 |  
 |  prev [= None]:
 |      The previous Request object (if any). This should be None
 |      unless we are processing an InternalRedirect.
 |  
 |  remote [= http.Host('localhost', 1111, 'localhost')]:
 |      An http.Host(ip, port, hostname) object for the client socket.

Christian's first question was, "why not just write it yourself by hand in the docstring?" Here's the long answer. The metaclass:

  1. Places the docstring nearer to the attribute declaration.
  2. Makes attribute docs more uniform ("name (default): doc").
  3. Automatically gets the attribute name right in the docstring.
  4. Automatically gets the default value right in the docstring.

I chose the naming convention because it allows the attribute name and the attribute__doc name to line up horizontally (it doesn't matter which comes first; I prefer to put the doc after the attribute). It also looks similar to the conventions in Python's C code, where doc variable names look like module_attribute__doc__ or sometimes just attribute_doc.

Code faster

Hopefully these two improvements, although more awkward than I like implementation-wise, will make using CherryPy much easier and faster. Feel free to help() us out by writing a few data member docstrings!

01/17/07

Permalink 05:12:12 pm, by fumanchu Email , 53 words   English (US)
Categories: Python

Selenium RC fixed for FF 2.0.0.1

The bug report is here. And, as stated in the original forum thread, you can get a nightly release from http://release.openqa.org/selenium-remote-control/nightly/ . Just replace your existing selenium-server.jar with the new one.

12/23/06

Permalink 06:05:52 pm, by fumanchu Email , 608 words   English (US)
Categories: CherryPy

CherryPy 3 has fastest WSGI server yet

A couple of months ago, in response to someone else's speed claims, I posted a comment that CherryPy's built in WSGI server could serve 1200 simple requests per second. The demo used Apache's "ab" tool to test ("-k -n 3000 -c %s"). In the last few days before the release of CherryPy 3.0 final, I've done some further optimization of cherrypy.wsgiserver, and now get 2000+ req/sec on my modest laptop.

threads | Completed | Failed | req/sec | msec/req | KB/sec |
     10 |      3000 |      0 | 2170.79 |    0.461 | 358.18 |
     20 |      3000 |      0 | 2080.34 |    0.481 | 343.26 |
     30 |      3000 |      0 | 1920.31 |    0.521 | 316.85 |
     40 |      3000 |      0 | 2051.84 |    0.487 | 338.55 |
     50 |      3000 |      0 | 2051.84 |    0.487 | 338.55 |

The improvements are due to a variety of optimizations, including:

  • Replacing mimetools/rfc822.Message with custom code for reading headers.
  • Using socket.sendall instead of a socket fileobject for writes.
  • Generic hand-tuning of code loops.

I want to make it clear that the benchmark does not exercise any part of CherryPy other than the WSGI server. I used a very simple WSGI application (not the full CherryPy stack):

def simple_app(environ, start_response):
    """Simplest possible application object"""
    status = '200 OK'
    response_headers = [('Content-type','text/plain'),
                        ('Content-Length','19')]
    start_response(status, response_headers)
    return ['My Own Hello World!']

The full stack of CherryPy includes the WSGI application side as well, and consequently takes more time. But that has risen from about 380 requests per second in October to:

Client Thread Report (1000 requests, 14 byte response body, 10 server threads):

threads | Completed | Failed | req/sec | msec/req | KB/sec |
     10 |      1000 |      0 |  536.86 |    1.863 |  85.36 |
     20 |      1000 |      0 |  509.47 |    1.963 |  81.01 |
     30 |      1000 |      0 |  499.28 |    2.003 |  79.39 |
     40 |      1000 |      0 |  491.90 |    2.033 |  78.21 |
     50 |      1000 |      0 |  504.32 |    1.983 |  80.19 |
Average |    1000.0 |    0.0 | 508.366 |    1.969 | 80.832 |

If you want to benchmark the full CherryPy stack on your own, just install CherryPy and run the script at cherrypy/test/benchmark.py.

Here's the other script for the "bare server" benchmarks:

import re
import sys
import threading
import time
from cherrypy import _cpmodpy

AB_PATH = ""
APACHE_PATH = "apache"
SCRIPT_NAME = ""
PORT = 8080


class ABSession:
    """A session of 'ab', the Apache HTTP server  benchmarking tool."""
    parse_patterns = [('complete_requests', 'Completed',
                       r'^Complete requests:\s*(\d+)'),
                      ('failed_requests', 'Failed',
                       r'^Failed requests:\s*(\d+)'),
                      ('requests_per_second', 'req/sec',
                       r'^Requests per second:\s*([0-9.]+)'),
                      ('time_per_request_concurrent', 'msec/req',
                       r'^Time per request:\s*([0-9.]+).*concurrent requests\)$'),
                      ('transfer_rate', 'KB/sec',
                       r'^Transfer rate:\s*([0-9.]+)'),
                      ]

    def __init__(self, path=SCRIPT_NAME + "/", requests=3000, concurrency=10):
        self.path = path
        self.requests = requests
        self.concurrency = concurrency

    def args(self):
        assert self.concurrency > 0
        assert self.requests > 0
        return ("-k -n %s -c %s <a href="http://localhost:%s%s"">http://localhost:%s%s"</a> %
                (self.requests, self.concurrency, PORT, self.path))

    def run(self):
        # Parse output of ab, setting attributes on self
        args = self.args()
        self.output = _cpmodpy.read_process(AB_PATH or "ab", args)
        for attr, name, pattern in self.parse_patterns:
            val = re.search(pattern, self.output, re.MULTILINE)
            if val:
                val = val.group(1)
                setattr(self, attr, val)
            else:
                setattr(self, attr, None)


safe_threads = (25, 50, 100, 200, 400)
if sys.platform in ("win32",):
    # For some reason, ab crashes with > 50 threads on my Win2k laptop.
    safe_threads = (10, 20, 30, 40, 50)


def thread_report(path=SCRIPT_NAME + "/", concurrency=safe_threads):
    sess = ABSession(path)
    attrs, names, patterns = zip(*sess.parse_patterns)
    rows = [('threads',) + names]
    for c in concurrency:
        sess.concurrency = c
        sess.run()
        rows.append([c] + [getattr(sess, attr) for attr in attrs])
    return rows

def print_report(rows):
    widths = []
    for i in range(len(rows[0])):
        lengths = [len(str(row[i])) for row in rows]
        widths.append(max(lengths))
    for row in rows:
        print
        for i, val in enumerate(row):
            print str(val).rjust(widths[i]), "|",
    print


if __name__ == '__main__':

    def simple_app(environ, start_response):
        """Simplest possible application object"""
        status = '200 OK'
        response_headers = [('Content-type','text/plain'),
                            ('Content-Length','19')]
        start_response(status, response_headers)
        return ['My Own Hello World!']

    from cherrypy import wsgiserver as w
    s = w.CherryPyWSGIServer(("localhost", PORT), simple_app)
    threading.Thread(target=s.start).start()
    try:
        time.sleep(1)
        print_report(thread_report())
    finally:
        s.stop()

11/13/06

Permalink 10:57:51 pm, by fumanchu Email , 745 words   English (US)
Categories: Python, CherryPy, WSGI

Internal Redirect WSGI middleware

I played around with this as a potential hack for CherryPy 3. It's WSGI middleware for adding almost-transparent "internal redirect" capabilities to any WSGI application.

My operating theory was that anyone writing a WSGI app that does not already have an internal-redirect feature was probably using HTTP redirects (302, 303, or 307) to do nearly the same thing. This middleware simply waits for a 307 response status and performs the redirection itself within the same request, without informing the user-agent.

This should be OK because 307 isn't normally cacheable anyway, and some versions of IE don't bother to ask the user as the spec requires already, so it just duplicates an existing browser bug. I could have used a custom HTTP code like 399, but if that ever leaked out to the UA (because someone forgot to enable the middleware) then the UA should fall back to "300 Multiple Choices", which didn't seem like a good fit. At least by using 307, the fallback should be appropriate, if not graceful.

Here's the code, which could probably use some improvements:

"""WSGI middleware which performs "internal" redirection."""

import StringIO


class _Redirector(object):

    def __init__(self, nextapp, recursive=False):
        self.nextapp = nextapp
        self.recursive = recursive

        self.location = None
        self.write_proxy = None
        self.status = None
        self.headers = None
        self.exc_info = None

        self.seen_paths = []

    def start_response(self, status, headers, exc_info):
        if status[:3] == "307":
            for name, value in headers:
                if name.lower() == "location":
                    self.location = value
                    break
        self.status = status
        self.headers = headers
        self.exc_info = exc_info
        return self.write

    def write(self, data):
        # This is only here for silly apps which call write.
        if self.write_proxy is None:
            self.write_proxy = self.sr(self.status, self.headers, self.exc_info)
        self.write_proxy(data)

    def __call__(self, environ, start_response):
        self.sr = start_response

        nextenv = environ.copy()
        curpath = nextenv['PATH_INFO']
        if nextenv.get('QUERY_STRING'):
            curpath = curpath + "?" + nextenv['QUERY_STRING']
        self.seen_paths.append(curpath)

        while True:
            # Consume the response (in case it's a generator).
            response = [x for x in self.nextapp(nextenv, self.start_response)]

            if self.location is None:
                # No redirection required; complete the response normally.
                self.sr(self.status, self.headers, self.exc_info)
                return response

            # Start with a fresh copy of the environ and start altering it.
            nextenv = environ.copy()
            nextenv['REQUEST_METHOD'] = 'GET'
            nextenv['CONTENT_LENGTH'] = '0'
            nextenv['wsgi.input'] = StringIO.StringIO()
            nextenv['redirector.history'] = self.seen_paths[:]

            # "The [Location response-header] field value
            # consists of a single absolute URI."
            (nextenv["wsgi.url_scheme"],
             nextenv["SERVER_NAME"],
             path, params,
             nextenv["QUERY_STRING"], frag) = urlparse(self.location)

            if frag:
                raise ValueError("Illegal #fragment in Location response "
                                 "header %r" % self.location)

            if params:
                path = path + ";" + params

            # Assume 'path' is already unquoted according to
            # <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5.1.2">http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5.1.2</a>
            if path.lower().startswith(environ['SCRIPT_NAME'].lower()):
                nextenv["PATH_INFO"] = path[len(environ['SCRIPT_NAME']):]
            else:
                raise ValueError("Location response header %r does not "
                                 "match current SCRIPT_NAME %r"
                                 % (self.location, environ['SCRIPT_NAME']))

            # Update self.seen_paths and check for recursive calls.
            curpath = nextenv['PATH_INFO']
            if nextenv.get('QUERY_STRING'):
                curpath = curpath + "?" + nextenv['QUERY_STRING']
            if curpath in self.seen_paths:
                raise RuntimeError("redirector visited the same URL twice: %r"
                                   % curpath)
            else:
                self.seen_paths.append(curpath)

            # Reset self for the next iteration
            self.location = None
            self.write_proxy = None
            self.status = None
            self.headers = None
            self.exc_info = None


def redirector(nextapp, recursive=False):
    """WSGI middleware which performs "internal" redirection.

    Whenever the next application sets a response status of 307 and
    provides a Location response header, this component will not pass
    that response on to the user-agent; instead, it parses the URI
    provided in the Location response header and calls the same
    application again using that URI. The following entries in the
    WSGI environ dict may be modified when redirecting: wsgi.url_scheme,
    SERVER_NAME, PATH_INFO, QUERY_STRING. REQUEST_METHOD is always
    set to 'GET', so any desired parameters must be supplied as
    query string arguments in the Location response header.
    The wsgi.input entry will always be reset to an empty StringIO,
    and CONTENT_LENGTH will be set to 0.

    If 'recursive' is False (the default), each new target URI will be
    checked to see if it has already been visited in the same request;
    if so, a RuntimeError is raised. If 'recursive' is True, no check
    is made and therefore no such errors are raised.
    """
    def redirect_wrapper(environ, start_response):
        ir = _Redirector(nextapp, recursive)
        return ir(environ, start_response)
    return redirect_wrapper

10/14/06

Permalink 02:46:43 pm, by fumanchu Email , 373 words   English (US)
Categories: CherryPy

If you like CherryPy except for the dispatching...

...you should know that CherryPy 3 (soon to be released) includes a Routes dispatcher:

class City:

    def __init__(self, name):
        self.name = name
        self.population = 10000

    def index(self, **kwargs):
        return "Welcome to %s, pop. %s" % (self.name, self.population)

    def update(self, **kwargs):
        self.population = kwargs['pop']
        return "OK"

d = cherrypy._cprequest.RoutesDispatcher()
d.connect(name='hounslow', route='hounslow', controller=City('Hounslow'))
d.connect(name='surbiton', route='surbiton', controller=City('Surbiton'),
          action='index', conditions=dict(method=['GET']))
d.mapper.connect('surbiton', controller='surbiton',
                 action='update', conditions=dict(method=['POST']))

conf = {'/': {'request.dispatch': d}}
cherrypy.tree.mount(root=None, config=conf)
cherrypy.config.update({'environment': 'test_suite'})

You tell CherryPy you want to use Routes dispatching in your app config with "request.dispatch = <obj>". The astute reader will note this means:

  1. You can make your own dispatchers for Django-style, regex style, Quixote-style, etc. You can even modify the builtin Routes dispatcher to add HTTP method dispatch or what-have-you.
  2. You can use the default CherryPy tree-style dispatcher for most paths, and Routes (or any other style) for select subpaths.

To make your own dispatcher:

  1. Make it callable. If it's a class, give it a __call__ method that takes a path_info argument.
  2. When called, it should set cherrypy.request.handler to a callable that takes no arguments. This "handler" should be (or should call) the user's application code. The default CherryPy handler, for example, sends virtual path atoms as *args and GET/POST parameters as **kwargs, but if that's not what your dispatch style requires, do something else. It's completely customizable. You can even set request.handler to None if you don't want anything called at that point. Note also that HTTPRedirect and HTTPError (including NotFound) can be used as handlers; when called, they raise self.
  3. Set cherrypy.request.config. This should be a flat dictionary of all config entries (from both global and application config) which apply to the current request, based on the path_info argument above. The default CherryPy dispatcher does a lot of work to correctly allow config file entries to override _cp_config entries on the CherryPy object tree. But if your dispatch style doesn't use a tree, you don't need to do all that.

Grab the latest trunk and start playing!

08/24/06

Permalink 12:05:29 am, by fumanchu Email , 320 words   English (US)
Categories: Python, Dejavu, CherryPy

Upgrades to Python 2.5

I probably waited too long, but today I upgraded both CherryPy (3.0alpha/trunk) and Dejavu (1.5alpha/trunk) to Python 2.5. The moves were surprisingly easy:

CherryPy

There were three changes in all:

  1. When the WSGI server socket is closed, socket.accept now fails with a socket.error "Socket operation on non-socket". I "fixed" this by just ignoring the error.
  2. The output of Response.SimpleCookie now has no trailing semicolon as it did in Python 2.4. Just had to fix the test suite to be aware of that.
  3. Some attributes of unittest.TestCase moved from double-underscore names to single; webtest had a custom subclass of it. This was easy enough to fix: define a different method for 2.5 than 2.4 or less.

[P.S. I've noticed CP 3 is about 3% slower in 2.5 than 2.4, even with the zombie frames and other optimizations. Hmmm.]

Dejavu

Amazingly, even though Dejavu makes extensive use of bytecode hacks, there was only one real change! The "logic" module needed an upgrade to the "comparison" function, which produces an Expression via the types.CodeType() constructor. Apparently, function args are no longer included in co_names, and co_consts no longer includes a leading 'None' value (except when there are cell references?). Finally, co_flags for "normal" functions now includes CO_NESTED by default. These changes also forced some parallel upgrades to the test suite.

While fixing the above, however, I noticed a long-standing bug in Dejavu's LambdaDecompiler. Python 2.4 used ROT_THREE before a STORE_SUBSCR, and this worked in Dejavu; but Python 2.5 uses ROT_TWO before STORE_SUBSCR, which showed me I had the stack-popping backwards in both functions. Bah. Fixed now.

Absolute imports

Both packages needed a good bit of work changing some relative import statements into absolute ones. Not really hard, just boring. ;)

Thanks to the Python core devs for a very smooth transition!

07/12/06

Permalink 08:59:58 am, by fumanchu Email , 120 words   English (US)
Categories: IT, General

Faces are not Proof, but they are Trust

cote (whose blog title is "People Over Process"!?) wrote:

There's a certain point, to be a cynical coder, where people just show up at meetings for face-time: to show that their involved. I'm not saying that these people don't have valuable work that could be done. Instead, their perceptions is that showing up at a meeting is the prime channel to prove that they're doing that valuable work and to do that work.

The perception is there for a reason. Face to face time, whether in meetings or the hallway or lunch, builds trust among humans. Lack of face time breaks down trust. Employ workarounds for this truth at your peril.

07/10/06

Permalink 11:07:40 am, by fumanchu Email , 361 words   English (US)
Categories: CherryPy

CherryPy 3 optimization

Currently (rev 1193), a typical CherryPy request has a standard execution path, and a standard time to complete it:

0.008 _cpwsgi.py:51(_wsgi_callable)
    0.001 _cpwsgi.py:36(translate_headers)
    0.001 _cpengine.py:131(request)
        0.001 _cprequest.py:623(Response.__init__)
    0.006 _cprequest.py:116(run)
        0.000 _cprequest.py:230(process_request_line)
        0.001 _cprequest.py:265(process_headers)
        0.003 _cprequest.py:189(respond)
            0.001 _cprequest.py:294(get_resource)
                0.001 _cprequest.py:415(Dispatcher.__call__)
                    0.001 _cprequest.py:432(find_handler)
            0.001 _cprequest.py:326(tool_up)
            0.001 _cprequest.py:644(finalize)
        0.001 cherrypy\__init__.py:96(log_access)
            0.001 logging\__init__.py:1015(log)
                0.001 logging\__init__.py:1055(_log)

0.001 cherrypy\__init__.py:51(__getattr__)
0.001 :0(getattr)

That is, _cpwsgi._wsgi_callable() takes about 8 msec (on my box using the builtin timer). That number breaks down into 1 msec for translate_headers(), 1 msec for _cpengine.request(), and 6 msec for Request.run(). Etcetera. These are all of the calls which take 1 msec or more to complete.

It looks like moving to Python's builtin logging for the access log has added 1 msec to Request.run(). I think that's reasonable; we lose a millisecond but gain syslog and rotating log options.

Somebody please explain to me why _cpwsgi.translate_headers takes a millisecond to change 20 strings from "HTTP_HEADER_NAME" to "Header-Name". I've tried lots of rewritings of that to no avail; moving from "yield" to returning a list did nothing, nor did inlining it into _wsgi_callable.

I tried making the default Dispatcher cache the results from find_handler. That is, cache[(app, path_info)] = func, vpath, request.config. I couldn't see any speedup on cache hits.

The next-to-last line above is interesting. 0.001 cherrypy\__init__.py:51(__getattr__) shows 1 msec being used for cherrypy.request and cherrypy.response. I've already done a lot of work to minimize this by looking them up once and binding to a local, for example, request = cherrypy.request, and then looking up further attributes using the local name. But perhaps there's more to be done.

The last line above shows 1 msec being used to call the builtin getattr() function. Seems we have a very object-oriented style. ;)

I'll keep looking for ways to get any of those 0.001's to read 0.000. Perhaps now that I've moved profiling to WSGI middleware, I can aggregate times and work with numbers that have a little more precision. ;)

06/28/06

Permalink 09:25:43 am, by admin Email , 365 words   English (US)
Categories: IT, Python, Dejavu

The Fourth Way

Ted Neward has written a good discussion of Object-Relational Mapper concerns. I'd like to react to, and associate, a couple of points he makes, seemingly unrelatedly:

...we typically end up with one of Query-By-Example (QBE), Query-By-API (QBA), or Query-By-Language (QBL) approaches.

A QBE approach states that you fill out an object template of the type of object you're looking for...
a "Query-By-API" approach, in which queries are constructed by Query objects...
a "Query-By-Language" approach, in which a new language, similar to SQL but "better" somehow, is written...

and

Several possible solutions present themselves...

5. Integration of relational concepts into the languages. Developers simply accept that this is a problem that should be solved by the language, not by a library or framework...bring relational concepts (which, at heart, are set-based) into mainstream programming languages, making it easier to bridge the gap between "sets" and "objects"...[such as] direct integration into traditional O-O languages, such as the LINQ project from Microsoft...

I propose (and implemented in Dejavu) a fourth approach from QBE, QBA, and QBL. Rather than build a DSL on top of a programming language (as QBL does), use the language instead. Rather than change the programming language by introducing relational syntax (as LINQ does), use the language instead. In Dejavu, you write plain old Python functions which take an object and return True or False. Iterate over a collection of objects and it works as a filter. Pass it to the storage backend and it is translated into SQL for you. Most commonly, you pass it to the library and it does both for you: iterates over its in-memory cache of objects and merges in new objects, queried from storage. Let's call it... "query". It's not "by" anything. It has an infinitesimal learning curve. LINQ is, in essence, shoehorning higher-order functions into its various target languages in a very limited domain. Why not use a programming language that has real HOF's?

06/15/06

<< 1 2 3 4 5 6 7 8 9 10 11 ... 17 >>

March 2017
Sun Mon Tue Wed Thu Fri Sat
 << <   > >>
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  

Search

The requested Blog doesn't exist any more!

XML Feeds

free blog software