Pages: << 1 ... 2 3 4 5 6 7 8 9 10 11 12 ... 26 >>
A couple of months ago, in response to someone else's speed claims, I posted a comment that CherryPy's built in WSGI server could serve 1200 simple requests per second. The demo used Apache's "ab" tool to test ("-k -n 3000 -c %s"). In the last few days before the release of CherryPy 3.0 final, I've done some further optimization of cherrypy.wsgiserver, and now get 2000+ req/sec on my modest laptop.
threads | Completed | Failed | req/sec | msec/req | KB/sec |
10 | 3000 | 0 | 2170.79 | 0.461 | 358.18 |
20 | 3000 | 0 | 2080.34 | 0.481 | 343.26 |
30 | 3000 | 0 | 1920.31 | 0.521 | 316.85 |
40 | 3000 | 0 | 2051.84 | 0.487 | 338.55 |
50 | 3000 | 0 | 2051.84 | 0.487 | 338.55 |
The improvements are due to a variety of optimizations, including:
I want to make it clear that the benchmark does not exercise any part of CherryPy other than the WSGI server. I used a very simple WSGI application (not the full CherryPy stack):
def simple_app(environ, start_response):
"""Simplest possible application object"""
status = '200 OK'
response_headers = [('Content-type','text/plain'),
('Content-Length','19')]
start_response(status, response_headers)
return ['My Own Hello World!']
The full stack of CherryPy includes the WSGI application side as well, and consequently takes more time. But that has risen from about 380 requests per second in October to:
Client Thread Report (1000 requests, 14 byte response body, 10 server threads):
threads | Completed | Failed | req/sec | msec/req | KB/sec |
10 | 1000 | 0 | 536.86 | 1.863 | 85.36 |
20 | 1000 | 0 | 509.47 | 1.963 | 81.01 |
30 | 1000 | 0 | 499.28 | 2.003 | 79.39 |
40 | 1000 | 0 | 491.90 | 2.033 | 78.21 |
50 | 1000 | 0 | 504.32 | 1.983 | 80.19 |
Average | 1000.0 | 0.0 | 508.366 | 1.969 | 80.832 |
If you want to benchmark the full CherryPy stack on your own, just install CherryPy and run the script at cherrypy/test/benchmark.py
.
Here's the other script for the "bare server" benchmarks:
import re
import sys
import threading
import time
from cherrypy import _cpmodpy
AB_PATH = ""
APACHE_PATH = "apache"
SCRIPT_NAME = ""
PORT = 8080
class ABSession:
"""A session of 'ab', the Apache HTTP server benchmarking tool."""
parse_patterns = [('complete_requests', 'Completed',
r'^Complete requests:\s*(\d+)'),
('failed_requests', 'Failed',
r'^Failed requests:\s*(\d+)'),
('requests_per_second', 'req/sec',
r'^Requests per second:\s*([0-9.]+)'),
('time_per_request_concurrent', 'msec/req',
r'^Time per request:\s*([0-9.]+).*concurrent requests\)$'),
('transfer_rate', 'KB/sec',
r'^Transfer rate:\s*([0-9.]+)'),
]
def __init__(self, path=SCRIPT_NAME + "/", requests=3000, concurrency=10):
self.path = path
self.requests = requests
self.concurrency = concurrency
def args(self):
assert self.concurrency > 0
assert self.requests > 0
return ("-k -n %s -c %s <a href="http://localhost:%s%s"">http://localhost:%s%s"</a> %
(self.requests, self.concurrency, PORT, self.path))
def run(self):
# Parse output of ab, setting attributes on self
args = self.args()
self.output = _cpmodpy.read_process(AB_PATH or "ab", args)
for attr, name, pattern in self.parse_patterns:
val = re.search(pattern, self.output, re.MULTILINE)
if val:
val = val.group(1)
setattr(self, attr, val)
else:
setattr(self, attr, None)
safe_threads = (25, 50, 100, 200, 400)
if sys.platform in ("win32",):
# For some reason, ab crashes with > 50 threads on my Win2k laptop.
safe_threads = (10, 20, 30, 40, 50)
def thread_report(path=SCRIPT_NAME + "/", concurrency=safe_threads):
sess = ABSession(path)
attrs, names, patterns = zip(*sess.parse_patterns)
rows = [('threads',) + names]
for c in concurrency:
sess.concurrency = c
sess.run()
rows.append([c] + [getattr(sess, attr) for attr in attrs])
return rows
def print_report(rows):
widths = []
for i in range(len(rows[0])):
lengths = [len(str(row[i])) for row in rows]
widths.append(max(lengths))
for row in rows:
print
for i, val in enumerate(row):
print str(val).rjust(widths[i]), "|",
print
if __name__ == '__main__':
def simple_app(environ, start_response):
"""Simplest possible application object"""
status = '200 OK'
response_headers = [('Content-type','text/plain'),
('Content-Length','19')]
start_response(status, response_headers)
return ['My Own Hello World!']
from cherrypy import wsgiserver as w
s = w.CherryPyWSGIServer(("localhost", PORT), simple_app)
threading.Thread(target=s.start).start()
try:
time.sleep(1)
print_report(thread_report())
finally:
s.stop()
I played around with this as a potential hack for CherryPy 3. It's WSGI middleware for adding almost-transparent "internal redirect" capabilities to any WSGI application.
My operating theory was that anyone writing a WSGI app that does not already have an internal-redirect feature was probably using HTTP redirects (302, 303, or 307) to do nearly the same thing. This middleware simply waits for a 307 response status and performs the redirection itself within the same request, without informing the user-agent.
This should be OK because 307 isn't normally cacheable anyway, and some versions of IE don't bother to ask the user as the spec requires already, so it just duplicates an existing browser bug. I could have used a custom HTTP code like 399, but if that ever leaked out to the UA (because someone forgot to enable the middleware) then the UA should fall back to "300 Multiple Choices", which didn't seem like a good fit. At least by using 307, the fallback should be appropriate, if not graceful.
Here's the code, which could probably use some improvements:
"""WSGI middleware which performs "internal" redirection."""
import StringIO
class _Redirector(object):
def __init__(self, nextapp, recursive=False):
self.nextapp = nextapp
self.recursive = recursive
self.location = None
self.write_proxy = None
self.status = None
self.headers = None
self.exc_info = None
self.seen_paths = []
def start_response(self, status, headers, exc_info):
if status[:3] == "307":
for name, value in headers:
if name.lower() == "location":
self.location = value
break
self.status = status
self.headers = headers
self.exc_info = exc_info
return self.write
def write(self, data):
# This is only here for silly apps which call write.
if self.write_proxy is None:
self.write_proxy = self.sr(self.status, self.headers, self.exc_info)
self.write_proxy(data)
def __call__(self, environ, start_response):
self.sr = start_response
nextenv = environ.copy()
curpath = nextenv['PATH_INFO']
if nextenv.get('QUERY_STRING'):
curpath = curpath + "?" + nextenv['QUERY_STRING']
self.seen_paths.append(curpath)
while True:
# Consume the response (in case it's a generator).
response = [x for x in self.nextapp(nextenv, self.start_response)]
if self.location is None:
# No redirection required; complete the response normally.
self.sr(self.status, self.headers, self.exc_info)
return response
# Start with a fresh copy of the environ and start altering it.
nextenv = environ.copy()
nextenv['REQUEST_METHOD'] = 'GET'
nextenv['CONTENT_LENGTH'] = '0'
nextenv['wsgi.input'] = StringIO.StringIO()
nextenv['redirector.history'] = self.seen_paths[:]
# "The [Location response-header] field value
# consists of a single absolute URI."
(nextenv["wsgi.url_scheme"],
nextenv["SERVER_NAME"],
path, params,
nextenv["QUERY_STRING"], frag) = urlparse(self.location)
if frag:
raise ValueError("Illegal #fragment in Location response "
"header %r" % self.location)
if params:
path = path + ";" + params
# Assume 'path' is already unquoted according to
# <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5.1.2">http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5.1.2</a>
if path.lower().startswith(environ['SCRIPT_NAME'].lower()):
nextenv["PATH_INFO"] = path[len(environ['SCRIPT_NAME']):]
else:
raise ValueError("Location response header %r does not "
"match current SCRIPT_NAME %r"
% (self.location, environ['SCRIPT_NAME']))
# Update self.seen_paths and check for recursive calls.
curpath = nextenv['PATH_INFO']
if nextenv.get('QUERY_STRING'):
curpath = curpath + "?" + nextenv['QUERY_STRING']
if curpath in self.seen_paths:
raise RuntimeError("redirector visited the same URL twice: %r"
% curpath)
else:
self.seen_paths.append(curpath)
# Reset self for the next iteration
self.location = None
self.write_proxy = None
self.status = None
self.headers = None
self.exc_info = None
def redirector(nextapp, recursive=False):
"""WSGI middleware which performs "internal" redirection.
Whenever the next application sets a response status of 307 and
provides a Location response header, this component will not pass
that response on to the user-agent; instead, it parses the URI
provided in the Location response header and calls the same
application again using that URI. The following entries in the
WSGI environ dict may be modified when redirecting: wsgi.url_scheme,
SERVER_NAME, PATH_INFO, QUERY_STRING. REQUEST_METHOD is always
set to 'GET', so any desired parameters must be supplied as
query string arguments in the Location response header.
The wsgi.input entry will always be reset to an empty StringIO,
and CONTENT_LENGTH will be set to 0.
If 'recursive' is False (the default), each new target URI will be
checked to see if it has already been visited in the same request;
if so, a RuntimeError is raised. If 'recursive' is True, no check
is made and therefore no such errors are raised.
"""
def redirect_wrapper(environ, start_response):
ir = _Redirector(nextapp, recursive)
return ir(environ, start_response)
return redirect_wrapper
...you should know that CherryPy 3 (soon to be released) includes a Routes dispatcher:
class City:
def __init__(self, name):
self.name = name
self.population = 10000
def index(self, **kwargs):
return "Welcome to %s, pop. %s" % (self.name, self.population)
def update(self, **kwargs):
self.population = kwargs['pop']
return "OK"
d = cherrypy._cprequest.RoutesDispatcher()
d.connect(name='hounslow', route='hounslow', controller=City('Hounslow'))
d.connect(name='surbiton', route='surbiton', controller=City('Surbiton'),
action='index', conditions=dict(method=['GET']))
d.mapper.connect('surbiton', controller='surbiton',
action='update', conditions=dict(method=['POST']))
conf = {'/': {'request.dispatch': d}}
cherrypy.tree.mount(root=None, config=conf)
cherrypy.config.update({'environment': 'test_suite'})
You tell CherryPy you want to use Routes dispatching in your app config with "request.dispatch = <obj>
". The astute reader will note this means:
To make your own dispatcher:
__call__
method that takes a path_info
argument.*args
and GET/POST parameters as **kwargs
, but if that's not what your dispatch style requires, do something else. It's completely customizable. You can even set request.handler to None if you don't want anything called at that point. Note also that HTTPRedirect and HTTPError (including NotFound) can be used as handlers; when called, they raise self
.path_info
argument above. The default CherryPy dispatcher does a lot of work to correctly allow config file entries to override _cp_config
entries on the CherryPy object tree. But if your dispatch style doesn't use a tree, you don't need to do all that.Grab the latest trunk and start playing!
I probably waited too long, but today I upgraded both CherryPy (3.0alpha/trunk) and Dejavu (1.5alpha/trunk) to Python 2.5. The moves were surprisingly easy:
There were three changes in all:
[P.S. I've noticed CP 3 is about 3% slower in 2.5 than 2.4, even with the zombie frames and other optimizations. Hmmm.]
Amazingly, even though Dejavu makes extensive use of bytecode hacks, there was only one real change! The "logic" module needed an upgrade to the "comparison" function, which produces an Expression via the types.CodeType()
constructor. Apparently, function args are no longer included in co_names
, and co_consts
no longer includes a leading 'None' value (except when there are cell references?). Finally, co_flags
for "normal" functions now includes CO_NESTED
by default. These changes also forced some parallel upgrades to the test suite.
While fixing the above, however, I noticed a long-standing bug in Dejavu's LambdaDecompiler. Python 2.4 used ROT_THREE
before a STORE_SUBSCR
, and this worked in Dejavu; but Python 2.5 uses ROT_TWO
before STORE_SUBSCR
, which showed me I had the stack-popping backwards in both functions. Bah. Fixed now.
Both packages needed a good bit of work changing some relative import statements into absolute ones. Not really hard, just boring.
Thanks to the Python core devs for a very smooth transition!
cote (whose blog title is "People Over Process"!?) wrote:
There's a certain point, to be a cynical coder, where people just show up at meetings for face-time: to show that their involved. I'm not saying that these people don't have valuable work that could be done. Instead, their perceptions is that showing up at a meeting is the prime channel to prove that they're doing that valuable work and to do that work.
The perception is there for a reason. Face to face time, whether in meetings or the hallway or lunch, builds trust among humans. Lack of face time breaks down trust. Employ workarounds for this truth at your peril.
Currently (rev 1193), a typical CherryPy request has a standard execution path, and a standard time to complete it:
0.008 _cpwsgi.py:51(_wsgi_callable)
0.001 _cpwsgi.py:36(translate_headers)
0.001 _cpengine.py:131(request)
0.001 _cprequest.py:623(Response.__init__)
0.006 _cprequest.py:116(run)
0.000 _cprequest.py:230(process_request_line)
0.001 _cprequest.py:265(process_headers)
0.003 _cprequest.py:189(respond)
0.001 _cprequest.py:294(get_resource)
0.001 _cprequest.py:415(Dispatcher.__call__)
0.001 _cprequest.py:432(find_handler)
0.001 _cprequest.py:326(tool_up)
0.001 _cprequest.py:644(finalize)
0.001 cherrypy\__init__.py:96(log_access)
0.001 logging\__init__.py:1015(log)
0.001 logging\__init__.py:1055(_log)
0.001 cherrypy\__init__.py:51(__getattr__)
0.001 :0(getattr)
That is, _cpwsgi._wsgi_callable()
takes about 8 msec (on my box using the builtin timer). That number breaks down into 1 msec for translate_headers()
, 1 msec for _cpengine.request()
, and 6 msec for Request.run()
. Etcetera. These are all of the calls which take 1 msec or more to complete.
It looks like moving to Python's builtin logging for the access log has added 1 msec to Request.run()
. I think that's reasonable; we lose a millisecond but gain syslog and rotating log options.
Somebody please explain to me why _cpwsgi.translate_headers
takes a millisecond to change 20 strings from "HTTP_HEADER_NAME
" to "Header-Name
". I've tried lots of rewritings of that to no avail; moving from "yield" to returning a list did nothing, nor did inlining it into _wsgi_callable
.
I tried making the default Dispatcher cache the results from find_handler
. That is, cache[(app, path_info)] = func, vpath, request.config
. I couldn't see any speedup on cache hits.
The next-to-last line above is interesting. 0.001 cherrypy\__init__.py:51(__getattr__)
shows 1 msec being used for cherrypy.request
and cherrypy.response
. I've already done a lot of work to minimize this by looking them up once and binding to a local, for example, request = cherrypy.request
, and then looking up further attributes using the local name. But perhaps there's more to be done.
The last line above shows 1 msec being used to call the builtin getattr()
function. Seems we have a very object-oriented style.
I'll keep looking for ways to get any of those 0.001's to read 0.000. Perhaps now that I've moved profiling to WSGI middleware, I can aggregate times and work with numbers that have a little more precision.
Ted Neward has written a good discussion of Object-Relational Mapper concerns. I'd like to react to, and associate, a couple of points he makes, seemingly unrelatedly:
...we typically end up with one of Query-By-Example (QBE), Query-By-API (QBA), or Query-By-Language (QBL) approaches.
A QBE approach states that you fill out an object template of the type of object you're looking for...
a "Query-By-API" approach, in which queries are constructed by Query objects...
a "Query-By-Language" approach, in which a new language, similar to SQL but "better" somehow, is written...
and
Several possible solutions present themselves...
5. Integration of relational concepts into the languages. Developers simply accept that this is a problem that should be solved by the language, not by a library or framework...bring relational concepts (which, at heart, are set-based) into mainstream programming languages, making it easier to bridge the gap between "sets" and "objects"...[such as] direct integration into traditional O-O languages, such as the LINQ project from Microsoft...
I propose (and implemented in Dejavu) a fourth approach from QBE, QBA, and QBL. Rather than build a DSL on top of a programming language (as QBL does), use the language instead. Rather than change the programming language by introducing relational syntax (as LINQ does), use the language instead. In Dejavu, you write plain old Python functions which take an object and return True or False. Iterate over a collection of objects and it works as a filter. Pass it to the storage backend and it is translated into SQL for you. Most commonly, you pass it to the library and it does both for you: iterates over its in-memory cache of objects and merges in new objects, queried from storage. Let's call it... "query". It's not "by" anything. It has an infinitesimal learning curve. LINQ is, in essence, shoehorning higher-order functions into its various target languages in a very limited domain. Why not use a programming language that has real HOF's?
I've decided it's easier to just ban blogspot.com comments. Sorry if that includes you.
I'm not surprised at all by this finding. But I secretly hope you are.
Inspired by James Bennett, here's a little treatise on how CherryPy processes a request. A couple of differences, though. First, Django is a "full-stack" web framework, with an ORM, built-in templating, etcetera, whereas CherryPy focuses on HTTP. Second, I'll be showing the process for CherryPy 2.2 (the current stable branch), but I'll try to point out along the way where CherryPy 3 (now in alpha) differs.
Something must actually sit on a listening socket and receive requests from HTTP clients. CherryPy provides an HTTP server (_cpwsgiserver.py
), or you can use Apache, lighttpd, or others.
The Web Server Gateway Interface spec came into being to connect various HTTP servers to various web frameworks (and gateways and middleware and...). If you want to use it to connect an HTTP server with CherryPy, feel free. CherryPy provides a "WSGI application callable" in _cpwsgi.py
. Otherwise, you need a specific adapter at this stage to connect the two.
Whether you use WSGI or not for the Bridge, it calls Engine.request()
, which creates the all-important objects cherrypy.request
and cherrypy.response
, returning the former. The Bridge then calls request.run()
, passing it the incoming message stream.
Several steps occur here to convert the incoming stream to more usable data structures, pass the request to the appropriate user code, and then convert outbound data. In-between the standard processing steps, users can define extra code to be run via filters (CP 2.2) or hooks (CP 3). Here's how CherryPy 2 does it:
Request.processRequestLine()
analyzes the first line of the request, turning "GET /path/to/resource?key=val HTTP/1.1
" into a request method, path, query string, and version.on_start_resource
filters are run.Request.processHeaders()
turns the incoming HTTP request headers into a dictionary, and separates Cookie information.before_request_body
filters are run.Request.processBody()
turns the incoming HTTP request body into a dictionary if possible, otherwise, it's passed onward as a file-like object.before_main
filters are run.before_finalize
filters will be run.Response.finalize()
checks for HTTP correctness of the response, and transforms user-friendly data structures into HTTP-server-friendly structures.on_end_resource
filters are run.CherryPy 3 performs the same steps as above, but in the order: 1, 3, 7, 2, 4, 5, 6, 8, 9, 10, 11. That is, it determines which bit of user code will respond to the request much earlier in the process. This also means that internal redirects can "start over" much earlier. In addition, CP 3 can collect configuration data once (at the same time that it looks up the page handler); CP 2 recollected config data every time it was used.
As mentioned (steps 7 and 8, above), CherryPy users write "page handlers", functions which receive the request parameters as arguments, and return the response body. CherryPy makes clever use of threadlocals, so all other data a developer needs is available in the global cherrypy.request
and cherrypy.response
objects (the parameters are as well, but it's awfully convenient to receive them as arguments to the page handler, and to return the body rather than setting it).
The URL is mapped to a page handler by traversing a tree of such handlers, so that the handler for "/a/b/c" is most likely root.a.b.c()
. I say "most likely", because you can also define index()
handlers and default()
handlers.
When the call to Request.run()
returns, the Bridge uses the Response
attributes status
, header_list
, and body
to construct the outbound stream, and pass it to the HTTP server that made the request. CherryPy works hard to support both buffered and streaming output, so the body may be a generator object that is only iterated over at this point.
The page handler, or any of the filters/hooks, can decide that the response is complete, and that processing should be stopped. Most often, this is accomplished by raising an HTTPRedirect
(3xx) exception, or an HTTPError
(4xx or 5xx; NotFound
(404) is so common it has its own subclass). Unanticipated errors are automatically converted into HTTPError(500)
. Users have some facility for modifying the actual error output with additional error filters/hooks.
That's it!