The correct answer is: "nobody knows". But here are some ideas I've been kicking around the ol' cranium lately...
[09:32] *** now talking in #cherrypy
[10:22] <Lawouach> where to start
[10:22] <Lawouach> what's your basic idea toward 3.0?
[10:22] <@fumanchu> oh, I have so many ;)
[10:22] <Lawouach> lol
[10:22] <Lawouach> say big general ones :)
[10:22] <Lawouach> not details per se
[10:23] <@fumanchu> 1) make CP have a kick-butt,
non-CP-specific toolkit (lib/httptools), that is SO
good that Quixote, Django, et al can't *help* but
decide to use it instead of their own server processes
[10:24] <@fumanchu> even if they don't like the way CP
maps handlers to URL's, for example
[10:24] <@fumanchu> they should be able to build a
server with the behavior they like out of lib/httptools
[10:25] <Lawouach> we want to be lib that rule them all :)
[10:25] <@fumanchu> yup
[10:26] <Lawouach> i agree as long as we don't become a
framework on our own, but i already know it's not what
you intend :)
[10:26] <@fumanchu> right
[10:26] <@fumanchu> it's an anti-framework approach
[10:26] <@fumanchu> we make writing-a-web-framework
into a weekend's work
[10:27] <@fumanchu> take some from column A; try all of column B
[10:27] <Lawouach> do you want to stay very low-level
(aka HTTP wrapper level) or make it a bit higher level
and provide functions such as the bast_match() we were
talking about last week?
[10:27] <@fumanchu> best_match would be fine as long
as it doesn't depend upon cherrypy
[10:28] <Lawouach> right, this was a bad example
[10:28] <Lawouach> but basically where httptools should stop?
[10:28] <@fumanchu> I think that can be open-ended
[10:28] <Lawouach> i think we should keep the level
you've been doing till now
[10:29] <@fumanchu> 2) then, by pulling a ton of code
out of _cphttptools (putting it in lib/httptools instead),
I want to see if we can get the Request and Response
objects down to a tiny size
[10:34] <@fumanchu> the trunk version of _cphttptools
is already 60% of its 2.1 size
[10:35] <Lawouach> right. hmmm
[10:37] <@fumanchu> and a *lot* of what's left is very OO
[10:38] <@fumanchu> so, one idea I'm toying with: allow
developers to use their own subclasses of Request
and Response
[10:40] <@fumanchu> if we make it super-easy to use custom
Request subclasses, then they will want to start
overriding Request.run
[10:40] <@fumanchu> take out the filter logic, and
Request.run becomes:
def _run(self, requestLine, headers, rfile):
self.headers = list(headers)
self.headerMap = httptools.HeaderMap()
self.simpleCookie = Cookie.SimpleCookie()
self.rfile = rfile
self.processRequestLine(requestLine)
try:
self.processHeaders()
self.processBody()
self.main()
cherrypy.response.finalize()
except cherrypy.RequestHandled:
pass
except (cherrypy.HTTPRedirect, cherrypy.HTTPError), inst:
inst.set_response()
cherrypy.response.finalize()
[10:40] <Lawouach> regarding the subclassing of request
and response, i'm know that it could interest very
much the guys behind itools
[10:40] <@fumanchu> yes
[10:40] <@fumanchu> and Ben Bangert (routes)
[10:41] <@fumanchu> anyway, if Request.run is *that* simple,
then who needs filters?
[10:41] <@fumanchu> just code them procedurally into your
Request.run method
[10:43] <@fumanchu> looking over the filters that are built in...
[10:44] <@fumanchu> I think that half could be done just as
easily as lib/httptools functions
[10:44] <@fumanchu> and half could be "always on"
[10:44] <@fumanchu> (if we continue to improve them, like
encodingfilter, to meet the HTP spec)
[10:44] <@fumanchu> HTTP
[10:44] <Lawouach> that's my white cheap :) (i don't think this
expression exists so i make it up!)
[10:45] <Lawouach> i really want CP to be HTTP conditionnaly compliant
at least :)
[10:45] <Lawouach> and maybe in CP 4.0 to be unconditionnaly compliant!
[10:45] <Lawouach> :p
[10:45] <@fumanchu> I completely agree
[10:46] <@fumanchu> anyway, I want to stress that I'm still playing
with these ideas
[10:46] <@fumanchu> nothing's set in stone
[10:47] <Lawouach> since you've be proposing them a while back,
i've been a great fan of them
[10:47] <@fumanchu> and trying to implement them will turn up
lots of problems, I'm sure
[10:47] <@fumanchu> oh, well thanks
[10:47] <Lawouach> that's why i don't have so many different
things to bring for cp 3.0
[10:51] <@fumanchu> one of the nice things about these ideas
for 3.0 is that the bulk of the work can be done within
the 2.x branch
Dear lazyweb,
After 6 hours, I am utterly stumped. I've got an application built with a popular Python web application server, via mod_python, and keep seeing data bleed from one request to the next. That is, if I:
OK
The requested URL /jjj.css was not found on this server.
Apache/2.0.55 (Win32) mod_ssl/2.0.55 OpenSSL/0.9.8a mod_python/3.2.2b Python/2.4.2 mod_auth_sspi/1.0.2 Server at skipper.amorhq.net Port 443 HTTP/1.1 404 Not Found Date: Tue, 22 Nov 2005 01:57:37 GMT Server: Apache/2.0.55 (Win32) mod_ssl/2.0.55 OpenSSL/0.9.8a mod_python/3.2.2b Python/2.4.2 mod_auth_sspi/1.0.2 Content-Length: 371 Keep-Alive: timeout=15, max=94 Connection: Keep-Alive Content-Type: text/html; charset=iso-8859-1Not Found
The requested URL /mmm.css was not found on this server.
Apache/2.0.55 (Win32) mod_ssl/2.0.55 OpenSSL/0.9.8a mod_python/3.2.2b Python/2.4.2 mod_auth_sspi/1.0.2 Server at skipper.amorhq.net Port 443
The body of request #2 is present in request #3, and so are the headers of request #3! Frightening.
This happens reliably with both Firefox and IE. It happens whether I use HTTPS or not. It happens whether I use authentication or not. It happens when I strip the modpython gateway-for-WSGI I wrote down to 80 lines.
It stops happening when I use CherryPy's builtin WSGI server, so I don't think any part of CP is to blame, which leaves a bug in mod_python or Apache2. I'm particularly inclined to blame them because, although CherryPy and Apache itself log both the missing responses as 404, Ethereal shows me that the actual third response, as received by the client, has a 200 response code!
So I'm stumped. Any solutions, pointers, or flights of debugging fantasy accepted.
James Robertson (among others) has been following the Sony rootkit fiasco, and comments:
Some of the management meetings at Sony must have been utterly fascinating over the last few days, as they slowly worked their way around to doing the right thing.
I can't help but wonder how my own company's management would respond to a similar challenge. My guess is that we would have a similar set of reactions. That is, we would choose the following reactions, in order:
I imagine that each step was instituted by a progressively-more-senior level of management. It's hard to imagine a company with more than 5 employees doing things any differently; there are simply too many such challenges (and too many "if's")—a company which discussed, implemented, and guaranteed a full fix for all of them would quickly smother itself in bureaucracy and second-guessing. In other words, my hunch is that Sony's error was probably systemic (the result of being a large company) and not moral.
Perhaps some issues, like Sony's rootkit issue, should side-step the above sequence and jump their response straight to 4th gear. I'd be interested to hear anyone's logic for deciding which issues need that and which ones don't.
These blogs have been getting hammered by some unknown process; Apache and MySQL start taking up all my RAM. It's mostly "unknown" because I can't be bothered to fix it at the moment—too much else going on.
Update Nov 6, 2005: Finally got it to work with Apache2-prefork on Unix (it only worked on mpm_winnt until now).
Update Oct 25, 2005: I was having a problem setting up a new install of my CherryPy application, using this recipe. It turned out that I didn't have the right interpreter_name in my PythonImport directive:
PythonImport module interpreter_name
Therefore, the CherryPy server started in a different intepreter than the one being used for the requests. It must exactly match the value of req.interpreter, and is case-sensitive. I've updated the code with comments to that effect (just to have it all in one place).
Update Aug 11, 2005: I was having a problem serving .css and .js pages. CherryPy's standalone WSGI server did fine, but mod_python did not. I finally tracked it down to the fact that I was both setting apache's req.status and returning req.status from the handler. Funky. It worked when I chose to simply return the value, and not set it.
Update June 5, 2005:
Pages take forever to terminate when returning a status of 200--apache.OK must be returned instead in that case.
Added code for hotshot profiling.
Added code for using paste.lint
As I mentioned I was doing last week, I wrote a more complete WSGI wrapper for modpython. Here it is. Feedback welcome. Phil Eby told me he'd like a mod_python wrapper for inclusion in wsgiref; he should feel free to use this one however he sees fit. ![]()
"""
WSGI wrapper for mod_python. Requires Python 2.2 or greater.
Example httpd.conf section for a CherryPy app called "mcontrol":
<Directory D:\htdocs\mcontrol>
SetHandler python-program
PythonHandler wsgiref.modpy_wrapper::handler
PythonOption application cherrypy.wsgiapp::wsgiApp
PythonOption import mcontrol.cherry::startup
</Directory>
"""
import sys
from mod_python import apache
from wsgiref.handlers import BaseCGIHandler
class InputWrapper(object):
def __init__(self, req):
self.req = req
def close(self):
pass
def read(self, size=-1):
return self.req.read(size)
def readline(self):
return self.req.readline()
def readlines(self, hint=-1):
return self.req.readlines(hint)
def __iter__(self):
line = self.readline()
while line:
yield line
# Notice this won't prefetch the next line; it only
# gets called if the generator is resumed.
line = self.readline()
class ErrorWrapper(object):
def __init__(self, req):
self.req = req
def flush(self):
pass
def write(self, msg):
self.req.log_error(msg)
def writelines(self, seq):
self.write(''.join(seq))
bad_value = ("You must provide a PythonOption '%s', either 'on' or 'off', "
"when running a version of mod_python < 3.1")
class Handler(BaseCGIHandler):
def __init__(self, req):
options = req.get_options()
# Threading and forking
try:
q = apache.mpm_query
except AttributeError:
threaded = options.get('multithread', '').lower()
if threaded == 'on':
threaded = True
elif threaded == 'off':
threaded = False
else:
raise ValueError(bad_value % "multithread")
forked = options.get('multiprocess', '').lower()
if forked == 'on':
forked = True
elif forked == 'off':
forked = False
else:
raise ValueError(bad_value % "multiprocess")
else:
threaded = q(apache.AP_MPMQ_IS_THREADED)
forked = q(apache.AP_MPMQ_IS_FORKED)
env = dict(apache.build_cgi_env(req))
if req.headers_in.has_key("authorization"):
env["HTTP_AUTHORIZATION"] = req.headers_in["authorization"]
BaseCGIHandler.__init__(self,
stdin=InputWrapper(req),
stdout=None,
stderr=ErrorWrapper(req),
environ=env,
multiprocess=forked,
multithread=threaded
)
self.request = req
self._write = req.write
def _flush(self):
pass
def send_headers(self):
self.cleanup_headers()
self.headers_sent = True
# Can't just return 200 or the page will hang until timeout
s = int(self.status[:3])
if s == 200:
self.finalstatus = apache.OK
else:
self.finalstatus = s
# the headers.Headers class doesn't have an iteritems method...
for key, val in self.headers.items():
if key.lower() == 'content-length':
if val is not None:
self.request.set_content_length(int(val))
elif key.lower() == 'content-type':
self.request.content_type = val
else:
self.request.headers_out[key] = val
_counter = 0
def profile(req):
# Call this function instead of handler
# to get profiling data for each call.
import hotshot, os.path
ppath = os.path.dirname(__file__)
if not os.path.exists(ppath):
os.makedirs(ppath)
global _counter
_counter += 1
ppath = os.path.join(ppath, "cp_%s.prof" % _counter)
prof = hotshot.Profile(ppath)
result = prof.runcall(handler, req)
prof.close()
return result
def handler(req):
config = req.get_config()
debug = int(config.get("PythonDebug", 0))
options = req.get_options()
# Because PythonImport cannot be specified per Directory or Location,
# take any 'import' PythonOption's and import them. If a function name
# in that module is provided (after the "::"), it will be called with
# the request as an argument. The module and function, if any, should
# be re-entrant (i.e., handle multiple threads), and, since they will
# be called per request, must be designed to run setup code only on the
# first request (a global 'first_request' flag is usually enough).
import_opt = options.get('import')
if import_opt:
atoms = import_opt.split('::', 1)
modname = atoms.pop(0)
module = __import__(modname, globals(), locals(), [''])
if atoms:
func = getattr(module, atoms[0])
func(req)
# Import the wsgi 'application' callable and pass it to Handler.run
modname, objname = options['application'].split('::', 1)
module = __import__(modname, globals(), locals(), [''])
app = getattr(module, objname)
h = Handler(req)
## from paste import lint
## app = lint.middleware(app)
h.run(app)
# finalstatus was set in Handler.send_headers()
return h.finalstatus