Categories: Python, Cation, CherryPy, Dejavu, WHELPS, WSGI

Pages: << 1 2 3 4 5 6 7 8 9 10 11 >>

08/11/05

Permalink 01:57:26 pm, by fumanchu Email , 45 words   English (US)
Categories: CherryPy

New CherryPy Planet

Link: http://www.defuze.org/oss/cherryplanet/

There's a new Planet in the OSS solar system, for posts related to CherryPy! CherryPy is "a pythonic, object-oriented web-development framework", which also happens to be fast, WSGI-ready, and easily extendable. Check out version 2.1, now in beta; you won't be disappointed!

Permalink 10:34:19 am, by fumanchu Email , 1132 words   English (US)
Categories: Python, Dejavu

Where Dejavu fits in the ORM cosmos

Florent Guillaume has written a good survey of his personal ORM options for Zope3. I thought I'd take the opportunity to discuss Dejavu, my own pure-Python ORM, in relation to his analysis.

SQLObject is a pure python mapper, and it is used in Zope 3 through sqlos. It provides declarative mapping for your classes ... any instance ... will actually be stored in SQL behind your back. A unique id is generated for each [object] and also stored in SQL. There are various facilities to provide relations between tables, and map them to lists in the python objects.

SQLStorage is an Archetypes storage that uses SQL as a backend. You can write an Archetypes schema ... SQLStorage relies on the Archetypes UIDs to uniquely identify objects. A Relation field can be used to have relations to other objects. Note that Archetypes objects stored through SQLStorage still have a presence in the ZODB, which means it's not a solution if you totally want to get rid of Data.fs bloat.

...both SQLObject and Archetypes require you to specify in your code that you will use SQL for some objects. Things are not "transparent" from the programmer's point of view.

Now that I haven't been actively hacking on Dejavu for a few months, I've had the opportunity to sit back and think more clearly about what I like in it. One of the things I like most is that the decision about what storage system to use is not made by the application programmer; instead, it's made by the deployer(s) of an app. Therefore:

  • Programmers can remain blissfully unaware of SQL. They should still understand something about storage in the abstract—things like indexing, size hints, and relations—but the Dejavu API completely replaces SQL, custom file drivers, caching mechanisms, and any other storage APIs, transparently.
  • Deployers can make their own decisions about storage mechanisms, on a per-class level. Even if those decisions are religious or political in nature. ;) But if they're not...
  • Deployers can test their dataset using various storage systems to find out which is fastest (or against other metrics).

Objects that you want to persist do have to follow a pretty complicated interface standard, and by far the easiest way to guarantee that is by subclassing dejavu.Unit. So Dejavu does "require you to specify in your code that you will [persist] some objects". I'm not sure whether Florent was defining "transparent" in terms of declaring "SQL" specifically, or declaring "persistence" in general.

Ape (Adaptable Persistence Engine) is a framework to do object-relational mapping at a lower level than the above two solutions, because it works at the ZODB Storage level...the downside is that the structure of the tables in the SQL database is chosen by the framework.

Dejavu doesn't do that. Object properties (and table names) are declared in Python code, just like SQLObject or SQLStorage (although more readably, IMO). But Dejavu doesn't add any properties which you don't explicitly specify; only the "ID" property is present by default. Dejavu does expose a "create_storage" method for turning your in-Python schema into empty database tables (for use when your database isn't already populated).

However Ape's default SQL mapper already tries hard to provide data mapping in a natural way; for instance all properties are made available in a natural manner, object titles or containment relationship are also naturally expressed.

If I understand that paragraph correctly, it's saying that the persistence mechanism doesn't conflict with normal Pythonic code. This is a great strength of Dejavu: objects are still objects, and their properties are gotten and set in a natural way, with reasonably-transparent coercion if necessary:

>>> class Knight(Unit):
        Name = UnitProperty(unicode)
>>> 
>>> galahad = Knight()
>>> galahad.Name = "Galahad the Chaste"
>>> print galahad.Name
u'Galahad the Chaste'

Hornet is an SQL bridge (alpha software for now) that also works at the ZODB layer. In contrast to Ape, it is much more geared toward existing datasets, or regular SQL access to tables. It requires you to define schemas in code too, but once this is done object access is totally transparent, as in Ape.

Dejavu sounds closer to Hornet than to Ape; one of the original reasons I wrote Dejavu was to get transparent access to a third-party database over which I had no schema control. You need to define the schema in code, but there's no reason that couldn't be automated for most storage systems (databases are the easy part ;) ).

To me Hornet and Ape are promising because I believe they integrate at the right level. Ape is better because it doesn't require explicit schema declaration, and that's very important when you have flexible objects where the users add new fields on document instances (which happens all the time in CPS using FlexibleTypeInformation-based content objects).

There was a time in the development of Dejavu that I wanted to support not only manual schema discovery, but actually discovery-on-the-fly at runtime. I think I only dropped it because I didn't have an explicit need for it; my hunch is that the framework could still support it easily. One of the best things about Dejavu is that it's only being used in production today at a couple of sites; at this stage of development, a good coder could easily step in and hack it into what they want it to be. :)

...I want to store blobs in the filesystem. This can be done at the application level by various products such as CPS DiskFileField, Archetypes ExternalStorage, chrism's Blob product, and others. It can also be done transparently at the ZODB storage level using Ape, and that's a much simpler way to do it.

That could be done with Dejavu with a tiny amount of work; subclass an existing StorageManager and special-case the BLOB fields. You could probably even use one of the above products to do it within Dejavu.

...I plan on using the most flexible (and underused) framework available, which is Ape. I'll write various classifiers, and mappers so that a typical CMF or CPS site can be mapped naturally to SQL. I'll also replace the catalog by an implementation that does SQL queries. This will not be a simple endeavour, but I feel this is the only way to get what I truly need in the end, without sacrifying any flexibility.

Wow, that's a lot of work. Care to leverage the 1000 man-hours I put into Dejavu instead?

If Ape proves to hard to work with (because it imposes its own framework of mapping), I'll go the Hornet way of writing a storage directly, with flexible enough policies for the mapping to SQL or the filesystem (or LDAP for that matter).

I wouldn't mind having a Dejavu StorageManager for LDAP... ;)

07/30/05

Permalink 01:58:46 pm, by fumanchu Email , 1199 words   English (US)
Categories: CherryPy, WSGI

Plugin madness

Glyph is talking about a complete refactoring of divmod, and mentions:

At every point in implementing this system we have known whether to fuse a component together because we'd built unnecessary additional complexity into previous systems, and where to use a plug-in architecture because we'd needed to inject ugly code into the middle of a monolithic routine.

As a result, where our architecture was heavily monolithic before, now it is almost entirely composed of plugins. It is so plugin-happy, in fact, that there is a database with Service plugins in it, which activate when the database is started from twistd; it contains its own configuration, including port-numbers, so nothing need live in a text configuration file.

Plugins are great because they facilitate customization: you can make small changes in system behavior with small changes in your code. An architecture that is "monolithic", to use Glyph's term, is one where small changes in system behavior require large changes in your code.

CherryPy 2.1 has a system of Filters (both built-in and user-provided), which act as plugins. As each HTTP request is processed, there are a few fixed points where the Request processor searches for registered Filter methods and gives up control to them. The Filter then must either return control to the Request processor, or raise a control-flow exception, like NotFound, RequestHandled, or HTTPRedirect. Here's the guts of the Request processor (the run method of the Request class) itself:

def run(self):
    """Process the Request."""
    try:
        try:
            applyFilters('onStartResource')

            try:
                self.processRequestHeaders()

                applyFilters('beforeRequestBody')
                if cherrypy.request.processRequestBody:
                    self.processRequestBody()

                applyFilters('beforeMain')
                if cherrypy.response.body is None:
                    main()

                applyFilters('beforeFinalize')
                finalize()
            except cherrypy.RequestHandled:
                pass
            except cherrypy.HTTPRedirect, inst:
                # For an HTTPRedirect, we don't go through the regular
                # mechanism: we return the redirect immediately
                inst.set_response()
                finalize()
        finally:
            applyFilters('onEndResource')
    except cherrypy.NotFound:
        cherrypy.response.status = 404
        handleError(sys.exc_info())
    except:
        handleError(sys.exc_info())

It's decent; that is, it's fairly clean and understandable IMO. But it's quite limited in two very important ways:

  1. There are only 7 points at which customization is possible [beforeErrorResponse and afterErrorResponse are found inside handleError]. Any filters which require the core to release control at multiple points have to do some fancy dancing to coordinate state between those applyFilters calls. Any filters which require additional control points are out of luck—their only recourse is to handle the remainder of the process themselves (probably with generous cut-and-paste from the CP core) and raise RequestHandled.

  2. Certain core processes are locked in time. For example, processRequestHeaders gets the "path" from the first line of the HTTP request; however, the Filters themselves are supposed to be dependent upon the current path! Therefore, for example, onStartResource filters must always be global ("/").

  3. (Slightly unrelated: the builtin "request" filters get run before any user-defined filters, and vice-versa for "response" filters. This needs a fix).

I tried to ameliorate some of these issues in the short term a) by making a Request class (which 2.0 didn't have); that might become subclassable someday, and b) by keeping a lot of logic out of the Request class, placing it instead in module-level global functions (which people can then call as needed).

A couple of weeks ago, I joked that maybe main (which calls the user's page handler) and finalize (which cleans up the response status, headers, and body) should themselves become filters, and it wasn't entirely a joke. When I look at CherryPy as-is, I see not one, but three separate API's.

The first, simplest one is for application developers, and includes:

  1. The cherrypy.root hierarchy.
  2. Page handler methods, including the spec for index and default methods, and the "exposed" attribute.
  3. Passing CGI params (etc) to page handler methods as keyword args.
  4. Expecting response content to be passed back to the core via "return" or "yield" statements.

The second API is quite different. It assumes a much higher level of competency, and is both more powerful and more complicated. I see it as mostly useful for those writing frameworks or libraries on top of CherryPy, although many "normal" app developers will end up using some of this interface. It includes:

  1. Filters. Creating, organizing, maintaining.
  2. Changing status, headers, or body via cherrypy.response.
  3. For that matter, almost anything involving cherrypy.request or .response: cookies and sessions, path inspection, HTTP-method dispatching, etc.

Finally, there's a third "API", which CherryPy supports quite well, but in a different fashion. There are a number of people who will run into, say, the filter limitations I outlined above, and will customize their copy of CherryPy to do what they want. One of my design goals has always been to make this easier by making the core insanely simple. There has been a lot of work done to keep the various components isolated, preferring a data-driven approach, centered around the cherrypy.request and .response objects, mostly; that is, your filter or site-specific customization can do whatever it likes, as long as it sets valid response.status, .headers, and .body before returning control to the HTTP server.

I don't see anything fundamentally wrong with having "different API's". It would be nice for some items from the "middle layer" to become more easily-accessible to the "shallow layer"; a lot of that can be done with clever wrapper classes, customized for specific situations. But I think it's quite all right to have a separation between a clean, simple, limited interface and a more powerful, but more complicated, interface behind that, to be used when needed.

However, I'd like to see the "lowest layer" unstick itself a bit more yet. That Request.run method, in particular, is far too frozen at the moment—I'd like to see it become the default processor, with an easy way to override some or all of it. I think that would free up CherryPy developers to better manage the current web-application space, which continues to change quickly as new ideas and technologies roll in. [Turning on my World Domination Mode for a moment, it might also allow a small, focused CherryPy core to become the backplane for several of the existing Python web frameworks, especially as more of them begin to support WSGI.]

Part of the reason there is a Filter specification at all is to shield CherryPy application deployers; they now have an interface for plugging in various components, that is simpler than, say, subclassing Request. For example, a deployer can decide to gzip their HTTP responses with a single line in their config file. In addition, the developer of that app need not be aware that this is being done. It's a "freebie" from his or her point of view.

But what if one could write one's own Request.run in such a way that that became easier than using config files? If that were possible, almost all of the overhead of the Filter architecture could be removed completely. In addition, developers and deployers could share total control over the request process, rather than the limited, dare-I-say "clunky" process we have now with filters. I think that with CherryPy 2.1, we're halfway there, and it won't take much work to make it happen in an elegant and powerful fashion.

07/21/05

Permalink 12:00:11 pm, by admin Email , 517 words   English (US)
Categories: IT, Python

Are we on the downhill side yet?

Ryan Tomayko hits one out of the park with his post, Motherhood and Apple Pie. It is the best summary I have read of the state of affairs in software development today, and the competing directions which Sanity and the Vendors are taking.

I found myself thinking, however, that I've been programming professionally now for, what, six years? And it was only with my rather recent move to Python that I really started to dig into "protocols and formats such as HTTP, URIs, MIME, HTML, and even XML (sometimes), and architectures such as REST", or fully understand (and use!) "MVC, ORM, frameworks, test- and domain-driven development". And, not to toot my own horn in any way, but I'm a pretty smart guy; what I mean is, I'm not your average programmer. My guess is that the "average programmers", and their managers, follow the vendors rather blindly because:

  1. they haven't been introduced to some of these more abstract concepts yet, either because they're inexperienced, or are operating on too small a scale, or
  2. they're not smart enough to "get it" when they are introduced to such things.

Even the best programmers spend some time in category 1, as they take time to learn each new concept. I certainly have, and continue to do so. But my hunch is that we've seen a large number of professional programmers in category 2 due to the dot-com glut. Programming, and especially software design, takes a certain set of traits: a good memory, a knack for system organization, the ability to focus, to get into the "flow". But the blitz of advertising in the nineties to "get a high-paying job in computers" has resulted in a majority of programmers who don't have the personality to reach for something better. Quite the opposite, in fact—in my experience, there are some personalities (and lifestyles) that favor having solutions pushed to you, rather than being researched and selected ("pulled") by a more informed process. It's a mistake to believe that only the latter involves reason and judgment; it's simply a different set of factors steering that judgment. But it seems software design is one of those industries which benefits from more people pulling, and less pushing/being pushed to.

My hope is that we are on the downhill side of that glut: witness the recent slide in Computer Science degrees (and careers). The "throw programmers at the problem/product" strategy worked well during the dot-com boom, but doesn't last during the leaner years. I think we'll start to see a return to Sanity, on average, which will result in a swing back toward better design and tools. IMO the current buzz around LAMP stacks, Ruby on Rails, "less is more", DSL's, etc are indicative of that. But I admit I may be blinded by my own learning process, and am projecting what I learn into "what the industry is learning".

Hmmm.

07/10/05

Permalink 01:03:08 pm, by fumanchu Email , 91 words   English (US)
Categories: IT, Python, CherryPy

lesscode.org

Link: http://lesscode.org/

I've really been enjoying Ryan Tomayko's new lesscode.org site. I've been on the simplicity bandwagon for about a year now, which coincides nicely with my Python learning curve ;). Check out lesscode if you're tired of overengineering.

Oh, and check out CherryPy if you're tired of overengineered web frameworks. I've worked pretty hard to make the upcoming 2.1 release as simple as possible, but no simpler.

06/04/05

Permalink 11:07:46 pm, by fumanchu Email , 127 words   English (US)
Categories: Python, CherryPy, WSGI

CherryPy WSGI is up and running

Update: 1) "lydon" is Oliver Graf. Thanks, Oliver! 2) the tests all pass now.

"lydon" contributed a recipe for using FastCGI with CherryPy's new WSGI interface. Thanks! (I notice he or she also used my brand new recipe for the Virtual Path Filter—nice to know when someone likes your little side projects. ;)

Peter Hunt has contributed a very nice WSGI server, as well. See the latest SVN trunk for the current version (here's a link to the Timeline.) There's still a bug in the WSGI server; the test suite isn't completing, because the server isn't shutting down when it should. I'm trying to track that down and fix it tonight.

05/26/05

Permalink 03:31:02 pm, by fumanchu Email , 542 words   English (US)
Categories: Python, WSGI

WSGI gateway for ASP (Microsoft IIS)

Update: I forgot to address buffering.

As I mentioned, I threw together an WSGI wrapper (gateway) for ASP. Here it is. Feedback welcome.

It blanks out SCRIPT_NAME to behave more like Apache. It also handles URL-rewriting, since that's pretty much the only sane way to use ASP with WSGI (or any framework, for that matter).


"""
WSGI wrapper for ASP.


Example Global.asa for a CherryPy app called "mcontrol":

<script language=Python runat=Server> 
def Application_OnStart():
    Application.Contents("multiprocess") = False
    Application.Contents("multithread") = True
    from mcontrol import chpy
</script>


Example handler.asp:

<%@Language=Python%>
<%
from wsgiref.asp_gateway import handler
from cherrypy.wsgiapp import wsgiApp

handler(Application, Request, Response).run(wsgiApp)
%>

"""

import sys
from wsgiref.handlers import BaseCGIHandler


class ASPInputWrapper(object):
    
    def __init__(self, Request):
        self.stream = Request.BinaryRead
        size = Request.ServerVariables('CONTENT_LENGTH')
        self.remainder = self.size = int(size)
    
    def read(self, size=-1):
        if size lt; 0:
            size = self.remainder
        content, size = self.stream(size)
        self.remainder -= size
        return content
    
    def readline(self):
        output = []
        while True:
            # Use an internal buffer instead? Still have to check for \n
            char = self.read(1)
            if not char:
                break
            output.append(char)
            if char in ('\n', '\r'):
                break
        return ''.join(output)
    
    def readlines(self, hint=-1):
        lines = []
        while True:
            line = self.readline()
            if not line:
                break
            lines.append(line)
        return lines
    
    def __iter__(self):
        line = self.readline()
        while line:
            yield line
            # Notice this won't prefetch the next line; it only
            # gets called if the generator is resumed.
            line = self.readline()


class handler(BaseCGIHandler):
    
    def __init__(self, Application, Request, Response, buffering=True):
        # If you set buffering to False, you must not "Enable Buffering" in
        # the current Virtual Directory, NOR in any of its parent containers
        # (directory, site, or server). IIS 5 and 6 buffer by default.
        # See http://support.microsoft.com/default.aspx?scid=kb;en-us;Q306805
        # and http://www.aspfaq.com/show.asp?id=2262
        Response.Buffer = buffering
        
        env = {}
        for name in Request.ServerVariables:
            try:
                # names and values are both probably unicode. coerce them.
                env[str(name)] = str(Request.ServerVariables(name))
            except UnicodeEncodeError, x:
                # There's a potential problem lurking here, since some ASP
                # server var's which are required by WSGI may be high ASCII.
                x.args += ((u"Server Variable '%s'" % name),)
                raise x
        
        multiprocess = str(Application.Contents("multiprocess"))
        multithread = str(Application.Contents("multithread"))
        
        # You will probably need *some* form of rewriter to use ASP
        # with WSGI, since ASP requires one physical .asp file
        # per requestable-URL; so far, we support one:
        
        # Handle URL rewriting done by ISAPI_Rewrite Lite.
        # http://www.isapirewrite.com/
        # Note that PATH_TRANSLATED is also rewritten, but we
        # don't make any provision for unmunging that.
        old_path = env.get("HTTP_X_REWRITE_URL", None)
        if old_path:
            # Tear off any params.
            env["PATH_INFO"] = old_path.split("?")[0]
        
        # ASP puts the same values in SCRIPT_NAME and PATH_INFO,
        # for some odd reason. Empty one of them.
        env["SCRIPT_NAME"] = ""
        
        BaseCGIHandler.__init__(self,
                                stdin=ASPInputWrapper(Request),
                                stdout=None,
                                stderr=sys.stderr,
                                environ=env,
                                multiprocess=multiprocess,
                                multithread=multithread
                                )
        
        self.Response = Response
        self._write = Response.Write
    
    def _flush(self):
        self.Response.Flush()
    
    def send_headers(self):
        self.cleanup_headers()
        self.headers_sent = True
        for key, val in self.headers.items():
            self.Response.AddHeader(key, val)

05/21/05

Permalink 12:02:07 pm, by fumanchu Email , 191 words   English (US)
Categories: Python, CherryPy, WSGI

CherryPy going WSGI...conservatively

As I announced, I've got a new core with both a wsgiapp and the original native httpserver. They both pass all tests.

[Remi] I don't have any problem with WSGI being the preferred interface. But in that case, we need to package with CherryPy a standalone WSGI HTTP server so that people can still run their sites without any external dependencies if they want to (and the server should support thread-pooling).

I'm about to write a WSGI server for CP, so that we can run the test suite against the wsgi handler without any dependency on wsgiref. I can either:

  1. Write a new WSGI server specifically for CherryPy, or
  2. Rewrite the existing native httpserver to have a WSGI interface.

If I did (2), then CP would no longer have a non-WSGI server. I think I'll avoid that for now--it would be better to grow the new interface, test it thoroughly through at least a minor version or two, and cut the old one later if we find nobody is using it.

I should have a patch ready by Monday. :)

05/19/05

Permalink 02:23:49 pm, by fumanchu Email , 117 words   English (US)
Categories: Python, CherryPy, WSGI

Abstracting the CherryPy webserver

CherryPy had a great IRC meeting today. I had several issues I wanted to discuss going into it—not only were they discussed, but I didn't even have to bring them up! :)

One important outcome was: I'm not the only one who wants the included HTTPServer to be less-strongly coupled to the rest of the framework. So I'll be working on that during the rest of the week; I told everyone I'd have a draft ready by Monday. I'll be asking lots of questions on cherrpy-devel, I'm sure. ;)

My biggest hope is that I can now eliminate the "write callable" hack which I just placed in the new wsgiapp module.

05/18/05

Permalink 02:13:21 pm, by fumanchu Email , 45 words   English (US)
Categories: IT, Python

Content-type: text/x-json

Update: it really should be "text/x-json", not "text/json".

Funny that I haven't seen anyone recommend this before. It sure would make XMLHttpRequest parsing a snap, on both ends.

I think I'll try using that in my CherryPy JSON library.

<< 1 2 3 4 5 6 7 8 9 10 11 >>

August 2014
Sun Mon Tue Wed Thu Fri Sat
 << <   > >>
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31            

Search

The requested Blog doesn't exist any more!

XML Feeds

powered by b2evolution