Category: Dejavu

Pages: << 1 2 3 >>


Permalink 11:03:42 pm, by fumanchu Email , 225 words   English (US)
Categories: Dejavu

Single Table Inheritance certainly *sounds* evil

Jeff Shell (who needs to turn on comments) wrote:

...By my understanding of Single Table Inheritance, the flight leg, ground leg, and scheduled meeting legs data would all be mashed up in one table. If I were designing tables in an RDBMS, I would never design that way – and I’m no genius Relational designer.

I guess, after thinking about it, I would write the three legs as separate tables/objects and write one legs action to combine them (in Rails)?

That's how I'd do it in Dejavu (three different tables). But Dejavu allows you to recall objects from those three tables either individually or collectively, without having to write your own "combine action". If you recall the subclass, you get just that subclass. If you recall the superclass, you get objects from all subclasses together in the same result set (you also get objects from the superclass, although quite often it's abstract and there aren't any).

I've never worked with Rails' inheritance, but I have been horrified to see mashup tables in plenty of databases. You know the ones: three columns common to all records, 28 columns that only apply to 50% of the rows, and 34 additional columns that only apply to the other 50%. Pick larger column-counts and smaller percentages if you're into mental masochism. ;)


Permalink 12:30:17 am, by admin Email , 256 words   English (US)
Categories: Dejavu

Dejavu 1.4.0 (Python ORM) release

Dejavu 1.4.0

I'm extremely pleased to announce the release of Dejavu 1.4.0, a pure-Python Object-Relational Mapping library. Dejavu allows you to create, query, and manage persistent data using your existing knowledge of Python programming.


  • Data queries are expressed using pure Python; no SQL, no operator hacks, and no need to wrap code in strings.
  • Data may be transparently stored in PostgreSQL, MySQL, SQLite, Access, or SQL Server databases, as well as in flat files (using shelve), and caching proxies. You can create and combine custom storage systems for your own integration and performance needs.
  • Easy associations between Unit classes.
  • Full thread-safety for reliable use in web applications and other concurrent environments.
  • Views, sorting, cross-tabs, and other analysis tools.
  • Unit Collections, plus Engines and Rules to form powerful end-user query and reporting interfaces.

What's new in 1.4

  • Full LEFT, RIGHT, and INNER JOIN support (okay, one operator hack here).
  • Optimized and introspectable To-One and To-Many associations.
  • Arbitrary names for Unit.ID's (primary keys).
  • Support for multiple ID's (primary keys).
  • A new Schema class and other upgrade-management tools.
  • Default values for UnitProperties.
  • New logging hooks to help debug SQL and other storage issues.
  • Fixes to support Python 2.4 bytecode and other changes.
  • Inheritance support (all subclasses are recalled).
  • Vastly-improved test suite.
  • New recur module included, with a threaded Worker class.
  • Better support for update triggers.

Dejavu is in the Public Domain, and you may use it anywhere with no obligation whatsoever.

User documentation and a full Trac site are available at:


Permalink 11:10:35 pm, by fumanchu Email , 213 words   English (US)
Categories: Python, General, Dejavu, CherryPy

We're hiring, by the way

The job posting is pretty tame: we need a Python web developer. But I thought I'd add my personal point-of-view, and say that we really mean "developer" and not just "coder". You'd be responsible for producing working web apps, but that involves a lot of design work and architectural decision-making.

You would also be expected to contribute to the CherryPy HTTP framework and to Dejavu (my Python ORM), since I'm a core dev on both those projects and use them heavily already. In other words, if you have or want exposure to the full stack of modern web development challenges, this is the job for you. You'll be a full member of an IT team of 3 serving an energetic staff of 50.

You'll also get something that's hard to find in most programming jobs: warm fuzzies. We build homes for the poor in Mexico, simultaneously "building" the church in Mexico, the U.S., Canada, and elsewhere. We are not on the cutting-edge of world missions--we are defining that edge. If you've been thinking about "doing more for Jesus", but would rather write code than dig ditches in Uganda, give us a call (619-662-1200 ext 11).


Permalink 11:37:08 pm, by fumanchu Email , 226 words   English (US)
Categories: Dejavu

Dejavu 1.4 now in beta

After more than a year since 1.3 was released, I'm just about ready to officially release Dejavu 1.4! In addition to bugfixes, there are some major new features:

  • Sandbox.recall now returns a list (use xrecall to get an iterator).
  • Associations are now aware of whether they are to-one or to-many.
  • logic.Expressions can now take multiple positional arguments (so you can test multiple Units at once).
  • Improved multirecall, including full support for INNER and OUTER JOINs for all Storage Managers. Since the signatures for recall and multirecall now align, the "multirecall" name has been dropped; just call Sandbox.recall(classes, expr) whether you're querying a single class or multiple ones.
  • Units may now have arbitrary identifiers (primary keys).
  • Unit Properties have a new "default" attribute.
  • Simple inheritance is now supported; recalling one class will also recall its subclasses.
  • New Sandbox "magic recaller" methods, like inv = box.Invoice(13).
  • New Sandbox.view method, to retrieve persisted data without creating full Units.
  • A new Schema class to help manage changes to your model, and helper methods to sync database schemas.
  • New logging support.
  • A new test runner.
  • Python 2.4 fixes for codewalk, the test suite, and fixedpoint.

As you can see, a year's worth of work. ;) Feel free to kick the tires on all the new stuff. I should bless a release candidate in early January.


Permalink 02:08:17 pm, by fumanchu Email , 338 words   English (US)
Categories: Dejavu

Dejavu is adding schema versioning

I just dumped a first crack at a Schema class on the trunk. Test code is here (search for 'schema'), docs are here. I haven't written anything like this before, so if anyone has recommendations or warnings about the direction it's heading, now is the time to speak up (before 1.4 is officially released ;) )!

Basic design: there's a dejavu.Schema class which your app can subclass. Whenever you need to change the underlying database (or other persistence mechanism) schema of your app, you write a new upgrade_to_X method, where X is an incrementing version number. Each such method contains the commands which will upgrade an installation from (X - 1) to X.

At runtime, you call MySchema.upgrade(), and each deployment will run any upgrade_to_X methods that it hasn't yet run, in order. The "currently deployed version" number is stored in a magic DeployedVersion Unit.

The upgrade_to_X methods can choose to stay database-neutral and just use the (new) arena.add_property, drop_property, and rename_property methods. But because each Schema is application-specific, you can also write optimized instructions for your known StorageManagers. For example, say you need to change an int property to a string. The "database-neutral" way would be to have additional Arena methods for such tasks. Some of those methods may be added in the future, but nothing's stopping you now from writing non-portable SQL statements if you know your app is only deployed on, say, Postgres (but you should probably assert that before you execute the SQL statements).

Anyway, I'd be interested to hear from anyone else who has written database-versioning tools. Save me from a pitfall if you can. :) Have fun with the new Schema class and let's see if there are a couple of other common methods (like add_column) that should go into the Arena and the StorageManagers.


Permalink 03:48:36 pm, by fumanchu Email , 232 words   English (US)
Categories: Dejavu

Dejavu has a new home

Dejavu, my pure-Python ORM, has a new Trac home at As always, it's open to the public to download, use, or develop. It could use another contributor or two, and the snazzy new Trac front-end should help that immensely.

Just to remind you about the parts of Dejavu I think are cool:

  1. It completely hides the storage layer, so your code need never know you're using a database. It supports MS Access, MS SQL Server, PostgreSQL, MySQL, SQLite, and shelve, all transparently.
  2. Simple queries use simple syntax: Panda = sandbox.unit(zoo.Animal, Species='Ailurpoda', Name='Mei'). More powerful queries use the logic.Expression object and pure Python (no code strings!):
    red_filter = logic.Expression(lambda x: x.color == 'red')
    RedAnimals = sandbox.recall(zoo.Animal, red_filter)
  3. Dejavu has been designed from the ground up to be used in a multithreaded environment.
  4. New database adapters are easy to develop using the `db` module. The PostgreSQL module, for example, is 100 lines of code.
  5. Model-layer Units are clean and easy, using modern, built-in Python types. They are also extendable to custom property classes and custom types:
    from dejavu import Unit, UnitProperty
    class MissionTrip(dejavu.Unit):
        """A Mission Trip experience."""
        DirectoryID = TripGroupProperty(int, index=True)
        FirstDate = NotifyProperty(
        GroupName = UnitProperty()
        LastDate = NotifyProperty(
        AmountDue = UnitProperty(fixedpoint.FixedPoint)
  6. Persistent query-engines and analysis tools are included.


Permalink 11:01:16 am, by fumanchu Email , 239 words   English (US)
Categories: Python, Dejavu, CherryPy

It doesn't take much of a Python to swallow my brain

Lines of code in the four systems I hack on most often (and of which I have a more-or-less complete grasp):

>>> import LOC
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\cherrypy")
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\dejavu")
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\endue")
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\mcontrol")

Something about my brain must naturally fit 7500-to-10000-line chunks of Python code. I certainly experience a strong drive to keep these systems from becoming more complicated, which I usually express via aggressive refactoring.

Some other packages (which I don't hack on) for comparison:

>>> LOC.LOC(r"C:\Python23\Lib\site-packages\colorstudy\SQLObject")
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\paste")
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\PIL")
>>> LOC.LOC(r"C:\Python23\Lib\site-packages\twisted")
>>> LOC.LOC(r"D:\download\Zope-2.8.1-final\Zope-2.8.1-final")

I think the sheer size of paste, twisted, and zope has actively kept me from wanting to dig into them further (but it's certainly not the only factor). Irrational, perhaps, but a natural human response to information overload.

Here's the LOC script if anyone wants to compare packages:

import os, codecs, re

def LOC(root, pattern='^.*\.py$'):
    LOCs = []
    pattern = re.compile(pattern)
    for path, dirs, files in os.walk(root):
        for f in files:
            if pattern.match(f):
                mod = os.path.join(root, path, f)
                lines = len(, "rb").readlines())
    return sum(LOCs)


Permalink 10:34:19 am, by fumanchu Email , 1132 words   English (US)
Categories: Python, Dejavu

Where Dejavu fits in the ORM cosmos

Florent Guillaume has written a good survey of his personal ORM options for Zope3. I thought I'd take the opportunity to discuss Dejavu, my own pure-Python ORM, in relation to his analysis.

SQLObject is a pure python mapper, and it is used in Zope 3 through sqlos. It provides declarative mapping for your classes ... any instance ... will actually be stored in SQL behind your back. A unique id is generated for each [object] and also stored in SQL. There are various facilities to provide relations between tables, and map them to lists in the python objects.

SQLStorage is an Archetypes storage that uses SQL as a backend. You can write an Archetypes schema ... SQLStorage relies on the Archetypes UIDs to uniquely identify objects. A Relation field can be used to have relations to other objects. Note that Archetypes objects stored through SQLStorage still have a presence in the ZODB, which means it's not a solution if you totally want to get rid of Data.fs bloat.

...both SQLObject and Archetypes require you to specify in your code that you will use SQL for some objects. Things are not "transparent" from the programmer's point of view.

Now that I haven't been actively hacking on Dejavu for a few months, I've had the opportunity to sit back and think more clearly about what I like in it. One of the things I like most is that the decision about what storage system to use is not made by the application programmer; instead, it's made by the deployer(s) of an app. Therefore:

  • Programmers can remain blissfully unaware of SQL. They should still understand something about storage in the abstract—things like indexing, size hints, and relations—but the Dejavu API completely replaces SQL, custom file drivers, caching mechanisms, and any other storage APIs, transparently.
  • Deployers can make their own decisions about storage mechanisms, on a per-class level. Even if those decisions are religious or political in nature. ;) But if they're not...
  • Deployers can test their dataset using various storage systems to find out which is fastest (or against other metrics).

Objects that you want to persist do have to follow a pretty complicated interface standard, and by far the easiest way to guarantee that is by subclassing dejavu.Unit. So Dejavu does "require you to specify in your code that you will [persist] some objects". I'm not sure whether Florent was defining "transparent" in terms of declaring "SQL" specifically, or declaring "persistence" in general.

Ape (Adaptable Persistence Engine) is a framework to do object-relational mapping at a lower level than the above two solutions, because it works at the ZODB Storage level...the downside is that the structure of the tables in the SQL database is chosen by the framework.

Dejavu doesn't do that. Object properties (and table names) are declared in Python code, just like SQLObject or SQLStorage (although more readably, IMO). But Dejavu doesn't add any properties which you don't explicitly specify; only the "ID" property is present by default. Dejavu does expose a "create_storage" method for turning your in-Python schema into empty database tables (for use when your database isn't already populated).

However Ape's default SQL mapper already tries hard to provide data mapping in a natural way; for instance all properties are made available in a natural manner, object titles or containment relationship are also naturally expressed.

If I understand that paragraph correctly, it's saying that the persistence mechanism doesn't conflict with normal Pythonic code. This is a great strength of Dejavu: objects are still objects, and their properties are gotten and set in a natural way, with reasonably-transparent coercion if necessary:

>>> class Knight(Unit):
        Name = UnitProperty(unicode)
>>> galahad = Knight()
>>> galahad.Name = "Galahad the Chaste"
>>> print galahad.Name
u'Galahad the Chaste'

Hornet is an SQL bridge (alpha software for now) that also works at the ZODB layer. In contrast to Ape, it is much more geared toward existing datasets, or regular SQL access to tables. It requires you to define schemas in code too, but once this is done object access is totally transparent, as in Ape.

Dejavu sounds closer to Hornet than to Ape; one of the original reasons I wrote Dejavu was to get transparent access to a third-party database over which I had no schema control. You need to define the schema in code, but there's no reason that couldn't be automated for most storage systems (databases are the easy part ;) ).

To me Hornet and Ape are promising because I believe they integrate at the right level. Ape is better because it doesn't require explicit schema declaration, and that's very important when you have flexible objects where the users add new fields on document instances (which happens all the time in CPS using FlexibleTypeInformation-based content objects).

There was a time in the development of Dejavu that I wanted to support not only manual schema discovery, but actually discovery-on-the-fly at runtime. I think I only dropped it because I didn't have an explicit need for it; my hunch is that the framework could still support it easily. One of the best things about Dejavu is that it's only being used in production today at a couple of sites; at this stage of development, a good coder could easily step in and hack it into what they want it to be. :)

...I want to store blobs in the filesystem. This can be done at the application level by various products such as CPS DiskFileField, Archetypes ExternalStorage, chrism's Blob product, and others. It can also be done transparently at the ZODB storage level using Ape, and that's a much simpler way to do it.

That could be done with Dejavu with a tiny amount of work; subclass an existing StorageManager and special-case the BLOB fields. You could probably even use one of the above products to do it within Dejavu.

...I plan on using the most flexible (and underused) framework available, which is Ape. I'll write various classifiers, and mappers so that a typical CMF or CPS site can be mapped naturally to SQL. I'll also replace the catalog by an implementation that does SQL queries. This will not be a simple endeavour, but I feel this is the only way to get what I truly need in the end, without sacrifying any flexibility.

Wow, that's a lot of work. Care to leverage the 1000 man-hours I put into Dejavu instead?

If Ape proves to hard to work with (because it imposes its own framework of mapping), I'll go the Hornet way of writing a storage directly, with flexible enough policies for the mapping to SQL or the filesystem (or LDAP for that matter).

I wouldn't mind having a Dejavu StorageManager for LDAP... ;)


Permalink 10:48:30 am, by fumanchu Email , 376 words   English (US)
Categories: IT, Dejavu

Outgrowing databases

Daniel H. Steinberg gives a summary of Adam Bosworth's recent keynote at the MySQL User's Conference 2005:

If you build an open source stack that delivers globally available information, how do you massively distribute it and cause it to scale? Bosworth said you need to limit your queries to those that can be easily implemented by everybody and those that can be handled by a single machine. This requires that your queries run at the item level. This might feel odd to those used to dealing with databases, as this means you are not likely to perform joins, aggregations, or subqueries. There is plenty of SQL that cannot be supported.

This is one of the design artifacts of Dejavu: if your domain model requires complicated joins, unions, or subqueries, it's better to refactor your model than to fight with data aggregation queries. Dejavu forces you to do so, in fact, because I didn't care to provide an object-to-SQL translation for such queries.

Refactoring may be painful, but is necessary for growth in almost any application. Best to do it up front than to be lured into fragile schemas, only to be forced to refactor or die later on.

Working backwards through the article:

Bosworth predicts that RSS 2.0 and Atom will be the lingua franca that will be used to consume all data from everywhere.

I've been thinking lately about writing a generic RSS interface for Dejavu. A simple object-property reader/writer would be a cake-walk; security would be the tough bit to design.

Imagine if you can query any data that is available anywhere in the world. Bosworth said that what this requires is a single, simple, open wire format for items. The format needs to be simple for any P programmer to deliver and any JavaScript programmer to consume.

Hm! Exactly where I've been going with Lyrica—I'm working hard to push as much as possible into Javascript on the client. The viewer will certainly be Javascript-only, with an option to traverse local files, so that the server connection isn't necessary once you've built your slideshow.

Permalink 10:09:39 am, by fumanchu Email , 286 words   English (US)
Categories: IT, Python, Dejavu, Cation

Designing from the outside in


...extracting a re-usable framework after the fact struck me as interesting because that's really what's happened with Leonardo. Two years ago, I wrote a little wiki-like script in Python in order to enable editing of content on from a browser. I then decided to expand it just over a year ago to include a blog. Now, as more features are being requested, an underlying web framework is emerging that could very well be useful outside of running a wiki or blog.

The same thing happened with Cation (a web framework) and Dejavu (an ORM). I was tasked with rewriting our core business app—two years ago, it was a procedural CGI app written in Visual Basic 4! When I rewrote the whole thing in Python, I started by isolating Cation+Dejavu into their own layer. After about six months I then separated Dejavu from Cation. In addition, I made a middle business-objects layer called "EnDue", which the final app, "Mission Control" is built on. There's also a wiki-like app called "Junct" which I built on top of Cation and Dejavu. So the tree currently looks like this:

[Cation]  [Dejavu]
    \        /
     \      /
[Mission Control]

I've also got "test apps" for Dejavu and EnDue (well, I'm still writing the one for EnDue...I think I'll model the business of the beard-and-stone salesman from "Life of Brian" ;) Any good names for such a business?).

Anyway, the real point I want to make (and have made before) is that I'll probably replace Cation with another web framework sometime this year...but I wouldn't have known which existing framework to pick if I hadn't written my own first.

<< 1 2 3 >>

July 2018
Sun Mon Tue Wed Thu Fri Sat
 << <   > >>
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31        


The requested Blog doesn't exist any more!

XML Feeds

free blog tool