Update: I stumbled onto Mike Spille's blog, which talks a bit more (and better) about middleware versus libraries.
Ian Bicking recently promoted the idea of a WSGI reference library, to possibly include the following components (among others):
- Sessions middleware
- Logging middleware/library (I assume he meant request logging)
- Error reporting middleware/library
- Test frameworks
- A file application (handling If-Modified-Since, etc)
- A proxy application
- Libraries for parsing query strings and all that.
- URL parsers.
- And maybe a few of the more boring servers, like the CGI server, which will otherwise be homeless (or widely repeated).
Not being the most careful reader in the world, I was thrown by the phrase, "...collaborating on a ... library of WSGI middleware"; I read the list as if he meant each piece would be a middleware component! Of course he did not intend that. Many of the items in the list are WSGI applications, which sit at the end of the software stack.
Some of the items in the list are, in fact, paraware; that is, they parallel the main application. Traditional programming libraries/toolkits are a common example of paraware. They provide functionality by supplying input and output hooks, which are supplied and consumed by the main application:
mylib.set_value(3) mylib.munge(obj) result = mylib.get('f')
Middleware, on the other hand, handles/munges a content stream, and sits between at least two other components in a software stack. Middleware is a nasty thing in many environments, because each middleware component must manage I/O of all shared objects, in two directions (both its caller and the next component in the stack). In Python, however (and specifically WSGI), the shared objects are all on the same heap, and can all be passed by reference.
I see problems with writing most of these components as middleware. WSGI has a shot at being ubiquitous because it enforces a set of interfaces and a data model; this same enforcement, however, can also be a liability, since WSGI is not yet ubiquitous. As a developer of a web framework, I have a dilemma: I need to provide the same functionality whether my users use WSGI or not. This means I need to write such components as libraries (so they can be used as paraware) and then wrap them with WSGI boilerplate (so they can be used as middleware). This leads to serious code smell. WSGI's callback structure is complicated enough without me introducing library-code wrappers. Perhaps what we need are generic pieces of WSGI middleware which you can init with a callback from your library code. Hmmm.
Potential components from Cation
I've been meaning for a while now to investigate breaking my Cation app framework down into a set of libraries (instead of the monolithic framework it is today). You can see from the dearth of recent checkins that I haven't done any of it yet. Many of those could be added to a WSGI library (some are already on Ian's list). Here are the ones I'd be most interested in writing:
Top-level error trapping, logging, and pretty printing
I'd like to do this myself because Cation keeps a list of application developers (usernames), and shows full tracebacks in the browser to developers. Ordinary users get a "pretty" error message, and the full traceback goes into the log only. I'm pretty sure a standard library version wouldn't do that. Integrating the usernames into the error handling logic leads me to want to provide this as paraware, since middleware components are usually not expected to interoperate.
Timed, threaded Worker classes for getting things done on a schedule, possibly recurring
This isn't WSGI-specific, and shouldn't be a candidate for WSGI. But it's something I'd like to rewrite in more of a library style, instead of a framework.
Centrally registered and managed requests
For example, this would assist a WSGI application in fulfilling a request to shut down--each active web request (thread) could be sent a shutdown message and kill itself gracefully from outside the application itself.
Data type coercion (both inbound and outbound), including encoding
Since HTML form values are always received as strings, a standard (but overridable) way to convert them into Python values would be helpful. In the other direction, values need to be coerced to strings, put in the encoding of the server (or of the page), and often quoted safely. Again, this would probably need enough customizability that it would be a poor candidate for middleware, but a good candidate for a set of library calls.
Classic middleware, meeting a need orthogonal to the actual content delivery, and not needing customization or context.
Something that might on occasion need to be specialized, but ultimately a commodity for 90+% of cases. The standard implementation would be nothing more than a pretty interface over simple (but secure) file management.
That's enough for the next year or so Pity I have so many other projects to work on simultaneously.
Ned nails it. If someone, somewhere has been using your software in production for a while (a year? years?), your core functionality is 1.0. Branch and fork all you want from there, but please let us, your users, know it's worth a try.
I particularly don't understand why this trend seems to apply to development libraries more than anything else. Library users are developers--give them some credit. They'll find and fix the broken bits on their own.
In an attempt to automate our backups (my PFY was doing manual DVD burning every day), we bought a Dell Poweredge SC420 with a pair of 250G SATA drives and no OS. It'll go either to our disaster hotsite, or a colo. The only thing the box will be doing is rsync over ssh. Our various nix and Windows servers (via cwrsync) will connect on a schedule and back up various top-level directories. Testing on a different server over the 'Net showed typical backup times of 5 to 10 minutes for one such directory, depending on how much had changed overnight. We expect to have backup traffic for about 1 hour each night, up to a max of 4 hours (rarely). Email will, of course, be the killer. We may go back to CD's/DVD's for that.
At any rate, here are the steps I went through to set up the Dell server:
- Setup (F2): make box power on when power lost and restored.
- Configure SATA by hitting Ctrl-A at boot, no RAID.
- Insert sarge debian-installer CD and boot from it. I can't praise the Debian team enough--the debian-installer is fantastic! Thanks to Greg Folkert for recommending it!
- At FIRST prompt (F1 for help or enter to boot), type "linux26", hit Enter (this selects the 2.6 kernel).
- SATA drives should be auto-recognized. Partition them ext3 and mount to /data1, /data2.
- Finish install normally.
- CD ejects and box reboots. When Real Time Clock freezes (twice) during boot, hit Ctrl-C to kill the hung process and continue boot. This is an ACPI problem, which we'll fix in a minute.
- Continue setting up Debian as prompted. linux.csua.berkeley.edu is a nice http mirror for me. Oh, and run the ssh server in protocol 2 only.
- apt-get remove exim4, which won't work. But it's fun to try. Anyone know why it isn't removed (still shows up in ps after removal/reboot)?
- Turn ACPI off, or the Real Time Clock will freeze on every boot:
vi /etc/grub/menu.lst Change the lines starting with "kernel": append the text " acpi=off".
- Mount the SATA drives:
cd / mkdir /data1 mkdir /data2 vi /etc/fstab add the lines: /dev/sda1 /data1 ext3 defaults,errors=remount-ro 0 2 /dev/sdb1 /data2 ext3 defaults,errors=remount-ro 0 2
- Change the order in which discover and checkfs are called:
cd /etc/rcS.d mv S36discover S29discover
- Setup rsync
apt-get install rsync vi /etc/default/rsync (set RSYNC_ENABLE=true) vi /etc/rsyncd.conf (see online docs) /etc/init.d/rsync start
- Setup ssh
vi /etc/ssh/sshd_config PermitRootLogin no RSAAuthentication no
- make an rsync user
Another Scott on IWETHEY asked me to expand on why I chose b2evolution for the blog software here, especially in relation to this post. I'm awfully bad at recording my decision-making processes, but I'll try.
Lots of blogs I examined failed one of our requirements outright, or at least offended my sensibilities :
- Wordpress: Multiple blogs aren't built into the core. There's a separate Wordpress-mu project, but it seems to be still in serious beta, with only one developer actively working on it.
- Blosxom: Perl. Bleah. If I used it every day, maybe. But the blog is something I want to work on (writing plugins, etc.) once or twice a year. PHP is something I can pick up quickly; in fact, I wrote my first working plugin for b2evolution in an hour, never having even looked at PHP code before in my life. By the end of the afternoon, I had a patch ready (against CVS-HEAD) for applying plugins to comments (not just posts), with enough confidence to mail it off to the project leads.
- Serendipity: no multiblog as far as I could see. By "multiblog", I mean multiple authors on one install, each with their own blog (each with their own feed(s)). We have 40 people on staff at Amor, each with their own financial supporters with whom they wish to communicate. Those who support me don't necessarily have any interest in reading what my co-workers are writing (but if they want to read what everyone at Amor is writing, b2evolution gives them that opportunity out of the box, as well).
- Textpattern: editing is done with Textile, and the whole editing process is very HTML-centric. If I were the only author, it might fly. But I have a wide range of authors, from complete Luddites to, well, me. HTML is something to hide from many of them. For b2evolution, on the other hand, I quickly found and applied a plugin to use Markdown. Users can also choose GreyMatter, BB code (a la phpBB ), Textile, or Texturize, all included in the default install.
- Nucleus. Same parent as b2evolution. I honestly can't remember why I chose b2evolution over nucleus, except for a vague feeling that nucleus was written by developers for developers, instead of for users. Oh, yeah, and the thumbnails didn't work, at least not quickly and easily enough. The whole "media library" idiom is nice for developers, but some of my users would never be able to add a picture to a post, a task they will desire to do quite often.
Meh. That's enough for now. b2evo has had its own quirks, but the problems have been surmountable with a minimum of effort. I think it will serve us well enough.
Welcome to my blog. I'm Robert Brewer, the Manager of Information Systems for Amor Ministries. "FuManChu" has been my radio handle since I started working in Ciudad Juarez in 1993, building homes for the poor.
After five years in the field, I moved into IT. During the past six years, I've built and rebuilt our core business-process application. The first version was a client-server app in VB4; that moved to a webapp within a year or two. In 2003-2004, I rewrote the entire thing in Python, including a standalone application framework (Cation) and Object-Relational Mapper (Dejavu).
I also managed everything else (networking, mail, file servers, backups, desktops, antivirus, you name it) until I got a PFY (an assistant), Ryan, in early 2001. Now I tend to advise and design top-level architecture in those areas, and let Ryan do the hard work.
Don't be surprised if I discuss topics other than computers, however! I enjoy almost anything done well. Drop me a line or a comment if you are able.
Yes, this really is the first post.
My company, Amor Ministries, has been talking about staff websites since...well, forever. We've toyed with various ideas and CMS systems. I even wrote one myself for HTML editing called Tibia. However, nothing seemed to solidify--the burden on the authors is usually too great.
Last week, I floated the concept of blogs past Alon, our Development Team leader, and for the first time, his desires and the available tech started to gel very nicely. We made a short laundry list of desirable features for blog software, and the result will be built here.
I went with b2evolution because:
- It is the most multi-blog friendly.
- It's free and open-source. Free is always nice, of course, but the modifications we wanted pretty much forced an OSS solution.
- It seemed the cleanest of the few I downloaded and evaluated (including Nucleus, Wordpress, and Serendipity). "Clean" in both on-screen appearance, and the codebase (again, to make our mods easier).
- Feeds are built in and on by default.
The first features I'm looking to write (and contribute back to b2evo, if they'll answer my email):
- Profanity filter. A simple one--no boiling of any oceans. This is already done, btw.
- A tool for staff to collect recent posts into a printed newsletter.