Pages: 1 2 3 4 5 6 7 8 9 10 11 ... 26 >>

03/20/13

Permalink 11:01:52 pm, by fumanchu Email , 18 words   English (US)
Categories: IT, Python, Architecture

PyData 2013 Slides

The presentation deck from my talk at PyData 2013 is up! Thanks to everyone for their interest and feedback.

02/12/13

Permalink 10:22:04 am, by fumanchu Email , 108 words   English (US)
Categories: IT

Addictive to check out

From 37 Signals, about their Basecamp iPhone app launch:

Our top priority was fast access to news. You’ll find the app makes it addictive to check in and feel the pulse of your projects throughout the day. You can quickly bounce in and out of projects. Project screens on the phone show the latest news first rather than static project contents.

Cool. As a manager, that's exactly what I want: to feel the pulse.

As an architect and designer and developer, I want the opposite. Now, can someone make an app that makes it addictive to get in the flow instead of to be interrupted all the time?

07/17/12

Permalink 10:27:34 am, by fumanchu Email , 75 words   English (US)
Categories: General

There's got to be a name for this

...you know, the difference between "what is the least we can do to alleviate our current pain?" versus "where do we want to be and how do we get there?" I see this distinction again and again. I've seen both called "strategy" and that can't be good. I would say the former is a product of "management" and the latter of "leadership", but that distinguishes the attitudes or processes, not the results. Lazyweb? Little help?

02/24/11

Permalink 02:58:42 pm, by fumanchu Email , 1214 words   English (US)
Categories: Python, CherryPy

Wow. Does isinstance blow up with ABC's?

Python 2.6.1. Here's a call to "isinstance(value, basestring)":

--[  (_cprequest:782)
--]  (_cprequest:782)  0.044ms

versus "isinstance(value, io.IOBase)":

--[  (_cprequest:791)
----> __instancecheck__ (abc:117)
----. __instancecheck__ (abc:120)
------[  (abc:120)
------]  (abc:120)  0.046ms
----. __instancecheck__ (abc:121)
----. __instancecheck__ (abc:123)
----. __instancecheck__ (abc:124)
----. __instancecheck__ (abc:125)
----. __instancecheck__ (abc:126)
----. __instancecheck__ (abc:127)
----. __instancecheck__ (abc:130)
------> __subclasscheck__ (abc:134)
------. __subclasscheck__ (abc:137)
------. __subclasscheck__ (abc:140)
------. __subclasscheck__ (abc:144)
------. __subclasscheck__ (abc:147)
--------[ ABCMeta.__subclasshook__ (abc:147)
--------] ABCMeta.__subclasshook__ (abc:147)  0.043ms
------. __subclasscheck__ (abc:148)
------. __subclasscheck__ (abc:156)
--------[  (abc:156)
--------]  (abc:156)  0.043ms
------. __subclasscheck__ (abc:160)
------. __subclasscheck__ (abc:165)
--------[ ABCMeta.__subclasses__ (abc:165)
--------] ABCMeta.__subclasses__ (abc:165)  0.045ms
------. __subclasscheck__ (abc:166)
--------[  (abc:166)
----------> __subclasscheck__ (abc:134)
----------. __subclasscheck__ (abc:137)
----------. __subclasscheck__ (abc:140)
----------. __subclasscheck__ (abc:144)
----------. __subclasscheck__ (abc:147)
------------[ ABCMeta.__subclasshook__ (abc:147)
------------] ABCMeta.__subclasshook__ (abc:147)  0.043ms
----------. __subclasscheck__ (abc:148)
----------. __subclasscheck__ (abc:156)
------------[  (abc:156)
------------]  (abc:156)  0.046ms
----------. __subclasscheck__ (abc:160)
----------. __subclasscheck__ (abc:165)
------------[ ABCMeta.__subclasses__ (abc:165)
------------] ABCMeta.__subclasses__ (abc:165)  0.043ms
----------. __subclasscheck__ (abc:166)
------------[  (abc:166)
--------------> __subclasscheck__ (abc:134)
--------------. __subclasscheck__ (abc:137)
--------------. __subclasscheck__ (abc:140)
--------------. __subclasscheck__ (abc:144)
--------------. __subclasscheck__ (abc:147)
----------------[ ABCMeta.__subclasshook__ (abc:147)
----------------] ABCMeta.__subclasshook__ (abc:147)  0.043ms
--------------. __subclasscheck__ (abc:148)
--------------. __subclasscheck__ (abc:156)
----------------[  (abc:156)
----------------]  (abc:156)  0.043ms
--------------. __subclasscheck__ (abc:160)
--------------. __subclasscheck__ (abc:165)
----------------[ ABCMeta.__subclasses__ (abc:165)
----------------] ABCMeta.__subclasses__ (abc:165)  0.042ms
--------------. __subclasscheck__ (abc:170)
----------------[ set.add (abc:170)
----------------] set.add (abc:170)  0.043ms
--------------. __subclasscheck__ (abc:171)
--------------< __subclasscheck__ (abc:171): False 1.690ms
------------]  (abc:166)  1.887ms
----------. __subclasscheck__ (abc:165)
----------. __subclasscheck__ (abc:170)
------------[ set.add (abc:170)
------------] set.add (abc:170)  0.042ms
----------. __subclasscheck__ (abc:171)
----------< __subclasscheck__ (abc:171): False 3.745ms
--------]  (abc:166)  3.952ms
------. __subclasscheck__ (abc:165)
------. __subclasscheck__ (abc:166)
--------[  (abc:166)
----------> __subclasscheck__ (abc:134)
----------. __subclasscheck__ (abc:137)
----------. __subclasscheck__ (abc:140)
----------. __subclasscheck__ (abc:144)
----------. __subclasscheck__ (abc:147)
------------[ ABCMeta.__subclasshook__ (abc:147)
------------] ABCMeta.__subclasshook__ (abc:147)  0.044ms
----------. __subclasscheck__ (abc:148)
----------. __subclasscheck__ (abc:156)
------------[  (abc:156)
------------]  (abc:156)  0.044ms
----------. __subclasscheck__ (abc:160)
----------. __subclasscheck__ (abc:165)
------------[ ABCMeta.__subclasses__ (abc:165)
------------] ABCMeta.__subclasses__ (abc:165)  0.045ms
----------. __subclasscheck__ (abc:166)
------------[  (abc:166)
--------------> __subclasscheck__ (abc:134)
--------------. __subclasscheck__ (abc:137)
--------------. __subclasscheck__ (abc:140)
--------------. __subclasscheck__ (abc:144)
--------------. __subclasscheck__ (abc:147)
----------------[ ABCMeta.__subclasshook__ (abc:147)
----------------] ABCMeta.__subclasshook__ (abc:147)  0.042ms
--------------. __subclasscheck__ (abc:148)
--------------. __subclasscheck__ (abc:156)
----------------[  (abc:156)
----------------]  (abc:156)  0.043ms
--------------. __subclasscheck__ (abc:160)
--------------. __subclasscheck__ (abc:165)
----------------[ ABCMeta.__subclasses__ (abc:165)
----------------] ABCMeta.__subclasses__ (abc:165)  0.043ms
--------------. __subclasscheck__ (abc:166)
----------------[  (abc:166)
------------------> __subclasscheck__ (abc:134)
------------------. __subclasscheck__ (abc:137)
------------------. __subclasscheck__ (abc:140)
------------------. __subclasscheck__ (abc:144)
------------------. __subclasscheck__ (abc:147)
--------------------[ ABCMeta.__subclasshook__ (abc:147)
--------------------] ABCMeta.__subclasshook__ (abc:147)  0.044ms
------------------. __subclasscheck__ (abc:148)
------------------. __subclasscheck__ (abc:156)
--------------------[  (abc:156)
--------------------]  (abc:156)  0.049ms
------------------. __subclasscheck__ (abc:160)
------------------. __subclasscheck__ (abc:165)
--------------------[ ABCMeta.__subclasses__ (abc:165)
--------------------] ABCMeta.__subclasses__ (abc:165)  0.044ms
------------------. __subclasscheck__ (abc:166)
--------------------[  (abc:166)
----------------------> __subclasscheck__ (abc:134)
----------------------. __subclasscheck__ (abc:137)
----------------------. __subclasscheck__ (abc:140)
----------------------. __subclasscheck__ (abc:144)
----------------------. __subclasscheck__ (abc:147)
------------------------[ ABCMeta.__subclasshook__ (abc:147)
------------------------] ABCMeta.__subclasshook__ (abc:147)  0.043ms
----------------------. __subclasscheck__ (abc:148)
----------------------. __subclasscheck__ (abc:156)
------------------------[  (abc:156)
------------------------]  (abc:156)  0.042ms
----------------------. __subclasscheck__ (abc:160)
----------------------. __subclasscheck__ (abc:165)
------------------------[ ABCMeta.__subclasses__ (abc:165)
------------------------] ABCMeta.__subclasses__ (abc:165)  0.042ms
----------------------. __subclasscheck__ (abc:170)
------------------------[ set.add (abc:170)
------------------------] set.add (abc:170)  0.042ms
----------------------. __subclasscheck__ (abc:171)
----------------------< __subclasscheck__ (abc:171): False 1.574ms
--------------------]  (abc:166)  1.772ms
------------------. __subclasscheck__ (abc:165)
------------------. __subclasscheck__ (abc:170)
--------------------[ set.add (abc:170)
--------------------] set.add (abc:170)  0.042ms
------------------. __subclasscheck__ (abc:171)
------------------< __subclasscheck__ (abc:171): False 4.394ms
----------------]  (abc:166)  4.592ms
--------------. __subclasscheck__ (abc:165)
--------------. __subclasscheck__ (abc:166)
----------------[  (abc:166)
------------------> __subclasscheck__ (abc:134)
------------------. __subclasscheck__ (abc:137)
------------------. __subclasscheck__ (abc:140)
------------------. __subclasscheck__ (abc:144)
------------------. __subclasscheck__ (abc:147)
--------------------[ ABCMeta.__subclasshook__ (abc:147)
--------------------] ABCMeta.__subclasshook__ (abc:147)  0.042ms
------------------. __subclasscheck__ (abc:148)
------------------. __subclasscheck__ (abc:156)
--------------------[  (abc:156)
--------------------]  (abc:156)  0.044ms
------------------. __subclasscheck__ (abc:160)
------------------. __subclasscheck__ (abc:165)
--------------------[ ABCMeta.__subclasses__ (abc:165)
--------------------] ABCMeta.__subclasses__ (abc:165)  0.044ms
------------------. __subclasscheck__ (abc:166)
--------------------[  (abc:166)
----------------------> __subclasscheck__ (abc:134)
----------------------. __subclasscheck__ (abc:137)
----------------------. __subclasscheck__ (abc:140)
----------------------. __subclasscheck__ (abc:144)
----------------------. __subclasscheck__ (abc:145)
----------------------< __subclasscheck__ (abc:145): False 0.350ms
--------------------]  (abc:166)  0.553ms
------------------. __subclasscheck__ (abc:165)
------------------. __subclasscheck__ (abc:170)
--------------------[ set.add (abc:170)
--------------------] set.add (abc:170)  0.043ms
------------------. __subclasscheck__ (abc:171)
------------------< __subclasscheck__ (abc:171): False 2.682ms
----------------]  (abc:166)  2.876ms
--------------. __subclasscheck__ (abc:165)
--------------. __subclasscheck__ (abc:170)
----------------[ set.add (abc:170)
----------------] set.add (abc:170)  0.042ms
--------------. __subclasscheck__ (abc:171)
--------------< __subclasscheck__ (abc:171): False 9.633ms
------------]  (abc:166)  9.855ms
----------. __subclasscheck__ (abc:165)
----------. __subclasscheck__ (abc:166)
------------[  (abc:166)
--------------> __subclasscheck__ (abc:134)
--------------. __subclasscheck__ (abc:137)
--------------. __subclasscheck__ (abc:140)
--------------. __subclasscheck__ (abc:144)
--------------. __subclasscheck__ (abc:147)
----------------[ ABCMeta.__subclasshook__ (abc:147)
----------------] ABCMeta.__subclasshook__ (abc:147)  0.042ms
--------------. __subclasscheck__ (abc:148)
--------------. __subclasscheck__ (abc:156)
----------------[  (abc:156)
----------------]  (abc:156)  0.043ms
--------------. __subclasscheck__ (abc:160)
--------------. __subclasscheck__ (abc:165)
----------------[ ABCMeta.__subclasses__ (abc:165)
----------------] ABCMeta.__subclasses__ (abc:165)  0.043ms
--------------. __subclasscheck__ (abc:170)
----------------[ set.add (abc:170)
----------------] set.add (abc:170)  0.042ms
--------------. __subclasscheck__ (abc:171)
--------------< __subclasscheck__ (abc:171): False 1.562ms
------------]  (abc:166)  1.755ms
----------. __subclasscheck__ (abc:165)
----------. __subclasscheck__ (abc:166)
------------[  (abc:166)
--------------> __subclasscheck__ (abc:134)
--------------. __subclasscheck__ (abc:137)
--------------. __subclasscheck__ (abc:140)
--------------. __subclasscheck__ (abc:144)
--------------. __subclasscheck__ (abc:147)
----------------[ ABCMeta.__subclasshook__ (abc:147)
----------------] ABCMeta.__subclasshook__ (abc:147)  0.043ms
--------------. __subclasscheck__ (abc:148)
--------------. __subclasscheck__ (abc:156)
----------------[  (abc:156)
----------------]  (abc:156)  0.043ms
--------------. __subclasscheck__ (abc:160)
--------------. __subclasscheck__ (abc:165)
----------------[ ABCMeta.__subclasses__ (abc:165)
----------------] ABCMeta.__subclasses__ (abc:165)  0.042ms
--------------. __subclasscheck__ (abc:170)
----------------[ set.add (abc:170)
----------------] set.add (abc:170)  0.043ms
--------------. __subclasscheck__ (abc:171)
--------------< __subclasscheck__ (abc:171): False 1.569ms
------------]  (abc:166)  1.772ms
----------. __subclasscheck__ (abc:165)
----------. __subclasscheck__ (abc:166)
------------[  (abc:166)
--------------> __subclasscheck__ (abc:134)
--------------. __subclasscheck__ (abc:137)
--------------. __subclasscheck__ (abc:140)
--------------. __subclasscheck__ (abc:144)
--------------. __subclasscheck__ (abc:147)
----------------[ ABCMeta.__subclasshook__ (abc:147)
----------------] ABCMeta.__subclasshook__ (abc:147)  0.042ms
--------------. __subclasscheck__ (abc:148)
--------------. __subclasscheck__ (abc:156)
----------------[  (abc:156)
----------------]  (abc:156)  0.043ms
--------------. __subclasscheck__ (abc:160)
--------------. __subclasscheck__ (abc:165)
----------------[ ABCMeta.__subclasses__ (abc:165)
----------------] ABCMeta.__subclasses__ (abc:165)  0.042ms
--------------. __subclasscheck__ (abc:170)
----------------[ set.add (abc:170)
----------------] set.add (abc:170)  0.042ms
--------------. __subclasscheck__ (abc:171)
--------------< __subclasscheck__ (abc:171): False 1.647ms
------------]  (abc:166)  1.842ms
----------. __subclasscheck__ (abc:165)
----------. __subclasscheck__ (abc:170)
------------[ set.add (abc:170)
------------] set.add (abc:170)  0.043ms
----------. __subclasscheck__ (abc:171)
----------< __subclasscheck__ (abc:171): False 18.252ms
--------]  (abc:166)  18.443ms
------. __subclasscheck__ (abc:165)
------. __subclasscheck__ (abc:166)
--------[  (abc:166)
----------> __subclasscheck__ (abc:134)
----------. __subclasscheck__ (abc:137)
----------. __subclasscheck__ (abc:140)
----------. __subclasscheck__ (abc:144)
----------. __subclasscheck__ (abc:147)
------------[ ABCMeta.__subclasshook__ (abc:147)
------------] ABCMeta.__subclasshook__ (abc:147)  0.043ms
----------. __subclasscheck__ (abc:148)
----------. __subclasscheck__ (abc:156)
------------[  (abc:156)
------------]  (abc:156)  0.044ms
----------. __subclasscheck__ (abc:160)
----------. __subclasscheck__ (abc:165)
------------[ ABCMeta.__subclasses__ (abc:165)
------------] ABCMeta.__subclasses__ (abc:165)  0.044ms
----------. __subclasscheck__ (abc:166)
------------[  (abc:166)
--------------> __subclasscheck__ (abc:134)
--------------. __subclasscheck__ (abc:137)
--------------. __subclasscheck__ (abc:140)
--------------. __subclasscheck__ (abc:144)
--------------. __subclasscheck__ (abc:147)
----------------[ ABCMeta.__subclasshook__ (abc:147)
----------------] ABCMeta.__subclasshook__ (abc:147)  0.044ms
--------------. __subclasscheck__ (abc:148)
--------------. __subclasscheck__ (abc:156)
----------------[  (abc:156)
----------------]  (abc:156)  0.043ms
--------------. __subclasscheck__ (abc:160)
--------------. __subclasscheck__ (abc:165)
----------------[ ABCMeta.__subclasses__ (abc:165)
----------------] ABCMeta.__subclasses__ (abc:165)  0.044ms
--------------. __subclasscheck__ (abc:166)
----------------[  (abc:166)
------------------> __subclasscheck__ (abc:134)
------------------. __subclasscheck__ (abc:137)
------------------. __subclasscheck__ (abc:140)
------------------. __subclasscheck__ (abc:144)
------------------. __subclasscheck__ (abc:147)
--------------------[ ABCMeta.__subclasshook__ (abc:147)
--------------------] ABCMeta.__subclasshook__ (abc:147)  0.045ms
------------------. __subclasscheck__ (abc:148)
------------------. __subclasscheck__ (abc:156)
--------------------[  (abc:156)
--------------------]  (abc:156)  0.043ms
------------------. __subclasscheck__ (abc:160)
------------------. __subclasscheck__ (abc:165)
--------------------[ ABCMeta.__subclasses__ (abc:165)
--------------------] ABCMeta.__subclasses__ (abc:165)  0.043ms
------------------. __subclasscheck__ (abc:170)
--------------------[ set.add (abc:170)
--------------------] set.add (abc:170)  0.043ms
------------------. __subclasscheck__ (abc:171)
------------------< __subclasscheck__ (abc:171): False 1.624ms
----------------]  (abc:166)  1.867ms
--------------. __subclasscheck__ (abc:165)
--------------. __subclasscheck__ (abc:170)
----------------[ set.add (abc:170)
----------------] set.add (abc:170)  0.041ms
--------------. __subclasscheck__ (abc:171)
--------------< __subclasscheck__ (abc:171): False 3.866ms
------------]  (abc:166)  4.063ms
----------. __subclasscheck__ (abc:165)
----------. __subclasscheck__ (abc:170)
------------[ set.add (abc:170)
------------] set.add (abc:170)  0.043ms
----------. __subclasscheck__ (abc:171)
----------< __subclasscheck__ (abc:171): False 5.968ms
--------]  (abc:166)  6.159ms
------. __subclasscheck__ (abc:165)
------. __subclasscheck__ (abc:170)
--------[ set.add (abc:170)
--------] set.add (abc:170)  0.042ms
------. __subclasscheck__ (abc:171)
------< __subclasscheck__ (abc:171): False 31.110ms
----< __instancecheck__ (abc:130): False 32.160ms
--]  (_cprequest:791)  32.350ms

11/19/10

Permalink 01:08:45 am, by fumanchu Email , 1007 words   English (US)
Categories: Python, CherryPy

logging.statistics

Statistics about program operation are an invaluable monitoring and debugging tool. How many requests are being handled per second, how much of various resources are in use, how long we've been up. Unfortunately, the gathering and reporting of these critical values is usually ad-hoc. It would be nice if we had 1) a centralized place for gathering statistical performance data, 2) a system for extrapolating that data into more useful information, and 3) a method of serving that information to both human investigators and monitoring software. I've got a proposal. Let's examine each of those points in more detail.

Data Gathering

Just as Python's logging module provides a common importable for gathering and sending messages, statistics need a similar mechanism, and one that does not require each package which wishes to collect stats to import a third-party module. Therefore, we choose to re-use the logging module by adding a statistics object to it.

That logging.statistics object is a nested dict:

import logging
if not hasattr(logging, 'statistics'): logging.statistics = {}

It is not a custom class, because that would 1) require apps to import a third-party module in order to participate, 2) inhibit innovation in extrapolation approaches and in reporting tools, and 3) be slow. There are, however, some specifications regarding the structure of the dict.

    {
   +----"SQLAlchemy": {
   |        "Inserts": 4389745,
   |        "Inserts per Second":
   |            lambda s: s["Inserts"] / (time() - s["Start"]),
   |  C +---"Table Statistics": {
   |  o |        "widgets": {-----------+
 N |  l |            "Rows": 1.3M,      | Record
 a |  l |            "Inserts": 400,    |
 m |  e |        },---------------------+
 e |  c |        "froobles": {
 s |  t |            "Rows": 7845,
 p |  i |            "Inserts": 0,
 a |  o |        },
 c |  n +---},
 e |        "Slow Queries":
   |            [{"Query": "SELECT * FROM widgets;",
   |              "Processing Time": 47.840923343,
   |              },
   |             ],
   +----},
    }

The logging.statistics dict has strictly 4 levels. The topmost level is nothing more than a set of names to introduce modularity. If SQLAlchemy wanted to participate, it might populate the item logging.statistics['SQLAlchemy'], whose value would be a second-layer dict we call a "namespace". Namespaces help multiple emitters to avoid collisions over key names, and make reports easier to read, to boot. The maintainers of SQLAlchemy should feel free to use more than one namespace if needed (such as 'SQLAlchemy ORM').

Each namespace, then, is a dict of named statistical values, such as 'Requests/sec' or 'Uptime'. You should choose names which will look good on a report: spaces and capitalization are just fine.

In addition to scalars, values in a namespace MAY be a (third-layer) dict, or a list, called a "collection". For example, the CherryPy StatsTool keeps track of what each worker thread is doing (or has most recently done) in a 'Worker Threads' collection, where each key is a thread ID; each value in the subdict MUST be a fourth dict (whew!) of statistical data about
each thread. We call each subdict in the collection a "record". Similarly, the StatsTool also keeps a list of slow queries, where each record contains data about each slow query, in order.

Values in a namespace or record may also be functions, which brings us to:

Extrapolation

def extrapolate_statistics(scope):
    """Return an extrapolated copy of the given scope."""
    c = {}
    for k, v in scope.items():
        if isinstance(v, dict):
            v = extrapolate_statistics(v)
        elif isinstance(v, (list, tuple)):
            v = [extrapolate_statistics(record) for record in v]
        elif callable(v):
            v = v(scope)
        c[k] = v
    return c

The collection of statistical data needs to be fast, as close to unnoticeable as possible to the host program. That requires us to minimize I/O, for example, but in Python it also means we need to minimize function calls. So when you are designing your namespace and record values, try to insert the most basic scalar values you already have on hand.

When it comes time to report on the gathered data, however, we usually have much more freedom in what we can calculate. Therefore, whenever reporting tools fetch the contents of logging.statistics for reporting, they first call extrapolate_statistics (passing the whole statistics dict as the only argument). This makes a deep copy of the statistics dict so that the reporting tool can both iterate over it and even change it without harming the original. But it also expands any functions in the dict by calling them. For example, you might have a 'Current Time' entry in the namespace with the value "lambda scope: time.time()". The "scope" parameter is the current namespace dict (or record, if we're currently expanding one of those instead), allowing you access to existing static entries. If you're truly evil, you can even modify more than one entry at a time.

However, don't try to calculate an entry and then use its value in further extrapolations; the order in which the functions are called is not guaranteed. This can lead to a certain amount of duplicated work (or a redesign of your schema), but that's better than complicating the spec.

After the whole thing has been extrapolated, it's time for:

Reporting

A reporting tool would grab the logging.statistics dict, extrapolate it all, and then transform it to (for example) HTML for easy viewing, or JSON for processing by Nagios etc (and because JSON will be a popular output format, you should seriously consider using Python's time module for datetimes and arithmetic, not the datetime module). Each namespace might get its own header and attribute table, plus an extra table for each collection. This is NOT part of the statistics specification; other tools can format how they like.

Turning Collection Off

It is recommended each namespace have an "Enabled" item which, if False, stops collection (but not reporting) of statistical data. Applications SHOULD provide controls to pause and resume collection by setting these entries to False or True, if present.

Usage

    import logging
    # Initialize the repository
    if not hasattr(logging, 'statistics'): logging.statistics = {}
    # Initialize my namespace
    mystats = logging.statistics.setdefault('My Stuff', {})
    # Initialize my namespace's scalars and collections
    mystats.update({
        'Enabled': True,
        'Start Time': time.time(),
        'Important Events': 0,
        'Events/Second': lambda s: (
            (s['Important Events'] / (time.time() - s['Start Time']))),
        })
    ...
    for event in events:
        ...
        # Collect stats
        if mystats.get('Enabled', False):
            mystats['Important Events'] += 1

09/22/10

Permalink 03:36:05 pm, by fumanchu Email , 1233 words   English (US)
Categories: IT, Python

A replacement for sessions

I'm tired of sessions. They lock for too long, reducing concurrency, and in my current case, don't fail gracefully when a request takes longer than the session timeout.

Problem: Session locks

Session implementations typically lock very near the beginning of a request, and unlock near the end of a request. They tend to do this even if the current request handler does no writing to the session. Why so aggressive? Because the typical test case trotted out for sessions is that of a page hit counter: session.counter += 1. What if the user opens two tabs pointing at the same page at once? The count might be off by one!

But if you don't do any counting, what's the benefit of such aggressive, synchronous locking? What we could really use is a system that used atomic commits instead of large, pessimistic locks.

Problem: Session timeouts

Sessions are often used for sites with thousands, even millions, of users. When any one of those users walks away from their computer, the servers usually try to free up resources by expiring any such inactive sessions. But lots of my admin-y sites have a few dozen users, not thousands. I'm just not that concerned with expiration of session state. I'm a little bit concerned, still, with cookies, so I still want to expire auth tokens. But there's no need to aggressively expire user data. But I find my current apps are so aggressive at expiring data that we frequently get errors in production where request A locked the session, and while it was processing a large job, request B locked the session because A was taking too long. B finishes normally, but then A chokes because it had the session lock forcibly taken away from it. Not fun.

What we could really use is a system that allows tokens to expire, or be reused concurrently, without forcing user data to expire or other, concurrent processes to choke.

Problem: Session conflation

Sessions are used for more than one kind of data. In my current apps, it's used to store:

  1. Cookie tokens. In fact, the session id is the cookie name.
  2. Common user information, like user id, name, and permissions, and
  3. Workflow state, such as when a user builds up an action over multiple pages using multiple forms.

The problem is that each of these three kinds of data has a different lifecycle. The session id tends to get recreated often as sessions and cookies time out (taking all of the rest of the data with it). The user info tends to change very rarely, being nearly read-only, but is often read on every page request (for example, to display the user's name in a corner, or to apply the user's timezone to time output). Workflow data, in contrast, persists for a few seconds or minutes as the user completes a particular task, and is then discardable at the end of the process; it never needs concurrency isolation, because the user is working synchronously through a single task.

Sessions traditionally lump all of these together into a single bag of attributes, and place the entire bag under a single large lock. What we could really use is a solution that had finer-grained control over locking for each kind of data, even for each kind of info or workflow!

Solution: Slates

We can achieve all of the above by abandoning sessions. Let's face it: sessions were cool when they were invented but they're showing their age. And rather than try to patch them up and keep calling them "sessions", I'm inventing something new: "slates".

I'm implementing slates in MongoDB, but you don't have to in order to get the benefits of slates. All you need is some sort of storage that uses atomic commits, and that allows you to partition such that you have a moderate number of "collections" (one for each user, plus a special "_auth" collection), and a moderate number of "documents" (one for each use case) in each collection. Let's look at an example:


$ mongo
MongoDB shell version: 1.6.2
connecting to: 127.0.0.1/test
> use slates
switched to db slates
> show collections
_auth
admin
> db.admin.find()
{ "_id" : "user", "userid" : 999, "readonly" : false,
  "timezone" : null, "panels" : [
    [1, "pollingpoint"],
    [2, "unsampled"],
    [4, "test_redirect"],
    [6, "test_redirect_manual"]
], "staff" : true }
{ "_id" : "new_id_set", "name" : "My set",
  "ids" : [ 84095, 3943, 39845, 112, 9458, ... ] }

As you can see, there is a collection for the username "admin". It contains 2 documents.

User info

The first returned document is what I called "user info" above: things most pages want to know about the logged-in user. They're read for almost every request but changed hardly ever, and when they're read, it's very near the beginning of the request. Here's the Python code I use to grab the whole document:

request.user = Slate(username).user

...which is API sugar for:

request.user = pool.slates[username].find_one('user') or {}

Most pages perform this quick read and never write it back.

Workflow data

The second document returned above is workflow data for a domain-specific process I called 'new_id_set': the user uploads a large number of id's in a CSV file and gives them a name. But if there are problems with a few of the id's, we want to ask the user whether to discard the conflicts or continue anyway. But we don't want to go making records in our Postgres database tables until the numbers are confirmed, and it's prohibitive to have the client upload the same file again after confirmation. So we need a temporary place to stick this data while the user is in the middle of the activity.

Slates to the rescue! Unlike sessions, which tend to dump all their data into a single big bag, when we use slates we store our data in multiple 'bags'. That means that our user can upload their ids, be prompted for confirmation, go elsewhere to investigate the conflicts further, and come back and confirm the ids. The time they spend investigating incurs no performance penalty, because those pages don't load and re-save the 'new_id_set' slate--only the pages directly concerned with that particular slate do. Once the user has confirmed the upload, the slate is deleted.

Auth tokens

Most of the use cases for slates fit nicely into "user slates"; that is, a collection that is identified by the user's username. But when you receive an auth token in a cookie, how do you match it to a username so you can look up the slate?

The answer is to create a special, global slate which I named "_auth" in my implementation. You can name it whatever you like. This collection contains a map from tokens to usernames:


> db._auth.find()
{ "_id" : "abcdef09345", "token" : "94ee8f572",
  "username" : "admin",
  "expires" : "Wed Sep 22 2010 13:39:51 GMT-0700 (PDT)"}

When a user visits a page, their token is searched for in the "_auth" collection, the username is retrieved, and that value is stored for the request. Typically, their "user info" slate is then retrieved. Finally, if they are visiting a page that participates in a slate-based workflow, that slate is retrieved (and saved if any changes are made).

Conclusion

Slates provide finer-grained locking than sessions in order to meet the varying needs of auth tokens, user info, and workflow data. They lock for much shorter durations, over smaller scopes, and take advantage of the native atomicity of the storage layer (MongoDB, in my case) allowing much more parallelism between requests.

09/01/10

Permalink 02:26:27 pm, by fumanchu Email , 130 words   English (US)
Categories: IT

Shoji Catalog Protocol version 2

I've updated the Shoji Catalog Protocol to draft version 02. See http://www.aminus.org/rbre/shoji/shoji-draft-02.txt

The only significant change is that shojiCatalogs, shojiFragments, and shojiViews elements now use an object instead of an array for their IRI's. That is, instead of:

{"element": "shoji:catalog",
 "self": "http://example.org/users",
 "catalogs": ["bills", "sellers", "sellers{?sold_count}"],
}

one would now write something like:

{"element": "shoji:catalog",
 "self": "http://example.org/users",
 "catalogs": {"bills": "bills",
              "sellers": "sellers",
              "sellers by sold count": "sellers{?sold_count}"
              },
}

This allows clients to bind to a more meaningful name across varying documents rather than a potentially opaque and varying URI. In this way, the names function somewhat like link relation types (e.g. the "rel" attributes in HTML, or the relation types in Link headers).

03/11/10

Permalink 08:29:40 pm, by fumanchu Email , 11 words   English (US)
Categories: IT, Python, CherryPy

Zen of CherryPy video

My PyCon 2010 talk video is up. Enjoy: The Zen of CherryPy

10/17/09

Permalink 10:52:56 am, by fumanchu Email , 6 words   English (US)
Categories: General

On Optima

Link: http://www.youtube.com/watch?v=sdGfh8a98jw

"Optimum" is for machines, not organisms.

09/20/09

Permalink 09:25:39 pm, by fumanchu Email , 136 words   English (US)
Categories: General

Dear Lionsgate,

I like your movies. Forbidden Kingdom was great, except for the whiny kid. I just bought it on DVD, even though it's been out for months, and I wanted to clear up why. It's not laziness. It's not the economy; I'm doing fine.

No, I held off buying it because, for a long time, all your copies on the shelf at Fry's and Best Buy included a "bonus digital copy". I can't support that. First, because the initial "D" in "DVD" does not stand for "Analog", but more importantly, because my laptop plays DVD's just fine, repeatedly, without any limited-viewing "bonuses" needed. You must be fooling somebody with that, if the great P.T. Barnum has any say, but not me. Please stop doing stupid things like that so I can give you more money sooner.

1 2 3 4 5 6 7 8 9 10 11 ... 26 >>

October 2014
Sun Mon Tue Wed Thu Fri Sat
 << <   > >>
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  

Search

The requested Blog doesn't exist any more!

XML Feeds

open source blog software