Categories: IT, Architecture, Linnaeus Award, Python, Cation, CherryPy, Dejavu, WHELPS, WSGI, Robotics and Engineering

Pages: << 1 2 3 4 5 6 7 8 9 10 11 ... 17 >>

12/30/07

Permalink 09:05:32 pm, by fumanchu Email , 1036 words   English (US)
Categories: IT

Memcached set invalidation

One of the saving graces of Memcached is its use of a stable, static address space; that is, client X and client Y can each manipulate datum D as long as they both know its key. But because the space of addresses is static and flat (not hierarchical), it also tends to be huge, and sparse. This can make it difficult to perform set operations, such as invalidation of a class of entries.

For example, let's design a Data Access component which sits in front of a database, and accepts requests using an RPC-style interface. It fetches results from the cache where possible; otherwise, it reads from a database and writes the result to the cache before returning it to the caller. Assume we have multiple Data Access (DAL) servers, multiple Memcached servers, and one or more database servers.

A common set-invalidation scenario involves the caching of lists of items. Let's suppose a web server requests get_items_in_cart(cart_id=13, limit=10, offset=30). The DAL server might translate this into a cache key such as get_items_in_cart::13:10:30. So far so good. But that's just a read; when we add writes to the picture, cache coherency starts to become a problem.

When a webserver asks a DAL component to add an item to cart 13, we need to invalidate (or recalculate) get_items_in_cart::13:10:30. However, it should be readily apparent that we need to invalidate not just a single key, but potentially a whole class of keys, get_items_in_cart::id:limit:offset, and that that set of keys could be very large. Let's conservatively guess:

>>> max_items = 1000
>>> sum([(max_items / n) for n in range(5, 21)])
1507

That is, if we restrict the 'limit' argument to the interval [5, 20], and assume a maximum number of items per cart of 1024, we end up with over 1500 potential cache keys per cart for a single function! And if we have a million carts...? There are several ways we could attack this problem:

  1. Restrict the interface. Don't parameterize functions unless absolutely necessary. This would mean, rather than expose get_items_in_cart(cart_id, limit, offset), which has 3 arguments, we might instead always assume a page size of 10 and expose get_paged_items_in_cart(cart_id, page), which would result in only 100 potential cache keys per cart, not 1500. We might even go further and expose get_items_in_cart(cart_id) (1 cache key per cart) and let the caller do its own paging. This makes cache invalidation more performant (fewer keys to manage) at the cost of API simplicity and extensibility (because the API is less flexible). So your programmers will scream, possibly silently, when the product team decides they want to change the number of items on a page.
  2. Use an address registry. This approach introduces a separate directory service; when the result of get_items_in_cart(cart_id, limit, offset) is cached, the address is registered in a central list. When a class of keys needs to be invalidated, that central registry can then iterate over all seen keys for each class quickly and stably (or return the list so the caller can do it). This can reduce performance with the extra network traffic, is not very scalable, and tacks on a new reliability issue (the directory is a single point of failure, which costs even more performance if made redundant). Also, your system architects will probably leave you threatening notes.
  3. Broadcast address hits. This is like #2, but broadcasts address registration to all DAL servers instead of using a central registry. Network traffic is much increased, and reliably updating all participating nodes can be quite a chore. Your network admins will most likely start randomly unplugging your office's LAN and phones, and then surreptitiously plug them back in when you knock on their door.
  4. Hash the addresses. If the consumers of our API tend to call only a few combinations of arguments for our cart function, then although we have a large potential space of cache keys, the set of actual keys might be very small, maybe only 2 or 3 keys! The classic means of reducing a large, sparse array to a small, denser one is hashing. That is, rather than using the key get_items_in_cart::13:10:30, we might hash it to get_items_in_cart::13/x, where 'x' is a number from 1 to 4. If we could come up with a reliable hash, we could then invalidate 4 keys, not 1500. This approach can become very complex, and results in cache keys which aren't very transparent (so getting other components to interact with the system is harder). But it doesn't require any extra network calls or new components; each DAL server does its own hash calculations. Your VP of Engineering may stick you with the bill for all those new engineers with Math Ph.D.'s, though.
  5. Rely on timeouts. If you don't mind web servers serving stale data for get_items_in_cart::13:10:30 for, say, 2 minutes, then tell memcached to expire that entry after 2 minutes, and don't bother trying to invalidate it on write. This is by far the simplest solution when you can swing it. But your system operators or QA team might call you first when they have to wait for one too many timeouts during a 3 A.M. debugging session.
  6. Add a version column to the carts table, and use it when forming cache keys. When you update the row, update the version. This can increase the number of database hits if not done properly, but in this example, your callers most likely have already fetched the entire relevant carts row, and might be able to pass the version in the DAL call. However, don't be surprised when your DBA's tell you the HR DB crashed and they lost your timesheets for the past year.

That's all I can think of at the moment. I think I've got some political goodwill among our other programmers at the moment (in that they don't wish me any specific harm, yet), so I may go with solution #1 for now on my current project. But I've also recently been playing architect, so maybe I'll pick solution #2 and just throw away any threatening notes I leave for myself.

12/24/07

Permalink 07:32:22 pm, by fumanchu Email , 70 words   English (US)
Categories: IT, General

Complexity: the mismatch between needs, does, and wants

ocean has rapidly become my favorite blogger. I even find myself reading nearly every one of his quick links. Here's a gem from a few days ago:

This mismatch between what a person wants, what a tool does, and what a person needs turns out to be very important. It's so important that it has a special name: complexity.

11/12/07

Permalink 11:30:19 am, by fumanchu Email , 1791 words   English (US)
Categories: IT, Python, Dejavu

Memcached indices

It seems lots of people are using memcached to cache both a set of objects (each with their own key), and also various lists of those objects using different keys. For example, a retailer might cache Item objects, but also want to cache the list of Items in a given Category. The SQL before memcached might look like this:

SELECT * FROM Item WHERE CategoryID = 5;

..whereas with memcached mixed in, the I/O tends to look like this (against an empty cache):

get Item:CategoryID=5
END
SELECT ID FROM Item WHERE CategoryID = 5;
set Item:CategoryID=5 1 300 19
1111,2222,3333,4444
STORED
get Item:1111
END
get Item:2222
END
get Item:3333
END
get Item:4444
END
SELECT * FROM Item WHERE ID IN (1111, 2222, 3333, 4444)
set Item:1111 1 300 58
STORED
set Item:2222 1 300 58
STORED
set Item:3333 1 300 54
STORED
set Item:4444 1 300 80
STORED

That is, fetch the list of ID's from the cache; if not present, fetch it from the DB and store it in the cache (the "300" in the above examples means, "expire in 300 seconds"). Then iterate over the list of ID's and try to fetch each whole object from cache; if any miss, ask the DB for them (in as few operations as possible) and store them in the cache.

Once both the objects and the list-of-id's are cached, subsequent calls to a hypothetical get_items_by_category function should look like this:

get Item:CategoryID=5
sending key Item:CategoryID=5
END
get Item:1111
sending key Item:1111
END
get Item:2222
sending key Item:2222
END
get Item:3333
sending key Item:3333
END
get Item:4444
sending key Item:4444
END

But what happens when you move Item 3333 from Category 5 to Category 6? There are three possibilities:

  1. Your timeouts are reasonable. In this case, you're OK with some ambiguity; specifically, you're OK that all clients will continue to see Item 3333 in the old Category list for up to 300 seconds. If you ask the Item object directly for its Category, you'll get "6", but if you ask for all Items in Category 5, it'll still show up in the "wrong" category. It might even show up in both Category lists, if you refetch Item:Category=6 from the DB before the cached Item:Category=5 list expires.
  2. Your timeouts are unreasonable. You thought 300 seconds was a good expiration time because that's the minimum you could get away with to sufficiently reduce load on your poor single-point-of-failure DB. In this case, you're not comfortable with the ambiguity, so you start adding invalidation calls to clean out the cached lists sooner. Lots of them. Spread out all over your application.
  3. Your timeouts are 0, meaning "never expire". This is like 2, but means your clients will probably never hit the DB on their own--you MUST explicitly add code to update both the cached objects and the cached lists whenever an object changes, or expect an unusable site.

If you're happy with option 1, great! The rest of this discussion probably isn't for you. I'm going to explore three solutions (only one of which I'm happy with) for cases 2 and 3.

Database Indices

So you're not happy with the expiration time of your cached lists, so you've built all that invalidation code. What you may not realize is that you've just reinvented (badly) something databases have had for decades: indices.

An index is usually implemented with a B+-tree, most of the details of which are unimportant for us. What is important is that 1) an index covers a subset of the columns in the table (often a single column), which from now on I'm going to call the index criteria, and 2) each distinct combination of values for the index criteria has its own leaf node in the tree, which contains/points to a list of rows in the table that match that combination. What a mouthful. Another way to say it is that the leaf nodes in the tree look like this for an index over Item.CategoryID:

(2,): [9650, 2304, 22, 50888]
(3,): [323, 3000, 243246, 87346, 6563, 8679]
(5,): [1111, 2222, 3333, 4444]
(6,): [18]

When you ask the database for all Items with a CategoryID of 5, the database traverses the "CategoryID" index tree, finds the leaf node for "5", grabs the list stored at that node, then iterates over the list and yields each full object mentioned therein. This is called an "index scan":

# EXPLAIN SELECT * FROM Items WHERE CategoryID = 5;
                                            QUERY PLAN
--------------------------------------------------------------------------------------------------
 Index Scan using items_categoryid on Items  (cost=0.00..29.49 rows=609 width=24)
   Index Cond: (CategoryID = 5)

Sound familiar? It's exactly what we're doing by hand with memcached.

Memcached Indices

Okay, it's not exactly like our memcached example. There are some striking differences.

First, a database index is sparse, in the sense that it doesn't contain leaf nodes for every potential index criteria value, just the concrete values in the table. Our memcached indexing is even sparser: so far it only contains leaf nodes (lists of ID's) for each index scan we've run. If we've only asked for Items in Categories 2 and 5, memcached will only contain nodes for Item:CategoryID=2 and Item:CategoryID=5.

Second, a database index is a full tree, with a root node. What we've done so far in memcached is only store leaf nodes. This will bite us in a moment.

Third, a database index is naturally transactional. When you move Item 3333 from Category 5 to 6, you might execute the SQL statement "UPDATE Item SET CategoryID = 6 WHERE ID = 3333;". The database, being the sole arbiter of Truth, can lock the row, read the old value, lock the index leaf nodes, remove the row from the old leaf node and add it to the new one, write the new value to the table, and unlock, all within a fully-logged transaction (although it can be a lot more complicated than that with ranged index locks, log schemes, and page reallocation schemes). Our memcached implementation so far can't do any of that.

Combining the above differences, we get...a real mess. Specifically, we have to find a way to do all of those index updates.

One way would be to invalidate everything. Call flush_all and be done with it. This can work (poorly) if you've partitioned your data well over multiple memcached servers, but Woe Unto Thee if you're storing, say, cached HTML on the same node.

Another, narrower, solution would be to try to delete all known cached lists for the given criteria. One way to do that would be to attempt to maintain the whole index tree in memcached, not just the leaf nodes. This turns out to be a thorny problem because of the transitory nature of memcached data--what happens when you lose an intermediate node in the index tree? or the root node? You'll fail to update the now-unreachable leaf nodes, but clients will still fetch them and get the stale results.

An even narrower solution would be to try to update just two index leaf nodes for each updated object. For example, if we move Item 3333 from Category 5 to 6, we could try to remove the Item from leaf node "5" and add it to leaf node "6". This can work, but requires keeping the old values around for all indexed columns in your application layer, which a lot of modern DALs and ORMs don't do by default.

I was stuck at this point for a couple of days, until I had an epiphany:

Add eagerly, remove lazily

Recall if you will what I said above about the transactional nature of database indices. The DB can remove the row from the old index node(s) and add it to the new index node(s) in one step. We already know we can't do that with memcached, since the "get" and "set" operations aren't atomic anyway (although check-and-set can work around this; see 'cas' in protocol.txt.)

Transactions exist to maintain data integrity; to move The Data from one valid state to another valid state. But don't confuse the technique for the phenomena that technique is trying to prevent. In the case of indices, we use transactions on indices to avoid both 1) reading an object that does not meet the index criteria for the given node, or 2) failing to read an object that does meet the criteria. Databases avoid both scenarios by adding and removing rows from the index atomically.

When dealing with index nodes in memcached, however, the best approach is to separate the two phenomena by adding eagerly and removing lazily:

  • Add eagerly: when an object (row) is flushed from the session, add it to any indices for that class (table). That is, whenever you "UPDATE Item SET CategoryID = 6 WHERE ID = 3333", you also cache the object (you were doing that already anyway), but you also append the ID to the list stored at Item:CategoryID=6.
  • Remove lazily: when you UPDATE you do not try to remove the item from the old index. Instead, you let the next client that reads Item:CategoryID=5 do that when it iterates over the objects in the list. If any objects no longer meet the index node criteria (CategoryID=5), they are removed from the list, and the client sets the revised index node back into the cache.

Shortcomings

There are several:

  1. A client might recall the index node but not the objects in its list. For example, a web page might just need hyperlinks to each object, which it tries to generate without recalling the entire set of objects. This technique succeeds in the fact that it doesn't fetch more data than the app asked for, but fails if stale objects are still listed in the index node. The only solutions I can see are to either disallow this sort of index read or to fall back on sane timeouts again.
  2. This doesn't address order, limit, or offset explicitly. Simple order queries should be no problem, but if you want the top N items you must either include the 'order by' columns in your index, or read in all items in the index, filter out the stale objects, sort them, and perform the limit/offset yourself.
  3. Functionally decomposing SQL (or whatever QL your ORM might use) into potential index criteria is not a simple task. If you're using a single table.column = value expression, it maps well, but using joins, boolean operators (like and, or, not) or other arithmetic operators makes it difficult. But then, the major databases have the same issues.
  4. Whatever you all leave in the comments.

Conclusion

I think this is worth a shot. I'm adding it to Dejavu in a branch, but it's complicated enough that it may not be done for a bit. Comments and suggestions on the approach welcome.

11/03/07

Permalink 05:33:51 pm, by fumanchu Email , 169 words   English (US)
Categories: Python

pickle.dumps not suitable for hashing

For reasons I don't have time to fully explore, once in a great while pickle.dumps(obj) doesn't produce consistent strings on successive runs. Here's the one that bit me today:

((2, 'Tiger River'), "(I2\nS'Tiger River'\np1\ntp2\n.", '23ca69094eb994abc75cdec989d22398')
((2, 'Tiger River'), "(I2\nS'Tiger River'\ntp1\n.", '2b40ffb53a0be2c4cfe4f99b24d64842')

The first item in each set is the object being pickled. The second object is the result of pickle.dumps(obj). Note the same object is pickled to two different strings on distinct runs. No idea why.

This is important enough to blog about because the third item in each set above is an md5 hash of the pickle, which is how I discovered this--my attempts to recover memcached objects using keys from md5.new(pickle.dumps(object)).hexdigest() failed because the pickles differed. So until some with more time and brains than I comments on why, I recommend you don't use pickle.dumps to create md5 seeds.

10/02/07

Permalink 12:55:58 am, by fumanchu Email , 159 words   English (US)
Categories: CherryPy, WSGI

CherryPy 3 request_queue_size

Well, that was instructive. Leaving server.request_queue_size at the default 5:

C:\Python24\Lib\site-packages>python cherrypy\test\benchmark.py
Starting CherryPy app server...
Started in 1.10800004005 seconds

Client Thread Report (1000 requests, 14 byte response body, 10 server threads):

threads | Completed | Failed | req/sec | msec/req | KB/sec |
     10 |      1000 |      0 |  736.81 |    1.357 | 119.36 |
     20 |      1000 |      0 |  436.07 |    2.293 |  70.64 |
     30 |      1000 |      0 |  348.38 |    2.870 |  56.44 |
     40 |      1000 |      0 |  233.10 |    4.290 |  37.76 |
     50 |      1000 |      0 |  296.77 |    3.370 |  48.08 |
Average |    1000.0 |    0.0 | 410.226 |    2.836 | 66.456 |

Client Thread Report (1000 requests, 14 bytes via staticdir, 10 server threads):

threads | Completed | Failed | req/sec | msec/req | KB/sec |
     10 |      1000 |      0 |  421.73 |    2.371 |  87.30 |
     20 |      1000 |      0 |  374.87 |    2.668 |  77.60 |
     30 |      1000 |      0 |  306.71 |    3.260 |  63.49 |
     40 |      1000 |      0 |  240.08 |    4.165 |  49.70 |
     50 |      1000 |      0 |  170.03 |    5.881 |  35.20 |
Average |    1000.0 |    0.0 | 302.684 |    3.669 | 62.658 |

Size Report (1000 requests, 50 client threads, 10 server threads):

    bytes | Completed | Failed | req/sec | msec/req |   KB/sec |
       10 |      1000 |      0 |  187.98 |    5.320 |    29.70 |
      100 |      1000 |      0 |  207.45 |    4.820 |    51.45 |
     1000 |      1000 |      0 |  186.89 |    5.351 |   210.81 |
    10000 |      1000 |      0 |  228.12 |    4.384 |  2262.07 |
   100000 |      1000 |      0 |  245.60 |    4.072 | 24022.01 |
100000000 |      1000 |     10 |   20.83 |   48.001 | 20358.12 |

Upping server.request_queue_size to 128:

C:\Python24\Lib\site-packages>python cherrypy\test\benchmark.py
Starting CherryPy app server...
Started in 1.10700011253 seconds

Client Thread Report (1000 requests, 14 byte response body, 10 server threads):

threads | Completed | Failed | req/sec | msec/req |  KB/sec |
     10 |      1000 |      0 |  745.38 |    1.342 |  120.75 |
     20 |      1000 |      0 |  772.32 |    1.295 |  125.12 |
     30 |      1000 |      0 |  654.11 |    1.529 |  105.97 |
     40 |      1000 |      0 |  929.02 |    1.076 |  150.50 |
     50 |      1000 |      0 |  641.03 |    1.560 |  103.85 |
Average |    1000.0 |    0.0 | 748.372 |   1.3604 | 121.238 |

Client Thread Report (1000 requests, 14 bytes via staticdir, 10 server threads):

threads | Completed | Failed | req/sec | msec/req |  KB/sec |
     10 |      1000 |      0 |  547.89 |    1.825 |  113.41 |
     20 |      1000 |      0 |  588.10 |    1.700 |  121.74 |
     30 |      1000 |      0 |  704.42 |    1.420 |  145.82 |
     40 |      1000 |      0 |  547.89 |    1.825 |  113.41 |
     50 |      1000 |      0 |  516.96 |    1.934 |  107.01 |
Average |    1000.0 |    0.0 | 581.052 |   1.7408 | 120.278 |

Size Report (1000 requests, 50 client threads, 10 server threads):

    bytes | Completed | Failed | req/sec | msec/req |   KB/sec |
       10 |      1000 |      0 |  622.35 |    1.607 |    98.33 |
      100 |      1000 |      0 |  604.74 |    1.654 |   149.37 |
     1000 |      1000 |      0 |  667.74 |    1.498 |   752.54 |
    10000 |      1000 |      0 |  890.31 |    1.123 |  8837.25 |
   100000 |      1000 |      0 |  728.44 |    1.373 | 71247.09 |
100000000 |      1000 |    202 |   12.81 |   78.094 |     None |

08/23/07

Permalink 09:11:05 am, by fumanchu Email , 83 words   English (US)
Categories: Python

Simplify, maaaaan

Glyf said:

I've experienced quite a bit of confusion when talking to other python programmers about this stuff, but I'm not sure if it was my awkward explanation of Twisted's plugin system or some inherent issue in Python's path management.

If the implementation is hard to explain, it's a bad idea.

And if you would allow me to comment on your blog without becoming a LiveJournal user, I'd really appreciate it.

08/18/07

Permalink 07:00:00 pm, by fumanchu Email , 2186 words   English (US)
Categories: Python, Dejavu

Storm, SQLAlchemy, and Geniusql

As Mike Bayer mentioned, I recently did some cross-ORM comparisons. The charts below all show "number of requests" on the X axis and "number of seconds to complete those requests" on the Y axis.

Update: Mike found some problems in my methodology and posted a very thorough "revisiting" post. Definitely read that if you read this.

Methodology

I used trunk versions of Geniusql (1.0) and SQLAlchemy (0.4, without using the orm subpackage), and the latest stable distro of Storm (0.10). All test runs were done on my Win2k laptop, which has a 2GHz CPU and 512MB of RAM. PostgreSQL 8.0.0rc1 was the DB engine, and psycopg2 was the driver.

Scripts and a CSV are here: http://www.aminus.org/rbre/python/ormcompare/ I'd be happy to be notified I did something wrong in the SQLAlchemy or Storm code (especially if you can patch it to do better ;) ).

Results and Analysis

As with all benchmarks, take this with a heavy grain of salt. The numbers say far more about the values and designs of each ORM than they do about the performance of particular subsystems in each ORM. To wit:

Step 2: Insert

Inserting single objects: Benchmark chart for step 2: insert

SQLAlchemy is doing something that Geniusql and Storm aren't doing on every insert. A trace of SQLAlchemy and Geniusql shows Geniusql making 13 Python function calls for a single simple insert, whereas SQLAlchemy makes over 400! What could it be doing?

Apparently, SQLAlchemy is spending a lot of time generating bound parameters. That is, it generates the SQL INSERT INTO "Animal" ("ID", "Name", "Species", "Legs") VALUES (:ID, :Name, :Species, :Legs), which Geniusql doesn't do. Is it ostensibly doing this for speed reasons? If so, the benchmark results seem to argue that it's a losing battle; the auto-generation of optimized code is costing more than the savings gained from the optimization.

It also spends far too long fetching the autoincremented ID's, using 73 Python function calls; Geniusql manages it in 4: fetch, log, a __getattr__, and a pull method to coerce the new ID to the proper Python type. But then, SQLAlchemy is also populating the new object with other (non-autoincrementing) default values, in Python, which Geniusql doesn't do (it relegates that to its ORM layer, Dejavu). I'd bet SQLAlchemy would be better off just fetching the new row (which the DB has populated with default values already anyway).

Here's a trace of Geniusql retrieving the autoincremented ID:

> _grab_new_ids (postgres:520)
  > fetch (psycopg:94)
    > log (objects:794)
    < log (objects:795) 0.047ms
    > __getattr__ (conns:23)
    < __getattr__ (conns:24): <built-in method cursor of psycopg2._psycopg.connection object at 0x00ADB020> 0.052ms
  < fetch (psycopg:120): ([(13L,)], (('last_value', 20, 2, 8, None, None, None),)) 1.232ms
< _grab_new_ids (postgres:527): {'ID': 13L} 1.601ms

and here's an SQLAlchemy trace:

> get_column_default (postgres:616)
  > __get__ (util:191)
  < __get__ (util:195): True 0.062ms
  > get_column_default (base:1530)
    > traverse_single (sql:893)
      > visit_sequence (postgres:633)
        > <lambda> (base:1528)
        < <lambda> (base:1528): <sqlalchemy.databases.postgres.PGDialect object at 0x00A23A30> 0.049ms
        > format_sequence (ansisql:1026)
          > __generic_obj_format (ansisql:1004)
            > _get_case_sensitive (schema:110)
            < _get_case_sensitive (schema:119): True 0.058ms
          < __generic_obj_format (ansisql:1010): 'animal_id_seq' 0.289ms
        < format_sequence (ansisql:1027): 'animal_id_seq' 0.462ms
        > execute (base:712)
          > _execute_text (base:722)
            > __distill_params (base:728)
            < __distill_params (base:735) 0.080ms
            > __create_execution_context (base:761)
              > <lambda> (base:932)
              < <lambda> (base:932): <sqlalchemy.databases.postgres.PGDialect object at 0x00A23A30> 0.051ms
              > create_execution_context (postgres:231)
                > __init__ (default:146)
                  > __encode_param_keys (default:184)
                    > supports_unicode_statements (default:50)
                    < supports_unicode_statements (default:52): False 0.049ms
                    > proc (default:189)
                    < proc (default:193) 0.065ms
                  < __encode_param_keys (default:198) 0.794ms
                  > supports_unicode_statements (default:50)
                  < supports_unicode_statements (default:52): False 0.048ms
                  > create_cursor (postgres:183)
                    > _is_server_side (postgres:180)
                    < _is_server_side (postgres:181): False 0.051ms
                    > _get_connection (base:532)
                    < _get_connection (base:534): <sqlalchemy.pool._ConnectionFairy object at 0x00BDA950> 0.059ms
                    > cursor (pool:308)
                      > __init__ (pool:391)
                      < __init__ (pool:393) 0.061ms
                    < cursor (pool:311): <sqlalchemy.pool._CursorFairy object at 0x00BDAE90> 0.271ms
                  < create_cursor (postgres:190): <sqlalchemy.pool._CursorFairy object at 0x00BDAE90> 0.832ms
                < __init__ (default:174) 2.301ms
              < create_execution_context (postgres:232): <sqlalchemy.databases.postgres.PGExecutionContext object at 0x00BDAD70> 2.487ms
            < __create_execution_context (base:762): <sqlalchemy.databases.postgres.PGExecutionContext object at 0x00BDAD70> 2.860ms
            > __execute_raw (base:764)
              > is_info_enabled (logging:71)
                > isEnabledFor (logging/__init__:1139)
                  > getEffectiveLevel (logging/__init__:1125)
                  < getEffectiveLevel (logging/__init__:1135): 30 0.129ms
                < isEnabledFor (logging/__init__:1145): False 0.334ms
              < is_info_enabled (logging:72): False 0.530ms
              > __execute (base:774)
                > do_execute (default:103)
                  > __getattr__ (pool:404)
                  < __getattr__ (pool:405): <built-in method execute of psycopg2._psycopg.cursor object at 0x00B94978> 0.049ms
                < do_execute (default:104) 2.892ms
              < __execute (base:781) 3.186ms
              > _autocommit (base:685)
                > in_transaction (base:617)
                < in_transaction (base:618): False 0.049ms
                > match (sre:126)
                  > _compile (sre:213)
                  < _compile (sre:218): <_sre.SRE_Pattern object at 0x00BDC020> 0.087ms
                < match (sre:129) 0.270ms
              < _autocommit (base:689) 0.608ms
            < __execute_raw (base:772) 4.783ms
            > result (default:233)
              > get_result_proxy (postgres:192)
                > _is_server_side (postgres:180)
                < _is_server_side (postgres:181): False 0.051ms
                > __init__ (base:1126)
                  > <lambda> (default:176)
                    > <lambda> (default:180)
                      > _branch (base:538)
                        > __init__ (base:524)
                        < __init__ (base:530) 0.114ms
                      < _branch (base:547): <sqlalchemy.engine.base.Connection object at 0x00BD1350> 0.298ms
                    < <lambda> (default:180): <sqlalchemy.engine.base.Connection object at 0x00BD1350> 0.475ms
                    > <lambda> (base:549)
                    < <lambda> (base:549): Engine(postgres://postgres:djvpg@localhost/geniusql_bench) 0.048ms
                  < <lambda> (default:176): Engine(postgres://postgres:djvpg@localhost/geniusql_bench) 0.888ms
                  > is_debug_enabled (logging:68)
                    > isEnabledFor (logging/__init__:1139)
                      > getEffectiveLevel (logging/__init__:1125)
                      < getEffectiveLevel (logging/__init__:1135): 30 0.130ms
                    < isEnabledFor (logging/__init__:1145): False 0.340ms
                  < is_debug_enabled (logging:69): False 0.541ms
                  > is_select (default:219)
                    > match (sre:126)
                      > _compile (sre:213)
                      < _compile (sre:218): <_sre.SRE_Pattern object at 0x00BAB7F0> 0.086ms
                    < match (sre:129): <_sre.SRE_Match object at 0x00BC49F8> 0.261ms
                  < is_select (default:222): True 0.431ms
                  > _init_metadata (base:1151)
                    > _create_key_cache (base:1182)
                      > __init__ (util:51)
                      < __init__ (util:52) 0.049ms
                    < _create_key_cache (base:1204): {} 0.321ms
                    > __getattr__ (pool:404)
                    < __getattr__ (pool:405): (('nextval', 20, 2, 8, None, None, None),) 0.053ms
                    > dbapi_type_map (default:30)
                    < dbapi_type_map (default:34): {} 0.048ms
                    > decode_result_columnname (default:25)
                      > decode (utf_8:15)
                      < decode (utf_8:16): (u'nextval', 7) 0.053ms
                    < decode_result_columnname (default:28): u'nextval' 0.234ms
                    > dialect_impl (types:49)
                    < dialect_impl (types:51): NullType() 0.060ms
                  < _init_metadata (base:1179) 1.758ms
                < __init__ (base:1135) 4.304ms
              < get_result_proxy (postgres:196): <sqlalchemy.engine.base.ResultProxy object at 0x00BD1E30> 4.737ms
            < result (default:234): <sqlalchemy.engine.base.ResultProxy object at 0x00BD1E30> 4.917ms
          < _execute_text (base:726): <sqlalchemy.engine.base.ResultProxy object at 0x00BD1E30> 13.304ms
        < execute (base:715): <sqlalchemy.engine.base.ResultProxy object at 0x00BD1E30> 13.544ms
        > scalar (base:1318)
          > _fetchone_impl (base:1284)
            > __getattr__ (pool:404)
            < __getattr__ (pool:405): <built-in method fetchone of psycopg2._psycopg.cursor object at 0x00B94978> 0.051ms
          < _fetchone_impl (base:1285): (13L,) 0.250ms
          > _process_row (base:1291)
            > __init__ (base:1431)
            < __init__ (base:1436) 0.072ms
          < _process_row (base:1292): (13L,) 0.308ms
          > __getitem__ (base:1462)
            > _get_col (base:1280)
              > __getitem__ (util:53)
              < __getitem__ (util:55): (NullType(), NullType(), 0) 0.063ms
              > convert_result_value (types:193)
              < convert_result_value (types:194): 13L 0.048ms
            < _get_col (base:1282): 13L 0.503ms
          < __getitem__ (base:1467): 13L 0.694ms
          > close (base:1206)
            > close (pool:398)
            < close (pool:400) 0.063ms
            > <lambda> (base:1140)
              > <lambda> (default:182)
              < <lambda> (default:182): <sqlalchemy.engine.base.Connection object at 0x00BDACF0> 0.048ms
            < <lambda> (base:1140): <sqlalchemy.engine.base.Connection object at 0x00BDACF0> 0.229ms
            > <lambda> (base:552)
            < <lambda> (base:552): False 0.050ms
          < close (base:1219) 0.796ms
        < scalar (base:1327): 13L 2.721ms
      < visit_sequence (postgres:635): 13L 17.369ms
    < traverse_single (sql:896): 13L 17.584ms
  < get_column_default (base:1532): 13L 17.773ms
< get_column_default (postgres:631): 13L 18.196ms

Step 3: Properties

Retrieving rows/objects by ID: Benchmark chart for step 3: properties

Again, Storm and Geniusql are close but SQLAlchemy is not. There must be some things that SQLAlchemy is doing that the others are not.

Geniusql uses the pattern: parse expression, write SQL, fetch, iterate over result rows (and coerce inbound values as they are yielded).

Storm does it a little differently. The first time you call store.find (at least, I'm assuming it's only the first time), Storm pre-constructs SQL for you, including INSERT INTO, autoincrement ID getters, SELECT by ID, etc. It's the actual find().one() call that does the traditional work. The only significant difference from Geniusql in this step is that Storm's inbound type adapters seem to be slower (by a factor of 10 or so).

There's no single thing I can point to that SQLAlchemy does in this case that the others don't, it just seems to generally take more steps to do each phase. The correlations seem to be taking the most time; SQLAlchemy takes about 5 times longer to generate the SQL than Geniusql does. And, like in the insert step, I'm not sure the bind params are helping SQLAlchemy. But then, Storm does parameter binding and manages to stay fast.

Step 4: Expressions

Retrieving rows/objects with various WHERE clauses: Benchmark chart for step 4: expressions

Oh, dear. It looks like Storm needs some work in this department. Those pre-calculated fetch-by-ID SQL statements are handy, but can't apply to every situation. But let's look at SQLAlchemy first:

SQLAlchemy does a lot of traversal of tree-like structures. Some of this could/should be cached. For example, Geniusql stores the quoted names of each column, so only a single attribute lookup is needed for each one in a SELECT clause; fetching all the qnames and assembling them into the final SQL takes about 1 ms in my trace log. SQLAlchemy, on the other hand, makes about 24 Python function calls (4.5 ms) for each column. The actual fetch-iterate-and-coerce step is practically the same for both libraries.

Storm takes about 3.4ms to generate each column name in a SELECT clause. It also does some odd things with inbound coercion: it gets a decimal.Decimal back for a column that was declared to be Float (?! Zoo.Admission), and coerces that plus the unicode Zoo.Name, but doesn't run a coercion function at all for the date fields. Guess it knows/trusts psycopg will return valid date objects. Ah, so then it wraps up the decimal scalar value in a storm.variables.FloatVariable object. Are you allowed to use binary floating-point "all the way down" in Storm?

At any rate, it takes Storm about 60ms in my trace log to return the first object for each row, and then about 20ms per row/object thereafter. Geniusql takes about 20ms for the first row, and under 2ms for each row thereafter. It's a good bet that that's the bottleneck, and is most likely due to slow inbound adapters.

Step 5: Crazy stuff

Retrieving rows with certain fields only (projection), use of builtin functions (today/now) and use of distinct: Benchmark chart for step 5: aggregates

This one's a little crazy for several reasons. First, it tests some dissimilar things:

  • The 'Legs' attribute of the 'Animal' table is returned for all rows.
  • The 'Species' and 'Lifespan' columns are then retrieved for all rows.
  • The 'Zoo.Founded' date attribute is compared using an SQL function like TODAY or NOW.
  • A DISTINCT query is run on the Animal table.

Second, I couldn't find a way to get Storm to write TODAY into the SQL, so that test cheats a bit and just sticks datetime.datetime.today into the initial expression. I'm pretty convinced that's why the Storm times are so low on this test.

But let's break down the trace log times for the first test, fetching SELECT Legs FROM Animal. Here's SQLAlchemy:

view = select([Animal.c.Legs]).execute().fetchall()
legs = [x[0] for x in view]

[Animal.c.Legs] =  1.3 ms
select()        =  2.9 ms
execute()       = 63.6 ms
fetchall()      = 15.5 ms
listcomp        = 10.4 ms

Here's the Geniusql equivalent:

view = db.select((Animal, ['Legs']))
legs = [x[0] for x in view]

select()        = 13.2 ms
+- fetch()      =     1.9 ms
listcomp        =  4.2 ms

And the Storm timings:

legs = getall(Animal.Legs)

getall()        = 13.8 ms
+- execute()    =    10.0 ms
+- get_all()    =     2.7 ms

I just noticed I forgot the listcomp in the Storm test. Meh.

The third query is more complicated than the ones we've seen so far, so it takes up most of the time in this entire step. Geniusql takes 74.455ms to write the SQL and fetch the data. SQLAlchemy takes 17 ms just parsing all of the overridden operators in the expression (Storm overrides operators, too, but typically does each one in 1/3 the time). Then select() is 3 ms, execute is 115 ms (!), and fetchall is 11 more. I've been thinking lately of adding compound SELECTs to Geniusql recently, but if that means a doubling of execution time for SELECTs without any correlated subqueries, perhaps I'll wait a while. Just kidding--I'm sure it can be done more simply; if SQLAlchemy could short-circuit the traversal of ClauseVisitors in the common case (one table, no compounded SELECTs), it might improve dramatically.

Step 6: Editing

Updating rows/objects: Benchmark chart for step 6: editing

No surprises here. SQLAlchemy parses the WHERE clause of the UPDATE statements faster than Geniusql, but here come those correlated subquery traversals and bind params again to slow the whole thing down.

I just noticed something else: Geniusql uses the following sequence of events to grab a connection:

> get (conns:241)
    > id (conns:271)
    < id (conns:273): 0.049ms
> get (conns:269): 0.280ms

Here's the SQLAlchemy system:

> contextual_connect (base:1036)
  > __init__ (base:524)
    > raw_connection (base:1086)
      > connect (pool:157)
        > __init__ (pool:262)
          > __init__ (pool:256)
          < __init__ (pool:257) 0.055ms
          > get (pool:171)
            > do_get (pool:533)
              > get (queue:124)
                > acquire (threading:93)
                  > currentThread (threading:672)
                  < currentThread (threading:674): <_MainThread(MainThread, started)> 0.058ms
                  > _note (threading:44)
                  < _note (threading:45) 0.050ms
                < acquire (threading:109): True 0.518ms
                > _empty (queue:181)
                < _empty (queue:182): False 0.049ms
                > _get (queue:193)
                < _get (queue:194): <sqlalchemy.pool._ConnectionRecord object at 0x00C0F1F0> 0.050ms
                > notify (threading:236)
                  > _is_owned (threading:143)
                    > currentThread (threading:672)
                    < currentThread (threading:674): <_MainThread(MainThread, started)> 0.054ms
                  < _is_owned (threading:144): True 0.232ms
                  > _note (threading:44)
                  < _note (threading:45) 0.047ms
                < notify (threading:243) 0.649ms
                > release (threading:111)
                  > currentThread (threading:672)
                  < currentThread (threading:674): <_MainThread(MainThread, started)> 0.055ms
                  > _note (threading:44)
                  < _note (threading:45) 0.048ms
                < release (threading:119) 0.487ms
              < get (queue:157): <sqlalchemy.pool._ConnectionRecord object at 0x00C0F1F0> 2.532ms
            < do_get (pool:536): <sqlalchemy.pool._ConnectionRecord object at 0x00C0F1F0> 2.743ms
          < get (pool:172): <sqlalchemy.pool._ConnectionRecord object at 0x00C0F1F0> 2.914ms
          > get_connection (pool:219)
          < get_connection (pool:234): <connection object at 0x00B94E58;
              dsn: 'dbname=geniusql_bench host=localhost user=postgres password=*****', closed: 0> 0.079ms
          > __get__ (logging:86)
            > _get_instance_name (logging:56)
            < _get_instance_name (logging:60): 'sqlalchemy.pool.QueuePool.0x..50' 0.061ms
            > getLogger (logging/__init__:1225)
              > getLogger (logging/__init__:832)
                > _acquireLock (logging/__init__:171)
                  > acquire (threading:93)
                    > currentThread (threading:672)
                    < currentThread (threading:674): <_MainThread(MainThread, started)> 0.056ms
                    > _note (threading:44)
                    < _note (threading:45) 0.048ms
                  < acquire (threading:109): True 0.496ms
                < _acquireLock (logging/__init__:181) 0.687ms
                > _releaseLock (logging/__init__:183)
                  > release (threading:111)
                    > currentThread (threading:672)
                    < currentThread (threading:674): <_MainThread(MainThread, started)> 0.056ms
                    > _note (threading:44)
                    < _note (threading:45) 0.048ms
                  < release (threading:119) 0.481ms
                < _releaseLock (logging/__init__:188) 0.658ms
              < getLogger (logging/__init__:862): <logging.Logger instance at 0x00B96F58> 1.775ms
            < getLogger (logging/__init__:1232): <logging.Logger instance at 0x00B96F58> 1.994ms
            > getEffectiveLevel (logging/__init__:1125)
            < getEffectiveLevel (logging/__init__:1135): 30 0.134ms
          < __get__ (logging:90): False 2.731ms
        < __init__ (pool:273) 6.473ms
        > checkout (pool:319)
        < checkout (pool:325): <sqlalchemy.pool._ConnectionFairy object at 0x00C0FF50> 0.092ms
      < connect (pool:159): <sqlalchemy.pool._ConnectionFairy object at 0x00C0FF50> 6.878ms
    < raw_connection (base:1089): <sqlalchemy.pool._ConnectionFairy object at 0x00C0FF50> 7.059ms
  < __init__ (base:530) 7.308ms
< contextual_connect (base:1042): <sqlalchemy.engine.base.Connection object at 0x00C0F910> 7.486ms

Houston, we have a problem. Don't log if you don't have to (especially when using the Python stdlib's extremely-nested logging module, which is slow even when turned off), and use a map of conns to threads instead of going to the Queue for every request. Geniusql manages this by using weakref wrappers on the conns, so when a thread no longer references its mapped connection, it's returned to the pool. Storm gets around this by not having a pool, or even threadsafe Store objects--each Store has its own connection, so it "maps conns to threads" by limiting you to one Store+conn per thread.

That's enough analysis for now. :) Hope you enjoyed the ride.

08/10/07

Permalink 08:18:51 pm, by fumanchu Email , 528 words   English (US)
Categories: Dejavu

Looks like I reinvented the wheel again

I just followed a link from LtU to a paper on Safe Query Objects. The content of the paper was immediately familiar to me, since Geniusql does exactly what is described therein: use native language (Python) semantics to construct remote queries.

Notes while I read:

Despite their usefulness, call level interfaces have a number of significant problems. First, the embedded database programs are not checked until they are passed to the CLI at runtime. As a result, the syntax and types of database programs are not checked statically, but instead result in runtime errors. This is true despite the fact that the database structure is almost always static and known when the client program is compiled.

Geniusql deals with this by obtaining and checking a complete model of the database types. For example, an object with a unicode attribute would be mapped to a Table object with (say) an instance of the geniusql.providers.postgres.TEXT class.

Second, programs that use call level interfaces are difficult to write and maintain. There are important classes of queries that must be constructed dynamically at runtime. Manipulating programs as strings is complex and error-prone due to the complex rules for nesting expressions, quoting constants, and the interplay between embedded and host languages.

Which is why Geniusql eschews strings in favor of Python lambdas.

Concepts that are relatively straightforward to express in a language, like query parameters, are awkward to specify and invoke via an API. Query results are represented as untyped objects that are accessed by string names. There are many subtle but unchecked dependencies between the query being executed and the code that decodes its results.

Geniusql determines both the database type and the Python type of each result column, and provides adapters between them (which you can override if needed).

Finally, call level interfaces make it difficult to reuse queries; doing so involves complex manipulation of programs at runtime, while avoiding name conflicts and ensuring consistency of the resulting query.

True, but reuse can manifest itself at many levels. Quite frankly, I've found the gains from reuse of DB query plans to often be completely offset by the cost of libraries constructing and maintaining the requisite prepared statements. Geniusql caches the generated SQL for each AST instead.

In its simplest form, a safe query object is just an object containing a boolean method that can be used to filter a collection of candidate objects. ...syntax and types are checked at compile time: if manager is misspelled, a compile time error is produced.

Geniusql uses Python lambdas in a similar fashion. Of course, Python is dynamically typed, so we only get syntax checks at compile-time, but that's a huge benefit all by itself. And, because we're using Python, we can skip all the metaprogramming in the Java solution. Instead of writing:

class SalaryLimit instantiates RemoteQueryJDO
                  extends SafeQuery<Employee>
{
    double limit; /* parameter */
    SalaryLimit(double limit) {
        this.limit = limit;
    }
    boolean filter(Employee employee) {
        return employee.salary > limit;
    }
}

we can get it all done in one line:

Employee.select(lambda e: e.salary > limit)

07/24/07

Permalink 10:03:20 am, by fumanchu Email , 393 words   English (US)
Categories: CherryPy, WSGI

Please don't use wsgiapp

Gordon Tillman has a wiki page up on how to mix Django content into a CherryPy site. It's easy and probably works, but please don't do it anymore.

We're officially going to deprecate the wsgiapp Tool because 1) it doesn't conform to the WSGI spec (and cannot be fixed to do so), and 2) there's a better way to mix content in a CherryPy site: tree.graft.

The tree.graft(app, script_name) method is the proper way to add Django or other WSGI content to an existing CherryPy site. Instead of nesting the two frameworks, we branch instead. To take Gordon's example, instead of:

class DjangoApp(object):
    _cp_config = {
        'tools.wsgiapp.on': True,
        'tools.wsgiapp.app': AdminMediaHandler(WSGIHandler()),
}
...
cherrypy.tree.mount(DjangoApp(), '/')

You should always write this instead:

cherrypy.tree.graft(AdminMediaHandler(WSGIHandler()), '/')

Look, if you nest the one inside the other, CherryPy's going to do an awful lot of HTTP request parsing that is going to be completely redundant, since Django's going to do it again anyway. And this code is not very fast. Your site is going to crawl. That's strike one for nesting.

Strike two is the "always on" nature of nesting as opposed to branching. When you write your request/response cycle like an onion, every component which could possibly play a part in the request has to be called, even if just to reply "I'm not involved in this one". Given the slowness of Python function calls, this is rarely a good thing. If you thought your site was crawling before... This was a major design flaw of CherryPy 2, and is a major reason CherryPy 3 is 3x faster: the old Filters were called all the time, even if you didn't need them; the new Tools are only called when they're applicable.

Strike three against the nested approach is that it's always easier to traverse a tree of siblings than it is to traverse a nested set; programmers, for some reason, like to hide information from you, including how their site components go together. The branched version will be much easier to reason about, statically analyze, and write inspection tools for.

So please, use tree.graft, and stop using the wsgiapp Tool in CherryPy 3. We're going to formally deprecate it soon.

07/17/07

Permalink 08:00:30 am, by fumanchu Email , 38 words   English (US)
Categories: IT, Python

URL's are files, again, apparently

It doesn't try to reinvent web files as classes. URLs retain their meaning as file locations.

Best example of Not Getting It I've seen in a while. Somebody needs to re-read Fielding.

<< 1 2 3 4 5 6 7 8 9 10 11 ... 17 >>

April 2014
Sun Mon Tue Wed Thu Fri Sat
 << <   > >>
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30      

Search

The requested Blog doesn't exist any more!

XML Feeds

free blog software