| « Complexity: the mismatch between needs, does, and wants | The Dresden Codak Gallery » |
It seems lots of people are using memcached to cache both a set of objects (each with their own key), and also various lists of those objects using different keys. For example, a retailer might cache Item objects, but also want to cache the list of Items in a given Category. The SQL before memcached might look like this:
SELECT * FROM Item WHERE CategoryID = 5;
..whereas with memcached mixed in, the I/O tends to look like this (against an empty cache):
get Item:CategoryID=5
END
SELECT ID FROM Item WHERE CategoryID = 5;
set Item:CategoryID=5 1 300 19
1111,2222,3333,4444
STORED
get Item:1111
END
get Item:2222
END
get Item:3333
END
get Item:4444
END
SELECT * FROM Item WHERE ID IN (1111, 2222, 3333, 4444)
set Item:1111 1 300 58
STORED
set Item:2222 1 300 58
STORED
set Item:3333 1 300 54
STORED
set Item:4444 1 300 80
STORED
That is, fetch the list of ID's from the cache; if not present, fetch it from the DB and store it in the cache (the "300" in the above examples means, "expire in 300 seconds"). Then iterate over the list of ID's and try to fetch each whole object from cache; if any miss, ask the DB for them (in as few operations as possible) and store them in the cache.
Once both the objects and the list-of-id's are cached, subsequent calls to a hypothetical get_items_by_category function should look like this:
get Item:CategoryID=5
sending key Item:CategoryID=5
END
get Item:1111
sending key Item:1111
END
get Item:2222
sending key Item:2222
END
get Item:3333
sending key Item:3333
END
get Item:4444
sending key Item:4444
END
But what happens when you move Item 3333 from Category 5 to Category 6? There are three possibilities:
Item:Category=6 from the DB before the cached Item:Category=5 list expires.If you're happy with option 1, great! The rest of this discussion probably isn't for you. I'm going to explore three solutions (only one of which I'm happy with) for cases 2 and 3.
So you're not happy with the expiration time of your cached lists, so you've built all that invalidation code. What you may not realize is that you've just reinvented (badly) something databases have had for decades: indices.
An index is usually implemented with a B+-tree, most of the details of which are unimportant for us. What is important is that 1) an index covers a subset of the columns in the table (often a single column), which from now on I'm going to call the index criteria, and 2) each distinct combination of values for the index criteria has its own leaf node in the tree, which contains/points to a list of rows in the table that match that combination. What a mouthful. Another way to say it is that the leaf nodes in the tree look like this for an index over Item.CategoryID:
(2,): [9650, 2304, 22, 50888]
(3,): [323, 3000, 243246, 87346, 6563, 8679]
(5,): [1111, 2222, 3333, 4444]
(6,): [18]
When you ask the database for all Items with a CategoryID of 5, the database traverses the "CategoryID" index tree, finds the leaf node for "5", grabs the list stored at that node, then iterates over the list and yields each full object mentioned therein. This is called an "index scan":
# EXPLAIN SELECT * FROM Items WHERE CategoryID = 5;
QUERY PLAN
--------------------------------------------------------------------------------------------------
Index Scan using items_categoryid on Items (cost=0.00..29.49 rows=609 width=24)
Index Cond: (CategoryID = 5)
Sound familiar? It's exactly what we're doing by hand with memcached.
Okay, it's not exactly like our memcached example. There are some striking differences.
First, a database index is sparse, in the sense that it doesn't contain leaf nodes for every potential index criteria value, just the concrete values in the table. Our memcached indexing is even sparser: so far it only contains leaf nodes (lists of ID's) for each index scan we've run. If we've only asked for Items in Categories 2 and 5, memcached will only contain nodes for Item:CategoryID=2 and Item:CategoryID=5.
Second, a database index is a full tree, with a root node. What we've done so far in memcached is only store leaf nodes. This will bite us in a moment.
Third, a database index is naturally transactional. When you move Item 3333 from Category 5 to 6, you might execute the SQL statement "UPDATE Item SET CategoryID = 6 WHERE ID = 3333;". The database, being the sole arbiter of Truth, can lock the row, read the old value, lock the index leaf nodes, remove the row from the old leaf node and add it to the new one, write the new value to the table, and unlock, all within a fully-logged transaction (although it can be a lot more complicated than that with ranged index locks, log schemes, and page reallocation schemes). Our memcached implementation so far can't do any of that.
Combining the above differences, we get...a real mess. Specifically, we have to find a way to do all of those index updates.
One way would be to invalidate everything. Call flush_all and be done with it. This can work (poorly) if you've partitioned your data well over multiple memcached servers, but Woe Unto Thee if you're storing, say, cached HTML on the same node.
Another, narrower, solution would be to try to delete all known cached lists for the given criteria. One way to do that would be to attempt to maintain the whole index tree in memcached, not just the leaf nodes. This turns out to be a thorny problem because of the transitory nature of memcached data--what happens when you lose an intermediate node in the index tree? or the root node? You'll fail to update the now-unreachable leaf nodes, but clients will still fetch them and get the stale results.
An even narrower solution would be to try to update just two index leaf nodes for each updated object. For example, if we move Item 3333 from Category 5 to 6, we could try to remove the Item from leaf node "5" and add it to leaf node "6". This can work, but requires keeping the old values around for all indexed columns in your application layer, which a lot of modern DALs and ORMs don't do by default.
I was stuck at this point for a couple of days, until I had an epiphany:
Recall if you will what I said above about the transactional nature of database indices. The DB can remove the row from the old index node(s) and add it to the new index node(s) in one step. We already know we can't do that with memcached, since the "get" and "set" operations aren't atomic anyway (although check-and-set can work around this; see 'cas' in protocol.txt.)
Transactions exist to maintain data integrity; to move The Data from one valid state to another valid state. But don't confuse the technique for the phenomena that technique is trying to prevent. In the case of indices, we use transactions on indices to avoid both 1) reading an object that does not meet the index criteria for the given node, or 2) failing to read an object that does meet the criteria. Databases avoid both scenarios by adding and removing rows from the index atomically.
When dealing with index nodes in memcached, however, the best approach is to separate the two phenomena by adding eagerly and removing lazily:
UPDATE Item SET CategoryID = 6 WHERE ID = 3333", you also cache the object (you were doing that already anyway), but you also append the ID to the list stored at Item:CategoryID=6.Item:CategoryID=5 do that when it iterates over the objects in the list. If any objects no longer meet the index node criteria (CategoryID=5), they are removed from the list, and the client sets the revised index node back into the cache.There are several:
table.column = value expression, it maps well, but using joins, boolean operators (like and, or, not) or other arithmetic operators makes it difficult. But then, the major databases have the same issues.I think this is worth a shot. I'm adding it to Dejavu in a branch, but it's complicated enough that it may not be done for a bit. Comments and suggestions on the approach welcome.
hi, could you tell me how to config cherrypy to list all the files in a dir? like http://download.cherrypy.org/
I just can't find it in the FAQ, this should be very simple problem, do not know where the magic is