These slides and notes were originally written to accompany a three hour Redis tutorial I gave at the NoSQL Europe conference on the 22nd of April 2010.
Please post any comments, feedback or corrections to the accompanying blog entry.
You can find me online at simonwillison.net and @simonw.
Redis is currently one of my favourite open source projects. I like it because it scales down as well as up - it's equally useful for tiny projects running on a single VPS as it is for huge services running across dozens of servers. At either end of the scale it lets you build things that would be a lot harder (and more expensive) using traditional tools.
Because Redis does so much, it's hard to precisely categorise it. It feels substantially different to the other software that tends to be packaged under the NoSQL brand.
It's like a good pen-knife or multitool - it solves a bunch of different but related problems, all in a convenient pocket sized wrapper.
(Photo by herzogbr)
Performance is Redis's killer feature. I get 80,000+ operations a second on my laptop - proper server hardware will get even more. It handles writes even faster than it handles reads, which makes it useful for a whole bunch of interesting problems. Because it's so fast, a whole class of features which most resource-constrained web applications would never even consider suddenly become feasible.
The performance is principally down to Redis keeping the entire dataset in memory, and only periodically syncing to disk. You can trade performance for improved durability in various ways.
Salvatore is fantastic. He's incredibly responsive to feature suggestions and bug reports, a constant presence on the mailing list, churns out an astonishing amount of high quality code and displays excellent taste in picking which features go in to the product. I requested a feature on Twitter once (SRANDMEMBER) and he checked in an implementation less than 12 hours later!
Up until March, he was working on Redis in addition to consulting and his own startup. He's since been hired by VMWare, which means he'll be working full time on the Redis project.
I'll try to keep the workshop focused around practical applications of Redis - for each feature, I'll show examples of some of the problems it can solve.
I suggest installing Redis directly from source control since some of the features I'll be discussing haven't yet made it in to a stable release. If you don't have git installed you can grab a zip file of the latest trunk directly from GitHub.
http://try.redis-db.com/ lets you try out Redis interactively directly from your browser.
UPDATE: The site has now been updated to a version which includes most of the recent commands, including the hash operations.
redis-cli is definitely the best way of experimenting with Redis. Run without arguments it connects to the local Redis server and provides an interactive prompt - with arguments, you can send a single command directly.
Redis is at heart a key-value store, so it makes sense to start by talking about keys. Keys should not contain whitespace - versions of Redis prior to 1.2 had trouble with this, and even now it's not guaranteed that any edge-case bugs have been ironed out. A common convention is to use obj-type:id:field, though now that Redis supports hashes as values this convention is likely to become less important.
Binary-safe strings means that values are essentially byte strings, and can contain any byte (including the null byte). This is possible because the Redis protocol uses pascal-style strings, prefixing any string with its length in bytes.
pubsub channels are a new addition to Redis and are semantically different from the other types, existing in their own namespace (they shouldn't really be considered as a type of value that gets assigned to a key).
The TYPE command is interesting: it tells you the type of a Redis key, be it a string, list, set, zset, hash or none for a key that has not yet been set. The type of a key is set by the first command to operate on it - if it's a set command, the key becomes a set. Once a key's type has been specified the set of commands that will work against that set becomes limited. These restrictions can be lifted by emptying the key, either by removing all of its members (if it's a list of set) or by using the DEL or RENAME commands.
Any Redis command that ends with an NX (standing for Not Exists) such as RENAMENX will fail if the key it is attempting to write to already exists. These commands are atomic, and can be used to create locking primitives. Lots more on that later.
Redis can work as a cache (similar to memcached) by using the EXPIRE and EXPIREAT commands. These set a key to expire after a certain number of seconds or at a specific time specified as a unix timestamp. Be warned though: EXPIRE has some interesting limitations, explained later. The TTL command can be used to find out if a key is due to expire in a certain number of seconds.
These are the most basic Redis commands, extremely similar to those provided by memcached. MGET performs a multi-get, returning values for a list of keys (or nil for any of those keys with no value).
INCR and friends cause a key value to be interpreted as an integer, and will then atomically increment or decrement that value. If you call INCR against a non-existent key it will be set to 1. The new value of the counter is always returned.
APPEND and SUBSTR are brand new - so new they have yet to be covered in the official documentation. They hint at an interesting future direction for Redis where bytestrings will no longer be immutable, and can be modified in-place by multiple clients.
We've covered only a tiny fraction of Redis's capabilities, but already there are a bunch of useful things we can do with it.
Server backed sessions (where the browser is given a random cookie value which is then associated with a larger chunk of serialized data on the server) are a very poor fit for relational databases. They are often created for every visitor, even those who stumble in from Google and then leave, never to return again. They then hang around for weeks taking up valuable database space. They are never queried by anything other than their primary key.
Database sessions also force an additional SQL statement to be executed for every page view to read that user's session, even if only to update a "Logged in as X" bar at the top of the page.
A fast key-value store like Redis is a much better fit for session data. The per-page overhead is far, far smaller than a round-trip to a regular database.
The implementation is simple - just a key/value pair per session. It's important to store the date the session was created so that sessions can be correctly expired. This can be done using a separate key, but a better solution is to use the EXPIRE command. Watch out though: writing to a key with an EXPIRE set on it (to update the session, for example) removes the expiry time entirely! You need to re-call EXPIRE after every write to the key. This is probably a good idea anyway, since it ensures user sessions expire N days after the last action by the user, not N days after they were first created.
This isn't real code - it's a weird pseudo-code I invented, kind of a cross between Perl, Python, Ruby and PHP. Don't get hung up on the syntax.
If you want to store many different records of the same type, it makes sense to use a numeric ID for each record. Since Redis counters are atomic, a smart way of doing that is to use the next value from a global counter. This is the Redis equivalent of an auto-incrementing primary key.
If an object has multiple properties, you can store it in multiple related keys. Again, the new hash type that was recently added to Redis will likely make this pattern obsolete.
Fast, atomically incremented counters are a great fit for offering real-time statistics - things like the bit.ly live click-through dashboard. Don't be afraid to use lots of different counters - in this example, we have a separate counter for each URL (counting total hits), one for each day (for total clicks by day) and then one for each URL/day combination. A link may end up with dozens of counters tracking its traffic over time, but since each one is just an integer the total space taken is negligible. It's also really easy to shard this pattern across multiple Redis nodes.
Salvatore was working on a real-time stats package when he created Redis, so many Redis features support this kind of activity. Sorted sets are a particularly good fit.
APPEND might work or logging, but other Redis primitives such as lists and sorted sets are a better fit there.
If you have a long-running process, it's nice to tell the user what's going on. Ajax polling is a simple way of doing this, but hitting a database every few seconds is expensive. Hitting Redis every few seconds (even for thousands of concurrent users) is no problem at all.
If you're sharding your data you'll need a central lookup service for quickly determining which shard is being used for a specific user's data. A replicated Redis cluster is a great solution here - GitHub use exactly that to manage sharding their many repositories between different backend file servers.
Any transient data used by your application is also a good fit for Redis. CSRF tokens (to prove a POST submission came from a form you served up, and not a form on a malicious third party site) need to be stored for a short while, as does handshake data for various security protocols. A relational database is wasted on this kind of short-lived data.
I discuss this in more detail in Why I like Redis.
Redis doesn't include direct support for locking - at least not yet. Instead, it provides a variety of atomic primitives which can be used to implement application-level distributed locks.
Let's say you're happy for Redis to accumulate counts, but you want to periodically copy those counts across to some other persistent store - write them to MySQL every 10 minutes, for example.
The GETSET command can return the current value of a counter while simultaneously resetting that counter to 0. You can be absolutely sure you'll never miss an increment.
This example is best explained by the Redis SETNX manual page.
Another common pattern: given a natural key, assign an integer ID within Redis - but make sure two different IDs aren't assigned to the same natural key in the case of a race condition. SETNX is ideal for this.
These commands all operate on the database itself, rather than individual keys.
Redis defaults to containing 16 databases, each a completely separate namespace that can contain its own keys. By default, all operations affect database 0 - but the SELECT command can be used to switch to one of the others by number.
There have been a few feature requests for named aliases for different databases, but so far only numeric access is supported.
The MOVE command can be used to atomically move keys from one database to another.
FLUSHDB deletes all keys in the current database; FLUSHALL deletes all keys in every database.
DBSIZE simply tells you how many keys are in the current database. INFO is a lot more interesting, dumping out a bunch of useful statistics about the status of the server - if you write any automatic monitoring and graphing scripts they'll be based around this command.
MONITOR is actually part of the Redis replication system. If you telnet directly to Redis and type monitor, you'll see a live dump of all commands executing against the database. This is really useful for debugging.
Compiling redis-tools gives you a couple of extra commands. redis-stat works a bit like top, showing you a continuous snapshot of the activity of your server. redis-load fills your database with random data, so don't run it unless you really mean to.
Redis is primarily an in-memory database. Unlike transactional relational databases, there's no guarantee that data has actually been written to disk when a Redis command returns. With the default Redis settings, a server crash is very likely to lose some data.
There are a number of ways in which you can improve durability, generally by trading off performance.
By default, Redis serializes its entire state to disk every few minutes - timing varies depending on how many keys have changed. You can control this behaviour using the redis.conf file, which has very clear comments.
The SAVE command can be used to trigger a blocking save - since this will cause your server to temporarily block all incoming commands, I'm not sure why you would want to do this (I imagine it's a debugging tool). The far more useful BGSAVE command causes the same behaviour as the automatic periodic saves - the process is forked and the new process writes everything atomically to disk, leaving the old process to continue serving requests.
The safer alternative to snapshots, though less performant: Redis can be configured to write every command to an append-only log file, which can then be replayed to recover the server's state in the event of a crash. Again you get to tune the trade-off between durability and performance by setting how often Redis should ensure the file has been fsyncd to disk.
Append only files can get long - if you constantly update just a single key the in-memory representation will be tiny but the log file could be thousands of lines long. The BGREWRITEAOF command addresses this by rewriting the log file to contain the smallest number of commands needed to recreate the current state. This is created as a temporary file and atomically renamed to replace the existing file once it has been fully written.
Last year, Nat and I went to the Royal Geographical Society's Explore conference. One of the sessions was about surviving in the desert with camels, and the point was made that you should always take three camels in to the desert with you. That way, if one of them breaks its leg and another one runs off you can still make it out safely.
When the Guardian hosted UK ScaleCamp a few months later, the same principle came up again - this time referring to data. If your data is on one machine (or in just one data centre), you have a high risk of losing it no matter how well specced that individual machine is. If it's being served from two machines you're still at risk - if one of them dies, the other one is dramatically more likely to die shortly afterwards since it now has to handle twice the load.
(Photo by Michael and Donna)
If you really care about your data, it should be on three machines - or even better, in three different data centres.
Thankfully, Redis supports trivial replication. A Redis node can mirror any other node that it can see over a network. You can configure a slave in redis.conf, but you can also configure one "live" by sending it the SLAVEOF command.
I mentioned earlier that it's easy to be tripped up by the semantics of the EXPIRE command. This is why: for both replication and the append-only log file to work reliably, it is essential that replaying the same sequence of commands will result in the same underlying data structure being created. Timing based commands such as EXPIRE cannot be allowed to cause different results on a master and slave once replication lag between the two has been taken in to account. The restrictions on EXPIRE are all caused by this constraint.
Sets are the first of the four Redis collection types (the others being lists, sorted sets and hashes). A set is an unordered set of bytestrings - you can't nest collection types in Redis.
SCARD tells you the cardinality of the set - how many items it is holding.
SPOP removes and returns a random item from a set; SRANDMEMBER returns a random item without removing it.
SMOVE is another atomic primitive - it removes an item from the source set and adds it to the destination set.
Here's the first example of Redis performing calculations for you: SUNION, SDIFF and SINTER all perform lightning-fast set arithmetic operations and return the newly calculated set.
SUNIONSTORE, SDIFFSTORE and SINTERSTORE perform the same calculation as their shorter alternatives but also take a key to which the resulting set will be written.
Here's the strangest manifestation of EXPIRE's weird semantics: a key with an expiry time set can be used as input to regular set operations (SUNION etc) but will throw an error if used as input to an SUNIONSTORE. Again, this is to ensure that replaying a sequence of commands always results in the same internal state within Redis.
Earlier we saw an example of storing compound data objects in Redis using keys related to each other by a shared ID. With sets, we can also keep track of ALL of the IDs that have been used for records in the system.
Tags and tag calculations are another obvious use of sets. This is a good example of Redis forcing you to denormalise your data - each item that is tagged gets a set of tags, while each tag gets a set of item IDs. SINTER python redis will return the set of IDs that have been tagged with both python and redis.
The ability to quickly pick a random item from a set using SRANDMEMBER is so useful it might be worth installing Redis for that feature alone. Picking random items from a SQL database generally involves using ORDER BY RAND() LIMIT 1, which is hideously inefficient for large tables. Any time you need to pick random items from your database, storing the IDs in Redis and doing it there could be an excellent way to go.
The example application here is Daniel Vydra's excellent Random Guardian, which shows a random page from guardian.co.uk added in the past 24 hours. This is a bit of a cheat on my part - his service runs on AppEngine and hence can't use Redis - but if I was to replicate it elsewhere I'd strongly consider using Redis.
One potential use for Redis is as a smarter replacement for memcached. A common challenge with caching systems is decaching things based on dependencies - if a blog entry tagged with "redis" and "python" has its title updated, the cache for both the entry page and the redis and python tag pages needs to be cleared. Redis sets could be used to keep track of dependencies and hence take a much more finely grained approach to cache invalidation.
One of my favourite quotes about computer science.
Like sets, lists can only contain bytestrings.
Note that these operations all make sense against a linked list - there’s no “insert multiple items at this index” operation, or even a “remove the item at this index” operation. Redis commands tend to be designed using lock-free algorithms.
Unlike the slice operator in languages such as Python, Redis start/end arguments are inclusive.
Now that we can store things in order, we can build a blog! No NoSQL tutorial would be complete without one. The LRANGE command is great for implementing pagination.
Capped Collections are one of my favourite features of MongoDB. In Redis, they can be emulated by LPUSHing items on to a list and then LTRIMing that list down to a certain size. This is a great way of providing a log of recent events without worrying that it will eventually fill up all available RAM.
Redis doesn't quite support transactions, but it does have a way of grouping a set of commands together in such a way that they will be executed as an atomic unit - no other client will be able to alter the data store in the middle of the block.
If the server crashes mid-way through applying MULTI/EXEC block the partially applied changes will NOT be rolled back.
Wrapping the capped log pattern in MULTI/EXEC is a good idea.
Here's how to check for the existence of multiple keys in one atomic operation, despite Redis having no built-in support for an EXISTSMULTI command.
When you EXEC the block, Redis will return the result of each command in the same sequence as the commands were queued up.
I mentioned that Redis has no command for removing an item from a list by its index. If you need to do that, you can use MULTI/EXEC to atomically set the list key to a special reserved value and then delete items with that value from the list.
The DISCARD command can be used instead of EXEC to discard the queued up operations without executing them.
The SORT command is incredibly powerful, and this slide really doesn't do it justice. I suggest reading the Redis SORT command documentation instead, or this entry by Chris Wanstrath.
Since SORT can optionally store the result to a key, one pattern for paginating through expensive sort results (millions of items, for example) is to save the result to a temporary key, set an expiry on it and use that for pagination via the LRANGE command.
It's important to remember that Redis is single-threaded, and handles concurrency by processing commands really, really fast. Slow running commands can block the Redis server for every other client. Most commands are fast enough that this will never be an issue, but an expensive SORT could cause a noticeable delay. One potential solution might be to reserve a slave or two for running long-running sorts.
A list works fine as a queue - just push things on one end and pop them off the other.
This is where Redis gets really interesting. BLPOP and BRPOP are the blocking equivalents of the LPOP and RPOP commands. If the queue for any of the keys they specify has an item in it, that item will be popped and returned. If it doesn't, the Redis client will block until a key becomes available (or the timeout expires - specify 0 for an unlimited timeout).
Blocking queue primitives mean message queues without polling!
This is particularly useful for worker queues that respond to user input in web applications - if a worker is available it can get to work straight away.
Sorted sets are the most interesting Redis data type - they may seem pretty obscure at first, but understanding them is key to getting the most out of Redis. They work like a set (in that a value can only be included in them once) but each member has an associated score. The set is kept ordered by that score, and answer range queries extremely efficiently, sorted in either direction.
ZADD can be used both to add items to the set and to update the score of an existing member.
The ZRANGE family of commands return items by their index position within the ordered set - similar to LRANGE for lists. The optional WITHSCORES argument returns the score for each item in the same response.
These commands query the ordered set by score, instead of by index.
These commands were only recently added to Redis. They're similar to the SUNION and SINTER set commands, but the AGGREGATE option can be used to calculate the scores for the new set.
An obvious application for sorted sets is maintaining high score tables.
But this quote from Salvatore reveals their true importance: any time you need to look up data based on range queries, you should be storing it in a sorted set. They're indexes that you have to maintain yourself.
Hashes are brand new and not entirely baked yet, but they're an important addition to Redis. They allow compound objects to be stored in a single Redis value, without needing to use the obj-type:id:value key storage pattern.
They aren't quite as useful as they could be yet since you can't use hash fields as an input to the SORT command. I imagine this will be fixed very shortly.
UPDATE: I was wrong, this has already been implemented. @antirez says: "Btw you *can* use GET/BY of SORT against hash fields. With GET obj:*->fieldname, same for BY".
The single biggest complaint about Redis is that it requires the entire data set to fit in RAM. If you're using it in conjunction with another data store this isn't a problem - put integer keys in Redis, but keep the full content elsewhere.
This constraint no longer holds with Redis trunk - the new virtual memory feature allows infrequently accessed values to be spilled to disk.
Virtual memory needs to be explicitly turned on in redis.conf.
Redis still requires all keys to fit in RAM, so operations like EXISTS will remain just as fast.
It's worth reading Salvatore's complete description of the virtual memory feature, which explains exactly how it works and why it was baked in to Redis rather than just relying on the virtual memory provided by the underlying operating system.
Publish / subscribe is another new Redis feature, which initially feels like quite a strange fit for the product. It doesn't involve key/value storage at all - instead, Redis can now be used as a broadcast server to transmit messages sent to channels to a large number of connected subscribers in real-time.
The feature came about because people kept on asking for the BLPOP and BRPOP commands to be extended to allow multiple connected clients to be alerted at the same time.
The internal architecture of Redis was already oriented around a pub/sub model for communicating with clients, so exposing that ability to developers wasn't that big a stretch. The initial implementation was only 100 or so lines of code.
Clients can subscribe to channels using the SUBSCRIBE command, or PSUBSCRIBE to subscribe to channels matching a wildcard pattern.
Once a subscription has been issued, the subscribing client can no longer issue other Redis commands (until it unsubscribes).
I'll finish by looking at some real-world examples of problems that were solved using Redis.
For the Guardian's MP expenses crowdsourcing application, Redis served two important purposes.
The application works by inviting members of the public to hit a "start reviewing" button, then showing them a random scanned page from an MP's expense receipts and asking them to help categorise it. The implementation of the random page button is critical to avoid wasted contributions. A Redis set, updated continuously to only contain unreviewed page IDs, works perfectly in conjunction with the SRANDMEMBER command.
We offered a number of different "assignments", each with their own set of random pages and individual progress bar. The assignment sets were created using Redis set intersection operations.
You can read more about the implementation in Crowdsourced document analysis and MP expenses.
WildlifeNearYou.com invites people to log their trips to see wildlife, and import their photos from Flickr and tag them with the species they saw and where they saw them.
We needed to select the best photograph for each individual species, so we built a crowd-sourcing feature that shows people two photographs of a random species and asks them which they prefer.
We expected the feature to generate quite a lot of traffic, so we built it using Redis instead of writing directly to our MySQL database. The species is selected randomly from a Redis set, and the photo rankings are stored in sorted sets, one for each species.
A later enhancement to the service uses a capped list for each logged in user to remember the last 200 species they have rated, and ensure that they don't get photos of the same species too close together.
The WildlifeNearYou.com API is currently in development. We're going to use Redis to manage API limiting. This is a great fit for Redis as a rate limiting check needs to be made for every single API hit, which involves both reading and writing short-lived data.
While API keys are looked up against Redis, the actual rate limiting itself currently uses memcached. This is because memcached counters can be updated even if they are set to expire.
It should be possible to build rate limiting against Redis sorted sets, without any extra dependency on memcached.
A/B testing is another perfect task for Redis - it involves tracking user behaviour in real-time, making writes for every navigation action a user takes, storing short-lived persistent state and picking random items.
Vanity is a good example of an A/B testing framework built on top of Redis.
Activity streams are another great example of a feature that can be tricky to develop using a relational database, but becomes much easier with a high performance data structure server like Redis.
They are notoriously difficult to scale big, and can cause problems for smaller sites as well.
Most people start out by assembling the page showing all actions by a user's friends at read-time - but this is very hard to scale. The large providers of activity streams have all moved to an alternative model where messages are written in to separate "inboxes" for all of a user's followers.
Implementing the inbox method with Redis is simple: each user gets a queue (a capped queue if you're worried about memory running out) to work as their inbox and a set to keep track of the other users who are following them.
When a user performs an action, that action is written to a persistent store (a Redis hash would work well here, especially in conjunction with virtual memory) and assigned an ID. The ID is then pushed on to the inbox queues of every one of that user's followers.
Ashton Kutcher has over 5,000,000 followers on Twitter - at 100,000 writes a second it would take less than a minute to fan a message out to all of those inboxes. Sharding the inbox queues across multiple servers would speed this up even more, and a task queue could be used to run the update process using offline workers.
I'm fascinated by comet, in particular the challenge of serving broadcast updates (such as election results) to hundreds of thousands of simultaneously connected users.
Node.js is a great tool for this kind of thing - it's non-blocking and hence can handle thousands of simultaneous connections, ships with an excellent streaming HTTP API and a broadcast comet implementation can be built in only a few dozen lines of code.
To deal with a million connected users would probably take a few Node.js servers, which means messages need to be broadcast to each server so they can forward the messages on to their connected clients. Redis publish/subscribe is perfect for this. Node has a good redis client library which includes a demo showing a subscriber.
I wrote more about serving Comet using Node in Node is genuinely exciting.
Here's a crazy idea I came up with that might just work: I often want a way of restarting Apache or performing some other root-level task from a non-root user account. One solution could be to have a shell script running as root that blocks on a Redis queue. The script could then be made to continue by pushing a message on to that queue.
The example code isn't complete: you'd want to check the output of the redis-cli command once it returns to ensure that a message really was sent, rather than the command terminating due to a timeout or the Redis server being rebooted.
Here's a neat idea from Ezra Zygmuntowicz: have workers periodically report their load average in to a sorted set.
When you want to issue a job, grab the three least loaded workers from the sorted set and pick one of them at random (to avoid the thundering herd problem).
Ezra's full presentation is available on SlideShare. It's well worth checking out.
Salvatore's full explanation of using a sorted set to implement prefix autocomplete can be found on the Redis mailing list.
Efficient autocomplete against any position in a string is a lot harder than autocomplete against a prefix. I first saw this technique used by Playdar - strings to be indexed are broken up in to consecutive three letter chunks and those chunks are used to build an index. Once a user has entered at least three letters their input can be broken in to three letter chunks and the corresponding sets can be intersected to find the potential matches. A final step outside of Redis is needed to ensure the searched text occurs in the correct order.
A search engine is basically an inverted index - a bunch of sets mapping terms to document IDs. Tim Bray's On Search series is an excellent introduction to search engine implementation fundamentals. Redis sets and set intersection can be used to implement such a search engine. The redis-textsearch is an implementation of this pattern in Ruby.
The TODO file in the Redis source tree is the best indication of upcoming features. Hashes, virtual memory and publish/subscribe are all features that have not yet been included in an official release.
Some of the more interesting commands to look out for in the future are those that operate on parts of regular Redis bytestrings. APPEND and SUBSTR are likely to be joined by LEN, PEEK, POKE, SETBIT and GETBIT. These will make it much easier to build highly space-efficient custom data structures like bitmaps on top of Redis.
Another exciting feature to look forward to is redis-cluster - currently just a design document, but at the rate that Redis development proceeds it probably won't be a long wait.
redis-cluster aims to provide a fault tolerant sharding layer on top of Redis. The core Redis server will stay the same, but a new layer of proxy servers will arrange for keys to be split across different shards and duplicated a configurable number of times.