Better rate limiting with Redis sorted sets

At ClassDojo, we’ve recently been building out our push notification infrastructure. Our plans required a rate limiter that met the following criteria:

Distributed: The rate limiter can be shared across multiple processes. This necessitated the use of a external key-value store – we chose Redis, because we use it elsewhere in the stack.
Rolling Window: If we set a maximum of 10 message per minute, we didn’t want a user to be able to receive 10 messages at 0:59 and 10 more messages at 1:01.
Minimum Distance Between Messages: Regardless of what overall rate we chose, we wanted to also enforce a minimum distance between consecutive messages, so that teachers wouldn’t get an annoying burst of several sounds or vibrations while busy in the classroom.

We looked around at the available rate limiting options, but what we found on NPM didn’t meet the above requirements. So we decided to write our own, which you can download here.

Our first attempt – token buckets

The canonical algorithm for rate limiting with a rolling window is a token bucket (or its inverse sibling, the leaky bucket). Here’s how it works:

Each user has a bucket associated with them, containing a number of tokens.
When a user attempts to taken an action, we check the number of tokens in the bucket.
If the bucket is empty, the user has exceeded the rate, and the action is blocked.
Otherwise, we remove a token from the bucket (or more, for “expensive” actions) and allow the action.
Over time, we refill all buckets at a set rate, up to some maximum capacity.

This is a clever algorithm that takes very little space (only a single integer counter per user), but the naive implementation has a big flaw – some process needs to keep refilling the buckets. With several million users, and each refill operation requiring a write, this would be an unsustainable load on our Redis instance. Here’s a more sophisticated approach:

Each user has a two keys associated with them: the token bucket, and a timestamp of the last time the bucket was refilled.
When a user attempts to perform an action, we fetch the stored timestamp.
We calculate how many tokens the user should have been granted since that last request.
We can then proceed with the algorithm, using this new token count.

Unfortunately, this algorithm also has a problem: it fails when two processes need to share the rate limiter. Redis can batch operations into one atomic action, but to calculate how many tokens we need to give the user, we need at least two trips to redis: one to get the last timestamp, and one to set the new token count. Without delving into Redis Lua scripting (something none of us had experience with, and that would be hard to mock out in tests), we couldn’t find a way to combine all needed operations into a single atomic set of Redis actions. Because of this, if two clients using the rate limiter both tried to verify a user action at the same time, we could have the following sequence of operations:

The user has enough tokens left for one action.
Not enough time has passed since the last action for more tokens to be granted.
Client 1 gets the stored timestamp and token count.
Client 2 gets the stored timestamp and token count.
Client 1 calculates that there is no need to add tokens, allows the action, and tells redis to set the token count to 0.
Client 2 also calculates that there is no need to add tokens, allows the action, and tells redis to set the token count to 0.

Needless to say, this is not ideal. If we have several dozen workers all processing push notification payloads, it would be possible for a user to be spammed with dozens of pushes all at once.

A better approach – sorted sets

Fortunately, Redis has another data structure that we can use to prevent these race conditions – the sorted set. Here’s the algorithm we came up with:

Each user has a sorted set associated with them. The keys and values are identical, and equal to the (microsecond) times when actions were attempted.
When a user attempts to perform an action, we first drop all elements of the set which occured before one interval ago. This can be accomplished with Redis’s ZREMRANGEBYSCORE command.
We fetch all elements of the set, using ZRANGE(0, -1).
We add the current timestamp to the set, using ZADD.
We set a TTL equal to the rate-limiting interval on the set (to save space).
After all operations are completed, we count the number of fetched elements. If it exceeds the limit, we don’t allow the action.
We also can compare the largest fetched element to the current timestamp. If they’re too close, we also don’t allow the action.

The advantage of this approach is that all Redis operations can be performed as an atomic action, using the MULTI command. This means that if two processes both try to perform an action for the same user, there’s no way for them to not have the latest information, preventing the problem outlined above. It also allows us to use one limiter for both rates we wanted to track (i.e. no more than 10 messages per minute or 2 per 3 seconds).

However, there is one caveat to this approach – one we were comfortable with but may not fit others’ requirements. In our algorithm, you can see that whether or not an action is blocked is determined after all Redis operations are completed. This means that blocked actions still count as actions. So, if a user continually exceeds the rate limit, none of their actions will be allowed (after the first few), instead of allowing occasional actions through.

The module

We’ve open-sourced our limiter as the rolling-rate-limiter module on npm. It can either use a Redis backend or, if you don’t need to run your rate limiter across multiple processes, operate in-memory (using a simple array instead of a sorted set). Here’s an example:

  /*  Setup:  */

  var RateLimiter = require("rolling-rate-limiter");
  var Redis = require("redis");
  var client = Redis.createClient(config);

  var limiter = RateLimiter({
    redis: client,
    namespace: "UserLoginLimiter"
    interval: 1000
    maxInInterval: 10
    minDifference: 100
  });

  /*  Action:  */

  function attemptAction(userId, cb) {
    limiter(userId, function(err, timeLeft) {
      if (err) {

        // redis failed or similar.

      } else if (timeLeft > 0) {

        // limit was exceeded, action should not be allowed
        // timeLeft is the number of ms until the next action will be allowed
        // note that this can be treated as a boolean, since 0 is falsy

      } else {

        // limit was not exceeded, action should be allowed

      }
    });
  }

You can also easily use it with an express middleware to rate limit requests as follows:

  var limiter = RateLimiter({
    redis: redisClient,
    namespace: "requestRateLimiter",
    interval: 60000,
    maxInInterval: 100,
    minDifference: 100
  });

  app.use(function(req, res, next) {

    // "req.ipAddress" could be replaced with any unique user identifier
    limiter(req.ipAddress, function(err, timeLeft) {
      if (err) {
        return res.status(500).send();
      } else if (timeLeft) {
        return res.status(429).send("You must wait " + timeLeft + " ms before you can make requests.");
      } else {
        return next();
      }
    });

  });

Feel free to check out the source code on Github. We hope this will be helpful as you’re writing Node.js applications!