Using NodeJS's AsyncLocalStorage to Instrument a Webserver

In the dark times before AsyncLocalStorage, it could be hard to tell why a request would occasionally time out. Were there multiple relatively slow queries somewhere in that route's code? Was another request on the same container saturating a database connection pool? Did another request block the event loop? It was possible to use tracing, profiling, and logs to track down problems like these, but it could be tricky; setting up per route metrics using AsyncLocalStorage makes it a ton easier!

When ClassDojo set up our AsyncLocalStorage-based per-route instrumentation we found things like:

  • a route that occasionally made 30,000+ database requests because it was fanning out over a large list of items
  • another route that blocked the event-loop for 15-20 seconds a few times a day, and caused timeouts for any other requests that our server was handling at the same time
  • a third route that was occasionally fetching 500,000+ items to render simple counts to return to clients

I wrote about this a bit more in AsyncLocalStorage Makes the Commons Legible. If you're not familiar with ClassDojo, it's a parent-teacher communication platform. Our monolithic NodeJS web-server backend API normally serves ~10,000 requests per second.

I'd like to go through some of the details of how we set up this per-request instrumentation. For this post, we'll be starting with a relatively standard NodeJS web-server with pre-router middleware, and a router that finds an appropriate route to handle a request. It should look something like this:

app.use(({ req, res }, next) => {
  const start = Date.now();
  onFinished(res, () => afterResponse(req, res, start));
  next();
});
app.use(rateLimitingMiddleware);
app.use(bodyParserMiddleware);
app.use(allOfTheRestOfOurMiddlware);

app.use(router);
app.use(notFoundMiddleware);

To add instrumentation to this setup, we'll want to do the following:

  1. Create a per-request async store in our first middleware
  2. Store details about the database request caused by our request in our request's async store
  3. Send the request's database request details to our data lake.
  4. If any of the database request details violate our per-request limits, we log it as a server-error so that a team can see it & take action

Starting our pre-request async store

In a NodeJS web server, each middleware calls the next, so if we start an async local storage context in our very first middleware, every subsequent middleware should have access to the same storage context. (I had a lot of trouble understanding why this worked, so I wrote up a simplified gist that hopefully demonstrates what's going on.)

import { AsyncLocalStorage } from "async_hooks";

export const requestTrackingAsyncLocalStore = new AsyncLocalStorage();

// requestTrackingAsyncLocalStoreMiddleware wraps the downstream koa middlewares inside an async local storage context
export function requestTrackingAsyncLocalStoreMiddleware({ req, res }, next) {
  const store = {
    requestCost,
    req,
    res,
  };
  // running the next middleware in the chain in the context of this 'run' makes sure that all calls
  // to getStore() in the scope of this requests are bound to the correct store instance
  return requestTrackingAsyncLocalStore.run(store, next);
}

// add this to the router! (this would be in a different file)
app.use(requestTrackingAsyncLocalStoreMiddleware);
app.use(rateLimitingMiddleware);
app.use(....);

Store details about request about behavior in our pre-request async store

Now that we have a pre-request async local store, we can grab it and start using it! We'll want to learn:

  1. How many database requests do we make over the course of an HTTP request? Are we running into the N+1 query problem on any of our routes?
  2. How long do those database requests take in total? Requests that take a long time can indicate spots where we're doing a lot of expensive work.
  3. How many documents are these requests returning? If we're processing 10,000s of documents in NodeJS, that can slow down a server quite a bit, and we may want to move that work to our database instead.
export function increment(type: "request_count" | "duration" | "document_count", table: string, n: number = 1) {
  const store = requestTrackingAsyncLocalStore.getStore();
   // we'll probably want to track this to see if we're losing async context over the course of a request
  if (!store) return;
  _.set(store, ["requestCost", type], _.get(store, ["requestCost", type], 0) + n);
  _.set(store, ["requestCost", "byTable", table, type], _.get(store, ["requestCost", "byTable", table,], 0) + n);
}

If we add code that wraps our database client's request, it should hopefully be easy to add these increment calls at an appropriate point.

Handle the request report

Once we have this request report, we can do whatever we'd like with it! At ClassDojo, we log a server-error whenever a route is doing anything particularly egregious: that way, we get quick feedback when we've made a mistake. We also use a firehose to send this data to redshift (our data lake) so that we can easily query it. Either way, this is something that we can do after we're done sending our response to the client:

app.use(requestTrackingAsyncLocalStoreMiddleware);
app.use(({ req, res }, next) => {
  // this use of new AsyncResource will preserve the async context
  res.on("finished", new AsyncResource("requestTrackingLogging").bind(() => {
      const store = requestTrackingAsyncLocalStore.getStore();
      if (!store) throw new Error(`Something has gone awry with our async tracking!`);
      if (isEgregiouslyBad(store.requestCost)) logOutBadRequest(store);
      requestCostFirehose.write(store);
  }))
  next();
});

Tracking down places where we lose async context

While the async store might feel like magic, it's not, and sommon common situations will cause you to lose async context:

  1. using callbacks rather than promises. In those situations, you'll need to create an AsyncResource to bind the current async context
setTimeout(new AsyncResource("timeout").bind(() => doRequestTrackingThings()), 1);
redisClient.get("key", new AsyncResource("timeout").bind(() => doRequestTrackingThings()))
  1. Some promise libraries might not support async-hooks. Bluebird does, but requires setting asyncHooks to true: Bluebird.config({ asyncHooks: true });.

It may take a bit of work to track down and fix all of the places where you're losing async context. Setting up your increment calls to log out details about those situations can help!

export function increment(type: "request_count" | "duration" | "document_count", table: string, n: number = 1) {
  const store = requestTrackingAsyncLocalStore.getStore();
  if (!store) {
    logServerError(`We lack async context for a call to increment ${type} ${table} by ${n}`, new Error().stack);
    return;
  }
  ...
}

Increased Observability is great!

Putting effort into increasing the observability of a system can make that system much easier to manage. For a NodeJS web-server, we've found a lot of benefits in using AsyncLocalStorage to improve per-request visibility: it has let us improve latency on a few routes, reduced our event-loop blocking, and given us a better view of opportunities to improve performance.

    A single large query in SQL can be hard to understand, test, debug, or change in the same way that an over-large function in code can be. A large query is also much harder to write! Feedback loops while writing large queries are slow and you'll often find yourself needing to guess at where the problem in your query is.

    When I started writing analytics queries, I wrote some pretty rough ones that are now hard to debug and maintain! (Apologies to everyone who has had to deal with any of my old SQL queries.) Over time, I think I've gotten better at them, and I wanted to write down some of the things I'm doing differently now that I hope will be useful to other people in similar situations.

    Finally, I'm a product engineer who rarely writes these sorts of queries! This advice is much less applicable to someone who does this daily & is using better & more focused tools to do analyses.

    Temporary tables, views, and with

    Large queries are hard, but it's pretty simple to break a large query into smaller pieces. For analytics queries, I normally create small temporary tables (often with the temporary keyword) that normalize data, filter out deleted rows and rows I'm not interested in, and organize my data into a format that makes querying easy. Views or with clauses can accomplish similar things, but I like using temporary tables for this because they cache results and make subsequent queries faster.

    I also try to put constants into a temporary table or with clause. When working on a query, it can be easy to forget to update a constant in one spot and then get completely meaningless results. (Shoutout to Ben Haley for showing me this trick!)

    All of this might sound a little abstract: let's take a somewhat contrived query and try to refactor it. We want to bucket and count US-based teachers who were active in 2021 by how many classes they created during that time period. Here's what that might look like as a single query:

    select 
      case when class_count < 5 then class_count::varchar else 'many' end as bucket,
      count(*)
      from (
        select count(distinct class.classId) as class_count
        from teacher
        join user_teacher ON teacher.teacherId = user_teacher.teacherid
        -- left join class_teacher to make sure we're counting teachers who haven't created classes
        left join class_teacher on class_teacher.teacherId = user_teacher.teacherId and class_teacher.creator
        left join user USING(userId)
        join class using(classId)
        join (
          select distinct teacherId
          from teacher_active
          where active_date between '2021-01-01' and '2022-01-01'
        ) as ats on teacher.teacherId = ats.teacherId
        and class.createdat between '2021-01-01' and '2022-01-01'
        and not class.autocreated_demo
        and lower(user.country) in ('usa', 'us')
        group by teacherId
      )
    group by 1
     
    

    This query isn't particularly complex, but it's still enough logic that I'd be a little worried about changing it or verifying that it's correct. I'd be tempted to try to pull out constants and then separate out the filtering logic from the calculation logic.

    drop table if exists _constant;
    create temporary table _constant as (
     select '2021-01-01' as start, '2022-01-01' as end
    );
     
    drop table if exists _teacher;
    create temporary table _teacher as (
     -- us_user is probably overkill: this might be better in the `where` clause!
     with us_user as (
       select userId
       from user
       where lower(country) in ('usa', 'us')
     )
     select distinct teacherId
     from teacher_active
     join user_teacher USING(teacherId)
     join us_user using(userid)
     where active_date between (select start from _constant)
       and (select end from _constant)
    );
    drop table if exists _class;
    create temporary table _class (
     select classId
     from class
     where class.created between (select start from _constant)
       and (select end from _constant)
       and not class.autocreated_demo
    );
     
    drop table if exists _classes_created_by_teacher;
    create temporary table _classes_created_by_teacher (
     with class_creator as (
       select class_teacher.*
       from class_teacher
       join _class USING(classId)
       where class_teacher.creator
     )
     select teacherId, count(distinct classId) as classes_created
     from _teacher
     left join class_creator using(teacherId)
     group by teacherId
    );
     
    select
     case when class_count < 5 then class_count::varchar else 'many' end as bucket,
     count(*)
    from _classes_created_by_teacher
    group by bucket;
     
    

    It's arguable whether this is actually better! The initial query is short enough that it's not that much logic to understand: it might be the right size for the team that you're working with. There are also certainly better ways of factoring this same query that could make the logic even more clear. Overall though, I'd much rather work with the updated query:

    • if something is broken in the query, I can easily and quickly examine the tables that I've created to see if my results match my expectations
    • if I have a testing library of some sort, I can set up simple assertions about what the tables I'm using look like
    • the overall query will run faster because results are cached and because query-planners don't always generate optimum plans for large queries. While working on this query, I'll have faster feedback loops
    • I'll be able to tell which parts of this query are slow and optimize if necessary
    • it's easier to focus on adapting & improving a single part
    • The sub-queries that I've turned into tables are things I could take advantage of later if I ever tackle other similar problems by turning them into nice non-temporary cache tables

    I think many data-focused engineers use jupyter notebooks and pandas to break down large queries. I think how you're breaking down a large query into smaller pieces is much less important than doing that breakdown!

    Make feedback loops FAST!

    One of the most frustrating parts of working on a large query is that feedback loops can be slow. Making a change and waiting tens of minutes can completely kill any programming flow or focus that you have.

    • Break up large queries into smaller ones that are quick to run!
    • Use select without a from to quickly test out behavior. You can run queries like select extract('year' from GETDATE()); or select 1 != null, 1 is not null to quickly check your understanding. This can be especially useful for testing out regular expressions and formatting date fields for charts. There's no need for a from or to run a full query1
    • If a full table is slow to query, it might make sense to create a temporary table with representative data that you can use to build up your queries
    • Good indexes or sort-keys can drastically improve query speed! The most common mistake I've made is setting up a compound sort key with a too-precise timestamp followed by other keys I'm interested in. If you use a second or millisecond precision key at the start of a compound sortkey, that key's precision will override any possible benefit from the other key. So, rather than creating a sortkey like (createdAtMs, event), it's probably better to instead sort and query on an index like (day, event). (This might mean that queries need to include both day and createdAtMs.)
    • validate your work as you go! The quicker you can realize that your assumptions about a table or column are incorrect the better. Running a query that checks whether a column is unique, what values a column can hold, or just what data looks like can save a ton of time!

    In general, putting effort into how quickly you get feedback while working makes it much easier to find flow and be effective. A little bit of effort put into setting up nice tables, improving data layout, and optimizing sortkeys can pay large dividends.

      In Part 1 we talked about why we switched from Google Analytics to Matomo. In Part 2, we discussed how we designed the architecture. Finally, here in Part 3 we will look at the Matomo specific changes necessary to support our architecture.

      First, we modified the Dockerfile so that we could run commands as part of the container startup. This allows classdojo_entrypoint.sh to run, but the process that the container ultimately creates is the long running apache2-foreground:

      # The matomo version here must exactly match the version in the matomo_plugin_download.sh script
      FROM matomo:4.2.1
      ADD classdojo_entrypoint.sh /classdojo_entrypoint.sh
      ADD ./tmp/SecurityInfo /var/www/html/plugins/SecurityInfo
      ADD ./tmp/QueuedTracking /var/www/html/plugins/QueuedTracking
      ADD ./tmp/dbip-city-lite-2021-03.mmdb /var/www/html/misc/DBIP-City.mmdb
      RUN chmod +x /classdojo_entrypoint.sh
      ENTRYPOINT ["/classdojo_entrypoint.sh"]
      CMD ["apache2-foreground"]
      

      Next, we wrote a script to download plugins and geolocation data, to bake into the Docker image:

      #!/bin/sh
      set -e
      
      MATOMO_VERSION="4.0.2"
      
      rm -rf ./tmp
      mkdir ./tmp/
      cd ./tmp/
      
      # This script downloads and unarchives plugins.  These plugins must be activated in the running docker container
      # to function, which happens in matomo_plugin_activate.sh
      curl -f https://plugins.matomo.org/api/2.0/plugins/QueuedTracking/download/${MATOMO_VERSION} --output QueuedTracking.zip
      unzip QueuedTracking.zip -d .
      rm QueuedTracking.zip
      curl -f https://plugins.matomo.org/api/2.0/plugins/SecurityInfo/download/${MATOMO_VERSION} --output SecurityInfo.zip
      unzip SecurityInfo.zip -d .
      rm SecurityInfo.zip
      
      curl -f https://download.db-ip.com/free/dbip-city-lite-2021-03.mmdb.gz --output dbip-city-lite-2021-03.mmdb.gz
      gunzip dbip-city-lite-2021-03.mmdb.gz
      
      cd ..
      

      Then we write the entrypoint file itself. Since we overwrote the original entrypoint, our entrypoint needs to unpack the Matomo image and fix some permissions first, but then we activate plugins that we want to include:

      #!/bin/sh
      set -e
      
      if [ ! -e matomo.php ]; then
        tar cf - --one-file-system -C /usr/src/matomo . | tar xf -
        chown -R www-data:www-data .
      fi
      
      mkdir -p /var/www/html/tmp/cache/tracker/
      mkdir -p /var/www/html/tmp/assets
      mkdir -p /var/www/html/tmp/templates_c
      chown -R www-data:www-data /var/www/html
      find /var/www/html/tmp/assets -type f -exec chmod 644 {} \;
      find /var/www/html/tmp/assets -type d -exec chmod 755 {} \;
      find /var/www/html/tmp/cache -type f -exec chmod 644 {} \;
      find /var/www/html/tmp/cache -type d -exec chmod 755 {} \;
      find /var/www/html/tmp/templates_c -type f -exec chmod 644 {} \;
      find /var/www/html/tmp/templates_c -type d -exec chmod 755 {} \;
      
      # activate matomo plugins that were downloaded and added to the image
      /var/www/html/console plugin:activate SecurityInfo
      /var/www/html/console plugin:activate QueuedTracking
      
      exec "$@"
      
      We tie it together with a Makefile to build and publish these Docker images:
      
      build-img:
      	sh ./matomo_plugin_download.sh
      	docker build . -t classdojo/matomo
      	rm -rf ./tmp
      
      push-img:
      	docker tag classdojo/matomo:latest xxx.dkr.ecr.us-east-1.amazonaws.com/classdojo/matomo:latest
      	docker tag classdojo/matomo:latest xxx.dkr.ecr.us-east-1.amazonaws.com/classdojo/matomo:${BUILD_STRING}
      	aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin xxx.dkr.ecr.us-east-1.amazonaws.com
      	docker push xxx.dkr.ecr.us-east-1.amazonaws.com/classdojo/matomo:latest
      	docker push xxx.dkr.ecr.us-east-1.amazonaws.com/classdojo/matomo:${BUILD_STRING}
      

      Inside our Nomad job specifications, we inject a config.ini.php file. This contains the customized config.ini.php for Matomo. It is a copy of the original Matomo config.ini.php file, but with some important changes:

      [General]
      proxy_client_headers[] = "HTTP_X_FORWARDED_FOR"
      force_ssl = 1
      enable_auto_update = 0
      multi_server_environment=1
      browser_archiving_disabled_enforce = 1
      

      Proxy_client_headers and force_ssl are used as part of our SSL setup. Enable_auto_update prevents containers from updating separately, so that we can coordinate updates across all containers. Multi_server_environment prevents plugin installation from the UI and disables UI changes that write to the config.ini.php file. Browser_archiving_disabled_enforce ensures that the archiving job is the only job that can run archiving, and that archiving won’t happen on demand.

      For our non-frontend ingestion containers, we also set:

      ; Maintenance mode disables the admin interface, but still allows tracking
      maintenance_mode = 1
      

      Another major change is that the Docker command for the queue processor is changed to:

              command    = "/bin/sh"
              args       = ["-c", "while true; do /var/www/html/console queuedtracking:process; done"]
      

      This allows the job to run in a loop, continuously processing the items in the queue.

      Similarly, the archive job is changed to:

              command    = "/var/www/html/console"
              args       = ["core:archive"]
      

      Which runs the archiving job directly. The admin and ingestion containers all use the default docker command and arguments.

      That’s the end of our current journey from Google Analytics to Matomo. There’s more work we have to do around production monitoring and making upgrades easier, but we’re very happy with the performance of Matomo at our scale, and its ability to grow with ClassDojo.

        Newer posts
        Older posts