Lifecycle Emails at ClassDojo (Part I)

We want ClassDojo to be a delightful product for teachers and parents everywhere, and we’ve found that email is an effective channel to showcase awesome features and to offer help if our users need it. These emails are known as lifecycle emails, and the central idea is to send the right content to the right user at the right time.

Up until last year, we had been using third party services for lifecycle communication, but we grew increasingly frustrated with the limitations they presented, and decided to build our own system. So far, we are really happy with its capabilities, performance, and reliability!

This will be a three part blog series covering how we do lifecycle emails at ClassDojo. We’ll start with an overview of the system and its architecture, move on to our experience using Redshift for this system in Part II, and finally explain some unique advantages our system offers compared off-the-shelf third party solution in Part III.

Challenges

The third party solution we had been relying on served our needs decently well for a while, but we found three main aspects to be inadequate for our needs:

Some of our emails required complex queries touching multiple data sources. For example, we send parents a recurring email summarizing their kids’ weekly activity. Syncing these data sources to a third-party solution is complicated. Moreover, their usage model isn’t set up to support these kinds of recurring campaigns.
We couldn’t send push notifications alongside our emails. Many of our users are mobile-only, and we felt that lifecycle marketing should definitely include mobile push notifications, but we couldn’t find a satisfactory solution that incorporated both.
Translation was really difficult to incorporate. ClassDojo has users all over the world with many language settings, and our old system could not easily integrate with our translations with many different language settings.

These challenges largely motivated our final design which includes Redshift as the main data store and “computing engine”, an email templating system from our existing code base, and finally our “notification service” that handles both push and email notifications. We won’t go into details on all of these decisions since they weren’t all that interesting. But we will start with covering how we structured our code to implement the central idea of a lifecycle email campaign: how to send the right email to the right user at the right time.

Campaign Manager

From what we have seen, most systems that manage lifecycle emails follow some form of a “rule based engine”. That is, they are built built as a series of “if this then that” conditions such as “X hours after user has completed action A” or “Y days if user has not completed action B”, and they are organized from the perspective of individual campaigns. This is an easy thing to start with, but after trying to build our campaign manager following this mental model, we found it to be pretty repetitive and error prone. We instead settled on a different abstraction.

Start From the Product Requirements

At the end of the day, what our marketing team really wanted was a comprehensive user lifecycle email campaign that would cover various stages of our users’ lifecycle: activation, feature awareness, retention, etc. They compiled a document that contained more than 30 different emails for various user types and appropriate emails to send. As an example, we have something similar to this table for the “Welcome to ClassDojo” and “Do you need help setting up your class” emails:

	Email	Logic
	Welcome/Getting started	Send immediately after sign up
	Add Your Class	Send 1 hours after signing up if no class added with students, with reminders each following day for 3 days

Each row contains the email template and the logic of when it should be sent out to users. The table then goes on for many more email campaigns, each one building off the logic of previous ones. So, this is our starting point in terms of product and we set out to turn the logic of these email campaigns into executable code.

First Iteration

We already had a Redshift table containing all users and whether they have completed certain actions on our product. This made it easy to write SQL queries that answered the question, “For email campaign X, which users fit X’s criteria?” So, we started with just that: for each email campaign we enumerate its conditions in SQL query to find the appropriate users. For example, the welcome email campaign becomes this query:

SELECT userIds
FROM user_events
WHERE signed_up > GETDATE() - INTERVAL '4 HOURS'
  AND received_sign_up_email IS NULL;

It’s easy to write and easy to understand, because it is essentially turning simple English into simple SQL. Welcome email campaign = done.

However, the logic quickly becomes more complicated. For example, it would require many AND and OR clauses to express the following criteria: users who have received welcome email, have added class and joined a school, or have not added class but received three “add class” emails and joined a school, or have not added class but received three “add class” emails and have not joined a school but received two “join school” emails. These queries are much harder to write and very error-prone. We needed a way to abstract the logic of individual email campaigns, especially ones later in the lifecycle with intricate dependencies on earlier events.

Lifecycle Tree

As we worked more on these lifecycle email campaigns, we made the key realization that, rather than thinking of them as individual criteria to be described separately; they are really just one flow chart with a yes/no question at each node, like the following chart:

lifecycle-tree

We can express this flow chart as a binary tree, and we can automatically generate SQL queries for the conditions of each campaign based on the paths that lead to the leaf nodes, which are our email campaigns. In code, we can express the logic for our activation emails as follows:

const addClassNode = {
  condition: "added_class IS NOT NULL and added_student IS NOT NULL",
  yes: ideasNode,
  no: {
    condition: happenedHoursAgo("last_email", 3),
    no: "no_action",
    yes: {
      condition: happenedHoursAgo("signed_up", 3),
      no: "no_action",
      yes: {
        condition: "add_class_email_0 IS NULL",
        yes: addClassCampaign0,
        no: retentionNode,
      },
    },
  },
}
const activationNode = {
  condition: "signed_up IS NOT NULL AND welcome_email IS NOT NULL",
  yes: addClassNode,
  no: welcomeCampaign,
}

Each node is composed of a condition, which is written as a SQL expression to describe a certain action that our user might have taken or received, and two edges, which are downstream nodes following the yes or no of that condition. The leaf nodes with no outgoing edges are our email campaigns (or no_action).

This becomes much easier to manage and allows for greater composability. In addition, we can change the logic of email campaigns at any time and we don’t have to manually update a bunch of SQL queries. The generation will be taken care of by our DecisionTree, which is just 32 lines of code:

class DecisionTree {
  constructor (root) {
    this.leafToPaths = new Map();

    const dfs = [{node: root, path: []}];
    while (dfs.length > 0) {
      const {node, path} = dfs.pop();
      const condition = node.condition;

      // leaf node
      if (!condition) {
        if (!this.leafToPaths.has(node)) {
          this.leafToPaths.set(node, []);
        }
        this.leafToPaths.get(node).push(path);
      } else {
        dfs.push({node: node.yes, path: path.concat(`(${condition})`)});
        dfs.push({node: node.no, path: path.concat(`NOT(${condition})`)});
      }
    }
  }

  getLeaves () {
    return Array.from(this.leafToPaths.keys());
  }

  getConditionPathAsWhere (campaign) {
    const paths = this.leafToPaths.get(campaign);
    const possibilities = paths.map((path) => ` ( ${path.join(" AND ")} ) `);
    return possibilities.join(" OR ");
  }
}

Before implementing this abstraction, we were spending multiple days of engineering time on each individual campaign, mainly on debugging and testing the SQL queries. But using our “decision tree” pattern, we were able to focus on structuring the overall logic of our campaigns, and each individual campaign takes less than an hour for copywriting and templating. It’s really a great feeling when a simple abstraction produces something like this, which would be impossible to produce by hand.

auto-generated-sql

That was certainly a mouthful!

Conclusion

This simple abstraction has held up pretty well with our changing requirements on email campaigns. It’s been pretty easy to add or to remove campaigns and be confident that our logic is still sound. This also allows easy visualization of our entire email campaign flow, which really helps with our marketing team to decide whether we have some gaps missing or are being too aggressive in other areas.

The overall system makes heavy use of Redshift to function properly, and we now have some good learnings on the various dials and knobs, such as choosing appropriate dist keys and sort keys, using the right compression algorithms, etc. Stay tuned for our next part covering our experiences with Redshift!