
Authors: An Le, Evan Jones
Background
At ClassDojo, our product teams conduct Weekly Business Reviews (WBRs) to track, analyze, and discuss key metrics in relation to our overall business goals.
One critical metric during the busy back-to-school season is the percentage of new teachers who successfully create a ClassDojo classroom within seven days of signing up. Ensuring a smooth onboarding experience is essential for helping teachers set up their classes and engage students and parents quickly.
Beyond simply tracking the raw number of classrooms created in the current year, we also need additional context to understand trends and impacts:
- How does class creation volume compare to the previous year?
- What effect did a recently launched product feature have on class creation over the past six weeks?
- Are there any sharp spikes or drops in metrics that warrant deeper investigation?
To maintain consistency across teams, we adopted a “6-12” visualization format, displaying data from the previous six weeks and the past 12 months to provide a clear historical perspective.
Now imagine you're on the data team, managing ~20 key metrics that drive weekly product decisions – plus 100 more used for deeper analysis. Now, scale that across all major product teams. How do you build a standardized, scalable, and maintainable process?
Do you manually update a BI tool? Do you precompute every possible metric slice in an OLAP cube? More importantly, how can you enable Data Scientists to define and deploy their own metrics – without relying on dedicated Analytics Engineering support?
Challenge: Scaling metrics generation and visualization
When approaching this challenge as a data team (Data Platform, Analytics Engineering, and Data Science), we aimed to create a scalable, end-to-end solution rather than focusing on a single metric. Our goal was to streamline metric generation across all key business metrics, making the process more efficient and self-serve. Before diving into implementation, we first conducted a discovery phase to understand the “life of a metric” across our data stack – mapping its journey from raw data to final visualization.
Step 1: Understand the life of a ClassDojo metric
How do metrics get generated today?
A core part of the Analytics Engineering mission is to drive product impact by delivering usable, scalable data solutions. Effective metric building requires a deep understanding of both product context and the end-to-end data flow – from upstream data sources to downstream consumption – before writing a single line of SQL.
Product Context = What is the business trying to achieve? What are our goals?
Identifying pain points by auditing the metrics journey
To understand which ~20 key metrics matter most to each team, we start by partnering with stakeholders and asking the right questions. Gaining this context ensures we focus on the most impactful metrics and align data solutions with business needs.
Looking at the ❌ pain points, we saw a clear theme emerge that we didn’t have a consistent, flexible, and scalable way to generate and visualize metrics.
Step 2: Designing for the future
Ideate! What could the tech vision look like for metrics generation?
There are many solutions to building a metrics engine that we explored ranging from DBT’s own semantic layer, Netflix’s DataJunction all the way to doing the modelling in the BI layer like LookML or Metabase models.
In the end, we went with a simple DBT macro approach that (1) accepts a generic SQL statement from an underlying fact table and then (2) creates a wide and tall table that can easily be fed into a visualization template. What follows is a deep dive into that approach.
What does the end-state look like?
Our goal is a standardized, fast-to-query "wide metrics table" that simplifies period-over-period analysis. The backbone of this table is the metrics period (e.g., daily, monthly, MTD, QTD, YTD), with corresponding metric values for each period.
In practice, this table includes 20+ additional metrics columns, providing a comprehensive view of performance trends. Next, we’ll dive into the process of (1) generating time periods and (2) incorporating all key metrics into this structure.
If we’re able to model the above correctly, this would replace many bespoke metrics queries like this one:
SELECT
COUNT(DISTINCT CASE
WHEN classroom_created_date <= DATE_ADD(joined_date, INTERVAL 7 DAY)
THEN teacher_id
END) * 100.0 / COUNT(DISTINCT teacher_id) AS percent_created_within_7_days
FROM <fct_teacher_events>
WHERE date_key BETWEEN '2025-01-01' AND '2025-03-31';
… with generic “wide” metrics table queries like this which we can then automate into a visualization tool:
SELECT perc_created_within_7d
FROM <curated_metics_table>
WHERE period_type = "quarter_to_date"
… you could even imagine a metrics cube that has many metric slices like:
SELECT
period_type,
metric_name,
metric_value,
metric_prev_yoy_value,
metric_perc_yoy_chg
FROM <curated_metics_cube>
WHERE metric_name = 'perc_created_new_class_within_7d_signup'
Show me the code!
Here’s the technical approach we went with
- Start with a high quality fact table as input
- Use DBT macro to generate the deep and wide metrics table
- Connect our visualization tool
Deep dive on macro generate_wbr_periods()
See this GitHub gist for the basic scaffolding of our approach.
At the very bottom, we see that six CTEs are UNION ALL
’ed to generate the needed “time periods”
-- Union all CTEs at the end
SELECT * FROM daily_stats
UNION ALL
SELECT * FROM weekly_stats
UNION ALL
SELECT * FROM monthly_stats
UNION ALL
SELECT * FROM month_to_date_stats
UNION ALL
SELECT * FROM quarter_to_date_stats
UNION ALL
SELECT * FROM year_to_date_stats
Each of those CTEs has their own unique code for their “time period”. For example, the monthly_stats CTE will have values of:
- period_type = 'month' across all rows
- period_value is an integer between 1 and 12 to represent month X, e.g. 6 == June
- year is an varchar(4) to represent the year, eg. '2024'
Thus, these three columns <period_type, period_value, year>
make the unique Primary Key for the output table of this macro.
Setting up the macro parameters
The top of the DBT macro code shows the mandatory and optional arguments. Here are the arguments and a short description.
Calling the macro on an existing fct_ table
To illustrate creating the screenshot earlier, we will call this macro like this:
... other important DBT code here
{{ generate_wbr_periods(
source_table='fct_teacher_events'
, date_column='date_key'
, date_end='2025-02-01'
, metrics_sql_body="""
, COUNT(DISTINCT CASE
WHEN classroom_created_date <= DATE_ADD(joined_date, INTERVAL 7 DAY)
THEN teacher_id
END) * 100.0 / COUNT(DISTINCT teacher_id) AS perc_class_created_within7d
"""
) }}
which generates a table like so:
SELECT
period_type
, period_value
, date_key
, end_date_key
, "year"
-- metrics_sql_body starts here
, COUNT(DISTINCT CASE
WHEN classroom_created_date <= DATE_ADD(joined_date, INTERVAL 7 DAY)
THEN teacher_id
END) * 100.0 / COUNT(DISTINCT teacher_id) AS perc_class_created_within7d
FROM fct_teacher_events
GROUP BY ...
ORDER BY period_type DESC, "year" DESC, period_value DESC;
Lastly, if one needs to calculate a metric across multiple segments, then more complex DBT Jinja code can be written to employ loops, e.g.
/* Loop through specific platforms */
{% set platforms_array = [
'ios'
, 'android'
, 'web'
]
%}
{% set platforms_sql = [] %}
{% for platform in platforms_array %}
{% set sql_piece %}
, COUNT(CASE WHEN platform = '{{platform}}' THEN 1 END) AS {{platform}}_count
{% endset %}
{% do platforms_sql.append(sql_piece) %}
{% endfor %}
One word of caution when using DBT Jinja macros instead of more out-of-the-box (OOTB) solutions is that over-customization can introduce significant maintenance overhead. While Jinja macros offer flexibility, they can also add complexity – especially if your Analytics Engineering team is new to macros or if the implementation lacks proper documentation and testing.
Extending this macro after we launched to support DS self-serve
After the initial WBR reviews with senior leadership team, we quickly learned that some of our first ~20 key metrics required additional filtering, such as “Those international teachers in countries that began their school year in the first quarter of the calendar”. We could simply add more metrics to the metrics_sql_body section, but we realized it would be more flexible to have all ~20 key metrics filtered, and thus DS could self-serve by configuration of the optional_entity_filter argument in a new DBT Model.
{{ generate_wbr_periods(
source_table='fct_teacher_events'
, date_column='date_key'
, date_end='2025-02-01'
, metrics_sql_body="""
, COUNT(DISTINCT CASE
WHEN classroom_created_date <= DATE_ADD(joined_date, INTERVAL 7 DAY)
THEN teacher_id
END) * 100.0 / COUNT(DISTINCT teacher_id) AS perc_class_created_within7d
"""
, optional_entity_filter="teacher_id IN (SELECT teacher_id FROM analytics.dim_teachers_now WHERE LOWER(country_code) IN ('au', 'za', 'my', 'br', 'co', 'nz', 'sg'))"
) }}
The top-most CTE of the macro would inject this global filter like:
-- Base table
WITH dbt_wide_table AS (
SELECT *
FROM fct_teacher_events
WHERE teacher_id IN (SELECT teacher_id FROM analytics.dim_teachers_now WHERE LOWER(country_code) IN ('au', 'za', 'my', 'br', 'co', 'nz', 'sg'))
)
-- from this point on, all other CTEs are filtered by only these specific teachers
Key takeaways
Tip #1 – Start small.
Standardize on a few critical metrics to start. How do these metrics get generated and reported today? Pick the most important 2 or 3 metrics to start with for a given domain. Don’t overload stakeholders with every metric slice. You can link to deep dive dashboards with filters for exploratory analysis. The aim for these company critical metrics is consistency across teams.
Tip #2 – Map out the lifecycle of a metric
Engaging with partner teams (e.g., Data Science, Product, and Engineering) early on helps uncover common pain points, such as inconsistent definitions, multiple sources of truth, or performance bottlenecks.
Tip #3 – Address the biggest pain points first in your solution
For us, our DBT macro approach helped solve:
- ✅ Standardized metric generation – A single, reusable macro to generate WBR-ready tables
- ✅ Automated time-period calculations – Built-in period logic (daily, monthly, quarterly, etc.)
- ✅ Self-serve analytics for Data Scientists – A structured way to define and onboard new metrics
- ✅ Consistent reporting look-and-feel – A standard way of presenting metrics for leadership reviews