When the ClassDojo engineering team was in the office, we loved our information radiators: we had multiple huge monitors showing broken jenkins builds, alerts, and important performance statistics. They worked amazingly well for helping us keep our CI/CD pipelines fast & unblocked, helped us keep the site up & fast, and helped us build an engineering culture that prioritized the things we showed on the info radiators. They worked well while the whole team was in the office, but when we went fully remote, our initial attempt of moving that same information into a slack channel failed completely, and we had to find a different way to get the same value.
Most teams have an #engineering-bots channel of some sort: it's a channel that quickly becomes full of alerts & broken builds, and that everyone quickly learns to ignore. For most of these things, knowing that something was broken isn't particularly interesting: we want to know what the current state of the world is, and that's impossible to glean from a slack channel (unless everyone on the team has inhuman discipline around claiming & updating these alerts).
We had, and still have, an #engineering-bots channel that has 100s of messages in it per day. As far as I know, every engineer on the team has that channel muted because the signal to noise ratio in it is far too low. This meant that we occasionally had alerts that we completely missed because they quickly scrolled out of view in the channel, and that we'd have important builds that'd stay broken for weeks. This made any fixes to builds expensive, allowed some small production issues to stay broken, and slowed down our teams.
After about a year of frustration, we decided that we needed to figure out a way to give people a way to set up in-home info-radiators. We had a few requirements for a remote-work info-radiator:
- It needed to be configurable: teams needed a way to see only their broken builds & the alerts that they cared about. Most of the time, the info-radiator shouldn't show anything at all!
- It needed be on an external display: not everyone had an office setup with enough monitor real-estate to support a page and keep it open
- It needed to display broken builds from multiple Jenkins locations, broken builds from GitHub Actions, and triggered alerts from Datadog and Pagerduty on a single display
We set up a script that fetches data from Jenkins, Github Actions, Datadog, Pagerduty, and Prowler, transforms that data into an easily consumable JSON file, and finally uploads that file to S3. We then have a simple progressive web app that we installed on small, cheap Android displays that fetches that JSON file regularly, filters it for the builds that each person cares about, and renders them nicely.
These remote info-radiators have made it much simpler to stay on top of alerts & broken builds, and have sped us up as an engineering organization. There's been a lot written about how valuable info-radiators can be for a team, but I never appreciated their value until we didn't have them, and the work we put into making sure we had remote ones has already more than paid for itself.
A former teacher, Will is an engineering manager focused on database performance, team effectiveness, and site-reliability. He believes most problems can be solved with judicious use of `sed` and `xargs`.