Posts By: Dominick Bellizzi

P1, P2,...P5 is a broken system: here's what we do instead

Software developers need to prioritize bugs when they are discovered, and most do this with an ordered priority scheme. Tracking tools like Jira provide a column on each ticket, and many organizations use numbers, such as P1, P2, P3, P4, and P5. Some even adopt a P0 as an indicator that there are some bugs that have an even higher priority than P1. Often there will be criteria for each of them, for example “all issues preventing users from logging in are P1.”

These systems are good at ranking issues against each other, but don’t on their own communicate what to do with the bugs themselves. When should they get worked on? How important are they to my current work? Just like the criteria for getting assigned a priority, companies define the response to the priority: “Drop everything for a P1” vs “We won’t fix a P5.”

At ClassDojo, we’ve found a few problems with systems like this:

  • Despite defined criteria, there’s lots of debate around the categorization: “Is this really a P3 or more of a P4?”
  • Despite defined criteria for work, there’s lots of debate around prioritization: “It’s a P1, but let’s finish the current set of work before we fix it, instead of doing it now.”
  • There’s a never-ending backlog of items that someone periodically goes through and deletes because either they’ve grown stale, or we don’t know if they are still valid, or we decide they aren't important enough to work on.

So how did we address this problem? We came up with a bug categorization system called P-NOW, P-NEXT, P-NEVER. Here’s how it works:

  • P-NOW represents something that is urgent to fix. These are critical issues that are impacting huge numbers of users, or critical features. The whole application being down is obviously a P-NOW, but so is something like not being able to log in on Android, or a critical security vulnerability, or a data privacy problem
  • P-NEXT represents all other bugs that we agree to fix. If this is something that has a real impact on people, it’s a P-NEXT
  • P-NEVER is everything we won’t fix. We’re honest about it up front, there’s no need to pretend we’ll get to the P5s when we’re done with the P4s, because that’s not going to happen before the bug report itself is invalid.

So with those criteria, how do we prioritize this work? It’s right in the name. P-NOW means stop everything and work on this bug. It means wake the team up in the middle of the night, get people in on the weekend, keep asking for help until someone is able to fix it.

It also means once it is fixed we have a postmortem, and as part of the postmortem we find a way to make sure a bug or outage like this never happens again. Nobody likes being woken up for work, and it’s inhumane to keep expecting people on call to do this work. The results of this postmortem are all categorized as P-NEXT issues.

P-NEXT issues are worked on as soon as we’re done with our current task. They go at the top of the prioritized queue, and whenever we’re done with our work, we take on the oldest P-NEXT. In this way we work through all of the bugs that we intend to fix now. This is an extension of our definition of done. The work that we released previously had a defect, so we need to fix that defect before we move on to new work.

Lots of people will be screaming about how you’ll never get anything new done, or how non-pragmatic this is. Remember, P-NEXT doesn’t mean we fix every bug. We are just going to take the ones we don’t intend to fix and categorize them as a P-NEVER.

When do we work on P-NEVER issues? Never! Well, not exactly. We don’t put them on a board, and we don’t track them in a backlog. We don’t want to maintain an inventory of things we don’t intend to work on. That inventory takes away from our focus on higher priorities and it requires someone to periodically look through the list.

But, if a one-off issue starts to pop up again then we rethink our decision. This is fine, and with our P-NEXT policy of cutting to the top of the line, it results in these bugs being fixed earlier than they would have otherwise.

But maybe people are still screaming that they’ve correctly categorized their P-NEXT issues and there are still too many to get anything done. This system is a great way to drive quality improvements in your code and process. When you prioritize all the bugs you are going to fix first, and then work to fix them, you’re working towards a zero-defect policy.

Does this sound like a better way to prioritize? Come join us in ClassDojo Engineering, we’re hiring!

    Newer posts