Product Prioritization with ICE·T

ICE·T is a product prioritization model that works well for high-level prioritization and applies to internal products and platforms, too.

ICE·T (pronounced: ‘ice tea’) adds time criticality to the ICE scoring model to accommodate for tasks or features where time is of the essence in addition to the estimated impact and complexity.

The ICE·T score is the product of the following factors: impact (I), confidence (C), ease of implementation (E), and time criticality (T).

Impact (I) and time (T) are mapped to integers as follows:

  • Low: 1
  • Medium: 5
  • High: 10

Ease (E) uses a simplified scale:

  • Low: 1
  • Medium: 2
  • High: 3

The confidence is a probability that represents the impact of the proposed feature on the product’s objective. It is indicated by C:

  • Low: 0.2
  • Medium: 0.6
  • High: 1.0

These numbers are multiplied to yield the ICE·T score. The features are then prioritized in descending order of the ICE·T score.

Simple!

Example

To illustrate how ICE·T works, let’s apply it to a set of four features or tasks.

Feature A has potentially the highest impact, but our level of confidence is poor. The development team estimates the feature to be of average complexity. Fortunately, it can be delayed without any consequences.

The team is moderately confident that Feature B will have a medium impact on the product’s goal(s). While the impact is never guaranteed, with continuous discovery and experimentation, product teams can be reasonably sure they are on the right track. Unfortunately, it is difficult to develop yet extremely urgent.

Feature C is somewhat urgent. The team is confident it will have an average impact. Moreover, it is relatively easy to accomplish.

Feature D is urgent and easy, but is not really expected to have much of an impact on the objective of the product. Such a feature could be a request from a single customer that does not quite fit into the overall roadmap but may be needed to close an important deal or to solve an urgent problem by a key customer.

Translated to ICE·T scores we have the following table:

Feature A B C D
Impact 10 5 5 1
Confidence 0.2 0.6 1.0 0.6
Ease 2 1 2 3
Time 1 10 5 10
ICE·T 4 30 50 18

The ICE·T model prioritizes these features as follows: C, B, D, and A. While B and C have the same expected impact, we are more sure of C’s. Even though B is more urgent, C is easier to implement. All in all, C seems the better choice to tackle immediately.

While A’s expected impact is the highest, we are not at all confident of that fact. Moreover, A can easily be delayed, whereas the other features cannot.

Background

Product prioritization frameworks abound. Some are hard to use consistently, others are tricky to explain to stakeholders, while a few do not apply to both new and existing internal products or platforms. In fact, the product management literature has largely ignored internal product management. Since internal technologies are my niche of product management, I have come up with ICE·T to work around the limitations of existing methodologies.

So, how did I arrive at ICE·T?

Principles

I drafted a laundry list of requirements for my desired product prioritization method:

  1. It is easy to use
  2. It is easy to remember
  3. It is easy to explain
  4. It is easy to understand
  5. It focuses on outcomes
  6. It is quantitative
  7. It is discriminative
  8. Its constituents are orthogonal
  9. It is applicable to new and existing products
  10. It is applicable to internal products and platforms
  11. It does not ignore important (but complex) tasks
  12. It does not require detailed planning
  13. It does not invite to cheat

That’s a rather long list, but it helps in figuring out where to begin. Initially, I listed all the models I have worked with in the past: value/effort, RICE, ICE, WSJF, MoSCoW, story mapping, weighted scoring, opportunity scoring, and product tree. From there, it was a matter of pruning.

For instance, value/effort is a very simple model that is easy to remember and explain to stakeholders. However, it’s not quantitative (i.e. numeric) or very discriminating, and it mostly applies to MVPs.

Story mapping is also best for new products with qualitative prioritization. Product trees represent another purely qualitative prioritization method, so they are off the shortlist.

MoSCoW is easy to use, remember, and explain, but it is focused on output (features) rather than outcome (value), and it lacks the ability to prioritize within each of its four groups: must have, should have, could have, and won’t have.

Weighted scoring is out because the weights often become political, can be gamed, and leave too much room for interpretation.

Opportunity scoring is highly subjective as it places the customer in the driver’s seat, which means a fairly sizable survey is needed to reach some level of objectivity. It is also tough to apply to entirely new products.

The Kano model is best for product improvements. Its focus on ‘delight’ also means it does not quite cover internal products: few developers would be ‘delighted’ by a well-designed REST API, SDK, or CLI. While that may sound pedantic, the overuse of the word ‘delight’ in UX design ought to make anyone cringe, as it devalues its true meaning. Moreover, “How likely are you to recommend our internal tool to a friend?” as a survey question is mostly moot: unless your friends all work at the same company in similar roles, you cannot recommend or typically talk about internal tools to the outside world.

RICE is a simple, proven method, but the E refers to effort and usually requires a good deal of up-front estimation of tasks. For internal products, reach (R) is ultimately limited by the number of employees. In addition, it’s often hard to extricate the reach from the impact.

WSJF is terrible in practice and equally atrocious to explain to stakeholders. Its “user/business value” concept mixes two very different things. Even after applying the method for almost a year I still have to look up what “risk reduction / opportunity enablement” really means, and I have no idea what scale to use. The only thing I like about WSJF is that it includes the cost of delay: What happens if we do not build this now? Monetary amounts do not make much sense though, as internal products usually improve productivity and do not affect the bottom line directly. WSJF also suffers from short-sightedness when applied mindlessly: it tends to favour short tasks.

That leaves ICE, which is often good enough. It is easy to use, remember, explain, and understand. With different levels instead of numeric scales it can be discriminative while still retaining its simplicity. Impact, confidence, and ease are hard to conflate. The only gripe is that it ignores the cost of delay, which is why ICE·T adds the time criticality to ICE.

Levels and Scores

That leaves two important, technical questions:

  • Why have I defined three levels for each constituent?
  • Why are the values the way they are?

First, the numbers are irrelevant. They are dimensionless figures that are multiplied only to reveal a relative order. Consequently, it’s best to pick numbers that are easy to remember.

Second, to keep the overall method simple yet discriminative, we need at least three levels: four are acceptable, but at five the discussions will revolve around the same inconsequential differences as observed in Fibonacci sequence-based planning sessions: Is this task really a 3 or a 5? Maybe a 2? Three strikes a good-enough balance.

An order of magnitude is also good enough to see clear differences in scores, especially when multiplied. So, let’s pick 1 (low), 5 (medium), 10 (high).

The impact and time criticality are both essential ingredients to the prioritization. That said, if a feature is easy to implement but it has no impact, why bother? The same goes for the urgency: if time is imperative but the impact is low, it is questionable whether that feature ought to be prioritized at all. So, only the impact and time criticality are full order-of-magnitude scales, and the confidence is a probability.

Third, to reduce the number of degrees of freedom, let’s fix the hard-to-implement features to E = 1 and the ones for which we have high confidence to C = 1. That leaves four degrees of freedom:

\[I \in \left\lbrace I_{\mathrm{low}} = 1,\,I_{\mathrm{med}} = 5,\,I_{\mathrm{high}} = 10 \right\rbrace\] \[C \in \left\lbrace C_{\mathrm{low}} >0,\,C_{\mathrm{med}},\,C_{\mathrm{high}} = 1 \right\rbrace\] \[E \in \left\lbrace E_{\mathrm{low}} = 1,\,E_{\mathrm{med}},\,E_{\mathrm{high}}\right\rbrace\] \[T \in \left\lbrace T_{\mathrm{low}} = 1,\,T_{\mathrm{med}} = 5,\,T_{\mathrm{high}} = 10 \right\rbrace\]

An impactful (I = 10) yet complex (E = 1) feature beats a feature that has low impact but is a breeze to build, all other variables being equal:

\[\forall C,T\colon\,I_{\mathrm{low}}CE_{\mathrm{high}}T<I_{\mathrm{high}}CE_{\mathrm{low}}T\] \[I_{\mathrm{low}}E_{\mathrm{high}}<I_{\mathrm{high}}E_{\mathrm{low}}\] \[E_{\mathrm{high}}<10\]

That confirms that the scale for ease is not a full order of magnitude. We can sharpen the upper bound by positing that a simple task with low urgency is less important than a hard task with medium urgency:

\[\forall I,C\colon\,ICE_{\mathrm{high}}T_{\mathrm{low}}<ICE_{\mathrm{low}}T_{\mathrm{med}}\] \[E_{\mathrm{high}}T_{\mathrm{low}}<E_{\mathrm{low}}T_{\mathrm{med}}\] \[E_{\mathrm{high}}<5\]

Furthermore, we postulate that an urgent (T = 10) but complex (E = 1) feature is roughly equivalent to a feature with the same impact and confidence but of medium complexity and medium time criticality: the urgency trumps the complexity only if that urgency is non-negotiable or the feature is unavoidable. It must be a conscious trade-off, which the model cannot make.

\[\forall I,C\colon\,ICE_{\mathrm{low}}T_{\mathrm{high}}=ICE_{\mathrm{med}}T_{\mathrm{med}}\] \[E_{\mathrm{low}}T_{\mathrm{high}}=E_{\mathrm{med}}T_{\mathrm{med}}\] \[10=5E_{\mathrm{med}}\] \[E_{\mathrm{med}} = 2\]

With that we can set easy tasks to have E = 3 < 5, which yields a 1-2-3 scale that is easy on the eye and brain, in line with the principles outlined above.

In a similar vein, we require that high-impact (I = 10) tasks for which we have low certainty are more important than tasks we are sure of (C = 1) they will have little impact (I = 1):

\[\forall E,T\colon\,I_{\mathrm{high}}C_{\mathrm{low}}ET>I_{\mathrm{low}}C_{\mathrm{high}}ET\] \[I_{\mathrm{high}}C_{\mathrm{low}}>I_{\mathrm{low}}C_{\mathrm{high}}\] \[10C_{\mathrm{low}}>1\] \[C_{\mathrm{low}}>0.1\]

Let’s pick 0.2. That leaves C = 0.6 for the medium level of confidence as an average of the low and high values.

With these figures, ICE·T yields 31 unique scores. That is enough for most cases.

Conclusion

ICE·T is a quick prioritization method that extends ICE by including the time criticality. It applies to new and existing products, both external and internal.

While the exact numbers for each level may be tricky to remember, they only have to be implemented once in your preferred product management software, Jira (e.g. with Foxly), or even a spreadsheet.

Now, who wants some ICE(d)·T(ea)?