Estimation exercises that require the team to come around a table and discuss whether a task is a three or a five are a typical component of today’s software development rituals. Do the benefits of this process outweigh its cost?
I have worked in teams with and without this process and I have found myself on both sides of the debate and I think it is time for a more nuanced discussion. In this article, I want to go over some good and some bad reasons for using story points. Once we have a good sense of their value and their limitations and once we are equipped with alternative techniques to address the same goals, we can make an informed decision.
Let us start with two arguments that are brought forward in favor of story points: Their use when planning work and their contribution to developing a shared understanding by the team.
The most obvious value of story points is that they help to project timelines for your work. This is important if other teams depend on your output. Whenever there are initiatives that need to come together down the road, some form of effort estimation is required. Similarly, the customer might need to adapt their processes to a new feature and will therefore need a sense of when a delivery can be expected.
Story points can serve as a tool as they indicate the complexity of a task. When multiplied by a team’s velocity you get an idea of the time required for completion.
The problem is that these plans usually require time predictions far into the future. This means you need to estimate very far ahead into the backlog, making these estimates imprecise. It is for good reasons, that many teams that practice story pointing plan at most two sprints ahead: Once they get there, reality will have changed.
A second problem with story points for long term planning is that they do not take into account dependencies. In my experience, dependencies on other work items or other teams tend to have more impact on the projected schedule than the complexity of the task itself.
The reality is that story points are often mainly used to plan what can get done in the upcoming sprint. That, however, is of limited value in my experience. Ask yourself this: Other than reporting, what are you actually doing in practice with that semi-accurate view on what the team can get done in the next two weeks? Assuming that all work items are sorted by priority, every person on the team will know what to work on next, whether or not there is a Fibonacci number attached to that task.
The biggest advantage of story points, in my opinion, is their value in aligning the team around the work that is coming up and what changes will need to be done to the code. When estimates are widely divergent, it is clear that the team is not yet on the same page and that more explanation is required.
A team that needs to decide how much effort is required for a story will have to explicitly ask the question “what will we need to do to make this happen” and they will have to ask clarifying questions, forcing the person that formulated the task to be as clear and detailed as possible.
This is an opportunity to eliminate potential sources of misunderstanding early and it creates a shared expectation about the scope of the task. Is it a quick fix or does it require a large refactoring? Having this shared understanding means that no one is surprised by the changes landing in the code during execution.
Let us now look at some downsides of story points:
Everyone working in software must have had the experience of sitting in a meeting, discussing for fifteen minutes whether a task is a five or an eight. This exercise is frustrating. Especially because the standard deviation is so large that at the end of the day both numbers are likely well inside the margin of error.
The pure cash investment of the exercise is worth considering: If you want to benefit from the “shared understanding” bonus described above, a decent number of team members need to be involved. Assuming five to ten team members at a loaded cost of 60€ – 70€ per hour and assuming you are able to estimate six work packages per hour, that makes 50€ – 120€ per work package. Importantly, this does not yet factor in opportunity cost.
If your project is more mature and many tasks are routine, this cost may be lower – but then again, if work is really routine then the added value of story pointing exercises is even more questionable.
That being said, I’m sympathetic to the argument that we spend money on a lot of silly things in business life. 100€ for knowing whether a story is an eight or a five is not going to ruin you. I heard of teams where estimation meetings are the only occasion to ask questions to a Product Manager – sometimes you just have to play the organizational circus.
Misleading Sense of Control
Often (but not always) the people asking for estimates are either engineering managers or product management. As mentioned above, there are good reasons for people in these roles to ask for such numbers when they help to plan ahead.
However, I suspect that organisations that are extremely serious about estimations are also in part doing this to regain some control over the messy world of engineering. It is neat to collapse work down to a simple number that hides all the details of technical debt, dependencies, firefighting, refactoring, faulty assumptions and other considerations that dominate the actual execution of a task.
This regained sense of control is dangerous if not handled with care. Some organizations succumb to its siren song, starting to tie individual or team performance to such metrics, using it as a productivity measure to benchmark workplace improvements or using it as an excuse to make lofty promises about delivery dates.
No simple number can completely eradicate the need to dive deep into the messy details from time to time to get a accurate understanding of what is really going on.
Delaying Important Work
You might have noticed that I did not mention “prioritising work based on ROI” as one of the valuable uses of story points. In fact, I am very skeptical about the use of story points to change the order of work based on a cost-benefit argument. The reason is that while measuring the cost of a story with story points is imprecise, we are even more blind regarding the business value of a story.
As we have almost no visibility on business value, this quickly leads to “uh-oh, this looks expensive” and “my boss has been shouting at me for weeks because that feature is not yet done” being the only yardsticks for assessing priority. This in turn can cause important work being delayed because of its perceived cost.
You might say that this is not really a problem with estimates per se. I disagree: Estimates lead to a reinforced focus on cost leading to even less focus on what work will deliver the most value.
Now, this is not to say that embarking on a six month quest to implement a very valuable feature is necessarily the right move. I am all in favour of making a distinction between a six month project and task of two days! What I am saying is that it is likely not ideal to delay an important task just because the team estimated it at thirteen.
This brings me to a particular pet peeve of mine: In some teams, the planning meeting ends with the ritual collective solving of the knapsack problem, squeezing as many story points as possible in a pre-computed sprint capacity. Large stories are less likely to make it in just because they are more likely to overlap with the capacity limits. Sometimes, extra time is spent trying to find a useful split that turns the thirteen into two stories of five and eight. What is the use in that? Why not simply draw tasks from the top in order of priority, and if one of them carries over to the next iteration then so be it.
Replacing Retrospective Debate with Planning Theatre
I think retrospectives (or whatever you call the place where you introspect into your way of working) can be a very valuable ritual. Many of today’s engineering teams practice it. While not all setups favour effective retrospectives, they can help a team improve incrementally by addressing the things that are holding it back.
The problem is that if estimations are part of the team process, you often also set sprint goals based on these estimates. That in turn makes it tempting to primarily discuss missed goals at the end of a sprint: Why did we not achieve what we set out to do?
The answer in many cases is simply: The initial estimate did not account for an important unknowns or included a faulty assumption and was thus wrong.
The debate then quickly starts to focus on improving the estimates or optimize the planning process. (“Let us add a buffer to account for unknowns”).
This takes up time that could be spent on topics that actually matter to the team’s productivity. (“How can we reduce our dependency on the Funky Parrots team?”). Furthermore, it regularly places team members into the uncomfortable position of having to justify whey their task took longer than expected.
Sometimes missed sprint goals can reveal an underlying problem. For example, maybe an engineer misunderstood the task and went off into the wrong direction at first. However, usually such problems are apparent and can be addressed without the detour over estimates and sprint goals.
So from the analysis above, it appears that story points are valuable to enable the team to develop a shared understanding of the tasks. Furthermore, they provide some limited value when planning ahead.
For the latter I think more appropriate tools exist. First of all, not all of your work requires this level of planning. For some things it is sufficient to call them done when they are done. That’s the ideal “agile” scenario. You’re developing a new product and it’s impossible to know one year in advance what the most useful features will be. So you work in small increments, starting with whatever looks most promising. What’s central to this scenario is the discovery of the problem space.
No all projects work like that. If you are building a nuclear power plant, you will probably require some degree of coordination between teams. For all the work where such visibility is required, we can drop the pretense of estimating complexity instead of time. We want to know when it’s done and what the implications will be for staffing, sales and budgeting, as simple as that.
With that masquerade out of the way, someone with knowledge of the inner workings of the system can do the upfront work of planning the actual execution of a larger, long term initiative. I know, I know, that is revolutionary stuff straight from the eighties! Waterfall ahoy! Part of that execution plan is breaking the work down into presentable milestones, formulating assumptions, clarifying dependencies, assessing if work can be divided across team members and how much time it might take to achieve each of the milestones assuming 100% focus on this initiative.
Now, we all know, the timings coming out of such an exercise will be wrong – but having milestones has the advantage that process can be tracked and teams waiting for our input can be notified. Planning is never a great exercise because reality will end up punching you in the face – but if the goal is really to synchronize a large activity between teams then it is simply hard to avoid, no matter what we end up calling it. In other words, yes, the plan will change, but that’s not an excuse for not having a plan and it is still a more appropriate tool than story points.
For developing a shared understanding in the team I’m a bit less confident about the alternatives.
An engineer picking up a work package could be asked to sketch out the changes they are planning and share these with the team. However, there are two possible outcomes here: Either this execution plan will just stay there, uncommented, meaning that everyone was probably too busy with their own stuff to really integrate what is going to happen, or it will receive a lot of comments and questions in which case we are almost back to an estimation exercise.
A nice alternative could be pair- or mob programming. The advantage here is that the shared understanding is developed while making the change. This is very tangible and leaves less room for misunderstandings. However, this is probably not going to get the entire team on board if that team is large and not everyone likes that way of working.
So, are story point estimates useful? I think it depends on the exact set of circumstances but I have become increasingly skeptical. If you don’t have other teams waiting for your output, the planning value is likely not justifying the cost and if you do, there are alternatives.
Using it as a communication tool can be useful – but we need to consider if it would not be better to keep teams small enough that this shared understanding can be achieved through informal interactions and providing other, more formalized formats to exchange information between teams on the same project.
Then again, if you are good at running efficient meetings and the Product Owner / Project Manager / Engineering Managers involved have the maturity to not use story points beyond the limits of their usefulness the team might benefit from discussing the work that needs to happen.
Also, like other rituals, formats such as story pointing will start shaping the team’s work culture. And if you already have a culture that works, why change it?