Hacker News new | ask | show | jobs
by giulianob 1932 days ago
Definitely my biggest gripe is story points + estimations. It's just completely flawed. We should be thinking more probabilistic (i.e. there's an 80% chance we'll get this done in 2 weeks). We should be doing that by running actual models on past work rather than gut feelings.

If you want to listen to a couple of guys I used to work with who really know what they are talking about, check out this series: https://www.youtube.com/channel/UC758reHaPAeEixmCjWIbsOA/vid...

9 comments

I remember noticing this on my first frontend job 5 years ago. How are in hell are they accurately measuring any kind of performance? I'm sure it can actually be done but enterprise half-asses even that, likely because keeping it all slightly arbitrary allows for more ambiguity to leverage against workers (keeps them paranoid rather than certain, elites don't like safety amongst workers).
> How are in hell are they accurately measuring any kind of performance?

You aren't, and that's the grift.

I force developers to do something they are bad at. They get better, but they're never 'great' at it. So I can withhold raises they deserve because I'm punishing them for what they're not good at instead of rewarding them for what they are great at.

It’s not a measurement tool though - it’s tool for self-calibration. If anyone outside the software team sees those estimates then the process is broken. Doubly so if they’re tied to any performance evaluation.
Well, but that's what actually happens everywhere. The estimates are not for the team itself, but used for managers and the company. Even in the comments here it was justified as "the company needs to know that to sell it to customers".

So reality trumps ideals.

> It’s not a measurement tool though - it’s tool for self-calibration.

This was my complaint of the term “velocity” because it sounds objective and absolute when it’s really much more arbitrary and not even comparable to the same team from year to year.

Exactly, and since there's no human whose job is to figure out the ontology of dev work...
I've wondered how well it would work to just have everyone on the team anonymously estimate what percentage of the project (or sprint, etc) is complete.

But I think that would get thrown out at the first sign of trouble. In many companies if the anonymous survey showed we were behind schedule the fix would be to stop doing the anonymous surveys.

That's why Kanban with throughput/cycle-time makes so much more sense.
https://owenmccall.com/high-performance-role-process-vs-outc...

This article gave me clarity between the two philosophies. Kanban is about perfecting the process and trusting that results will naturally come as a result of that process. Agile is about attaining the outcomes and then focusing on what works for those outcomes.

I agree that Kanban makes more sense but Agile allows managers to point the finger at devs instead of having to point the finger at themselves.

I understand your gripe but in a business setting you need some estimate of cost, even if it's on the order of hours, days, weeks etc. In my experience it's often cheaper to just get stuck in rather than trying to accurately predict the effort, since a single developer prototyping some solution for 10 hours goes further than 10 developers in a meeting for an hour. I'd say call a spade a spade, and put time estimates on these tasks, and admit that velocity is a measure of your bias (if you correctly estimate it will always be 1).

What really gets me about story points are the Agile folks who say "story points are a measure of complexity not effort/time". As if adding more complexity in the same amount of time were a good thing...

Probabilities would be estimates of costs too, they are just less naive ones.
Big surprise to most people: The purpose of estimating is not to come up with an estimate.

In a team of five, we might get 2,5,8,8,20. The value of the estimation was in discovering that someone thinks it's a 20, while someone else thinks it's a 2. They tell the rest of the team why, and we estimate again. Another useful signal is ?,?,?,?,? or (50,50,50,50,50). And of course, 5,5,5,8,8 (or other general agreement) suggests that this is low risk.

You certainly wont hear that from a scrum consultant.

> Definitely my biggest gripe is story points + estimations. It's just completely flawed. We should be thinking more probabilistic (i.e. there's an 80% chance we'll get this done in 2 weeks). We should be doing that by running actual models on past work rather than gut feelings.

Why? If you made those more detailed estimates and models (which I'm sure would have a significant time cost), what would you do differently based on the results of them?

You need a rough sense of how relatively costly different tasks are, so that you can prioritise - if you ask the business/product side to prioritise completely blind, you'll end up not doing small things that could have brought a lot of value because they assume those things are hard, and you'll spend far too long doing things that they thought would be easy but are actually hard. So you want developers to do just enough estimation to allow the business to prioritise. Which means giving a low-detail estimate and giving developers assurance that it won't be used as a deadline. Story points are the most effective version I've seen.

(I don't watch videos, I'd be happy to read text articles)

I wonder how much more accurate estimations would be if we had engineers actually draw probability curves that are then aggregated and analyzed. As it stands now, having had some experience getting burned, I always give my estimate that's really at the tail end, and I have become exceedingly efficient at explaining why that estimate is how long it will "really take."
Estimates somehow magically become commitments. Like, if you don't make it in the estimated time, it's your fault, you should have estimated bettter.

Okay, so suppose my best guess is that there is a 10% chance something could be done in a day, 80% chance it would take two or three days, and 10% chance it would take five days. What is supposed to be my "estimate"?

If I keep giving the longest time interval, I will seem like lazy and incompetent, because why am I always saying "five days" to tasks that everyone knows usually take only two or three days. But if I say "three days", then in those 10% when it actually takes five days, I have estimated wrongly.

In a long sprint, this will usually average out somehow; one "three-day" task will take five days, another "three-day" task will take one day. In a short spring, things are less predictable.

(yeah, technically it's not days, it's story points, but the idea is still that "medium complexity" only means "medium complexity unless something unexpected happens" and the unexpected things sometimes do happen, you can't simply commit to unexpected things never happening.)

Estimating a single task is only really important to keep your cycle time in check and have the conversation about "should this task be broken down?"

Estimations from engineers shouldn't be used to forecast when a feature will really be done. That's where the model comes in and probabilities comes in.

Interestingly, the Manifesto for Agile Software Development never mentions estimation. Yet it is a big part of Scrum. Measuring accuracy in estimation is a big part of tooling for Agile project management. But should it be?

I think this is suspicious. It smuggles things from the Old Ways into Agile. Estimation sucked, to a fatal extent, when trying to do critical path analysis of software projects. Why should it suck less in Scrum?

The manifesto mentions retrospectives. Retrospectives have hard reliable data. You can learn from retrospectives. Estimation is unreliable. Would project outcomes actually differ with less emphasis on estimation? Would they improve with more emphasis on retrospectives?

Yes, exactly. The Mythical Man Month was written 45 years ago. And since then we have gone from bad to worse.

Story points are even worse than man-months at measuring work.

Because in scrum not only is the assumption that the estimate for a task holds not matter who and how many works on it, but dependencies are also not handled well for either tasks or backlog items. To some extent a team can try to handle it in a sprint, but it is not part of the estimation.

For example it can be a lot faster if the same developers can work on corresponding frontend and backend jobs. If they are split in different sprints or if the developers also have to work on other tasks, it could take a lot longer.

And the whole story points, fibonacci, etc is just nonsense. If an experienced developer estimates that a given job would take somewhere between 10 days and two months, depending if they can reuse X and the Y algorithms performs if he and developer Z works on it, then that is the best estimate you will get.

The only thing that makes scrum estimates more precise, is that once you have broken everything into 100 tiny tasks, you have added about 20 hours of writing commit messages, updating JIRA, and preparing for the next task.

I see probabilistic accounting for unknowns. An 80% chance would have some a small amount of unknown, so if the task was 3 points, I would bump it to a 5 to account for the known. And, since points are not linear, the larger task automatically grows proportionally, i.e. 8 -> 13.
Points are flawed. Whether you have them grow linear, exponentially, or fibonacci doesn't make a difference. The fundamentals are flawed. They cover it in their videos and in more depth their talks/books.
Monte Carlo estimations and probabilistic estimates take longer to do.

I don't know how they do it, but you used to have to group stories to stories of similar effort done in the past in order for the simulation to take into account story size. Otherwise your model will be effected by things like stories in different components taking more effort to do.

Putting stories into rough size pots is essentially what pointing is.