Hacker News new | ask | show | jobs
by sytse 2015 days ago
The article argues that there is no useful measure that operates at a finer grain than “tasks multiplied by complexity”.

I think that complexity is hard to measure and therefore easy to game.

At GitLab we only measure tasks completed, the number of changes that shipped to production, with the requirement that every change has to add value. This measure has been used throughout R&D https://about.gitlab.com/handbook/engineering/performance-in... to assess productivity for multiple years now with good success https://about.gitlab.com/blog/2020/08/27/measuring-engineeri...

When you tell new engineers about this target they see a great opportunity to game it, just ship smaller changes. It turns out that smaller changes are quicker to ship. Lead to better code and tests. Have lower risk of cancellation and problems in production. And lead to earlier and better feedback.

Inspired by Goodhart’s Law I'll propose the following: A measure that when it becomes a target improves productivity. ~Sijbrandij's Law

4 comments

Your proposed law has been tried for many years, by just as many good-willed people who believed their measures would result in target increases. In fact, the entire industry is being bombarded by one such methodology that includes those measures: Scrum. Must we really repeat the years of complaints, criticism and debates to show any measure can get warped and gamed to the point it only vaguely resembles a tool of productivity?

So we get young, naive engineers to focus on small changes. Cool, probably as it should be, you gotta start somewhere. And when these developers get hungry for bigger projects, when they get bored implementing the umpteenth small and by that point (for them) trivial change, how do you encourage them to tackle bigger technical problems? Those that lay the foundation for the new people to do their job more easily and on-board quicker? Or did you actually not tell us all, and you measure far more than just the number of changes?

Thanks for the feedback. I agree there is risk of larger technical improvements not happening enough if you only measure the number of changes.

Maybe a few things are happening:

1. Some large technical improvements can be shipped in multiple changes that add value.

2. Most companies do more large technical changes than is optimal.

3. Engineers are motivated to make the large technical changes since they are interesting and make their future work easier so they will prioritize them despite the measure.

4. GitLab is having fewer larger technical improvements than optimal.

5. Our dual career structure ensures that there are engineers who can do these larger technical improvements without being below average themselves because they are more productive than others.

6. We are not pushing very hard on this metric since we do it in a group setting instead of per individual.

> how do you encourage them to tackle bigger technical problems? Those that lay the foundation for the new people to do their job more easily and on-board quicker?

Isn't gitlab known for disastrously poor infrastructure with all the long outages? I.e. the exact things where people need to take their time to tackle bigger technical problems, not complete short tasks. I guess this attitude explains it, at least partially.

Would be sure nice to back up those claims.
Would be sure nice to back up those databases: https://about.gitlab.com/blog/2017/02/01/gitlab-dot-com-data...

In all fairness though, this is the only major gitlab incident I can recall, and it's more than three years old at this point.

It seems like a useful aggregate metric, but is it also used to rate individuals? For that purpose, it seems like it would be terrible. What if you have an experienced staff member from whom everyone else constantly seeks advice? That person may be having a positive impact that isn't visible as merge requests.
We try to not go below the level of a group when making productivity assessments. So measure the team instead of the individual. This is indeed to encourage helping each-other out as you noted.
> We try

And what do you do in the end?

If you were a manager, could you honestly say it was ever in a subordinate's best interest to do more than the bare minimum?

This should illuminate why these conversations constantly go in circles.

I have never had the opportunity to try this method but after much theory crafting and many useless pointing meetings and much statistical investigation with negative correlation between "complexity" and "time to completion" I can only think that this is the correct way to get velocity.
What does "good success" mean in this context?