Hacker News new | ask | show | jobs
by Groxx 3413 days ago
Ehh... nice in principle (and I do small deploys all the time for work), but too many artificially-small changes can easily cause you to "miss the forest for the trees". Each change is small and LGTM-able, but they can add up to a misbehaving system unless you have full context (which, because they're small, does not exist in the diffs).

If it's conceptually a single unit, keep it a single unit. Pushing dead code in small pieces that waits for a small master switch doesn't give you additional safety (though it can give you easier merges).

>Reviews for large diffs are closed with a single “lgtm,” or miss big-picture problems for the weeds.

That means you have bad reviews. If it can't be figured out, how was it written? Take the time and do it right. Yes, it's a large commitment - is it larger than the value being created? If no, why does it exist?

3 comments

The problem is that review complexity is quadratic in size, and quadratic in number of interacting parts.

Each thing can interact with each other thing (in theory), so if you have double the diff, you might need 4 times the amount of work. And the increased workload increases the risk of a mistake.

But splitting up the diff in two doesn't change the result, right? The reality is that everything doesn't interact with everything else. But the reviewer doesn't know this, and needs to confirm non-interaction (hence a review).

If you, as an implementer, know the divisions, then shipping the pieces that are independent will do two things. First, it makes the review easier, letting it serve its main purpose: catching errors in implementation. Secondly, it will let _you_ confirm you know the divisions. It could be that your conception is wrong! Your small diff might not work because of an interaction you didn't realize.

Personally, I think that by the time you're sending diffs, the general architecture should have been decided. We tend to have at least one person (not the implementor) who knows how a feature is going to be implemented, so the architecture can be confirmed. You want someone to be looking at the forest! I think that doing this during code review is a bit harder though.

Taking the time to do it right is hard, and chopping things up makes it easier to do "it" (implementation review) right. Architecture review should probably be happening in a different space.

On a social level, you are asking for your code to be reviewed. Please have empathy for the reviewer.

>Secondly, it will let _you_ confirm you know the divisions.

Absolutely. Which is why I draw the line at "conceptual units". If there are rational sub-divisions that don't require context, go ahead, divide! If you're baking in cleanup + feature + bugfix + convert tabs to spaces, you're mixing things in the same way that you would usually avoid when coding. Same thing with too-large commits - we avoid 10,000 line files and functions, do the same with commits.

I caution only against artificially small diffs. Small is a fine target, but there's a lot of dogma around it everywhere I've been, which is why I think it's worth calling out. Breaking things up too much can remove context, which can be dangerous.

---

If you're introducing a large thing in 1 vs N units (say a full feature in one go, which is unusable until complete), the line is of course fuzzier. In some ways, sub-units may be more reviewable - that's a solid benefit, though they're not likely a dozen or two lines of code (if they are, and it's a large feature, you probably now have dozens of reviews). But you aren't getting any additional safety, because it's all either on or off - which is half of the post.

For the other half of the post, if you're not getting the same reviewer(s) for all the pieces, unless you've had a detailed architectural review[1], has anybody but you read and understood the whole system? When it's turned on and it breaks in a subtly-interacting way and you're on vacation, does someone else know where to look? Doing it all in one go, as it will be when turned on, forces an architectural review of the system implemented, not just the plans. Say it takes a day or two to thoroughly review - is that bad?

---

Always tradeoffs. For the situation described in the blog post, where large reviews get insufficient attention, I don't think small diffs will solve the problem. The problem is the lack of attention given to reviews - fix that. It's critical regardless of diff sizes.

[1]: No place I've worked for has truly done this, which I would claim is the (vast) majority, but I wholly believe some places do. I would probably even like it! But it's perceived as slowing things down, so it hasn't happened for me yet.

Great notes. Thanks for writing this!

Architecture review distinct from deploys is key, as is a separate operability review. And it's also important to mentally decouple the feature release to end users from the code releases.

Review of 10 lines: 10 issues. Review 1000 lines: LGTM. As the article says - even the best strongest teams and culture will fall in that trap.

Avoiding review fatigue has to be done some other way than merging smaller diffs, for example splitting changesets among reviewers so that person A reviews the backend code and person B reviews the frontend code etc. A meaningful feature of 1000 lines can still be reviewed in terms of 20 50-line changes, and doesn't have to be reviewed in one sitting or by one person.

You are perfectly right that pushing dead code and half baked features (that won't be used, so aren't actually in production!) is useless and adds no safety. Adding a new feature is risky, and adding it as 99% dead code and then finally adding the 1% code with the button that actually enables the feature in the UI doesn't help - 100% of the new code will hit production with that final change.

I read such articles more like a general guideline than a strict requirement. Of course it is bad to split up commits, which are one unit, into several parts. But when in doubt, you should try to commit in smaller batches. And by all means try not to commit unrelated changes in one commit.