Hacker News new | ask | show | jobs
by hugocbp 861 days ago
I'm probably in the minority here, but personally I'd much rather review a 300 line PR instead of 6 50-line ones if the change is a single context.

I briefly worked with a hard line-count-limit for PRs and I thought it made everything much worse. It is fine for changes that are actually small, but when you need to go back and re-open 4, 5 merged PRs in different tabs to get the full context again, the time to review increases tenfold with tiny PRs that don't really make complete sense by themselves.

I have worked with co-workers that have the complete opposite preference, though, and anything over a set amount of lines wouldn't even be reviewed.

Interesting to see the numbers on the article, however. My anecdotal experience would make me guess the opposite. I feel like work slows to a crawl once the PRs are artificially limited and broekn up like that, specially in fast moving companies and startups.

5 comments

Line count (or count of any units in the PR) is completely meaningless.

I feel that in the last decade somehow there has been a loss of context of what is the purpose of version control.

Why do we bother with version control? Sure, there are many reasons.

But a primary reason is that if we discover behavior X turns out to cause regressions (or is otherwise a bad idea), we can easily revert it by reverting its commit.

That's a why a single commit should contain a single behavior change, and contain the entirety of that single behavior change.

If the change is split among multiple commits, it'll be a pain to revert it. If the change is contained within a commit that changes other things, it'll be an ever larger pain to revert it.

So that's my rule. A commit is single behavior change, all of it, and nothing else.

You can't express that in lines. Might be 1 line might be lots.

100% agree, if you have multiple commits to complete the change during development, just squash them into one for the PR.

That being said many companies (including some FAANG) use commit counts as a performance metric, thereby directly disincentivizing clean, readable commits.

Sure, and at the same time a PR can contain multiple commits. You don't have to squash and merge.
I've rarely seen cases where multiple commits per PR are a good thing. When that happens, it's (usually) that either the commits can be squashed into one logical/consistent commit, or there are too many changes for one PR.

The exception being that many ci/cd setups will squash your commits for you when merging a PR.

> but when you need to go back and re-open 4, 5 merged PRs in different tabs to get the full context again, the time to review increases tenfold with tiny PRs

You can't really say much about the first 4 PRs because if you don't know how that code is going to be used, it's kind of hard to giving meaningful feedback. By PR 5, it's too late to give feedback on PRs 1-4.

I also have worked with people that prefer that but I never understood it. Code without context isn't reviewable imo

A skill that you see in some of the more experienced devs is the ability to front-load the most important stuff in a series of changes, or make changes that can at least be understood in isolation.

It’s not always reasonable to do that, but you can often deliver incremental changes where each incremental change either brings a whole feature visibly closer to completion, or provides some self-contained utility, or makes some refactoring changes with the promise of “this will make it possible to do X, later”.

That's a talented approach, yes. But something I've observed in a couple of former coworkers is they will put together 6 PRs to the main branch where the code won't even (purposefully) build yet. By PR 6, the code functions, but is in a terrible state because PRs 1-5 were essentially pointless to review, and many of the issues you have with the code base are now firmly outside of the 6th PR's diff. I've likened this phenomena to the H.H. Holmes murder hotel, where he hired contractors to do extremely small jobs individually, and none of them realized (or at least had some plausible deniability) that they were creating a hotel made perfectly for the owner to murder residents. Except in this situation, I perceive the author was hiding the fact that they have no clue how to write code, which I don't know who they thought they fooled.
Right, in my experience large changesets often don't add a single thing, but do some refactoring, which enables the change, and then add some new infrastructure and the individual feature.

If you split it, you can verify the refactoring is a pure refactoring, not changing functionality (much), can understnad the infrastructure added and then understand the feature built on top.

Of course often things aren't developed in isolation, thus preparing the review for isolating those aspects is work again, but usually that in a side effect leads to better architecture, as you look at the parts in isolation, leading to half a refactoring, half an infrastructure and a feature hacked in.

Definitely. The good version of small PRs are small but useful changes that incrementally produce value on their own

The bad version is people adding random unused classes and methods that don't make sense until the final PR comes in and uses them

My bar is if its more than 20 files, or breaks the version control UI, which I've seen in bitbucket some years ago, then your PR is a little too much, and if its a major automated refactor of some sort, it should not contain any other code changes outside of what's necessary to make that specific change work as intended. Other changes should be their own discrete PRs as follow ups.

I just try to be reasonable about PRs. Everyone will have their preference, I just think as long as you can go back and figure out what happened, why the change was made, and do it quickly enough, then you're golden.

Lines with actual code changes aren't a problem for me, it's the automated IDE indent/spacing/bracketing that really drives me up a wall and fatigues the hell out of me.

But this might only be a problem with those of us working on legacy codebases. The kind of PRs I see in OSS projects I could review 1000 lines at a time - it's so clean!

Split the reformatting and real change. Make the reformatting change first and have a policy of merging those change fast.
Some teammates agreed to do this, but it's not known what files will be edited ahead of time, so a large job often has formatting edits anyway.
Yeah, it's really something that you have to get buy-in for.

You could always make a commit that just has the reformatting when you do start editing a file. If it turns out you didn't need to edit that file after all, then you can revert just that commit easily. And if you did, you can collect up all the reformatting commits into a PR as a first step.

The flip side of this is making minimal surgical small changes, and then doing the refactoring after the change. The key message though is split them.

Refactoring doesn't change behavior, so refactoring efforts should be judged only on "tests exist that cover the functionality, those tests continue to pass, the code is easier to maintain". It's when you add "changed functionality, so we changed the test" to review at the same time, that this becomes a hassle.

I tend to agree.

For my part, ever since I was a wee lad learning to program for the first time, I've been of the opinion that line counts themselves are a largely worthless metric for anything.

How many lines code takes up is too variable between languages, too variable within the same language, too dependent on irrelevancies like formatting, etc.

What actually matters is logical and conceptual complexity, not line counts.