Hacker News new | ask | show | jobs
by appplication 862 days ago
I have found that teams who allow larger PR sizes without complaining about it can get a lot done a lot faster. Which is to say having 50 line PRs are not ubiquitously ideal, but are somewhere on the spectrum of acceptable but come with definitive tradeoffs. Reviewing and merging 10x60 line PRs is, in my experience, more time intensive than reviewing one 600 line PR.

Most of the tests for and new functionality alone in any of our PRs well exceed 50 lines. Indeed we have 2-3x more test code than actual code. So should we be writing less tests? Merging the code first and later the tests? Merging the broken, codeless tests first and then the code?

It’s all ridiculous. Just make PRs that represent unit change, whatever the means to you. The unit functionality is independent of lines of code. Sometimes that is 12 lines, sometimes it’s 800. Yeah large PRs are harder to understand but that’s why you have tests. Also XXL PRs aren’t usually happening every day. If they are, you have a very productive team and maybe you should count yourself lucky.

6 comments

But why does it take longer for 10x60 line PR's?

If it's because those PR's are getting reviewed more thoroughly, that's likely a great use of time.

If it's because the small PR's are getting bikeshedded, it's not.

If it's because of tooling, then the OP at graphite.dev has something to sell you. I'd be buying but we're a GitLab shop.

> Indeed we have 2-3x more test code than actual code. So should we be writing less tests? Merging the code first and later the tests? Merging the broken, codeless tests first and then the code?

I find that test code is rarely reviewed properly. If you want it reviewed properly, then create MR's that are just tests. You can create an MR with the change and happy path code, and then subsequent MR's with each of the rest of the tests.

If you're fine with tests not getting the same level of attention (and that's likely OK), then tests don't count as part of the 50 lines, IMO.

It's because every time I switch from coding to reviewing, there's a context switch, and every time I switch back from reviewing to coding, there's another context switch. I can easily spend 30 minutes getting back into the flow of coding after doing something else.

One PR has 1 context switch. 10 PRs have 10.

To say nothing of the cascading effects of CR on early PRs. Oh, you want me to rename a variable that I used in the first PR? Looks like I'm spending the next 10 minutes combing through my other 9 diffs updating names. That would have been a 5 second refactor if I only had a single monolithic PR.

Yeah—on the other hand, I’ve gotten PRs that are just too damn big and that means that I can’t review them without sitting down and getting into the flow just for that review. What I end up doing is, after some false starts, getting myself a block of time to review the massive PR.

Let’s say I just finished up some change I’m working on, and I take a look at my coworker’s PR. It’s 11:45, and I’m heading to lunch in 15 minutes. If it’s a 10-150 line PR, I can probably review it within 15 minutes, and maybe I can review six or eight reviews like that in a day, when I stand up to get coffee or something like that.

If I see some 700-line PR coming through, I have to allocate some focus time to that, just like I were programming. It’s a burden.

I totally agree, and this was a point of conflict between my last manager and I. On one hand, he'd be on my case for not getting some coding done, and on the other he'd be on my case for not jumping on a massive PR right away. Part of the problem I think was that we had a very bottlenecked QA process and decoupled feedback systems for reviewing and discussing.

I'd get a large PR and some smaller ones on GitHub that would likely have tests included, so I'd have to check those too, while also maintaining conversation over on Jira about some other ticket and trying to actually get code written. There was no concept iteration because every unit change needed to simultaneously be broken up by 2 week sprint AND be completely unit tested and flawless before asking someone to review it or getting it merged. Meanwhile, I'd never know when someone updated their PR or Jira thread after discussing unless I manually checked or always kept the browser tab open.

So I'd take a full morning to review a PR, then respond to things on Jira, and if I'm lucky, write some code. Eventually the PR would be updated but I'd forget to check because there wasn't a good system in-place to manage those notifications, and when I did it'd be another massive change that I'd need to take quite a lot of time to check.

So I started using Draft PRs for bigger or more complex changes, and explicitly note what wasn't functioning or written yet, which would provide for lower pressure time to get early feedback instead of trying to complete everything up front, and updates to those PRs would be more incremental.

I've found that if a PR is sufficiently small then reviewing it doesn't entail quite as severe of a context switch as reviewing a larger PR. Something about it being shorter means I can keep some of the coding context fresh in my head so I can switch back to it quicker (whereas for a longer PR I need to jettison that context in order to comprehend the whole PR).
Not the OP, but in my experience, it's because those 60 line PRs lack a lot of context, and it's not until you see the cumulative changes across those 10 60-line PRs that you really understand how they come together to form a feature--and are able to review them within that context. So instead of having one 600 line PR in which the whole picture is clear, you have to go back and forth between 10 different PRs and comparing notes between them to make sure you really understand any of them.

I still think 600 is a lot, but there's a happy medium between "minimal change that excluded broader context" and "monster PR that has all context, all changes, and eats left-pad packages for breakfast". Line count shouldn't be the goal. The goal should be making as small a change possible that is self-contained (ie all context is either within the PR or already in the codebase), and (IMO) that can be delployed to production as-is.

100%. It's missing the forest for the trees.

You can focus on those 10 60 line PRs and completely miss the bigger context. Not to mention these PRs can be reviewed by different teammates that can easily lead to missed issues within the whole scope of the change.

As always, there's no one right answer. Sometimes yeah, those 10 PRs will be better. Sometimes one or two PRs will facilitate and deliver a much better final solution with less bugs.

It really bothers me how these articles/companies try to claim "ideal" scenarios completely devoid of context. That's how bad managers will use your PR sizes to say you're doing a "bad job" despite not knowing a single thing about coding.

An ex-manager (ex-Google, too) insisted we move from GitLab to Gerrit, and mandated that all commits be a maximum of 5 lines, ideally fewer.

This led to a complete loss of all context; without that overview, you might very well be releasing code faster, but it could be the wrong solution to the original problem - and crucially, you won't know until it's out in the wild.

A preference for smaller, more focused PRs is fine - it certainly makes them easier to review independently - but I think putting any sort of limit on the size makes your team more likely to omit things like tests and also to not have sufficient context to understand the overall problem space.

> mandated that all commits be a maximum of 5 lines, ideally fewer

Yeesh. How would you ... rename anything? Imagine a linter rule which prevents referencing a given name on more than 5 different lines of code.

Many language allow an alias name. Which means you change the name and then alias the old name to the new line - 2 line PR. Then a bunch of one line PRs to change easy use of the old name. Finally a PR to remove the alias.

Which is why I don't have such a rule and would oppose it.

5????

Please tell us more. Did 6 line commits get rejected? How did anything get done?

Well, things did get done, but not the right things.

The rule wasn't hard and fast, but you had to have a damn good reason why you wanted to add bigger commits. Protobuf definitions didn't count, so you could get away with doing longer commits for those.

Tests were discouraged anyway ("Google just tests in production", we were told), so there wasn't a lot of that sort of thing.

This must be a typo - 50 or 500 lines maybe?

...or satire?

I wish it had been satire. This was fundamentally one of the worst parts of my coding career, and it was disappointing because the company had been amazing to work for prior to this individual's arrival.

Post-arrival, I was ordered to rewrite a working, stable PHP web app into Go ("Google doesn't use PHP, because it doesn't scale"). We weren't allowed to write RESTful APIs any more, everything had to use Protobuf.

And there was a physical fight in the office one day between two of the developers, that culminated in one of them storming out and never coming back.

That each MR be deployable is a great guideline and should trump any sort of line-number guideline. It's possible that an MR could be deployable but lack context for the bigger change, but it's less likely.
"I have found that teams who allow larger PR sizes without complaining about it can get a lot done a lot faster".

Sure.

How many times is the code touched after the PR has been merged is a better question.

Faster is not better. Better is not faster.

Weirdly, what the article claims is that it's the ideal size because of speed. So they are trying to say it's faster, therefore it's better.

Which is in my view a very silly claim devoid of any context.

The article also claims it's the ideal size because it minimizes reverts and ends up with PRs getting more thoroughly reviewed (based on # of comments). I think that makes for a more compelling case.
Not unless we know what those changes are.

Guess what, all typo commit fixes are <50 lines and never get reverted.

I wish never, definitely have been burned by a thinko in a supposed "typo fix". But I'll give you "rarely".
Haha, that's a fair point. Cries in dynamic languages

I've definitely had code with typos consistently replicated due to auto completion, then had to fix it all later on, haha.

I couldn't disagree more. My current team has a history of large PRs and everything was a dumpster fire until we started really hammering down on PR size. Prod broke all the time, reverting code was a nightmare because of how big the "units" were, code review was not thorough, problems problems problems.

> large PRs are harder to understand but that’s why you have tests

I'd rather have comprehensible, well-reviewed code than test coverage (I'd rather have both, of course). We've got a mission critical module that is a crusty piece of junk that everyone is scared to change because of how much of a mess it is, even with its amazing test coverage (frankly the code is so bad that I assume the tests are just as bad anyway).

> XXL PRs aren’t usually happening every day. If they are, you have a very productive team and maybe you should count yourself lucky

I've never seen an XXL PR that improved productivity. They pile up tech debt, bugs, and down time. I don't care how fast the devs are moving if they bring prod down every week. That is absolutely unacceptable.

On a team of superstar, perfect developers that all understand the code base perfectly, I'd probably agree with you. If that's your situation then I envy you :) But I've never been on such a team.

> everyone is scared to change because of how much of a mess it is

As a rule messy code a way more easier to cleanup in one big chunk. Teams with CR limit prefer not to touch such nests.

My team has a history of large (and small) PRs. If it's a huge feature / vertical, it's often done in a few large PRs. Very few reverts in our 5 year history.

I guess, what I'm saying is, it depends on the team, on their seniority and capabilities, on the product, the culture, etc.

Measuring the total size of the PR seems to me to be missing a dimension, namely the commit size. What if a 600-line PR consist of 10 60-line reviewable commits? Which, if not for GitHub’s broken review UI, is what I would hope is the goal for such a thing. Then you have a reasonable amount of code to look at per commit, and all in context of the larger overall change.
Refactorings and redesings could easily span 1k lines even for small projects. Yes, almost always you could make it incremental, but total lines counts would increase and it requires hard planing beforehand. Too much fuzz.
In re: to

> Reviewing and merging 10x60 line PRs is, in my experience, more time intensive than reviewing one 600 line PR.

The article disagrees, and the article has data:

> Some folks might wonder if writing small PRs leads to less code output in total. We all want to be high velocity, but there are times when you need to be high volume. Consistently writing sub-20 line PRs will have a significant impact on your net coding ability - but interestingly, so will writing PRs greater than 100+ lines. The highest volume coders and repositories have a median change size of only 40-80 lines. I suspect this is because the volume of code is change-size * speed of changes. Too small of changes, and the faster merge time doesn't make up for it. Too large, and you start getting weighted down by slower review and merge cycles.

I agree that you shouldn't be dogmatic about PR size - there's an art to engineering as much as it's a science. But part of that art might also be recognizing that "unit change, whatever that means to you" is - as you suggest - a flexible concept. The takeaway for me here is, given what the data shows, when you end up with a 600-line unit it might be worthwhile to try to, if possible, find a way to break that unit down.

> and the article has data

How does that data prove anything? The ideal PR size is 50 lines because that’s the median size on Github. Seem like a pretty worthless claim.

> The ideal PR size is 50 lines because that’s the median size on Github.

???

No, that's not why it's arguing that it's the ideal size. Where did you get that idea? (TBH I doubt it's the case - apparently the average PR size on GH is nearly 1000 lines [0])

It's arguing that it's the ideal size because (1) PRs of that size end up getting reviewed and merged the fastest; (2) PRs of that size end up getting reverted the least, implying that they're less likely to cause breakage; (3) PRs of that size end up getting more review comments, implying higher engagement and less rubber-stamping (there's a concern that this could imply unnecessary bikeshedding that slows things down, but see 1); and (4) PRs of that size end up creating the highest total throughput.

I suppose that doesn't "prove" anything, because correlation doesn't imply causation. But it gestures towards it.

[0] https://www.keypup.io/product/average-pull-request-size-metr...

> (1) (2) (3)

Which IMO is still not worth much or hardly anything at all if we ignore the content of those PRs. I would assume that most are bugfixes or trivial changes (since no tests are needed) so it makes sense that they are causing the least issues.

That’s hardly relevant if you are trying to introduce actual features which require hundreds if not thousands of new lines.