Hacker News new | ask | show | jobs
by wk_end 863 days ago
In re: to

> Reviewing and merging 10x60 line PRs is, in my experience, more time intensive than reviewing one 600 line PR.

The article disagrees, and the article has data:

> Some folks might wonder if writing small PRs leads to less code output in total. We all want to be high velocity, but there are times when you need to be high volume. Consistently writing sub-20 line PRs will have a significant impact on your net coding ability - but interestingly, so will writing PRs greater than 100+ lines. The highest volume coders and repositories have a median change size of only 40-80 lines. I suspect this is because the volume of code is change-size * speed of changes. Too small of changes, and the faster merge time doesn't make up for it. Too large, and you start getting weighted down by slower review and merge cycles.

I agree that you shouldn't be dogmatic about PR size - there's an art to engineering as much as it's a science. But part of that art might also be recognizing that "unit change, whatever that means to you" is - as you suggest - a flexible concept. The takeaway for me here is, given what the data shows, when you end up with a 600-line unit it might be worthwhile to try to, if possible, find a way to break that unit down.

1 comments

> and the article has data

How does that data prove anything? The ideal PR size is 50 lines because that’s the median size on Github. Seem like a pretty worthless claim.

> The ideal PR size is 50 lines because that’s the median size on Github.

???

No, that's not why it's arguing that it's the ideal size. Where did you get that idea? (TBH I doubt it's the case - apparently the average PR size on GH is nearly 1000 lines [0])

It's arguing that it's the ideal size because (1) PRs of that size end up getting reviewed and merged the fastest; (2) PRs of that size end up getting reverted the least, implying that they're less likely to cause breakage; (3) PRs of that size end up getting more review comments, implying higher engagement and less rubber-stamping (there's a concern that this could imply unnecessary bikeshedding that slows things down, but see 1); and (4) PRs of that size end up creating the highest total throughput.

I suppose that doesn't "prove" anything, because correlation doesn't imply causation. But it gestures towards it.

[0] https://www.keypup.io/product/average-pull-request-size-metr...

> (1) (2) (3)

Which IMO is still not worth much or hardly anything at all if we ignore the content of those PRs. I would assume that most are bugfixes or trivial changes (since no tests are needed) so it makes sense that they are causing the least issues.

That’s hardly relevant if you are trying to introduce actual features which require hundreds if not thousands of new lines.