Hacker News new | ask | show | jobs
by k_dumez 935 days ago
> artificially minimizing PR size

Not sure I understand the "artificial" part here. There's nothing "artificial" about breaking up your larger changes into smaller PRs. It's just good practice.

Helps reviewers who are reviewing the code, and helps the author be more focused with their changes.

Even in net new feature development it's a good idea to break up your large changes to something more manageable.

Sorry if I'm not understanding, what do you believe the downside to be?

3 comments

It's artificial to break up a PR to satisfy the rule of small PRs.

Often you need the full context when evaluating a new feature end to end.

Or you spend 2 days splitting up a PR into smaller PRs so that a person can review it in 30 minutes instead of 2 hours.

I can't say I've ever seen benefit from it both as a reviewer or as a developer, but it could be an effect of different companies and different teams.

I'm not sure I agree? If your code isn't trivial to compartmentalize changes then that might be a code smell.

I'd agree keeping the unified context is preferable but it's probably easier to do that by having developers rebase their changes into discrete commits that can be reviewed one by one.

It's not necessarily about the units of code, it's about the bigger picture.

Let's give a hypothetical situation where you split up a PR by the backend and frontend components, but you have some extra fields and endpoints that are not used in your final product. They accidentally get left in because you miss them.

I believe the chance that it would be caught in review is significantly higher in a unified PR review instead of artificially splitting front and back into separate PRs.

And if you need to have the 2 PRs open side by side, then why split it up in the first place?

> And if you need to have the 2 PRs open side by side, then why split it up in the first place?

I agree however I think PR's should be split up. Just not into separate PRs. The solution is to actually start reviewing PRs by commit (you can do this in the PR web interface). Everything is split up and can be reviewed separately but you can still see the final diff and while each discrete unit is reviewable, the greater feature is also still one discrete reviewable item.

I agree big time with this.

Organizing by clean commits is definitely important.

Why’s that? Do you step through commits diff by diff when reviewing?
But if you are doing by commit and it later gets iteratively refactored, wouldn't that be a huge waste of time?

How would you know what to critically evaluate in such case?

I'm confused what you mean?

If you are talking about refactoring prior to merging into the tree? then no it's not a waste of time. That's the intended workflow. You make your changes to the commits or add new commits in between using rebase. This is how all the development for linux is done.

If you are talking about after they are merged into the mainline? That's also not a waste of time. You don't have to go back and change those commits because they are now finalised and your iterative refactoring can be done via small one off patches gradually merged into main, a series of large patches merged into main, or a set of patch series all merged into an incubation branch that is kept up with main and eventually merged back into main once the refactor has made sufficient progress to take over.

> "They accidentally get left in because you miss them."

In my experience, this kind of thing happens much more frequently with large PRs.

If a PR is large enough, things will slip through the cracks. If there are too many change requests, even if they're sensible and make sense, things will slip through the cracks. If there's the need for multiple checks by the reviewer, even more things will slip through the cracks.

Also, breaking up PRs in frontend vs backend is not the best idea. Build the feature iteratively if possible.

EDIT: Since the grandparent is talking about reviewing by commit: if those are more reviewable, just use those as the way to separate PRs instead, maybe?

What is the advantage of breaking up a large change into multiple small PRs, compared to multiple small independently-reviewable commits as part of a single PR?

If a PR introduces a new function that will be used in the next commit, I would much rather see that next commit using git, than hunt for it in the PR queue.

To me in a lot of cases it seems even as a reviewer, it's harder to understand big picture if someone splits up the PRs.

But as a code writer myself, for example, if I am building a new feature, firstly it's really hard for me to know what the whole thing would look like without going through it all and it's probably very iterative process as I'm doing it, so I usually wouldn't be able to split it up or it would very suboptimal to split it up before I've finished everything.

Then I would try to split it up as I've finished to appease reviewers, but again, it requires whole lot of creativity to do. Should I try to split up shared component first? Because I surely can't split up whatever is using those shared components. If I do then, people won't see whatever is implementing those shared components so they won't have understanding on why those shared components provide certain functionality etc.

Overall it complicates a lot it seems because if I was to do it during my iterative process then I would write a PR, later refactor bunch of it anyway, and I would do it in the order that feels best for me, but wouldn't necessarily be easy to understand for anyone not within it.