Hacker News new | ask | show | jobs
by sb8244 942 days ago
It's artificial to break up a PR to satisfy the rule of small PRs.

Often you need the full context when evaluating a new feature end to end.

Or you spend 2 days splitting up a PR into smaller PRs so that a person can review it in 30 minutes instead of 2 hours.

I can't say I've ever seen benefit from it both as a reviewer or as a developer, but it could be an effect of different companies and different teams.

1 comments

I'm not sure I agree? If your code isn't trivial to compartmentalize changes then that might be a code smell.

I'd agree keeping the unified context is preferable but it's probably easier to do that by having developers rebase their changes into discrete commits that can be reviewed one by one.

It's not necessarily about the units of code, it's about the bigger picture.

Let's give a hypothetical situation where you split up a PR by the backend and frontend components, but you have some extra fields and endpoints that are not used in your final product. They accidentally get left in because you miss them.

I believe the chance that it would be caught in review is significantly higher in a unified PR review instead of artificially splitting front and back into separate PRs.

And if you need to have the 2 PRs open side by side, then why split it up in the first place?

> And if you need to have the 2 PRs open side by side, then why split it up in the first place?

I agree however I think PR's should be split up. Just not into separate PRs. The solution is to actually start reviewing PRs by commit (you can do this in the PR web interface). Everything is split up and can be reviewed separately but you can still see the final diff and while each discrete unit is reviewable, the greater feature is also still one discrete reviewable item.

I agree big time with this.

Organizing by clean commits is definitely important.

Why’s that? Do you step through commits diff by diff when reviewing?
This is actually the intended way of using git. Pick any random mailing list of the lore [1] and select a thread that has [PATCH] in the name to see this in action.

I just grabbed a random patchset off of the git mailing list and linked it below [2] to demonstrate this.

You'll notice that on top of the overall patchset (equivalent to a PR) having a detailed description (in the coverletter, i.e. PATCH 00/xx), each commit has a descriptive name, message detailing what the changes within do, and signoffs by everyone who contributed to it.

Then as each reviewer comes in, they review each individual patch (i.e. commit) separately, replying to the message for that patch. And any high level concerns can be in reply to the coverletter (addressing the entire patchset).

As the contributor responds to comments and makes the requested changes, those changes get made per patch/commit via rebasing rather than just added on top. And when they are finished making the revisions, a new v2 patchset is released in reply to the cover letter of the first patchset, now also containing a diff per commit/patch against the previous revision.

Then the cycle repeats until everyone is happy. At that point the maintainer will merge the changes into their incubation branch (the git devs call theirs `next`) and after some time has passed, they merge it into master/main and it becomes established history.

The worst part of the whole workflow is that it uses email but other than that it is a far preferable reviewing experience to using github's stock pull request workflow.

1. https://lore.kernel.org/

2. https://lore.kernel.org/git/20231010123847.2777056-1-christi...

But if you are doing by commit and it later gets iteratively refactored, wouldn't that be a huge waste of time?

How would you know what to critically evaluate in such case?

I'm confused what you mean?

If you are talking about refactoring prior to merging into the tree? then no it's not a waste of time. That's the intended workflow. You make your changes to the commits or add new commits in between using rebase. This is how all the development for linux is done.

If you are talking about after they are merged into the mainline? That's also not a waste of time. You don't have to go back and change those commits because they are now finalised and your iterative refactoring can be done via small one off patches gradually merged into main, a series of large patches merged into main, or a set of patch series all merged into an incubation branch that is kept up with main and eventually merged back into main once the refactor has made sufficient progress to take over.

I mean for example if I need to create a new feature:

1. I open a branch.

2. I start coding.

3. During coding I do various commits.

4. I realise I need to refactor/rethink something, I do more commits, overriding the code/ideas of previous commits.

5. Then I finally put it up for code review. But in such case there's no point in going through all commits, since a lot of that gets changed anyhow.

6. Usually I would squash everything into a single commit and merge after doing that. Because a lot of my commits would've been pointless anyway.

What is the suggestion exactly?

> "They accidentally get left in because you miss them."

In my experience, this kind of thing happens much more frequently with large PRs.

If a PR is large enough, things will slip through the cracks. If there are too many change requests, even if they're sensible and make sense, things will slip through the cracks. If there's the need for multiple checks by the reviewer, even more things will slip through the cracks.

Also, breaking up PRs in frontend vs backend is not the best idea. Build the feature iteratively if possible.

EDIT: Since the grandparent is talking about reviewing by commit: if those are more reviewable, just use those as the way to separate PRs instead, maybe?