Hacker News new | ask | show | jobs
by CuriousCosmic 935 days ago
> And if you need to have the 2 PRs open side by side, then why split it up in the first place?

I agree however I think PR's should be split up. Just not into separate PRs. The solution is to actually start reviewing PRs by commit (you can do this in the PR web interface). Everything is split up and can be reviewed separately but you can still see the final diff and while each discrete unit is reviewable, the greater feature is also still one discrete reviewable item.

2 comments

I agree big time with this.

Organizing by clean commits is definitely important.

Why’s that? Do you step through commits diff by diff when reviewing?
This is actually the intended way of using git. Pick any random mailing list of the lore [1] and select a thread that has [PATCH] in the name to see this in action.

I just grabbed a random patchset off of the git mailing list and linked it below [2] to demonstrate this.

You'll notice that on top of the overall patchset (equivalent to a PR) having a detailed description (in the coverletter, i.e. PATCH 00/xx), each commit has a descriptive name, message detailing what the changes within do, and signoffs by everyone who contributed to it.

Then as each reviewer comes in, they review each individual patch (i.e. commit) separately, replying to the message for that patch. And any high level concerns can be in reply to the coverletter (addressing the entire patchset).

As the contributor responds to comments and makes the requested changes, those changes get made per patch/commit via rebasing rather than just added on top. And when they are finished making the revisions, a new v2 patchset is released in reply to the cover letter of the first patchset, now also containing a diff per commit/patch against the previous revision.

Then the cycle repeats until everyone is happy. At that point the maintainer will merge the changes into their incubation branch (the git devs call theirs `next`) and after some time has passed, they merge it into master/main and it becomes established history.

The worst part of the whole workflow is that it uses email but other than that it is a far preferable reviewing experience to using github's stock pull request workflow.

1. https://lore.kernel.org/

2. https://lore.kernel.org/git/20231010123847.2777056-1-christi...

But if you are doing by commit and it later gets iteratively refactored, wouldn't that be a huge waste of time?

How would you know what to critically evaluate in such case?

I'm confused what you mean?

If you are talking about refactoring prior to merging into the tree? then no it's not a waste of time. That's the intended workflow. You make your changes to the commits or add new commits in between using rebase. This is how all the development for linux is done.

If you are talking about after they are merged into the mainline? That's also not a waste of time. You don't have to go back and change those commits because they are now finalised and your iterative refactoring can be done via small one off patches gradually merged into main, a series of large patches merged into main, or a set of patch series all merged into an incubation branch that is kept up with main and eventually merged back into main once the refactor has made sufficient progress to take over.

I mean for example if I need to create a new feature:

1. I open a branch.

2. I start coding.

3. During coding I do various commits.

4. I realise I need to refactor/rethink something, I do more commits, overriding the code/ideas of previous commits.

5. Then I finally put it up for code review. But in such case there's no point in going through all commits, since a lot of that gets changed anyhow.

6. Usually I would squash everything into a single commit and merge after doing that. Because a lot of my commits would've been pointless anyway.

What is the suggestion exactly?

Ohhh yeah. My workflow is:

1. I open a branch. i.e. `ID-XXXX-branch_name-working`.

2. I start coding.

3. I make a ton of quick tiny commits. (I generally label these commits "NO-MERGE: ").

4. I make a bunch of changes.

5. I finally get everything done.

6. I now create a new branch `ID-XXXX-branch_name` from `ID-XXXX-branch_name-working`.

7. I rebase that branch to get my code cleanly formatted by commit with each discrete feature or change getting its own commit.

8. My code goes up for review.

9. I get changes requested.

10. I make a new branch `ID-XXXX-branch_name-v2-working` from `ID-XXXX-branch_name`.

11. I make the requested changes as a bunch of new small "NO-MERGE: " commits.

12. I am now ready for re-review.

13. I now create a new branch `ID-XXXX-branch_name-v2` from `ID-XXXX-branch_name-v2-working`.

14. I rebase those NO-MERGE changes into my "presentable" commits, adding or removing well documented commits as necessary.

15. I now send out my v2 revision to the mailing list or I change the HEAD of my PR from `ID-XXXX-branch_name` to `ID-XXXX-branch_name-v2`. If I'm using a PR workflow, I link a diff between the two revision branches (you can do this in github using `https://github.com/org/repo/compare`). That isn't necessary with patchsets since I can easily do a range diff there. I suppose I could do a cover letter with range diff and paste it to github but it somehow doesn't seem as nice.

16. Rinse repeat steps 8-15 as necessary for each new revision.

17. "LGTM"

18. Merge into `main`/`master` (like actual merge, not rebase or squash merge) and close PR.

19. Clean up branches. Either save them somewhere for prosperity if you have trust issues like me or just delete them.

This looks like a lot but I was trying to be as detailed about that workflow as I can be. Realistically it's not so bad and you can get a hold of it very quickly.

Also with regards to rebasing changes into well documented, discrete commits, if when you are doing your development you make your commits small and self contained, with `git rebase -i` you can actually just reorder the list of commits to chunk together the related commits and 99% of the time it'll rebase with little to no merge conflicts. Then you can just squash those chunks down into your presentable commits. This also applies to your v2 changes and on. You can just move those commits in the rebase TODO to put them after the commit you want to squash/fixup them into and if they are small clean changes, they should rebase without conflict. Things only get nasty and break when your commits are spanning multiple unrelated files and you try to break those up.

I'd estimate rebasing new changes into an already documented set of commits probably takes me 1-2 minutes on average so I consider it well worth the extra 30 minutes spent over the course of the week.