Hacker News new | ask | show | jobs
by fishywang 1907 days ago
(squash-n-merge isn't new on github, unless you are not talking about the same thing I'm thinking about)

Yes squash-n-merge is often needed in github's PR workflow because no one need those un-bisect-able fixup commits in the final merged master/main branch, and also they make the diff between different states of the PR more readable, but it comes with its own problems.

Main problem is commit message. As the contributor (the one sending out the PR for maintainer to review), you have no control on what the final commit message in the merged single commit is. The maintainer doing the merge decides that for you, and by default github generates that message by combining all the commit message titles (the first line of the commit messages) of all the commits in that branch, and that's almost never the good choice for the final commit message.

Another problem with that is the email in the final commit. When the maintainer use squash-n-merge, github uses your default email on file on your github account, regardless whichever email(s) you configured your git to use and associated with those individual commits inside the PR.

As a result, squash-n-merge is more suitable for contributors less familiar with open source contribution, for example people not yet realized the value of a good, concise commit message, and people don't have different email addresses for different projects. For advanced contributors, there's no wonder they would prefer force-push with rebase-merge when they are making contributions on github, because rebase-merge makes sure the exact state of their final commit is preserved, including commit message, email address associated with it, and gpg signature if they use that. But github's rebase-merge strategy has its own issues, as described by the author and more.

2 comments

You can work around all of this as a contributor by squashing on your own end before the final merge.
That comes with all the problems with force push and rebase, bar the history during code review one.

For example this still has a commit message issue, just on the maintainer's side: As the maintainer if you are going to use rebase to merge this PR, that means you need to accept whatever commit message the contributor wrote as-is. Are you happy with that? If not, you can't even leave inline comments on that, and it's usually pretty hard to communicate and give feedback on how you want the commit message to be.

> un-bisect-able fixup commits in the final merged master/main branch

If you require PRs to create merge commits you get the nice world where git bisect --first-parent bisects at the PR level, you don't have to worry about the individual commits inside the PR/below the PR level when bisecting, but you still have that commit history "as-is" for deep archeological dives when you need it.

(And you can use --first-parent to cleanup git log and git praise too.)

And those commits rarely provide useful information because they're of the variety where people fix syntax errors, add missing files, remove changes they didn't mean to commit, etc.
There's plenty of information in all those types of commits even if you personally don't find that information "useful". I've had to do the sort of archeology digs to figure out "what syntax errors did a build tool miss", "why is this type of file often missed to be added, and how often do we miss it", "what was still TODO in this feature effort that got removed at the last minute", etc. All of which needs information from those sorts of "low level" commits.
In the instance where a file was missed and added in a later commit, then running git blame would show the sha1 referencing a commit that has a title that says something like "Added missing file". That's not going to tell me anything about why that file was added.

Instead, if you had a commit that explained what the file was for or if some of the lines in that file were added by a commit that explained the change and why it was made, then that would be useful history.

Many times, investigations start with running git blame on a file you plan to make changes to. The usefulness of commit messages associated with each line in a change and whether the diff associated with the commit shows a logical change rather than a fix for a syntax error is the difference between an investigation that leads to results versus one that leads to a dead end.

I already mentioned `git blame --first-parent` just a few comments up! You get the sha1 referencing a commit that has a title like "Merged PR #327". You can dig down deeper than that --first-parent level if need be, but you have the power of the git graph to show/hide details if when you do/do not need them.
Does the --first-parent flag handle the case where the the line was change as part of a conflict resolution in the merge commit itself?