Assuming you squash when you merge the PR (and if you don't, why not?), why even care? Do people actually look through the commit history to review a PR? When I review I'm just looking at the finished product.
I don't typically review commit by commit, but I do really appreciate having good commit messages when I look at the blame later on while reading code.
Which is just the PR message if you squash. To be clear, I'm not advocating for bad messages, but I am saying I don't worry about each commit and focus instead on the quality and presentation of the PR.
Definitely, but if you've made your PR with several different commits the message will contain all of the information for the whole PR, instead of just the message that pertains to the changes in that commit. It's not a HUGE deal, but it can make it harder to understand what the commit message is saying.
Indiscriminate squashing sucks. Atomic commits are great if you want the git history to actually represent a logical changelog for a project, as opposed to a pointless literal keylog of what changes each developer made and when. It will help you if you need to bisect a regression later. It sucks if you bisect and find the change happened in some enormous incohesive commit. Squashing should be done carefully to reform WIP and fix type commits into proper commits that are ready for sharing.
> It sucks if you bisect and find the change happened in some enormous incohesive commit.
But why are any PRs like this? Each PR should represent an atomic action against the codebase - implementing feature 1234, fixing bug 4567. The project's changelog should only be updated at the end of each PR. The fact that I went down the wrong path three times doesn't need to be documented.
That's true, some are big and messy, or the change has to be created across a couple of PRs, but I don't think that the answer to "some PRs are messy" is "let's include all the mess". I don't think the job is made easier by having to dig through a half dozen messy commits to find where the bug is as opposed to one or two large ones.
> I don't think that the answer to "some PRs are messy" is "let's include all the mess"
Hey look at us, two alike thinking people! I never said "let's include all the mess".
Looking at the other extreme someone in this thread said they didn't want other people to see the 3 attempts it took to get it right. Sure if it's just a mess (or, since this is 2025, ai slop) squash it away. But in some situations you want to keep a history of the failed attemps. Maybe one of them was actually the better solution but you were just short of making it work, or maybe someone in the future will be able to see that method X didn't work and won't have to find out himself.
I can see the intent, but how often do people look through commit history to learn anything beside "when did this break and why"? If you want lessons learned put it in a wiki or a special branch.
Main should be a clear, concise log of changes. It's already hard enough to parse code and it's made even harder by then parsing versions throughout the code's history, we should try to minimize the cognitive load required to track the number of times something is added and then immediately removed because there's going to be enough of that already in the finished merges.
I can see your point and sometimes I myself include PoC code as commented out block that I clean up in a next PR incase it proves to be useful.
But the fact is your complete PR commit history gives most people a headache unless it's multiple important fixes in one PR for conveniency's sake. Happens at least for me very rarely. Important things should be documented in say a separate markdown file.
This simply isn’t true unless you have to put everything in one commit?
To be honest, I usually get this with people who have never realized that you can merge dead code (code that is never called). You can basically merge an entire feature this way, with the last PR “turning it on” or adding a feature flag — optionally removing the old code at this point as well.
So maintaining old and new code for X amounts of time? That sounds acceptable in some limited cases, and terrible in many others. If the code is being changed for another reason, or the new feature needs to update code used in many places, etc. It can be much more practical to just have a long-lived branch, merge changes from upstream yourself, and merge when it's ready.
My industry is also fairly strictly regulated and we plainly cannot do that even if we wanted to, but that's admittedly a niche case.
> So maintaining old and new code for X amounts of time?
No more than normal? Generally speaking, the author working on the feature is the only one who’s working on the new code, right? The whole team can see it, but generally isn’t using it.
> If the code is being changed for another reason, or the new feature needs to update code used in many places, etc. It can be much more practical to just have a long-lived branch, merge changes from upstream yourself, and merge when it's ready.
If you have people good at what they do ... maybe. I’ve seen this end very badly due to merge artefacts, so I wouldn’t recommend doing any merges, but rebasing instead. In any case, you can always copy a function to another function: do_something_v2(). Then after you remove the v1, remove the v2 prefix. It isn’t rocket science.
> My industry is also fairly strictly regulated and we plainly cannot do that even if we wanted to, but that's admittedly a niche case.
I can’t think of any regulations in any country (and I know of a lot of them) that dictate how you do code changes. The only thing I can think of is your own company’s policies in relation to those regulations; in which case, you can change your own policies.
Our regulatory compliance regime hates it when we run non-main branches in production and specifically requires us to use feature flagging in order to delay rollouts of new code paths to higher-risk markets. YMMV.
yes, that would be ideal. especially in a world with infrastructure tied so closely to the application this standard cannot always be met for many teams.
I so miss bazaar's UI around merges/commits/branches. I feel like most of the push for squashing is a result of people trying to work around git's poor UI here.
Alternative to squashing is not a beautiful atomic commits. It is series of commits where commit #5 fixes commit #2 and intruduces bug to be fixed on commit #7. Where commit #3 introduces new class that is going to be removed in commits #6 and #7.
Yeah, I don't see the value in looking through that. At best I'll solve the problem, commit because the code works now, create unit tests, commit them, and then refactor one or both in another commit. That first commit is just ugly and that second holds no additional information that the end product won't have.
I feel like that requires a lot of coordination that I, in the midst of development, don't necessarily have. Taking my WIP and trying to build a story around it at each step requires a lot of additional effort, but I can see how that would be useful for the reviewer.
We can agree that we don't need those additional steps once the PR is merged, though, right?
I have literally never met a developer who does this (including myself). 99% of all PRs I have ever created or reviewed consist of a single commit that "does the thing" and N commits that fix issues with/debug failure modes of the initial commit.
You never design a solution which needs multiple architectural components which _support_ the feature? I do, and would make little sense to merge them as separate PRs most of the time as that would mean sometimes tests written on the inappropriate level, also a lot more coordination and needs a lot more elaborate description then just explain how the set of components work in tandem to provide the user value.