Is that why people are always saying you should squash commits? To help with collecting metrics?!
I view it as a clear antipattern (since the history within a branch can be valuable later if you need to cherry-pick apart a feature or find a bug with git-bisect) and have asked superiors in numerous places why they require it, and the response is usually a vague mention of “it cleans things up” and “history isn’t important”. It feels like the kind of practice that was mentioned on a screencast and just got cargo-culted, but I have to think it originally had some purpose.
To me it's rebase > squash > merge. Nobody wants to read a commit history and see a page-long list of "fix audit", "fix typo", "retry ci", "test: change foo", "Revert: test: change foo", "Revert: Revert: retry ci"... That's why you rebase into a sequence of logical, if fictional, worksteps. But if you can't do that, squash is second-best.
My preferred is rebase (exactly as you describe) but keeping the original fork point, (unless it becomes necessary to depend on some other later work from another branch, the original base or not) and then merge.
So you end up with a cleaned-up branch history via rebase, and a master branch that's similarly clean, with a higher level view of 'logical commits' that 'merge X feature', 'merge Y bug fix'.
With rebase I keep my commits small, easier to merge then. I've seen strange things slip through large commits. E.g. Someone (me) missed the deleted files, rebased and hence recreated them.
Then I tried to squash them before merging. But if you squash your commits you get a large commit and you get the same problem. Oh well...
If your workplace requires just 1 commit per change, no matter what the scope, that doesn't make a lot of sense, but there's a lot of room between never squashing and squashing all changes always to 1 commit. Both those extreme approaches don't make much sense to me. Some history is important, some is not.
Squashing commits doesn't have to mean turning 50 commits into 1. It can mean reordering, squashing some commits, tidying up commit messages and generally editing until the set of changes is clear and coherent. This lets you commit early and often during development on an unpublished branch without concern, then tidy that up into a coherent set of changes for readers (including your future self). The absolute numbers don't really matter, for me at least it's more about reorganising and editing changes to read coherently and be properly separated. For example if you need steps 1,2,3,4 to make a change, keep those separate but don't include 2a,2b,2c which were exploring 2 and finding a few places you missed a change when you tested it.
I see it as basic respect for future readers, much as you might revise and edit an essay or novel before publication, revising and editing your code changes at least once often makes them better and clearer.
Commits should be squashed to clean up iterative work, corrections, etc. I commit constantly while developing and testing, especially if I want/need to let a CI pipeline do builds during development. It's good to squash them all when the work is ready to be merged, so that there's a single, clean, clearly explained, atomic commit to add functionality. No one gains anything from seeing a dozen work-in-progress/cleanup commits. The history of how I got to the merge point isn't important. A single unit of new, tested functionality that's ready to merge only needs one clear commit in most cases.
squashing commits make history important actually. The history how you reached the solution in your branch is not important, I am not interested in a wip commit which could indeed have a valid commit message, but still not final. Only the committed is interested into that.
So I think just before merging squashing the commits makes completely sense. Before that the committer is free to do whatever he/she likes.
Of course in the case the diff of files is so big that make multiple commits sensible, means that the PR is not broken correctly. Then, it's fine to have multiple commits, but the problem is elsewhere not in squashing them.
I think that squashing commits sometimes make sense. Like is it worth retain two commits if the second one is just fixing the formatting, logs, or metrics from the first one?
Also in open source I think it can be easier to keep track of the history if one commit == one PR.
I agree people kinda just cargo cult it though, good to be thoughtful about the trade offs for your team or project.
I think its just people protecting their egos and hiding their dev process. I've never been reading the commit history and thought, "boy I wish these were squashed." Much more often I wish commits were in smaller more reviewable bites.
I understand why its there. People want to sweep the details away. I feel it about my own commits as well. However, if I have to go back and read the history, I'd much rather read the ground truth.
It might depend on the CI process. If the CI is only run on the tip of the branch being merged, then the PR should squashed, otherwise if a rollback it required, it would be possible to rollback to a commit that was not tested by CI. Unless there is a list of 'known good commits' somewhere.
I view it as a clear antipattern (since the history within a branch can be valuable later if you need to cherry-pick apart a feature or find a bug with git-bisect) and have asked superiors in numerous places why they require it, and the response is usually a vague mention of “it cleans things up” and “history isn’t important”. It feels like the kind of practice that was mentioned on a screencast and just got cargo-culted, but I have to think it originally had some purpose.