Hacker News new | ask | show | jobs
by anonymars 79 days ago
It boggles my mind that instead of this being a UI projection, git instead ingrains a process where developers habitually destroy their history (and bisection options, and merge conflict resolution), therein loading an additional footgun that goes off every now and again when it turns out a now-squashed branch was the basis of (or merged into) some other branch
4 comments

It’s important to note that not all history is worth keeping, and keeping a dozen commits titled “fix” fixing build / CI errors from the original changes are a lot worse for bisecting than squashing it all into just one.

I very much prefer keeping histories by default (both my personal workflows and the tools I build default to that) but squash is a valuable tool.

> keeping a dozen commits titled “fix” fixing build / CI errors from the original changes are a lot worse for bisecting than squashing it all into just one.

How so? When I bisect I want to get down to a small diff, landing on a stretch of several commits (because some didn't build) is still better than landing on a big squashed commit that includes all those changes and more. The absolute worst case when you keep the original history is the same as the default case when you squash.

Because they’re broken and their only purpose is to fix up the original change, so it’s functionally the same change.

> The absolute worst case when you keep the original history is the same as the default case when you squash.

No, now you have a bunch of worthless broken commits that you need to evaluate and skip because they’re not the problem you’re looking for.

> Because they’re broken and their only purpose is to fix up the original change, so it’s functionally the same change.

Do you restrict yourself to 1 non-broken commit per PR? I don't, and nor does anyone I've worked with. If there were even 2 non-broken commits in the PR, then bisecting with the original history lands you on a diff half the size that bisecting with squashed history would, which is a significant win. (If you didn't care about that sort of thing you wouldn't be bisecting at all).

> No, now you have a bunch of worthless broken commits that you need to evaluate and skip because they’re not the problem you’re looking for.

What are you "evaluating"? If you want to ignore the individual commits and just look at the overall diff that's easy. If you want to ignore the individual messages and just look at the PR-time message that's easy too. Better to have the extra details and not need them than need them and not have them.

> Do you restrict yourself to 1 non-broken commit per PR?

No. To the extent that I can however I do restrict myself to only non-broken commits.

> If there were even 2 non-broken commits in the PR, then bisecting with the original history lands you on a diff half the size that bisecting with squashed history would, which is a significant win

It is not a significant win when the bisecting session keeps landing me in your broken commits that I have to waste time evaluating and skipping.

And splitting out fixups doesn’t save anything (let alone “half the size”), most commonly those fixups are just modifying content the previous commits were touching already, so you’re increasing the total diff size you have to evaluate.

> What are you "evaluating"?

Whether the commit is the one that caused the issue I’m bisecting for.

> If you want to ignore the individual commits and just look at the overall diff that's easy. If you want to ignore the individual messages and just look at the PR-time message that's easy too.

Neither of these is true. git bisect (run) lands me on a commit, it’s broken, now I need to look whether the commit is broken in a way that is relevant to what I’m seeking.

> Better to have the extra details and not need them than need them and not have them.

Garbage is “extra details” only in the hoarder sense.

> It is not a significant win when the bisecting session keeps landing me in your broken commits that I have to waste time evaluating and skipping.

Skipping a commit that doesn't build is trivial (especially if you're automating your bisects).

> And splitting out fixups doesn’t save anything (let alone “half the size”), most commonly those fixups are just modifying content the previous commits were touching already, so you’re increasing the total diff size you have to evaluate.

If you feel the need to rebase to squash one-liner fixups into the commits they fix then that's a more subtle tradeoff and there are reasonable arguments. But squashing your whole PR for the sake of that is massive overkill, and the costs outweigh the benefits.

Yes, a good Git log viewer that would auto-squash branches down to a summary, and allow "expanding" them, would be useful. But the way branching and merging creates confusing train-track graphs is IMHO one of the reasons why many teams end up using the squash-and-merge workflow. There's definitely room for improvement there...
For sure. It just bugs me that we're stuck between two bad options.

Now let's also talk about renames...

I would assume most people who would enable an "auto squash" option also aren't carefully creating and curating commits. Bisect is useless if half your commits are broken. People regularly make commits that don't even build, much less pass QA and deliver a valid version of the software. These are works in progress, broken versions and should be deleted.

If you actually do like to deliver the correct number of commits then it's frustrating to work with people who don't care. In that case I would suggest making the squash optional but you could also try selling your team on doing smaller commits. In my experience you either "get it" or you don't, though. I've never successfully got someone to understand small commits.

> I would assume most people who would enable an "auto squash" option also aren't carefully creating and curating commits.

Or don't have a choice. Our department-wide rules were almost to require that for all repos, I had to push hard just to make it "strongly suggested" instead.

Git doesn't do that. People needlessly destroying history do that.

Git will happily let you merge branches and preserve the history there. GP seems to like that history being in PRs only on github instead. I don't get why, that just seems worse to me.

The why is that most people when given the merge option don’t clean up their history so you end up with tons of garbage fix up commits.
That is an issue of ignorance, not laziness. It’s not obvious at all to an average developer that only uses `add/commit/merge/fetch/push/pull/rebase/restore/reset` that they can manipulate their change history.
The cause stops mattering after a while, either you have to go on a full time campaign to educate people… or switch the setting to rebase and squash and be done with it.
Off-topic, but what does GP stand for? I know OP usually means Original Poster, but I'm not familiar with GP.
Grand-parent, as-in the parent of my parent comment.