Hacker News new | ask | show | jobs
by Aethelwulf 779 days ago
Do people really look at their git commit history like this? Why?
5 comments

Yes. To figure out what happened, and which path it took to get into my repo. Or to see how branches have diverged.

I also recommend it to git newbies, as a tool to understand the state of their repo when something has gone wrong (e.g. they did a bad merge or rebase).

But with rebase, the history does not show what happened. It shows the changes to the code yes, but not how they came into being.
People often bring up this objection, but I don't want a complete history—I want an easy-to-understand history.

If I make a variable in commit A and think of a better name for it in commit C, why wouldn't I use rebase to squash C into A? Some sense of purity of history?

Or more directly related to the rebase vs. merge debate, if I fix an issue across three functions, and in the meantime someone has removed one of those functions on the main branch, rebasing eliminates the "history" of me fixing that removed function and I think that's good. It makes my commit simpler.

Rebase can certainly be used to simplify the history too much, but that will always be a judgement call. That shouldn't keep us from editing our branches in ways that are clarifying instead of confusing.

We never capture the full complexity of what we went through writing code in our source control. It would be bad if we did.

I think a lot of people are anti rebase in general, but doing that on your own is fine IMO. I have a different issue.

The issue with is with rebasing multiple commits onto master instead of merging - the intermediate commits were never code that anyone ever actually wrote or tested, so any issues with them stem purely from the fakeness of the history.

If you have commit A then you write a branch A -> B -> C while someone else writes A -> D and they merge first, rebasing to get A -> D -> B' -> C' means B' was never something you wrote or tested. This code never existed on anyone's machine or had CI run on it before the rebase.

Does it really make sense to run CI on all of the new intermediate commits that rebase invented? What if some of those fail, are you really going to go through and fix tests for fake intermediate commits?

The solution for me has always been to squash. Inventing fake history is totally pointless and counterproductive. If you want cleanliness and bisectability via destroying the "real" history, just go ahead and really destroy it, don't invent a fake, possibly broken history.

I think the Rust project strikes a pretty good compromise for this issue: rebased linear branches with conflict-free merges onto master. When you make a PR, the CI sees whether your branch can be merged onto master without causing a conflict; if not, it directs you to rebase your branch onto the latest version of master. This check is repeated for all open PRs whenever the master branch is updated.

Once it's satisfied that your branch can be merged, it runs a subset of the tests, and throws an error if they fail. This way, even if you do rebase your branch, its latest commit will still be tested. (Having intermediate commits pass tests is encouraged but not required.) Finally, it regularly takes groups of 8 or so accepted PRs, tries merging them all in sequence, and runs the full test suite on the result. If it succeeds, the merge commits are pushed to master; if not, a human operator gets it to try again without the offending PR.

By your terminology, I suppose this would count as running CI on all the "invented" commits, and forcing PR authors to fix all their tests. But in practice, it's not too odious, since most PRs don't conflict (unless you're touching half the codebase), and any test failures from a non-conflicting change will get caught by the merge step.

An easy to understand history that is correct about the relevant details... but "David didn't happen to run the formatter before his WIP commit that time" isn't ever going to be a relevant detail.
That is fine. No-one in 5 years cares that a dev did 10 “fix test”, “ci format”, “fix misc derp”. Changes should be single units, such that bisect works cleanly to find the resulting bugs.
It shows where you ended up, and with the help of `git reflog`, you can also show where you came from (in the same graph!).
If you only rebase feature branches before merging to main/master/trunk then you still get most of the history.
Why would I care how the changes came into being?
I've been using git since the first year it came out, I've never looked at it visually.
I've been using git for ~15 years and I've also never looked at it visually, except by accident. And yes, this does include working on big repos with lots of other people. Maybe we're the weird ones.
You can look at it visually?
Same reason anyone visualises any data. It makes it easier and quicker to understand (unless it is a mess due to not rebasing & too fine commit granularity).
The map? Sometimes it helps to get a visual queue (at least for me) but most of the time I don't need it.
I've asked this before. I think it's down to flaws in development practices elsewhere.