Hacker News new | ask | show | jobs
by rich-tea 2594 days ago
What do you use your git history for? History is either worth keeping, in which case you should maintain it like any other artifact, or it's not, in which case you should squash down master to a single commit every time you merge.

But maybe you use your history for something else that I haven't considered.

4 comments

While I like the idea of rearranging commits to convey a nice (but "not how it originally happened") development sequence, I think in practice this matters less than (say) good commit messages, or the difference between merging and rebasing.

(--fixup type commits aside).

Practical benefits from not squashing history:

- Can bisect to find bug introduction.

- Can annotate/praise/blame to find who/when some change was made.

- Adam Tornhill's "Code as a Crime Scene" argues that it'd be beneficial to consume VCS history to provide health metrics on the codebase. (e.g. use VCS to check which sources have many contributors (thus potentially high defects), or check for "lost knowledge" from developers who have left).

- Can build/run an older version of the software.

But is there really a big advantage from putting time into maintaining a sequence of commits? EDIT: Ah, I see another comment point out that "maintaining a nice history" tends to mean fixing very borked commits. That makes sense. :-)

All of these advantages don't make sense if half your commits are broken versions of the software. Rebasing helps ensure that each commit is valid. That's important for the reasons you mention. Having a log of what you actually did is not important.
History is the cleaned-up story we tell after the fact.

The fact that I had a bunch of stupid typos and broken tests that I didn't realize were broken before I committed doesn't need to be in the final history. What I really want for the preserved history is the conceptual chunks of changes I made along the way.

Is this really good for your team and the project?

If you have a safe work atmosphere, and yor teammates reviewing the work can discover pitfalls in your project's workflow, you as a team can have discussions about it can improve your test stuff. And you can maybe go back through the history and see how many times this kind of normal human mistake with other branches and developers.

You can still diff through the PR as a unit before merging, without getting bogged down in low level commits.

There’s a vast middle ground between those two extremes. Some history is worth keeping, and some is not. Noise commits are of the form “forgot a closing paren,” “comment/uncomment section while debugging,” “finally got it to compile,” “checkpoint,” “fix typo,” or “going home for the day.”

Code and by extension history should be easy for humans to read. For that reason, the signal is very much worth keeping and polishing, but the noise is not. Documentation of false starts, appealing but ultimately problematic design choices, and “why” information belong in comments, commit messages, or design documents — explicit rather than implicitly littered around the history.

can't you do similar by tagging? and then later just diffing against them?
I'm not talking about squashing the feature branch. I'm talking about squashing all of master down to one commit (initial commit). If you don't take care of your history, my question is why do you keep it at all?