Hacker News new | ask | show | jobs
by gmfawcett 1938 days ago
In fairness, you're only seeing 5% of the typos. We caught the other 95% before committing. :)

I love your question, "why not a commit per keypress?", because it raises an interesting follow-up: why not squash and rebase entire months or years of project work into single commits? If squashing is so useful, why do we only apply it at low-grain scales? Could we read and understand massive projects quickly and easily, if they only had a few commits to them?

I'm sure that we don't experiment with larger-scale rebases because of the limitations in the technology -- we all know that we're not supposed to 'git rebase' in public, and why that is. But suppose those obstacles were lifted. Now that we can rebase and rewrite at any time scale, which scale(s) is the right one(s) to choose?

3 comments

> why not squash and rebase entire months or years of project work into single commits?

The argument here is that one should rebase and carefully craft commits that isolates each functional change into a separate commit, where each change is motivated and builds on previous, before pushing anything. Every commit should build cleanly, preferably even pass tests. That makes changes easier to reason about, and enables the use of tools such as bisect. Look at git itself for an example of this type of history.

The counter argument to that was that it presents a false view of history. Maybe there were false starts and mistakes made along the way. Without preserving these to history the reader is left without understanding these. This is not an uncommon argument. Some people argue rebase should never be used.

This view suggests that a more detailed history is preferable. Taken to its logical extreme, that would mean every keypress and editor command.

But "why not delete all of history" is not an example of "carefully crafted commits" taken to an extreme. Quite the opposite.

Basically, you want to keep the history of individual logical patches to the codebase, but not the meta-history of how those patches were made.
It helps to think about how git grew out of an email based workflow.

A commit is essentially an email. It has a sender, date, a subject line and a message body. The commit message format is subject, empty line, body. Think of git repository as an archived mailing list worth of patches.

Much the same as you wouldn't send an email describing several days of work without proofreading it, you should treat your commits the same way. The git design grew out of this usage, which was much harder in something like Subversion.

No one would send an email to the kernel mailing list suggesting a patch set that included errors, false starts, and reverts. That would waste reviewers' time. Code history is a craft to aid understading the code, it is not an undo log.

> it raises an interesting follow-up: why not squash and rebase entire months or years of project work into single commits?

That's effectively what happened before version control/before the small-scale rebases we enjoy now were possible. And the reason is that it's hugely valuable in certain circumstances to be able to see some granularity of the history. (Though clearly people disagree about what the grain size should be.)

> Could we read and understand massive projects quickly and easily, if they only had a few commits to them?

I don't think so. The current state is visible at the top of the git tree regardless. History comes in when you are trying to understand why the state is what it is. Usually this is for troubleshooting in my experience, but sometimes also when doing a refactor. Meaningful commit messages attached to meaningfully-clumped patches are, in my opinion, absolute gold in those cases.

There's little benefit to squashing down a year's worth of work into 5 commits because you can just as easily tag each of those 5 commits with a version number, give it a little write up, and call it a release.

I think the reason to squash commits is to cut out the noisy bits that were only useful to the original developer that day and create a timeline that's helpful for future readers. It doesn't really make sense to get more granular than the level of a single commit with a good comment and a small set of cohesive changes. So you store your history at that granular level and you can take care of the rest with tags, minor and major versions, etc.