| So I'm relatively new to version control entirely, but in the last few years my group has been making a big push to institute Git. I have been wondering lately, however: how much history cleaning is expected/desirable? When I develop, I split my commits into as many small changes as I can so that the commit messages are single topic. I thought that was basically the idea. Every once in a while I use rebase to combine a few commits that should have been done together as they all addressed the same issue. This all seems right to me. I am left with a clean history of everything I have done on a very fine grained time scale. But the large number of commits, each with little significance to whole program hides the large scale structure of the development. However, I could use rebase to start combining loosely related commits, trading the time resolution for clarity in the commit history. There seems to be a continuum along this scale. Where is the proper place in that continuum to say this is clean enough? Also, I don't like making changes where I am losing perfectly good information. I know that I can group certain commits by defining a branch, developing on it, then merging (non-fast-forward) back to the original. The branch should keep the grouping in the commit history. I even suppose that this is can be done after the fact using rebase with the proper amount of git-fu. Is branching and non-fast-forward merges the preferred method of grouping related commits in the history? If so, this seems troubling as it means that partially fixing something is difficult to do with a clean history. Until the piece of the program you wish to fix is completely working, it shouldn't be merged into master because it would ruin the grouping of the related commits. This means that there can't be any partial thought's like fixing bugs as you find them, because presumably you might want to group all bug fixes of a function together, but have a distinct commit for each. Now I'm more confused than when I started. Seriously, any references or advice on this sort of topic are welcome. |
In general, your commits should be the smallest atomic operation that makes sense. When people talk about 'clean history,' they're talking about working in the awesome workflow git provides:
1. Write half-written broken code. 2. Fix that code up. 3. Add some more onto that. 4. Fix a typo! 5. Forgot to update the README.
Now, you could push that to master, but then the main master is littered with commit messages like 'oops' and 'typo.' Instead, you can rebase 5-1 onto the latest master, squash them together, and have one 'nice' commit that only has the cleaned up final changes.
This is one of the most powerful things about git: in a private repo, you can commit all kinds of garbage and half-written stuff without caring. When you want to make your stuff public, rebase and squash, then send it out. Be careful though! Only rebase your own private branches, or you're gonna have a bad timeā¢.