"Every commit should be runnable, that is we should be able to git checkout any commit and get a functional code base. This means no “WIP” commits..."
I strongly disagree with this.
You can still have WIP commits, you just clean them up before you share them, i.e. squash. It doesn't mean you have to reduce your work to a single commit -- that's silly -- but it does mean building a narrative of changes.
Take a look at patches in the LKML and their associated revisions. This is the most sane approach to history. Changes are broken up logically, nearly all commits in a patch series are individually functional/correct, and in their presented order they tell a story for the given feature.
Your "update" or "whoops" or "wip" or "address feedback" commits are, almost universally, entirely useless. There's no point in checking out one of those commits. They don't help tell a narrative to uniformed readers. They're just noise.
In the end a lot of features end up being single commits. Crafting commits is hard if one wants perfect commits and most of us don't have time to do that.
If I implement text-box with a save button there is no way I can functionally commit "working change" where I only have a text-box or have a save button separately.
There is nothing wrong with that I believe. I think there is much more wrong with insisting on keeping "history" so people think that their "whoops" or "wip" are important.
> Crafting commits is hard if one wants perfect commits and most of us don't have time to do that.
I can agree with the first part, because in my current job we do it. Some more than others, but it's recommended. Throwing some naturally grown git history to review is making the life of the reviewer hard. Doesn't matter whether you belong to those who make a lot of WIP commits because you are paranoid of losing some idea or whether you only commit once when everything is ready, because you are shy of showing your trial and error mess.
Probably you are also correct that most people don't invest the time to do it. That's why there is more bad code on the planet than good. It's hacked together hastily and not reviewed properly because the history is not reviewer-friendly.
I'm sure in some organizations quality work is not tolerated because "we don't have the time".
You take me too literally. Not every feature needs to be broken out into separate commits. Usually the narrative looks something like this:
- Fix some bugs impeding the work being done. Zero or more commits.
- Refactor or lay the groundwork for the change. This could be something like plumbing dependencies, writing some feature flags, a new API, database migrations. Zero or more commits.
- Implement the feature. If you previously plumbed an API to save some text, now you can wire up the frontend. One or more commits.
- Lastly, you may have some cruft leftover that you want to remove, like an old method that you now mark as deprecated or issue a warning, or change some links or remove some feature flags. Zero or more commits.
In the end you have a series of changes that describe a cohesive item of work, or feature. It's rarely as simple as just slapping some extra markup somewhere. This makes it easy for reviewers to see the important bits and makes it easy to bisect or blame. Squashing all these changes would be a mistake.
I highly recommend folks become familiar with interactive git rebase and fixup commits. Instead of thinking linearly, and just slapping work on a branch, it should be more akin to writing a novel. Some background, set the scene, introduce the characters, some exposition, advance the plot, resolution, etc.
All comments regarding “how you should use Git” (and many comments in general) could be prefaced with nature of the project. A predominantly markup based project worked on by a whole team of juniors will have different needs to a C codebase worked on by a single senior.
There’s not one correct way, just alternative ways that are optimal in some situations.
I disagree too, I prefer a lot of small commits. What some people call "keeping a clean history" is actually destroying history and discarding a lot of important information.
I saw a lot of people merging branches a single commit "to keep the history clean" while in fact it just made the history less legible and less reversible.
The article is suggesting that while you are working you can commit as you go, quick and dirty, irrelevant of the changes. Once you get something working and it's ready to be merged, then you can go back rewrite the branch, removing back and forth changes and writing a good commit message with the reason for it for every one change.
This way, the history would be meaningful instead of it be littered by fix up commits and it makes reverts easy.
> The article is suggesting that while you are working you can commit as you go, quick and dirty
Why would someone do that?
I’d argue that if there is cleaning to be done, it should be done way before something is ready to be merged, while context is fresh in your head.
Also, cleaning before something is ready to be merged will lead to sloppy history removal as, often, people are eager to merge for a variety of reasons, and rewriting history may not be seen as a priority.
I personally commit and push to a branch when I think there's enough changes for someone else to review my work (of course with WIP flag), in case if someone is curious and once any of the changes are ready to be merged (this can be before the review changes), I'll clean up and write nice commits.
I guess it's personal preference, I don't recall losing context, as I don't let too much of time passed. And I actually ensure nice commits are written before code review otherwise it might be harder for the reviewer to get the context of the changes.
According to logic that you should also always commit before you hit Ctrl+Z.
Of course you have to consider what history is relevant and create relevant commits for those. But a WIP commit at the end of the day is rarely more relevant when split up from the finishing up of that work the next morning.
I agree with the idea, but disagree with the approach. This seems like a lot of extra work for not much, or maybe even less, benefit.
The process I've set up for my team allows us to get the best of both worlds, in my opinion, with almost no extra effort:
1. Create a branch and open a pull request. Nothing gets pushed directly to master.
2. After the pull request has been tested and reviewed, we Squash and Merge into master on GitHub with a single button click.
This way we end up with a single commit in master for each feature, bugfix, etc. There's also a link to the PR in the commit message where you can view the individual commits that were squashed as well as links to any related Jira issues.
I think I'd go the other way around. Instead of squash/rebase, you should instead use `merge --no-ff`. The allows your main branch to have a clean history with the `--first-parent` option (e.g. `git log --first-parent`), while still maintaining the history on the branches. If you squash the commits, the original is no longer part of the commit history. Yes, there is a link in the commit message, but being able to manually click through to the commits on an external web page is a huge usability drop as compared to getting the information straight from `git bisect`.
Squash/rebase also plays horrendously with my local branches. If I make a PR, I should be able to use `git branch -d branch_name` to safely verify that the branch has been merged into main and can be deleted. If you squash/rebase instead, then I need to use the unconditional `git branch -D branch_name`. git can no longer verify that I'm performing a safe operation, because the history needed to determine whether it is safe has been rewritten.
> Squash/rebase also plays horrendously with my local branches.
This is a very good point, and it has been slightly annoying at times with this approach. It hasn't been a major pain point for me, so I've just dealt with it. I've seen some scripts/aliases that claim to solve this, but I haven't spent much time looking into it.
My current workflow is to watch for a PR to be accepted on github, then use the delete branch button in the GUI. On my local repo, I'll then use `git remote prune origin`, and only call `git branch -D` on branches that were pruned. It's a workable solution, and it could be scripted around, but I don't want to. Reproducing the functionality that already exists in git feels like a waste of time, when the entire purpose would be to work around the existence of a rebase workflow. Better would be to just use a merge/no-ff workflow in the first place.
GitHub actually keeps these in the pull request indefinitely if you delete the branch. You can also restore the branch at any time if needed. We have the branches set to automatically delete when a PR is merged.
Exactly! Suppose feature A is implemented (commit 1). Then it is discovered that the original implementation breaks feature B, and that is resolved (commit 2). In merge workflows, you have a merge commit on main, while commit 1 and 2 are maintained. In rebase/squash workflows, you have a single commit on main, with both changes. If 6 months later I find that the commit broke some feature C, I really want to know if it was the main change or the compatibility fix that did it.
I'll take an intermediary position. Commits on a WiP branch don't need to let the repo in a working state, however I like when said branch has its history rewritten so that each commit is mostly a working unit change before submitting the branch for merging.
You definitely shouldn't restrain from committing often during development, however having a cherrypickable branch history when merging instead of "let's squash this mess and forget about it" has value to me.
I'm not experienced enough to form a strong opinion on this. Could you provide some reasons why you strongly disagree?
At first glance, it seems like a beneficial goal. If I ever need to checkout different points of history to find a breaking change, it'd be nice if I can run the code after every `git checkout`.
Why? It's an excellent principle, especially for debugging. The biggest benefits of your commits and commit messages are not you but future code reviewers. Imagine someone reads the blame on a file and sees WiP commits (that absolutely will eventually happen), that's nightmare fuel.
As someone who has had to do some "deep archeology debugging" of legacy code, I'd much rather see that git blame full of WIP commits from junior developers than squash/rebase commits from the same. Questions like "was this an intentional regression or bad test behavior from an in progress commit or an accidental regression from a bad merge?" are much easier to answer when you've got "more parts of how the sausage was made" to work with. When I'm not doing "deep archeology" I have tools like git praise --first-parent to focus on the high level (completed PR merge commits).
There are two competing interests, and the author suggests cleaning up after so that you can satisfy both.
The first interest is what you said, for every commit to build.
The second interest is to help prevent loss during development. If I am working on something and need to finish for the day, but haven't yet got the code to a working state, I want to be able to commit and push my code in case something happens to my laptop.
Also, sometimes I am making a series of big changes, but I need to make them all for the code to work. I want to have checkpoint along the way, in case I need to undo something (or even just see what I have changed since the checkpoint!) committing WIP code that doesn't build lets you do that.
We usually solve this problem with squash merges, so every commit on the main branch is the full working feature, but there are downsides to that technique.
Take a look at patches in the LKML and their associated revisions. This is the most sane approach to history. Changes are broken up logically, nearly all commits in a patch series are individually functional/correct, and in their presented order they tell a story for the given feature.
Your "update" or "whoops" or "wip" or "address feedback" commits are, almost universally, entirely useless. There's no point in checking out one of those commits. They don't help tell a narrative to uniformed readers. They're just noise.