Hacker News new | ask | show | jobs
by pm215 3203 days ago
The major usecase I have for rebase is code review. If your code review stage happens at the point of merge/rebase/whatever into master, then rebase allows you to present the feature changes in a digestible way for the reviewers (split up into nice individual commits that make sense individually and are small enough to read and review without too much effort). The classic open source "send patches by email" workflow works this way.

The author is correct that a rebase may require resolving conflicts; but then those conflicts need resolving anyway if you choose to merge instead. It is also possible to miss a "semantic conflict" that doesn't make git complain but produces a commit that doesn't build -- you can put in tooling that checks that each commit builds individually to avoid the "oops, bisect on master isn't much use" issues.

1 comments

Ok, if your set of changes is too large to review in one piece, and you put in the work of refactoring your changes into a speedrun-style best possible history so they are cognitively review-friendly, I grant that this is a valid use case for rebase. Though erasing the real history is still a serious drawback, and the unit of code review is still too large. Not many projects work like this, however.
You don't have to erase the whole history, just duplicate it.

What we do is to have a developement branch, with the real history. When ready for merge/review, we duplicate that branch, rebase -i (on the same origin) to clean it all up without making any changes -- changes aren't allowed, only history split then merge the clean branch in the dev one (no conflicts: no changes!) and THEN ask for reviews on that clean branch.

If there are further changes, park the 'clean' branch, continue working on dev branch as before, and make any changes needed (typically we use the 'autosquash' naming convention), and re-do the cleanup before re-submitting.

Once the review is complete, just merge again dev<->clean, and merge that in the trunk.

That way you have the whole history complete, and you also have a set of 'public' patches that have been reviewed and potentially can be upstreamed.

My definition of "too large to review in one piece" is "more than 200-250 lines", so most non-trivial changes benefit in my view from being split into clean patchsets. It is extra effort that's purely to reduce the code review workload, so it makes most sense when the project is very short of code review resources and the reviewers aren't all the same small set of people as the coders (ie not all working for the same company). Keeping a clean set of commits gets easier with practice though, especially if you do it as you go along rather than trying to do it all at the end. I like stgit for tooling that allows you to think of your branch as a stack of patches and avoid the horrible UI of raw git rebase.
On a given feature branch the history is only important if there is more than one review until the work is complete. What the developer does in between these visible/public events is almost always not important. For example in a "make work, make right" situation where the original work is thrown away and rewritten you definitely are not interested in the history.
Have you ever gone back to your (or someone else's) code from 6 months ago and needed to refresh your memory about what you tried and what didn't work, or why some of your test cases/data are what they are? Or why you commented out some test? It can be very helpful, I think this history is often important. And you won't know if a given bit of code history is important until you need it.