Hacker News new | ask | show | jobs
by fulafel 3203 days ago
Ok, if your set of changes is too large to review in one piece, and you put in the work of refactoring your changes into a speedrun-style best possible history so they are cognitively review-friendly, I grant that this is a valid use case for rebase. Though erasing the real history is still a serious drawback, and the unit of code review is still too large. Not many projects work like this, however.
3 comments

You don't have to erase the whole history, just duplicate it.

What we do is to have a developement branch, with the real history. When ready for merge/review, we duplicate that branch, rebase -i (on the same origin) to clean it all up without making any changes -- changes aren't allowed, only history split then merge the clean branch in the dev one (no conflicts: no changes!) and THEN ask for reviews on that clean branch.

If there are further changes, park the 'clean' branch, continue working on dev branch as before, and make any changes needed (typically we use the 'autosquash' naming convention), and re-do the cleanup before re-submitting.

Once the review is complete, just merge again dev<->clean, and merge that in the trunk.

That way you have the whole history complete, and you also have a set of 'public' patches that have been reviewed and potentially can be upstreamed.

My definition of "too large to review in one piece" is "more than 200-250 lines", so most non-trivial changes benefit in my view from being split into clean patchsets. It is extra effort that's purely to reduce the code review workload, so it makes most sense when the project is very short of code review resources and the reviewers aren't all the same small set of people as the coders (ie not all working for the same company). Keeping a clean set of commits gets easier with practice though, especially if you do it as you go along rather than trying to do it all at the end. I like stgit for tooling that allows you to think of your branch as a stack of patches and avoid the horrible UI of raw git rebase.
On a given feature branch the history is only important if there is more than one review until the work is complete. What the developer does in between these visible/public events is almost always not important. For example in a "make work, make right" situation where the original work is thrown away and rewritten you definitely are not interested in the history.
Have you ever gone back to your (or someone else's) code from 6 months ago and needed to refresh your memory about what you tried and what didn't work, or why some of your test cases/data are what they are? Or why you commented out some test? It can be very helpful, I think this history is often important. And you won't know if a given bit of code history is important until you need it.