Hacker News new | ask | show | jobs
by Vinnl 3203 days ago
There's no such thing as "real" history - you don't commit every line of code you add or remove, or even every character. Rather, you choose some points in time to commit.

For me, those are often arbitrary - I can't get something to work at a certain point, so I make a WIP commit with the buggy work at that point, and will come back to it the next day.

Before I merge my branch back into master, though, I want my commit history to be useful. "This is the point where I went home or was disturbed that day" is not useful to future developers. "This is the work I did on this individual feature and everything that's needed to run it and to have the tests succeed is in this commit, and this was the reasoning behind what I did", however, is.

In other words, I rebase to divide my codebase into non-arbitrary units of code, not based on chronology, but on what is useful together.

2 comments

Well, assuming that you are adding features in parallel you have an history. You have a branch with feature A and are about to build feature B, which is now based on A, someone else is building feature C which is also based on A. Building B and C might require different or equal changes to A.

There's no guarantee that any of these features will be built in order since they might have different priorities, difficulties or level of acceptance so it's hard to tell which one must or will be done in what order. Rebasing pretty much settles that while merging is much more sane to that workflow. It's harder to keep it functional yes, but it's enabling parallel development. Using rebasing and/or merge isn't a source control problem but a feature management problem.

Yeah I think my point was more that a blanket ban on rebasing is too rigorous. I agree that once you want to integrate B and C (or both into A), a merge is usually the best way to do it - and I think that's what TFA was actually referring to.

It fails to consider the case, however, of rewriting B's history internally, not as a way of integrating with A or C, but as a way of making its commits clear. Afterwards, you'd still do a merge of A into B and then merging B back once you see that's successful.

This makes a lot of sense to me. If you're maintaining a large project having lots of non-cohesive commits make it much harder to figure out what the logical changes were. I care about the set of code changes that were required to add a feature or fix a bug. I don't care about the set of changes that were required to get halfway to working code.

This is also potentially a big deal if you're doing maintenance bugfix releases for a project - that's way easier if porting the bugfix to an older branch just requires cherry-picking a single commit.

This is precisely why we rebase and squash. We maintain about 4 release branches at any given point and having everything squashed into one or two commits makes a bug fix on master simple and straightforward to downstream.

Although more frequently what we will do is do the bug fix on the furthest downstream branch and because we tag our branches with semver scheme we jus have our prep-for-deploy build automatically walk back up the branches back to master and attempt to merge the feature in along the way.