Hacker News new | ask | show | jobs
by rhpistole 3804 days ago
Furthermore it's mostly a rant against subversion rather than going making any kind of argument about his assertions.

Plus anybody who advocates the use of git rebase --interactive as a sensible response to code review feedback is clearly insane.

3 comments

> Furthermore it's mostly a rant against subversion rather than going making any kind of argument about his assertions.

A completely misguided rant at that. The author even admits it: "To be fair, you can adopt this workflow with Subversion, using either task branches or patches..." This is exactly how my team uses Subversion (and used CVS before it; yes, we used this exact "branch per issue, code review before merge" workflow with CVS up through 2013) and we don't find the overhead of branching and merging onerous. Granted, I haven't ever tried contributing to an open source project using SVN, but arguably Git is the more-correct tool for that kind of development anyways.

We use branches with SVN, but it's slow and creates a lot of overhead on quick tasks. People who can get away with it work out of trunk and commit directly to trunk when they can.
> Plus anybody who advocates the use of git rebase --interactive as a sensible response to code review feedback is clearly insane.

Why? Prior to a series of changes being pulled into someone else's history you have all the incentive in the world to present the cleanest view of history possible. Moreover, if you can rebase -i on top of master you can turn the merge into a fast-forward rather than an actual merge. A more linear history with fewer merges is much easier to bisect as well as to read, generally.

> Moreover, if you can rebase -i on top of master you can turn the merge into a fast-forward rather than an actual merge. A more linear history with fewer merges is much easier to bisect as well as to read, generally.

Au contraire, rebase can break bisect by creating commits that were never actually tested or compiled. And if your bisection algorithm cannot handle merges, then that's a problem with a bad bisection algorithm.

Nor does it make history more readable; it makes history more readable in the same sense that inlining all procedures makes a program more readable (i.e. not really).

Merge commits are an abstraction mechanism. You can view a branch's history as a purely linear history (git log --first-parent) of normal and merge commits, where merge commits represent the merged branches (and where the commit message should summarize the branch changes properly). You can then unfold merge commits and view the branches they represent in such a quasi-linear fashion, too. Repeat as needed.

This can also guide a bisection algorithm (bisect along git log --first-parent, descending down the merge hierarchy as necessary) and is generally more readable if presented properly (that Git doesn't have a tool other than git log --first-parent for this is a different story). Conversely, a fully linearized history is longer and lacks any structure (as I said, it's like manually inlining all procedures in a piece of code).

> Au contraire, rebase can break bisect by creating commits that were never actually tested or compiled. And if your bisection algorithm cannot handle merges, then that's a problem with a bad bisection algorithm.

This is only true if you allow commits that haven't individually passed your CI suite to be pushed. That's insane.

> Nor does it make history more readable; it makes history more readable in the same sense that inlining all procedures makes a program more readable (i.e. not really).

Working at an organization (Google) that for the most part doesn't use branches and where other merges are impossible (every file must have previously been resolved against whatever is at HEAD prior to submitting), I respectfully disagree[0]. A linear history of individually reviewed and validated changes is MUCH easier to deal with than the regularly branched/merged history of my previous gig (which did use git, along with pre-push change review for each change).

> Merge commits are an abstraction mechanism. You can view a branch's history as a purely linear history (git log --first-parent) of normal and merge commits, where merge commits represent the merged branches (and where the commit message should summarize the branch changes properly). You can then unfold merge commits and view the branches they represent in such a quasi-linear fashion, too. Repeat as needed.

They're a leaky abstraction, though. Unless you mutate published history (which I do think is insane) it's inevitable that your nice bundle of merged changes will one day be incomplete: additional changes to make the bundle actually function as intended will be required. These will end up detached from it as some other commit series on HEAD or merged into HEAD. I find it far easier to tie together a series of changes via things like issue tags in the commit than expecting a merge commit to properly bundle things up.

[0]: Mind you, Google also for the most part does not use git -- we do our own thing, and with that history is necessarily linear, but I've found I really do prefer it that way. See: http://www.wired.com/2015/09/google-2-billion-lines-codeand-...

> This is only true if you allow commits that haven't individually passed your CI suite to be pushed. That's insane.

It's also pretty common in open source projects (due to limited CI resources or other reasons). I also note that this is not the only problem: the problem is that these commits never existed in the wild. This can also lead to subtle problems that CI simply can't catch.

> Working at an organization (Google) that for the most part doesn't use branches and where other merges are impossible (every file must have previously been resolved against whatever is at HEAD prior to submitting), I respectfully disagree[0]. A linear history of individually reviewed and validated changes is MUCH easier to deal with than the regularly branched/merged history of my previous gig (which did use git, along with pre-push change review for each change).

Your problem here is that Git does not have a good way of displaying structured history and just throws the raw DAG or a topologically or chronologically sorted list at the user; lack of named branches means that you're largely looking at an anonymous soup of objects with no tools to filter them. Contrast with Bazaar's hierarchical logs (augmented with branch nicknames), for example, if you want to make a serious argument.

> They're a leaky abstraction, though. Unless you mutate published history (which I do think is insane)

Published mutable history is not per se insane. Please familiarize yourself with Darcs/Pijul/Mercurial's changeset evolution. Note that I am not saying this is necessarily a good idea, it's just not per se insane. Not all projects have the same requirements/workflow/number of contributors/etc.

> it's inevitable that your nice bundle of merged changes will one day be incomplete: additional changes to make the bundle actually function as intended will be required. These will end up detached from it as some other commit series on HEAD or merged into HEAD.

Note that when I say that this is an abstraction mechanism, it's not the only abstraction mechanism. This, again, is a Git problem [1], not a VCS problem. This is, for example, where named branches come into play; see in particular Fossil's tagging mechanism (which has little to do with Git-style tags) or Veracity's stamps.

[1] Git is a solid workhorse for a DVCS, but it is a very basic version control system, sort of the lowest common denominator in the DVCS world. This makes it a poor basis for arguments as to what can be done in a VCS.

I thought the -i option was irrelevant to whether the merge commit is present. It's the "putting your commits directly downstream of master" that avoids the merge commit, not the use of the interactive rebase features. Right?
You present a clean history, yes, but not a correct one. The context in which you made your changes, often noted as important, has been lost.
In the general case, I agree; rebases do lose a bit of context. As a code reviewer, however, I find the reason I most often ask for a rebase & squash is that the "context" that I'm asking to lose is simply the author isn't taking the proper time to think about what he or she is doing, and it shows, in commit messages like,

  * I can't spell
  * fix tests, too, lol
  * fixed for real
  * fixed
  * minor touchup
  * Implement feature foobar from issue #123
Once the code review has completed, this history is not going to ever be useful again; worse, it impedes someone looking through the logs for useful history, and other operations like cherry-picking this feature to a branch.
I'd argue that what you're describing shouldn't even be called a rebase, even though you could use "rebase -i" to accomplish it. For example, I just use "reset --soft ; commit" instead.
Presumably each change in the history will be reviewed, though, right? One assumes the point of doing rebase -i to reorganize commits during review is to present a history with MORE and BETTER context -- one which can be more easily reviewed and understood and where changes can be accepted in-order and piecemeal.
I often just use `rebase -i` as a way to squash + fixup commit messages.
"git rebase -i" is essentially how the entire Linux kernel process works - and IMO it works pretty beautifully. The only real downside I've seen is that it can lead to developers submitting code which breaks bisection due to not testing each individual rebased commit - in the kernel, this is somewhat alleviated by the Intel 0day bot which gives us automated build failure feedback, but it's still an issue. The benefits of a nice clean changelog though should not be underestimated.