Hacker News new | ask | show | jobs
by wylee 4013 days ago
I agree with you, but only for local commits that haven't been pushed to a shared repo.

Rewriting local history seems no different than rewriting code in your editor.

Rewriting shared history is (almost) always bad.

7 comments

I like "Rewriting local history seems no different than rewriting code in your editor", that's a pretty good analogy I hadn't thought of.

There are a (very) few instances where you'd want to rewrite something pushed to a shared repo. One is if there's a shared understanding that that branch will be rewritten. Some examples would include git's own "pu" and "next" branches. "pu" is rebased every time it changes, and "next" is rebased after every release. Everyone knows this and knows not to base work off these branches. There's also the occasional "brown paper bag" cleanup like some proprietary information got into the repository by mistake and all the contributors have to cooporate to get it removed. But all of these take out-of-band communication somehow.

We've been fine using rebase on already pushed branches. This comes from the understanding that a feature branch belongs to one developer, ever, and that no one else is supposed to work off of it (or at their own peril).

Everyone knows that it's "my branch" and that they're absolutely not supposed to use it for anything until it's merged back into master or whatever authoritative branch.

If you're having a person own a branch, implementing it in their fork would probably make more sense: https://www.atlassian.com/git/tutorials/comparing-workflows

We using the forking model for bigger projects with more developers, and the branching model for smaller projects. It works out very nicely.

Ok, that makes sense... but then why bother pushing the branch in the first place?
For me, it's because I hop between development machines, and pushing/pulling a branch is much easier than the alternative of synchronizing files manually among said machines.

Also, so that if something goes awry with my dev machine for whatever reason, at least my work is saved.

Also, to make it easier for a colleague to review my code before it gets merged into something.

Also, becuase it means I can use GitHub's PR system instead of doing it on my machine (thus providing some additional record that my code got merged in, and providing an avenue for the merge itself to be reviewed and commented on).

We have a rule that you never go home at night without pushing your work, even if it's garbage. Put it in a super-short-term feature branch if needed, and push that, but don't leave it imprisoned on your machine.
There are people who follow this rule, and there are people that think disk failures are what happen to other people.

Few things sting as bad as loosing hours or days worth of work.

And there are people who have good backup systems.
It allows builds off of that branch, so you can get test feedback etc. It also acts as sort of a backup or a sync if you switch machines.
Code reviews -- you can create a PR on the pushed code, make fixes in response to the comments, rebase, and re-push.
I work on multiple machines. Pushing my branch up even if it's busted code means I can continue work easily on other computers.
Immediate backup

(I hope I'm not alone in saying this...)

I kinda hope you are, because backup and source control really should be separate functions. Obviously your source control repository should be backed up, and pushing stuff into it acts to create a backup, but you really should have a separate backup system at work as well, to cover unpushed code as well as all the other useful info contained on your computer.
I use it the same way too. I do not really see why backup should be separate from source control as there is no valuable information on my (work) computer apart from the source code, and I never spend more than a few hours without pushing.
Backups of your work computer would close that hours-long window between pushes.
You are not
Does anyone advocate rewriting shared history? Oddly I see this "exception" a lot in reply to this person but I'm not sure I ever read anywhere anyone saying rewriting shared history is a good idea.
I think its less people saying you should rebase shared history, and more people saying you should rebase without realizing shared history matters. Then some poor confused soul starts always rebasing before pushing/merging and they mess up their local history and do not know how to fix it.

A lot of git is "magic" to many developers, and the way that rebase works is certainly one of the features poorly understood.

Only in extreme circumstances where something sensitive (such as credentials) or otherwise (such as other people's copyrighted assets, or .svn directories in the case of some repos that were moved from SVN to get in a hamfisted manner) was checked into the repository and needs to be removed. Those are the only reasons for rewriting shared history.
My rule of thumb is that rewriting shared history is always, always bad. There may be situations where the proper precautions can mitigate the risk, but I've never seen a good example where it's actually a completely good idea without downsides.
> I agree with you, but only for local commits that haven't been pushed to a shared repo.

Yes, that's why Git doesn't allow you to push rewrites, at least not without '--force'.

> Rewriting shared history is (almost) always bad.

Agreed. The one counterexample that I have is Github pull requests. Those are actually branches in your fork, and you do want to rewrite those when you get feedback on a pull request. That makes it easier for the owner of the repo to do the merge later.

Why do you need to rewrite? If a pull request is not completed, you can continue to push it and the PR is updated to pull the latest commit.
I will get pull requests where later commits fix bugs introduced in former commits.

I generally ask people to rewrite such PRs, as I’m not going to pull known buggy commits into master, even if they are followed by fixes. That is just noise.

It might also be that some commits in the PR has changed tabs to spaces or vice versa.

I think the point was: if you have a PR with two commits, you can squash it to a single commit and force push. This will update the PR to just have the single commit. (Similarly with a rebase.)
sorbits' point was in response to:

clinta > Why do you need to rewrite? If a pull request is not completed, you can continue to push it and the PR is updated to pull the latest commit.

sorbits is saying that no, you really should rewrite your PR.

You, hayd, seem to be merely reiterating sorbits' point.

Making 'temporary' commits and rewriting local history before pushing to a shared repo has analogs in other revision control systems:

* In Subversion, people track patches using tools like quilt to manage them before actually putting them together into a commit.

* In Mercurial, people use `hg mq` which is like a more featureful version `git-stash`.

These are basically all ways to track a series of patches prior to 'committing' them into the code base shared with others.

Speaking of `git-stash` I've always thought of `git-stash` as a less featureful version of `git-branch stash`
I don't think I've ever seen anyone advocate rewriting shared history.
I've came across reasons, but they've always been pretty marginal, such as somebody checking in sensitive credentials without realising what they were doing.
I think I would like the ability to edit commit messages for typos without having to force everyone to reset --hard.
The thing is, the commit message is part of the commit, not something separate from it. Irritating as it might be, this is good for traceability.

What I do to avoid that is work on a separate branch, rebase against master, then review the commits on my branch after getting rid of any WIP commits and shuffling them around to make more sense. Finally, I make sure the commit messages are (a) accurate and (b) have no typos. Once I'm satisfied with that, I merge.

I treat merging as a big deal, but not committing.