Hacker News new | ask | show | jobs
by gemma 4573 days ago
No, that advice from the article is fundamentally broken. Outside of the garbage collection system (which runs by default after what, 30 days? 90?), Git doesn't delete committed content. Any commit you "lose" through rebasing, amending, resetting, etc. can always be recovered. It's a little more complicated than renaming a directory, sure, but it's important, and it's not something a Git tutorial should ignore.

Git IS safe, and ANYTHING involving changes to history can be undone without resorting to backups. Data loss can occur when you're mucking about with uncommitted changes, but that's a risk in most other version control systems as well.

2 comments

Surprised to see no one in the comments has mentioned the reflog [0]. It's really very easy.

[0]: http://jscal.es/2013/08/05/seriously-the-reflog-isnt-that-sc...

I'm not 100% sure this is true, however it is also a fundamental flaw of git. There should be a way to remove commits permanently in order to remove mistakenly checked in large files or private content.

It's also definitely not true with uncommitted changes, including gitignored files.

I still don't see the "fundamental flaw". Non-reachable commits are automatically deleted by the garbage collection system, which can be also be run manually. Accidental commits with large files or private content can be "modified" (technically copied and rewritten, since individual commits are immutable) with rebase, amend, filter-branch, etc. Those operations make the original commits unreachable, so garbage collection takes care of deleting them.

And like I already said above, data loss can occur when you're working with uncommitted changes, just like in most other version control systems. If the content is not under version control (in this case, not in a git commit), it's not safe.

Honestly, you guys should go watch Linus Torvalds' presentation at Google about Git. The entire point, the massive problem he was trying to solve, was preservation and verification of data integrity.

git filter-branch will let you remove content permanently and irrecoverably if you really need to.

Regarding uncommitted changes: This is in the same category as forgetting to do your backup before starting to mess around, IMO. I would encourage anyone to simply get used to committing extremely often and just using a quick interactive rebase before pushing.

At the last job where I used git, I'd work in a separate branch, and I started using `git merge --squash` to merge into the main branch to keep the history from getting too difficult to follow. When git merges a bunch of different histories into one, it becomes almost impossible to make sense of if people make lots of small commits. I shy away from `git rebase`, because it seems dangerous.
Never fear! "git rebase" isn't nearly as dangerous as many have been led to believe... unless you start rebasing things you've already published/pushed elsewhere. In that case you need to be very proactive about notifying everyone who could possibly have checked out your branch, etc. Otherwise: it certainly takes a little getting used to, but I find a little one-on-one "mini-mentoring" others with the first few rebases helps them immensely, so if you have someone who can help you in person it might be a lot easier to get comfortable with the process that way.