Hacker News new | ask | show | jobs
by urschrei 5453 days ago
I would respectfully allow that anyone who can't comprehend what the index is for with the aid of two paragraphs and a diagram should be looking for an alternate career. The index is important. It's one of the great things about Git; without it, you wouldn't be able to create commits from a subset of the difference between your working directory and HEAD. It also has a confusing name, but here we are.
5 comments

> It's one of the great things about Git; without it, you wouldn't be able to create commits from a subset of the difference between your working directory and HEAD.

I never understood that. I can do this just fine even with TortoiseSVN. Just click "commit" and select the files you want to include in this commit. I don't see how I need to keep yet another data structure / piece of "state" in the back of my head for that. I definitely don't see why I have to look for an alternate career because of that.

Am I missing something?

The benefit that the index gets you in such a situation is that sometimes you are working on two unrelated changes at the same time, but they both touch the same file. Git (using the -p flag to the add command) lets you interactively select portions of a file to add to the index.

The interface is pretty simple: it just shows you each small piece of diff to the files you are adding, and you say whether you want to include that bit of diff in the index or leave it in the working directory. Alternately, it can drop you into a view of the diff in the editor, and you can add or modify diff lines as you please.

Honestly though, if you're doing that you should be using two different branches. Which is another thing git is wonderful for.
Yes, you are; SVN doesn't allow you to commit parts of a file -- it works at the file level. Git allows you to commit an arbitrary subset of changed lines from whatever files you've changed in your working copy. It doesn't track changes to files, it tracks changes to content in its tree. See here for a more complete explanation: https://git.wiki.kernel.org/index.php/GitFaq#Why_is_.22git_c...
> Am I missing something?

Yes. In git you can also do 'git add -p', which lets you interactively add pieces of a diff to the index, not just entire files.

Now, you could imagine an interface where you interactively select the pieces of the diff when committing, and I believe some VCSs do this. So, even though you were missing something, you weren't necessarily wrong. :)

But having the index can be useful if you want to build up the things you're going to commit over separate 'git add -p' sessions.

> Now, you could imagine an interface where you interactively select the pieces of the diff when committing

There's no need to imagine -- git-gui and git-cola can do this.

http://cola.tuxfamily.org/

Click on a modified file, select specific lines from the diff, right-click, and click on "stage selected lines".

> [W]ithout [the index], you wouldn't be able to create commits from a subset of the difference between your working directory and HEAD.

That's a correct argument for why the index has to be separate from your working directory. But I don't see why the index has to be separate from HEAD ... in my model, every "git add" would automatically be followed by an implicit "git commit --amend". So instead of building up your commit in the index you build it directly onto HEAD.

Since you're remotely sane, of course HEAD is a private branch not a public one (because you're perpetually modifying "history" on it). (And to start a new commit, of course you also need a command which advances HEAD by creating an empty commit.)

To put it another way, the index should be just another branch. The git commands are way too complex because they don't treat it orthogonally.

I appreciate that you should have the option to choose which part of your working directory you commit. But I think the default should be to commit everything, like in SVN or mercurial. If I need to commit selectively in those systems, I'll disable some checkboxes, type in the filenames, build a changelist or use the record extension or something. The point is that I don't need to deal with it until I need it.

Also, it seems to me that git's behaviour encourages broken revisions (i.e. compiler errors, test failures) because your working directory doesn't match what you commit. And broken revisions will interfere with bisecting.

I think committing everything in the working directory is an awful default. Few developers are disciplined enough that the contents of a working directory always make a perfect commit.

With git, how you code is a non-issue. The whole idea is that you're able to to worry about commits afterwards. The index is a wonderful tool to help you untangle the mess of code that has not yet been separated into logical pieces. Working with git is not just writing out code and then committing it. Making proper commits requires time, discipline and practice.

Part of the problem might be the mindset that a VCS is supposed to record how the development happens, but if you think about it, that does not make much sense. The actual development process of a feature or even a bugfix is often riddled with experiments, trivial mistakes, sidetracking, and other largely uninteresting issues.

Once you have thought about what is logical to record into the repository and create a commit, you can test it. Git stash allows you to put aside all other work while you run tests, and git commit --amend allows you to fix the commit until it works.

Test your commits, and you will not have broken revisions or unbisectable history.

Yes, this is legitimate criticism of git's philosophy.

However, in practice i have seen those broken revisions in svn just as well. People forget to add untracked files or they skip the final unit test run. Basically, I believe your criticism is theoretical and no problem in practice.

Git requires a bit of a learning curve to use but it's gotten far simpler than subversion for me. I used subversion for years before using git. I refuse to go back.

If you always want to commit everything use `git add . && git commit -a`.

For me, I could learn git no problem, write wrapper scripts for some things, etc... But I'm not explaining it's idiosyncrasies to other people on my team.
It's also vital when merging. Agreed, too bad about the naming problem. Index, cache, staging Area... how many names does it need?