Hacker News new | ask | show | jobs
by philwelch 5104 days ago
I figured out Git, and I'm not that clever, so don't worry!

The staging area (aka the index) is where you put things before they become a commit. You don't always want to commit all of your changes at once. The index is there so you can commit the changes you want instead of just committing all the changes every time.

`git commit` means "turn the contents of the index into a commit". A commit is a set of changes that logically go together denoting a version of the repository. A commit needs to have a message describing what the commit does. If you just type "git commit", it opens up an editor for you to type in your commit message. If you want to skip the editor, you can just pass in the -m flag, followed by your commit message in quotes.

Are you familiar with the shell? Are you comfortable with the concept of a linked list? If not, work on those--even if you don't use Git, it's part of being a better programmer. But if you do understand these things, you can fairly easily get a very deep understanding of Git.

2 comments

That doesn't explain the fundamental part. Why is there a staging area in the first place?

Place yourself in the shoes of a subversion user, the workflow is very simple and intuitive given that:

  1. the repository is a (remote) place where my project is stored
  2. the local copy is where I modify my project
Then, a commit is just pushing your modifications to the repository where other people can go get them.

Now, with git you don't have a remote repository. Your local copy is itself a repository. Think about this for a second. Then...

If a commit doesn't push my changes to a remote repository, why do I care?

Well, if a commit allows me to have a local history of changes that I may not want to have in the remote repository, why do I need to stage my changes? Why don't I just commit them and be done with that? Isn't the files themselves a stagind area? Isn't this all redundant?

These are the problems a non-Git user faces. Why do I care about this complexity? In what whay does this make the process of making my changes accessible to others easier?

Answer: it doesn't.

Regular people are usually perfectly happy with their other VCS solutions. The ones that want them to see the light and start using Git for all its benefits must thing about what makes Git useful and explain that.

It doesn't matter that Git is important for large distributed projects. Most people aren't a part of large distributed projects.

No tutorial that I can remember does this. Not a single one.

> Why is there a staging area in the first place?

Because you don't always want to commit every change you've made or every new file you've added.

> If a commit doesn't push my changes to a remote repository, why do I care?

If I have my code in a working state, I'd like to save that "version" somewhere, so I can make a whole bunch of changes without worrying about whether or not I can get back to a working state. This is true even if no one else in the world has to read my code. This is the entire rationale of version control in the first place! But if I can do it just on my own individual changes, that means I can go back to a closer savepoint and not have to play the whole level over again if I screw up ;)

> Well, if a commit allows me to have a local history of changes that I may not want to have in the remote repository, why do I need to stage my changes? Why don't I just commit them and be done with that? Isn't the files themselves a stagind area? Isn't this all redundant?

No, because you still don't always want to commit every change you've made or every new file you've added!

Maybe you want to commit while you have Vim open but you don't want to add a bunch of garbage .swp files to your repo. Maybe you did two or three different things that aren't related, so you want them to show up as two or three different commits in the history.

From the perspective of a Subversion or Perforce user, it's not something you really think about because it's not even an option that you have. You effectively don't even have version control on your own machine. Your company has version control, but you don't. And in my personal experience as a Subversion or Perforce user, I frequently feel lost at sea in that environment because it's virtually impossible for me to reliably do things like:

1. Get back to a working state newer than the one I checked out of the repository after making lots of changes everywhere.

2. Make completely unrelated code changes at the same time without mixing the changes together. Maybe one change is blocking on a code review. Maybe I found an unrelated bug and want to fix it separately from whatever other changes I'm making. Maybe I'm second-guessing a certain feature addition and want to put it on ice while I do other things. Maybe someone reported a bug and I want to fix it separately from what I happen to be working on at the time. Whatever the reason, it can live in its own branch and I can come back to it later. I don't have to create duplicate workspaces in my file system, Git just manages it for me.

3. Turn a large, complicated code change into a series of smaller changes, each with its own diff and description which I can review more easily.

These are things I do every day. I would do them if I shared my code with a small team or with the entire world. I would do them if I shared my code with no one at all. Git isn't just for large distributed projects. It's for decoupling version control from version sharing or version verification.

In practice, the purpose of something like Subversion or Perforce isn't to help you as a programmer, it's to help the canonical owner of the code you're working on to do certain things, like rollback to past versions or make policies that your code has to pass code review or something before you can "check it in". Git handles all that too--some of it more effectively--and it has the added feature that even you the programmer can get the benefits of version control, too.

> No tutorial that I can remember does this. Not a single one.

The purpose of a tutorial is to help someone learn how to use Git. If you're not interested in learning how to use Git, why are you reading tutorials? It would be much easier to just start a flame war on Hacker News and wait for someone knowledgable to respond. I guess you figured that out yourself.

What disturbs me about that workflow is that your commits have literally never been tested because the staging area contents aren't accessible as a working copy. I for one am adamant about not littering my history with all my crap that didn't run, so I much prefer to commit the mixed work and then rewrite history to tease out and regress the independent changes. I pretty much always test and commit my workspace as-is, treating the staging area as an unfortunately visible implementation detail. It'd be easier if I could stash some but not all of my changes to get them out of my workspace temporarily, but this hasn't bugged me enough to figure out how to implement that.
> It'd be easier if I could stash some but not all of my changes to get them out of my workspace temporarily, but this hasn't bugged me enough to figure out how to implement that.

Here, let me help you, from the examples section of `git help stash`:

"Testing partial commits You can use git stash save --keep-index when you want to make two or more commits out of the changes in the work tree, and you want to test each change before committing:

               # ... hack hack hack ...
               $ git add --patch foo            # add just first part to the index
               $ git stash save --keep-index    # save all other changes to the stash
               $ edit/build/test first part
               $ git commit -m 'First part'     # commit fully tested change"
A nice thing about Git's Swiss army knife nature: someone else has likely run into most problems you encounter, and have added the solution to Git porcelain.
Thanks for the very genteel RTFM. I think you've pointed that out before, I just mischaracterized the process as quarantining the changes I don't want yet rather than rescuing the changes I do and forgot the mechanics.
> It'd be easier if I could stash some but not all of my changes to get them out of my workspace temporarily

It's literally called "git stash".

Aside from that, you can squash commits together after-the-fact. If it makes it any easier, you can squash a whole branch together all at once before pushing it out to other people. No one need be the wiser. The intermediate commits can just be temporary savepoints for your personal convenience.

The point isn't about me reading tutorials to learn Git.

The point is about convincing people to use Git by pointing them to tutorials and have them come back with blank stares and "why do I need this exactly?" questions.

I use Git for just my own projects, and it's just me. I barely use branches either, and I come from SVN.

I use TortoiseSVN, I'm not sure if you do. But when I want to commit in SVN, I bring up the commit tool, select the files I want to commit, write a message in the box, and click commit. But I have to do it all in one go. I can't close the commit window if I forget something, otherwise I have to make sure I copy out the message to paste in again, select all the files.

With Git, the staging area is the same as this commit box. It's just a bit more stretched out. Instead of selecting files to commit, and doing it then, you add files to the staging, and commit the staging area. It's a different way of doing it, but I've found it much better for myself. Instead of doing it all in one go, I can add to staging and keep working. Usually I keep open all the files that have changed so I remember which ones to add.

Regarding the remote repository, you still can have it remote. Simply instead of "commit" being the last action you do to push it to the remote repo, make sure "git push" is. I use bitbucket for this, so I know it's "safe" in case of computer death or something.

I have one more project using SVN and I want to move it to Git, for just me, I don't work in a team or anything like that. I like the staging area, it feels "lighter" and that commits are much less drastic, and diffs/logs are MUCH faster since it's all on your computer. Also, `git add -p` to craft your own commits.

There's a good reason that developers love git, it helps to organise code for _themselves_. Centralised VCSs are there to help you share your code with _other_ people.

Here's some problems I encounter everyday that git solves and SVN for instance doesn't.

1) I've been working on a feature since this morning but now there's a bug in production. I need to drop what I'm doing and come back to it when the bug is fixed.

2) I've implemented a feature but before I share it I want do some refactoring to clean it up. If I break something during the refactoring I want to be able to get back to the version that was working.

3) I'm working on integrating an old web API (SOAP hell) and I want to make a note of the workaround which is spread across several files. My tests aren't passing yet though so I don't want to share any code yet.

These are some examples of scenarios where git fits with _your_ workflow. This is where git really shines, not in the sharing of code (where it does also improve on SVN), but the ways in which it allows you to organise the pieces of code you're working on throughout your day.

Unfortunately you fell right into the trap he said everyone trying to explain Git does. The first paragraph is complete nonsense unless you already understand it.

You first call it a staging area and then you clarify with an "index," but you never clarify what an index is?! The second paragraph doesn't improve the situation much.

Ultimately I think there is a language breakdown here. People who explain Git seem to be unable to do so without using Git-language, and it is very difficult to understand the Git-language without understanding how it is used.

So therefore it is a chicken/egg problem. You somehow need to know the language to understand Git, but to understand the Git you need to understand the language.

PS - I know you're trying to help; it is just one of those things where I am not even sure it can be explained in that way.

I'm not saying it's an "index" to clarify what "staging area" means, I'm saying "index" is a synonym for "staging area" so if he runs into the word "index" somewhere, he knows it's just a different word for the same concept. For Christ's sake, there's nothing magic about the words, they're just names for something. If you didn't know what a dog was, and I said "a dog (aka a canine) is a domesticated animal related to a wolf", I'm not expecting the word "canine" to add anything to the explanation, I'm just throwing it in there in case you run into someone else who says "canine" instead of "dog", just so you know that in both cases they're referring to the same type of thing.

In any case, a staging area only makes sense in the context of trying to create a commit, and I think I did usefully define what a commit is. Maybe that part should have come first.

I suppose you have to know the general idea of a Version Control System (VCS) first before you can understand how Git does version control. I'll try my hand at it.

A Version Control System is a software system that uses files to save different versions of your source code (really, it can save any file, but it works best with source code). The system also lets you revert back to an old version of your code, so you can try something radical out on your code knowing that you have a safe version tucked away in your version control system.

A 'commit' is when you tell the VCS to take a "snapshot" of your files and then save that in its system. Git implements 'commits' with three-location system: the working index (the files on your computer), the staging index (Git's log of what files you want to change), and then the actual commit index (where all the versions are saved). When you stage a file, you tell Git to add the name of that file to it's internal log, telling Git that you want that file to be saved in a version. After you have told Git what files to save (by "staging" them), you can commit (with 'git commit') to save that snapshot of your staged files. Later on, you can revert to the snapshot through git commands.

Hope this helps!