Hacker News new | ask | show | jobs
by karaterobot 1758 days ago
I read most of this long article, and I found it useful, but:

It's unsurprising that people's mental model of git is incorrect. Git is not something people study at a conceptual level, it's something they learn recipes for in order to work on some project. Recipes like "how do I save all this work I just did" and "oh shit, everything is hosed, please give me a magic spell I can paste into my terminal to fix it".

I don't really blame people, since git itself does nothing to teach you how it works. Git it is the definition of something you have to deal with in order to do something more important to you. Some people want to dig deep and understand how the system works: it's nice to sit near that person and ask them for help sometimes.

Saying "you should really understand more about git" is like saying "you should really study the tax code, it's important and it affects you whether you like it or not." True, but deeply irrelevant!

7 comments

I think it's the other way around. The fact that git does not provide a clean analogous way to intuitively interact with it just demonstrates that the git interface is horribly broken.

This is not essential complexity, it's just bad design that stuck.

Take a look at https://gitless.com/

If you just look at a summary of the commands, you will have an accurate mental model of what's going on:

    gl init - create an empty repo or create one from an existing remote repo
    gl status - show status of the repo
    gl track - start tracking changes to files
    gl untrack - stop tracking changes to files
    gl diff - show changes to files
    gl commit - record changes in the local repo
    gl checkout - checkout committed versions of files
    gl history - show commit history
    gl branch - list, create, edit or delete branches
    gl switch - switch branches
    gl tag - list, create, or delete tags
    gl merge - merge the divergent changes of one branch onto another
    gl fuse - fuse the divergent changes of one branch onto another
    gl resolve - mark files with conflicts as resolved
    gl publish - publish commits upstream
    gl remote - list, create, edit or delete remotes

To me this clearly demonstrates that the problem isn't that people aren't learning git, it's that git is bad to learn. Stash + Index + Working Tree isn't the right abstraction to present to people. Just say there is a working tree, and tracked and untracked files and snapshots. Done. Branches aren't particular commits but particular working trees on top of particular commits.

Working on a feature and want to look at the main branch, but not ready to commit the changes yet? Well just switch to the main branch, then switch back and pick up where you started. No need to know about an additional data structure called the stash.

Unfortunately this did not pick up enough steam. And because a lot of tools expose concepts from gits broken interface you have to learn the git interface anyway...

Having used `gitless` a while ago as my main interface I strongly disagree. Having a distinction between my working tree and things I'm actually considering to commit is a luxury you only really start to miss when it's gone. IMO gitless makes it way too easy commit too much. Also it's "feature" of keeping uncommitted changes local to the branch is just weird. If I want to make a branch specific change, I create a commit. This has the big advantage that it actually forces the user to add a message what the change is about, so if something else comes up I know what was going on when coming back to it later. It's not like this has to be a formal commit message, after all the commit can be dropped again later. Otherwise you end up being surprised by old experiments when switching to branches you haven't used in a while. If I just switch branches then the most likely reason for that is that I want to move the changes.
Interesting. I haven't met many who have. Your two usecases basically never arose for me. If I switch back to a branch and there's random stuff there, then I can just revert easily. So it's an extra operation at a different time to get there. The other usecase for switching branches temporarily where it's one less command, is more important to me though. The crucial thing though is that both behaviours can be accessed easily but we are dealing with one less data store/stateful thing, because we don't need the stash.

As for the first point, fine grained control for what goes into a commit, that's definitely a power user feature, but an important one of course. Again there are ways to achieve this without introducing new state (the index), for example by allowing to amend the last commit.

I wouldn't claim that gitless is a 100% complete git replacement for expert users. It just shows that git has way too much state exposed to users, and has confusing commands to make that state interact. Obviously we all learned git and use it successfully, so it's obviously not broken or anything, it's just worse than it could be (and the constant chorus of "it's so simple, just a DAG!" is a bit grating if you have to teach beginners regularly).

The gitless authors did do some research with users that backs up the claim that this is conceptually easier to use:

https://spderosso.github.io/oopsla16.pdf

> As for the first point, fine grained control for what goes into a commit, that's definitely a power user feature, but an important one of course.

It's not a power-user feature, and it shouldn't be considered one. It should be taught as a standard part of any workflow: before committing, look at the changes you're about to add, and use hunk-staging features (e.g. trivial using Magit) to stage and commit unrelated changes separately.

For example, did you clean up some comments and docstrings while you were adding a new feature? Commit those improvements separately, so that if you need to revert the feature commit later, the improvements won't also be reverted. It also makes reviewing much easier, as each commit or patch, having its own purpose, can easily be reviewed separately, and attention can be focused on parts that need changing.

> Again there are ways to achieve this without introducing new state (the index), for example by allowing to amend the last commit.

Amending a commit does not serve the same purpose as staging files and hunks separately into the index.

It's my impression that few git users understand the value of the index, because few of them use porcelains that expose its power in simple ways. If I had only "git add -p" to use, I might not, either. But Magit is, well, like its name implies, like magic.

gitless has the --partial flag that allows you to commit parts of files interactively.

And your workflow of gradually building up an index of (parts of) files can be achieved by partial/amendable commits. You simply iteratively/interactively add files and partial files to your latest commit until you're done. Instead of building up the index and then committing it, you just build up the commit directly.

This also means you can interact with the "in progress commit" in the same way as with all other commits.

There is no need for having an index to realize what you want.

Another minor point: Your workflow _is_ a power user workflow in my world. Out of twenty people that have reason to use git, one has use for this workflow.

It seems we roughly agree that there is a lot of scope for improving git though. I looked at magit and it looks nice. It exposes all the moving parts in a user interface. I would prefer to just have fewer moving parts, but if they are there it's sensible to make them obvious (and it puts to rest the idea that all you need to understand is that the git data structure is a DAG...)

    gl merge - merge the divergent changes of one branch onto another
    gl fuse - fuse the divergent changes of one branch onto another
Good while it lasted though
Yup. That was exactly the point at which the commenter's promise of "just look at a summary of the commands, you will have an accurate mental model of what's going on" break down for me.
In the same vein as my sibling but not repeating what he said I agree with him though, I regularly commit just specific files. I actually teach every GUI I use that comes with git integration NOT to Auto add and such nuisances. I use the command line and in probably 90% of cases a git commit -a is what I do. Another 5 is git add the entire directory tree I am in and the other 5 are specifically picking what to commit. I'm all for UIs doing auto add and commit -a equivalent by default. But do not take that ability away from me!

The list you provide sounded great until it came to gl switch. Why is there one specific operation for a branch that is NOT done via gl branch?

I don't understand what fuse is supposed to do from this at all. No idea whatsoever. Merge I get and anyone who has worked with any other versioning tool does conceptually.

Rebase most people seem to have a problem with but the abstract concept really isn't that hard. Just like cherry pick isn't really hard but somehow people have trouble with it. Though conceptually it really isn't hard either.

What really helped me the most with git was the realization that it's just a tree of commits with a bunch of labels. Labels have different types so to speak, like branch or tag, remote branches being special in a way etc. And obviously various commands can interact with these labels. Like a fetch updates the remote labels and moves them around on my local copy.

Did you actually look at the page the list is from? THis isn't some sketch, it's a fully implemented way of working with git repos that supports everything you ask for. Committing just specific files is done in gitless via

    gl commit a.foo b.bar
committing all but some files is done with

    gl commit -e a.foo b.bar
gl commit -p allows you to interactively commit parts of files.

gl doesn't take any abilities away (it's just git under the hood after all), it just exposes the abilities in sane ways.

If you actually look at the homepage of gitless you will also immediately see what fuse does:

https://gitless.com/#gl-fuse

I believe that by reading that one, not very long page, most people (including non-programmers) can use gl correctly most of the time. This is not the case for git.

BTW, gl branch is for creating/deleting branches, gl switch is for switching your working tree from one branch to another. These are very different things, why should they be under the same command?

For git, the last paragraph is a necessary but in no way sufficient step towards using it proficiently. Gitless is actually much closer to realizing that vision.

Seriously, people need to go back and teach beginners git to realize how bad it is. We have internalized so much of the bad design decisions in git that we don't notice them anymore.

No I did specifically not go to the gitless site because apparently I was supposed to understand it just from the list given in the post. Which isn't true.

I understand what gl branch "is supposed to do" but I don't see why gl switch is its own command given the other reasoning presented for why gl "is better".

I would say it is different. Probably very workable. Completely intuitive and the only reasonable way to do version control? Definitely not.

To me it's very very natural that git checkout will check out any commit I give it. How I specify that commit is up to me. It could be the commit hash. It could be a text label. That text label might on a logical level be a branch. Or a tag. Why do I need to switch branches with a special switch command when checkout handles this perfectly well?

Everything is perfectly logical after you get used to it enough.

And no, my post did not say that you will understand the details of every command in the list from just looking at the list. It only said that the list demonstrates that a much more coherent, less stateful, simply better UI is possible. That you can not fully explain the difference of fuse and merge in one sentence summaries is not a counterexample to that.

I said you will have an accurate mental model of what's going on. From the summaries you can tell all state you interact with: Working Tree, Commits, Track/Untrack status. That's it. That mental model is perfectly sufficient to accurately predict what most anything will do. And crucially the explanations and mechanisms to achieve all the workflows people asked for in this comment thread can be achieved with these ingredients just fine.

It's what the "Git is easy! You just need to understand that it's a DAG!" crowd pretends git already is.

I hate the inconsistency that stash brings, but gitless is useless since it destroys the primary use case for stashing: I start making changes and then realise I'm working on the wrong branch. Git's solution to the problem is awful, but it's better than nothing.
> just demonstrates that the git interface is horribly broken

This is HN criticism #94238 on the terrible git CLI.

Okay, sure.

Would you kindly post your superior git CLI? Or at least the outline of it?

---

Snark aside, Git's popularity is not an accident. Bitbucket supported Mercurial too.

> Would you kindly post your superior git CLI? Or at least the outline of it?

You are literally replying to a comment that describes a possible better CLI for git...

I suspect it was added in an edit in response to this comment. Dad is downvoted!
Actually no. I didn't edit the original post... shrugs
git has quite an inconsistent cli, this is well covered in "master git". And yes, I said it without proposing a better one.
"master git" This is a great demonstration of git's inconsistencies.

I maintain git stacks up well against other similarly mature/complex software (Nginx, AWS, Java), but it's a wonderful read nonetheless.

(And holy hell what a hard thing to search for...can't find the link.)

https://stevelosh.com/blog/2013/04/git-koans/

The trick is to remember it's called "git koans"

Thanks.
personal opinion, if you're a software engineer that can't be bothered to learn git I'm not sure that I respect you as a professional
> I don't really blame people, since git itself does nothing to teach you how it works. Git it is the definition of something you have to deal with in order to do something more important to you. Some people want to dig deep and understand how the system works: it's nice to sit near that person and ask them for help sometimes.

The official git handbook, freely available on the official git-scm site is not terribly long, and explains the internals on a conceptual level quite well.

I think the problem is most people learning git land on some wordpress site of someone trying to flog a condensed and uninsightful shortcut to getting started with git for ad clicks, which only involves a series of commands without explaining the effects of those commands - This, combined with peoples expectation that an SCM should take no thought whatsoever causes most people that use git on a day to day basis to not really understand it at all.

Git needs to be introduced as powerful data structure, kind of like how SQL is not a DB, imagine someone explaining SQL without ever refering to the DB tables, rows and fields... only talking about git commits is like only talking about the result of a single query. You must understand the data structure to easily use the interface, otherwise the interface will be very confusing or you will be limited to "recipes"... after that you are just learning new variations on how to manipulate and navigate that structure (yes the graph), and from this perspective peoples complaints about the historical inconsistencies we have to put up with in git porcelain are moot.

I was going to write a blog post conveying my mental model of what git is (having had one too many conversations along the lines of "no, git is not a ledger of diffs").

So, I started reading through <https://git-scm.com/book/en/v2/Git-Internals-Git-Objects> again to make sure I didn't have anything wrong.

But now there's no point in writing a blog post. Maybe I'll write one that just links to <https://git-scm.com/book/en/v2/Git-Internals-Git-Objects>.

It even has nice diagrams, which I think are essential for this kind of thing.

The tax code is a completely inscrutable mess but git's internal model is one of the most simple and elegant structures in modern computer science. It's just covered over with utterly stupid commands and terminology that obscures the beauty of the underlying architecture.

I used to despise git because it was so hard to learn. Then as an exercise I started writing my own code to read and write its underlying files and it finally dawned on me how simple the whole thing was.

Git's a very unusual piece of software; it's mind-bogglingly useful, the basic data structures and algorithms are perfectly matched to its job, and it has a UI that's a train wreck.

In a literal sense, sure, git _the tool_ doesn't do much, though I think this is slowly improving as it evolves. For example, there is an experimental `git switch` command[1] under development to provide a simpler interface for changing branches. For me, the biggest leap in developing my own mental model was reading Scott Chacon's book Pro Git, and that is now available online for free on the official Git website[2].

[1]: http://git-scm.com/docs/git-switch

[2]: http://git-scm.com/book/en/v2

idk man, if you're a software engineer I think the onus is on you. There are plenty of great and free resources, like the pro git book. Every month there's a thread where a bunch of people come in and bemone how git is complicated blah blah. Every month lots of people point out that git is much easier to use if you just bother to conceptually learn about it's internals.

it's like coming into a forum for accountants where people bitch about having to learn tax code. please...

I use this when I want to teach someone Git: https://gist.github.com/nicowilliams/a6e5c9131767364ce2f4b39...
I find that a good introduction.

All operations on a repository involve adding commits and/or manipulating the name resolution table.

It may be simplified, but that statement alone, taken in context, is worth its weight in gold.

Thanks!

It's simplified, but really, not that much.

>Some people want to dig deep and understand how the system works

I'd say that's definitely the case but also a problem.

Sophisticated users mixed with people who just want to do a few simple things is a bad combination. I seem to remember that ClearCase had the same issues.