Hacker News new | ask | show | jobs
by throwawaygh 1864 days ago
Git is used almost exclusively in a centralized fashion. The use patterns really aren't all that different from SVN, even if having a local copy is kinda nice. But, like you said, having a local copy isn't really the same thing as decentralized. "master" is still a thing.

More importantly, the next big shift in version control technology is/will be back toward centralization.

You already see this with eg Google Docs. Far inferior to Word, but preferred by many because of the free and highly functional live-multi-collaborator-editing feature.

Someday we'll look back and wonder why merge hell lasted so long into the age of ubiquitous gigabit internet. Not that there won't sometimes be merges, but far rarer.

3 comments

Git is very much used in a distributed fashion. Eg, on github.

Right now, I'm working on one of the forks of a dead commercial project.

Our fork in turn has 61 forks right now, some of which may diverge further and have multiple people working on them, which at some point may or not contribute things upstream.

I think it is worth thinking about systems which do not have (or need) a single source of truth (i.e. git) v.s. systems which include a mechanism for an arbitrarily number of nodes arriving at a single source of truth (i.e. blockchain).

I agree that there are often many versions of a git repo at different places and the system gets much of its utility from that quality. But this strength actually comes from git eschewing the idea of a central truth. You can have one or more remote git repo with different sets of commits. You can freely integrate whatever changes you want. That flexibility allows the free movement and sharing of code, but it is key in that movement that the system does not force a single idea of truth.

The blockchain allows new and old nodes to participate in a process of agreeing on a central truth. This is actually very cool from a technical perspective, but I think it's pretty rare that we want it in a technical system. Most things, like git, benefit from the ability to branch when needed and use social organization to handle centralization (e.x: linux development centralizes on the linux kernel git because the kernel development community has agreed to use that particular git, there are no protocol requirements to do so, and various branches are independently created in various places to the benefit of all).

But not in the blockchain sense. All those forks aren't independently contributing to a single source of truth in a peer-to-peer manner.

In practice, those forks generally serve one of two purposes. Either they're for working separately on changes that you intend to submit upstream to the agreed-upon central repository, or you're intending to legitimately fork the project and create your own new central repository that's relatively independent of the original.

Despite from the fact that, thanks to our industry's love for overloading technical jargon, we happen to use the word "distributed" to describe both use cases, they're really quite different in practice.

They kind of are. I mean, the hierarchical structure is mostly fictional. In git any fork is as valid as any other. This is very much unlike any SVN setup. Any of those forks could conceivably become the main one, if the former official repo died, or somebody just started developing their fork faster.
That's what I'm getting at, though. Even though Git does support a non-hierarchical organization structure, nobody actually uses it that way.

Almost as if not every tool in search of a problem eventually finds one.

But they do. Not 100% of the time, certainly, but it very much happens.

Eg, under SVN, people would check-out our project, do some work and then either need commit rights, or create a patch.

Under git, people have full power to clone the entire thing, explore the entire history, rewrite whatever they want to, easily collaborate with other people with the same interests, and then maybe submit it all upstream.

So for instance right now in Vircadia we're having a big project of redoing the scripting engine. This work could conceivably take months, and thanks to git it doesn't have to happen in the main repo. It can happen between the people interested in that part of the code, where even several people can collaborate on a gigantic PR that would hopefully get merged in the end. SVN doesn't allow for that kind of workflow.

Yes, the high level view is still centralized, but the ability to break away from the centralization to do something big is very helpful and important. Even if it's not the dominant way of working.

> and then maybe submit it all upstream

Right. That's typically the ultimate goal, because...

> the high level view is still centralized

Period. Just because you don't interact with the parent node in the organizational structure for a while doesn't mean it temporarily ceases to exist.

I think mumblemumble hit the nail on the head -- this isn't really distributed in the same sense as, e.g., blockchain. There is no peer-to-peer distributed ledger, no consensus mechanism, etc.

In addition to what they said, I'll also point out something else that's perhaps even more important -- you just measured "number of forks" in terms of the number of forks on github (as opposed to eg the number of people who have git cloned the repo).

A peer-to-peer distributed ledger isn't needed for decentralization. It's just how Bitcoin does it.

Also, I used github and its number of forks because it's the number I can easily work with. I have no clue how many copies of our tree are floating out there, nor is there a way of finding out.

> A peer-to-peer distributed ledger isn't needed for decentralization.

You can chuck blockchain out of the conversation like that. But then we'd no longer be having a conversation about blockchain.

But we aren't. We're talking about git. We're having a digression from the main topic because git is kinda tangentially related.
The very notion that you are a fork, and that you have forks, implies a hierarchical sense of authority. And I think that's the thing that confuses a lot of these debates: "Distributed" can mean "independent nodes", and it can mean "hierarchical authority".

Almost everybody is OK with the latter, and that's the only issue that blockchain meaningfully addresses.

Github is not an example of decentralization. Github is the center hub for the vast majority of git-managed projects. The fact that you can fork a repo is not the kind of distributed use we're talking about. Github is still the "default" host for those forked repos, and the identity provider for people working on most software projects. That's centralization.
Github is a completely optional thing though. If it were to die tomorrow, everyone still has the code locally, and code sharing is still very much possible.
That is also not what is meant by decentralization in the blockchain sense. A decentralized use of git would be a graph with no central hub: each developer syncing their repo with one or more other developers directly, not with a central "source-of-truth" repo. In reality if Github disappeared everyone would just find a new host to centralize on. Furthermore Github creates lock-in by hosting data that actually would not survive any migration attempts: issue tracking and pull request discussions.
You were the person who wrote "Eg, on github."
Yes, it just didn't come out quite right.

The main reason to mention github at all is that it gives me stats. There's probably a bunch more repositories out there just from people doing git clone, but I can't count them.

Also, while github helps it's not a critical part of the whole thing. If it disappeared it wouldn't be a critical problem. Everybody would still have the source, and could figure a way to reconnect again.

Bit of a nitpick but even in a team with a central origin, I think git is still mostly used in a decentralized way. Try to use Perforce on spotty internet vs git. There's a big difference.

Merge is a fundamental issue across all source control. Its not going away.

> You already see this with eg Google Docs. Far inferior to Word, but preferred by many because of the free and highly functional live-multi-collaborator-editing feature.

Only for those in (roughly) the same time zone, which appears to be less often the case today.

I've seen a lot of different VCS, all the way back to RCS. I don't want to go back to the awful scheme of versioned virtual filesystem like the one of ClearCase. You can pry git from my cold, dead fingers.