Hacker News new | ask | show | jobs
by derefr 1864 days ago
I’ve said it before, and I’ll say it again:

It’s because “blockchain” software is a thing that already exists, but its superclass, “distributed log of signed proofs” software, isn’t a thing that already exists (except in the form of the subclass, blockchain software.)

There are tons of use-cases for which a “distributed log of signed proofs” is the perfect fit. Have you got an architecture where nodes independently make “stuff” and want to publish it “somewhere” for other nodes to find — where the other nodes have some independent criterion they can apply to validate published “stuff” to decide whether to accept or ignore it for their own use? For example, have you got a sharded data warehouse, where each shard is publishing its own write-ahead log segments for replicas of that shard to use? Well, add a “distributed log of signed proofs”, and now your closed data warehouse becomes an open network where anyone can have their own “shard”, and anyone can choose to replicate from a “shard” they trust.

(Yes, DHTs and/or gossip networks are sort of like this — but neither provides durability or linearization. If you want a new node to be able to look up historical “stuff” from publisher-nodes that aren’t online any more, then you need durability; and if you want your “stuff” validation algorithm to — at least within a sliding window — reject duplicate “stuff” from the log, then you need linearization.)

It just so happens that all the software that exists, that offers the “distributed log of signed proofs” guarantee, also offers the additional guarantee of all the nodes doing the validation of proofs serially, reducing over each new proof against an accumulator of an existing global consensus state, to build a new global consensus state, where proofs can only be valid relative to a specific “base” global consensus state. We call such proofs-relative-to-a-state “blocks”, and we call the resulting system “a blockchain.”

Most of these use-cases don’t need that additional guarantee. It doesn’t get them anything, and it costs a lot (e.g. in the inability to concurrently validate proofs; in the requirement to keep a forever-growing durable representation of global consensus state around on disk; etc.)

But the people building these systems are generally practical engineers, who “buy” blockchain software (and just ignore the features they don’t need), rather than attempting to “build” their own “distributed log of signed proofs” software with no known antecedent.

1 comments

I am coming to think that it's precisely the "distributed" bit that nobody really wants.

For example, distributed version control systems are often held up as an example of some form of precedent for blockchain-type technologies. But I've never actually seen a truly distributed (in the blockchain sense) deployment of Git. It's technically possible, but it just doesn't seem to happen.

Similarly, I'm not sure people actually want immutability. They want the ability to edit history, they just want it to not be an everyday thing. In a discussion about the relative merits of different distributed version control systems, someone invariably points out that the thing Git has that makes it more usable in practice than any of the others is that it allows you to rewrite history. In the repositories I manage, I even mandate it, in the form of requiring people to rebase before merging into the main branch so that we can linearize history. As salty former Mercurial user, I used to do the opposite and ban the practice, until I realized that rebasing and squashing is more practical in the long run. I'm trying to run a software project, not an episode of Hoarders.

Distributed immutable ledger is astonishingly relevant in a variety of security and financial (and financial security) areas, but good luck explaining to the non-tech execs why they should fund projects using that rather than FOMO on “but blockchain!!!”

For purposes of talking w/ execs or boards, there are something like seven yes/no reasonably explainable properties of chains of blocks that toggled some ways give you alt coins and in one particular other way give you a fantastically high performance distributed ledger a trusted authority can keep an eye on — you can have your cake and eat it too if you aren’t being a coin.

Businesses often want that outcome, but they verbalize what they want as “bLoCkChAiN!!! to the moon!!!” and it’s tech’s job to say wait, what are you trying to do?

Quite probably, they actually could benefit from something like QLDB:

https://aws.amazon.com/qldb/

> Distributed immutable ledger is astonishingly relevant in a variety of security and financial (and financial security) areas

I'm not so sure about that. IANAL, but I suspect that, at least under US law, a distributed immutable ledger would actually be illegal in many cases. The entire legal environment is set up around the idea that there is a system of record, and that system of record has a single custodian, and that custodian is not just responsible for tending to it, but also someone to whom you can appeal (or sue) for remediation if something goes wrong.

The immutable bit is also often incompatible. There are laws and contracts out there laying out cases where data needs to be deleted - not reverted, not being flagged as no longer relevant, actually deleted - from the record. In the US, the Fair Credit Reporting Act is probably the most familiar example, but there are others.

Yes, we can say that it hasn't taken off because people just don't understand it because it's a complex technical topic. But we should also consider the possibility that the business environment in which we are trying to ininuate ourselves is a complex technical topic, too. And also watch out for Chesterton's Fence.

I think you're talking about different use-cases than the person you're replying to.

A distributed immutable ledger has many uses other than recording voluntary transactions, or even recording things to do with specific people.

For example, such a ledger can be used to create a tamper-proof security-camera footage log. Just hashes of exported video files, locked into a chain at time of export. You can redact the videos themselves (i.e. make all copies of the referenced video unavailable), but you can't change the hash, and so there's no party you can collude with to substitute one video for another. Even if you're a state actor. You either have the videos — which can be proven to be the right videos — or you don't; but you'll never be able to present the wrong videos.

Or, in the same vein, a chain-of-custody log for the contents of a safety deposit box at a bank. Any time someone opens the box, an entry is automatically appended to the log saying what authorization (e.g. access card) was used to open the box. Once again, the fact that the log is distributed on a wider multi-party-controlled system, makes it impossible (or at least impractical) for the bank itself to tamper with the logs to steal something from your box.

These are "finance" / "security" / "financial security" use-cases. But they're not PII. There's no point at which any of this data would ever legally require redaction or purging, because it doesn't relate to a specific client profile. It relates either to metadata of public-point-of-view sensory data capture; or it relates to employee actions against customer accounts, where the mapping back to an individual isn't given in the public log but rather exists in a private database.

> For example, such a ledger can be used to create a tamper-proof security-camera footage log.

What's the use-case here - who are the untrustworthy individuals that society needs to protect itself against? Societal trust is currently rooted in people - it will not be switched over to machines/distributed ledgers any time soon. This is why people can write affidavits/get sworn in to say "That video's legit" under pain of perjury. Frankly, there's little money in turning over trust to a blockchain when there is an individual/organization that can be interrogated. It's not perfect, but trying to perfect it has (evidently) diminishing returns

That example is not about positive proof (i.e. proving that real evidence is real); it's about negative proof (i.e. proving that fake evidence is fake.) It's about having a way to figure out that the expert defending the evidence has been bribed, and is saying what the prosecution wants them to say.

You can't charge someone with perjury if you can't prove the faked evidence is faked—which is why so few people get charged with perjury. Any threat of perjury with no discriminatory proof mechanism to back it up, is toothless, and experts will treat such threats with exactly the respect they deserve. (Look at the Japanese court system if you don't believe me.)

> For example, distributed version control systems are often held up as an example of some form of precedent for blockchain-type technologies. But I've never actually seen a truly distributed (in the blockchain sense) deployment of Git. It's technically possible, but it just doesn't seem to happen.

The truth is, git provides only a narrow set of capabilities we use for development. Git being decentralized is insufficient. That is why the world uses github (and to lesser degrees bitbucket & gitlab): they provide the collaborative experience, they provide regulation/control/flow across the distributed systems.

There is ongoing & active work to build many of these social protocols ("Pull requests", code comments, issues, &c) in a distributed fashion, under the ForgeFed[1] project. Popular git workspace applications such as Gitea are working towards implementations.

It's been long overdue, but we're filling in the gaps, to make a distributed git possible & interesting. Historically, one of the few & only successful models of distributed/decentralized development has been the Linux kernel itself, which has stuck to using patches sent by email to coordinate the distributed work. But that status quo will soon be changing, or at least, there will be other options, than github or email.

[1] https://forgefed.peers.community/

>"distributed" bit that nobody really wants

Distributed adds complexity but is pretty much essential for illegal activities which is why it is needed for cryptocurrencies which would otherwise be shut down for not KYCing everything and bittorrent which would be shut for copyright infringement. Otherwise centralized is usually easier.

Git is used almost exclusively in a centralized fashion. The use patterns really aren't all that different from SVN, even if having a local copy is kinda nice. But, like you said, having a local copy isn't really the same thing as decentralized. "master" is still a thing.

More importantly, the next big shift in version control technology is/will be back toward centralization.

You already see this with eg Google Docs. Far inferior to Word, but preferred by many because of the free and highly functional live-multi-collaborator-editing feature.

Someday we'll look back and wonder why merge hell lasted so long into the age of ubiquitous gigabit internet. Not that there won't sometimes be merges, but far rarer.

Git is very much used in a distributed fashion. Eg, on github.

Right now, I'm working on one of the forks of a dead commercial project.

Our fork in turn has 61 forks right now, some of which may diverge further and have multiple people working on them, which at some point may or not contribute things upstream.

I think it is worth thinking about systems which do not have (or need) a single source of truth (i.e. git) v.s. systems which include a mechanism for an arbitrarily number of nodes arriving at a single source of truth (i.e. blockchain).

I agree that there are often many versions of a git repo at different places and the system gets much of its utility from that quality. But this strength actually comes from git eschewing the idea of a central truth. You can have one or more remote git repo with different sets of commits. You can freely integrate whatever changes you want. That flexibility allows the free movement and sharing of code, but it is key in that movement that the system does not force a single idea of truth.

The blockchain allows new and old nodes to participate in a process of agreeing on a central truth. This is actually very cool from a technical perspective, but I think it's pretty rare that we want it in a technical system. Most things, like git, benefit from the ability to branch when needed and use social organization to handle centralization (e.x: linux development centralizes on the linux kernel git because the kernel development community has agreed to use that particular git, there are no protocol requirements to do so, and various branches are independently created in various places to the benefit of all).

But not in the blockchain sense. All those forks aren't independently contributing to a single source of truth in a peer-to-peer manner.

In practice, those forks generally serve one of two purposes. Either they're for working separately on changes that you intend to submit upstream to the agreed-upon central repository, or you're intending to legitimately fork the project and create your own new central repository that's relatively independent of the original.

Despite from the fact that, thanks to our industry's love for overloading technical jargon, we happen to use the word "distributed" to describe both use cases, they're really quite different in practice.

They kind of are. I mean, the hierarchical structure is mostly fictional. In git any fork is as valid as any other. This is very much unlike any SVN setup. Any of those forks could conceivably become the main one, if the former official repo died, or somebody just started developing their fork faster.
That's what I'm getting at, though. Even though Git does support a non-hierarchical organization structure, nobody actually uses it that way.

Almost as if not every tool in search of a problem eventually finds one.

I think mumblemumble hit the nail on the head -- this isn't really distributed in the same sense as, e.g., blockchain. There is no peer-to-peer distributed ledger, no consensus mechanism, etc.

In addition to what they said, I'll also point out something else that's perhaps even more important -- you just measured "number of forks" in terms of the number of forks on github (as opposed to eg the number of people who have git cloned the repo).

A peer-to-peer distributed ledger isn't needed for decentralization. It's just how Bitcoin does it.

Also, I used github and its number of forks because it's the number I can easily work with. I have no clue how many copies of our tree are floating out there, nor is there a way of finding out.

> A peer-to-peer distributed ledger isn't needed for decentralization.

You can chuck blockchain out of the conversation like that. But then we'd no longer be having a conversation about blockchain.

The very notion that you are a fork, and that you have forks, implies a hierarchical sense of authority. And I think that's the thing that confuses a lot of these debates: "Distributed" can mean "independent nodes", and it can mean "hierarchical authority".

Almost everybody is OK with the latter, and that's the only issue that blockchain meaningfully addresses.

Github is not an example of decentralization. Github is the center hub for the vast majority of git-managed projects. The fact that you can fork a repo is not the kind of distributed use we're talking about. Github is still the "default" host for those forked repos, and the identity provider for people working on most software projects. That's centralization.
Github is a completely optional thing though. If it were to die tomorrow, everyone still has the code locally, and code sharing is still very much possible.
That is also not what is meant by decentralization in the blockchain sense. A decentralized use of git would be a graph with no central hub: each developer syncing their repo with one or more other developers directly, not with a central "source-of-truth" repo. In reality if Github disappeared everyone would just find a new host to centralize on. Furthermore Github creates lock-in by hosting data that actually would not survive any migration attempts: issue tracking and pull request discussions.
You were the person who wrote "Eg, on github."
Bit of a nitpick but even in a team with a central origin, I think git is still mostly used in a decentralized way. Try to use Perforce on spotty internet vs git. There's a big difference.

Merge is a fundamental issue across all source control. Its not going away.

> You already see this with eg Google Docs. Far inferior to Word, but preferred by many because of the free and highly functional live-multi-collaborator-editing feature.

Only for those in (roughly) the same time zone, which appears to be less often the case today.

I've seen a lot of different VCS, all the way back to RCS. I don't want to go back to the awful scheme of versioned virtual filesystem like the one of ClearCase. You can pry git from my cold, dead fingers.