Hacker News new | ask | show | jobs
by trog 571 days ago
Is a Merkle tree needed or is good old basic double ledger accounting in a central database sufficient? If a key requirement is not a distributed ledger then it seems like a waste of time.
2 comments

Merkle tree is to prevent tampering, not bad accounting practices
It only prevents tampering if the cost of generating hashes is extremely high.

Internally in your company you're not going to spend millions of $'s a year in GPU compute just to replace a database.

"Prevents tampering" lacks specificity. git is a blockchain that prevents tampering in some aspects, but you can still force push if you have that privilege. What is important is understand what the guarantees are.
? If I use something like Blake3 (which is super fast and emits gobs of good bits) and encode a node with say 512 bits of the hash, you are claiming that somehow I am vulnerable to tampering because the hash function is fast? What is the probable number of attempts to forge a document D' that hashes to the very same hash? And if the document in structured per a standard format, you have even less degrees of freedom in forging a fake. So yes, a Merkel tree definitely can provide very strong guarantees against tampering.
Fwiw, increasing the BLAKE3 output size beyond 256 bits doesn't add security, because the internal "chaining values" are still 256 bits regardless of the final output length. But 256 bits of security should be enough for any practical purpose.
Good to know. But does that also mean that e.g. splitting the full output to n 256 chunks would mean there is correlation between the chunks? (I always assumed one could grab any number of bits (from anywhere) in a cryptographic hash.)
You can take as many bytes from the output stream as you want, and they should all be indistinguishable from random to someone who can't guess the input. (Similar to how each of the bytes of a SHA-256 hash should appear independently random. I don't think that's a formal design goal in the SHA-2 spec, but in practice we'd be very surprised and worried if that property didn't hold.) But for example in the catastrophic case where someone found a collision in the default 256-bit BLAKE3 output, they would probably be able to construct colliding outputs of unlimited length with little additional effort.
Certificate transparency logs achieve tamper-resistance without expensive hashes.
Write-Once, Read Many drives also prevent tampering. Not everything needs crypto.
In a distributed setting where a me may wish to join the party late and receive a non-forged copy, it’s important. The crypto is there to stand in for an authority.
> In a distributed setting where a me may wish to join the party late and receive a non-forged copy, it’s important. The crypto is there to stand in for an authority.

Yeh, but that's kinda my point: if your primary use case is not "needs to be distributed" then there's almost never a benefit, because there is always a trusted authority and the benefits of centralisation outweigh (massively, IMO) any benefit you get from a blockchain approach.

100% agreed there. A central authority can just sign stuff. Merkle trees can still be very valuable for integrity and synchronization management, but burning a bunch of energy to bogo-search nonces is silly if the writer (or federated writers) can be cryptographic authorities.
We launched Fireproof earlier this month on HN. It’s a tamperproof Merkle CRDT in TypeScript, with an object storage backend for portability.

See our Show HN: https://news.ycombinator.com/item?id=42184362

We’ve seen interest from trading groups for edge collaboration, so multi-user apps can run on-site without cloud latency.

What disrespectful marketing. We don’t care that you use Merkle trees because that’s irrelevant. I guess I can add Fireproof to my big list of sketchy products to avoid. It’s embarrassing.
I figured the responses would be more interesting. Questions about CRDT guarantees etc.

Perhaps worth seeding the convo with a remark about finality.

While your intentions may have been around discussion, I don’t want to be marketed to when I’m trying to understand something unrelated. I have a business degree so I intimately understand that HN is technically free and it’s nice to get free eyeballs, but we are people too. I’m so much more than a credit card number, yet you’ve reduced me to a user acquisition in the most insulting way possible.

Perhaps instead of your ideas, it’s worth seeding your own personal make up with a firm statement of ethics??

Are you the kind of person who will hijack conversations to promote your product? Or do you have integrity?

Just purely out of concern for your business, do you have a cofounder who could handle marketing for you? If so, consider letting her have complete control over that function. It’s genuinely sad to see a founder squander goodwill on shitty marketing.

In founder mode, I pretty much only think about these data structures. So I am (admittedly) not that sensitive to how it comes across.

Spam would be raising the topic on unrelated posts. This is a context where I can find people who get it. The biggest single thing we need now is critical feedback on the tech from folks who understand the area. You’re right I probably should have raised the questions about mergability and finality without referencing other discussions.

Because I don’t want to spam, I didn’t link externally, just to conversation on HN. As a reader I often follow links like this because I’m here to learn about new projects and where the people who make them think they’ll be useful.

ps I emailed the address in your profile, I have a feeling you are right about something here and I want to explore.

> Spam would be raising the topic on unrelated posts.

I think you need to reread the conversation, because you did post your marketing comment while ignoring the context, making your comment unrelated.

If you want it distilled down from my perspective, it went something like this:

> Trog: Doubts about the necessity of Merkle trees. Looking for a conversation about the pros and cons of Merkle trees and double ledger accounting.

> You: Look at our product. Incidentally it uses Merkle trees, but I am not going to mention anything about their use. No mention of pros and cons of Merkle trees. No mention of double ledger accounting.

This doesn't address the question in any way except to note that you also use Merkle Trees. Do you reply to any comment mentioning TypeScript with a link to your Show HN post as well?
Sorry, but your post came off as blatant advertising. There is no need to link to your company announcement just because it benefits you.
Thanks y'all -- feedback taken. If I were saying it again I'd say something like:

Merkle proofs are rad b/c they build causal consistency into the protocol. But there are lots of ways to find agreement about the latest operation in distributed systems. I've built an engine using deterministic merge -- if anyone wants to help with lowest common ancestor algorithms it's all Apache/MIT.

While deterministic merge with an immutable storage medium is compelling, it doesn't solve the finality problem -- when is an offline peer too out-of-date to reconcile? This mirrors the transaction problem -- we all need to agree. This brings the question I'm curious about to the forefront: can a Merkle CRDT use a Calvin/Raft-like agreement protocol to provide strong finality guarantees and the ability to commit snapshots globally?

Apologies for the noise.