Hacker News new | ask | show | jobs
Ask HN: Is there an immutable and decentralized database out there?
5 points by ncodes 3693 days ago
Hi Guys,

I am in need of a database that is decentralised and would allow one or more privileged users perform only INSERT and READ operations. This means data stored in it would be replicated across all nodes and are immutable. Immutable in the sense that the principal cannot delete or edit a record even though it is the reason the database exists. Anyone can join the network, but only one or more privileged users can perform operations.

The reason I need this is because I am looking for non-blockchain based model for achieving data integrity and ensuring that a centralized entity cannot mutate data secretly.

Do you think this is feasible? Are there any existing open source solutions out there?

5 comments

I believe you're describing what is typically referred to as an "append only" datastore.

The biggest-name in the game is Google BigQuery. From their docs:

"BigQuery tables are append-only. The query language does not currently support either updating or deleting data. In order to update or delete data, you must delete the table, then recreate the table with new data. Alternatively, you could write a query that modifies the data and specify a new results table."

There are others databases, like Datomic, which are less popular. Typical use-cases are usually log-storage, so search around for databases meant for logging, and I'm sure you'll find a lot more.

Thanks for your suggestion. The fact that data written to BigQuery can still be modified is not so great for my case. I want to be able to make data immutable forever. Nobody should be able to modify or delete it. Malicious alterations should not affect other nodes holding replicated data.
I remember watching a talk about apache Samza [1] that tries to envision a database model that would be truly append only. It is based on Apache Kafka, so it would satisfy your "distributed" and "immutable forever" requirements.

Talk was really interesting, I haven't used it yet, so I am not sure how mature for use as the canonical data-store it is.

As my college professor would put it "I don't thing these have found their Ulman yet." and if you look at companies using it now [2] it seems mostly stream-processing/data-analytics.

Another thing is, I really don't know how Kafka handles checking of the data authenticity, because in my mind there is not that big of a difference between malicious alteration and malicious append.

Because if you then use something like CRDT [3], or DDD style agregates [4] on top of your immutable data , your end users would still see their view on data mutate.

The thing the immutability would mostly give you is log of all the changes and simple way to restore it. And most mutable databases give you that capability as well.

[1] http://www.confluent.io/blog/turning-the-database-inside-out... [2] https://cwiki.apache.org/confluence/display/SAMZA/Powered+By [3] https://en.wikipedia.org/wiki/Conflict-free_replicated_data_... [4] https://en.wikipedia.org/wiki/Domain-driven_design#Building_...

BigQuery is great for read and analytics. It's one of the best products I have used. But, it may not suit high frequency inserts (not a transactional DB). Also, the insert only "limitation" may not exist in the future.
IPFS, with a chain of objects, signed with one of a set of whitelisted keys. You'd need a bit of logic on each node to pull the latest objects (maybe using IPNS).
IPFS looks quite interesting. I will study it more to see whether it would play an important part in solving my problem. It appears there isn't an exact open source solution out there.
Any distributed database with support for User roles/access will fulfill your requirements.
The distributed databases I have looked into have a user who grants roles/access to every other user. This means someone has to be trusted with full privileges. A solution where the access/role is baked into the software could be ideal.
You want the database to be its own admin?
Actually, I want no admin.

Maybe this might sound a little crazy, I am trying to build a system where arbitrary number of persons maintain nodes (I don't even need to know these persons) that have my data. The software managing my data should only allow me append and read from it.

This kind of database makes it impossible for me to alter data once appended.

Can you say more about why you're avoiding the blockchain?
I am not exactly avoiding the blockchain. There isn't any production ready database that I am aware of that provides immutability like the blockchain and an advanced querying system like NoSQL database systems.

I looked at BlockchainDB which solves my immutability requirement but I don't think it's production ready yet and if I host one myself, I can mutate the data by destroying my BlockchainDB machine.

> and ensuring that a centralized entity cannot mutate data secretly

Hello paranoia.

Justified paranoia :)