Wow - great to see more work in this area. And interesting to see that the Swiss Government is supporting.
We’ve been working on a graph/document database with these kind of collaborative revision-control features over at TerminusDB (https://github.com/terminusdb/terminusdb). I think it is the wave of the future. And interestingly we also came out of a European Commission backed project.
Thanks a lot, I heard about TerminusDB in the past, the capacity to do queries is interesting, that's something Condensation doesn't provide as it is on the client side and because at least now there are no tools to do it on the store.
I would be very happy to keep in touch and you have to give me the secret of these European funds.
I'm afraid the European monies are all spent! Was in a former life when we were a university progect called ALIGNED. But got us off the ground at least.
Very happy to stay in touch! We have a Discord server if you are interested.
I'm not sure if I have fully gotten it correctly. It's a distributed database that syncs without conflict. So, cool for collaboration tools. What other things would you do with it?
Yes, it excels at synchronization, you could just put a server synchronized and you have your backup, or let each user of an application have his own server which is synchronized with others (e.g., for a smart lock system). That's something pretty useful for privacy or if you have connectivity problems like in a mesh network with interruptions.
For the app itself, its really about getting end-to-end encryption be able to use the app while offline without loosing data.
Interesting one, I think the difference is that you dont really control where your data is, it's like IPFS? I am surprised they managed to have queries, I will dig deeper into that thanks.
It's just a db, so your data is where the db is running. You could have a central db storing your data and your local client could sync periodically, or you can have a p2p architecture if you want. I use Noms as an embedded db in my project, where syncing happens between p2p nodes.
How does data synhchronization and conflict resolution work in CondensationDB? I expected to see something about OT or CRDT since it says it can be used for building collaborative applications such as Google docs (something like Docs is not possible without OT or CRDT).
Yes exactly, it's based on CRDTs and there is a strategy out there to mark the entries with a timestamp to figure out which one are the latest. An object may contain many entries and when they are read by the client they are just compared one by one to find the union, or the latest version.
The beauty of it is that the algorithm decide on how much entries to put in objects to ensure that only the data that is changed is sent on the network and compared on the other client. That's why we call it Condensation.
So, it's like a giant "grow only set" shared between as many nodes as you want and sending diffs on the network? I find it difficult to understand the possible applications* and the link with the article in the readme (https://www.inkandswitch.com/local-first.html).
*: I see that it could be used as backbone for an end-to-end encrypted messaging system, but what would it change for me if I were reading a remote API, running a multi-instance web app or parsing CSVs to train ML models on them ?
It's really not a grow only, there you have immutable data but it expires at some points regarding the rules you want to implement.
The integrity of the data is not shared on the network, each user owns his data, and choose to store it on his desired server. It's really like the email system.
For the article, it really joins the conclusion that you could build google doc, or an IOT system or anything but you will inherent from powerful synchronization, encryption, offline mode and so you can ensure the 7 principles the author of the article characterized.
You could just use a cloud to store massively your data, and pass them through a local server to make sure they are not compromised.
Ah! I see where your inspiration is coming from! It's really interesting when you think of it that way!
I think there is a true need for a decentralized immutable data store (basically get/put/list operations), and it could serve a vast number of use cases. It enables simple algorithmic memoization, complete reproducibility of ML models, and datasets shareability. The only problem is we lack practical solutions. Maybe Condensation could help, or maybe not. In any case, it's good to see more alternative to traditional datastore that are not "git for data".
Hey, can you send us an email? I can connect you to Thomas for the installation of the latest version. It will be a good exercise to guide you through.
Basically you have a document with all the references to objects, and if you remove the reference the object will be deleted after a certain timeout (you can set it for your specific case).
Not yet planned but its definitely something we want to have. we onboard everyone who want to port the code, so I hope someone will come with this idea soon.
Interesting. Would love to try this out, but I generally try to avoid Java for personal side projects. Is the plan to make a Javascript client or a full Javascript port?
We’ve been working on a graph/document database with these kind of collaborative revision-control features over at TerminusDB (https://github.com/terminusdb/terminusdb). I think it is the wave of the future. And interestingly we also came out of a European Commission backed project.
Our approach to distribution is to use delta encoding and succinct data structures. We borrowed a fair few ideas from Git. Might be interested to read our storage layer white paper: https://github.com/terminusdb/terminusdb/blob/master/docs/wh...
Conflict free merges sound fantastic - not an easy road! Good luck.