Hacker News new | ask | show | jobs
by sbazerque 1745 days ago
I agree with the author on the merits of the file abstraction, but I think the concept should be updated for networked devices. We need file formats that support both offline usage and seamless sync over the network.

For example, here I use a merkle DAG-based file format to represent CRDT-like types:

https://www.hyperhyperspace.org

The resulting abstraction can be universally looked up using a hash (or short sequence of words), can be modified offline and synchronized flawlessly. It's still WIP (for example, you still can't export it to an actual file, hehe).

1 comments

I am interested in the distributed computing/P2P space and study it.

Developers dont want to rewrite their data models to reflect a strange API or weird model. I dont want to pollute my domain objects with MutableReferences or extend HashedObject. So while I think the ideas in Hyperhyperspace are good, the programming model needs revisiting if you want people to actually use it. You need to use plain old objects and spider them for references.

ORMs dont require you to use MutableReferences, they let you use your actual data model as is.

If we can get the backend storage to be P2P then any application outside the backend storage can be written traditionally.

I am personally interested in distributed SQL databases and have written a simple distributed SQL database that could in theory be used as a P2P application backend with distributed joins

I think a combination of event sourcing and CRDTs could be used to provide arbitrary synchronisation and merging between peers that handles bad actors through web of trust.

I have started a discussion of P2P storage backends on Infinity family which is a community of inventors (I can provide invites). I have mentioned Hyperhyperspace there.

https://0oo.li/intent/74001/distributed-data-storage#1630701...

Thank you for your feedback!

There are other projects that take an approach similar to what you're describing (Automerge and its derivatives, mainly).

I guess there is a distinction between distributed databases and permission-less p2p applications. Having a data center where you can trust the compute nodes running your distributed database is very different than having random untrusted peers over the net. But maybe data validation can be done in a less intrusive way, something like what Automerge does for JSON merging (maybe in a typed setting?).

That said, hyper-hyper-space is still work in progress, once we have a proper library of container types programming will be a bit more intuitive. Or so I hope, at least.

I think you're laying some very useful foundations and that I look forward to experimenting with it or ideas based on it. I also look forward to what you decide to do or build and will be monitoring your project.

I see you're going for a zero trust model where you have baked identities into the data structure. Which is very cool.

We have a discussion about this problem on Infinity.

https://0oo.li/method/60001/community-managed-software-secur...

One idea is to strictly review all distributed apps, so sign them by a central party.

If you want to run arbitrary source code which is untrusted you need a interpreter like Ethereum/Bitcoin or browser JavaScript.

I like the approach you can exclude nodes or identities you dont like which is the federated approach mastodon and matrix take. But it doesn't really scale if actors can mass produce identities to spam the network.

I liked what Freenet's FMS did (freenet messaging system) which is web of trust. Your visibility of content is based on a transitive trust setting on an identity. So if you knew a good person and trusted them you could still see good content.

Reddit solves the problem by having a 'new' section which is where any users content goes but it's not the default view. HN does the same. Good content gets manually audited.

I am really excited by P2P software and Braid. I think the problem is running information systems over a P2P network is not a solved problem. The automerge, yjs and Hyperhyperspace projects all provide parts of a data storage layers. Either need to hitch a ride on the browser like Hyperhyperspace with webrtc or something dedicated like a DHT like Kademlia.

The day I can distribute host an information system backend and frontend and others can help host it with me that would be success to me.

So we have problems with distributed storage and distributed compute.

You might get replies on Infinity to your comment..if you join the following Telegram groups you will get notifications.

https://t.me/en0oo and https://t.me/oo0oio

This invite link doubles as an explanation of the site. https://0oo.li/accounts/signup/?invite=0x3fFca3853B7eb67A5a8...

And we are planning to make the site peer to peer at some point when the ontology is concrete.