Hacker News new | ask | show | jobs
by zmw 2842 days ago
I checked out Beaker Browser, and apparently it's based on the Dat project [1], which seems to be very similar to IPFS. Then apparently it follows that, just like IPFS, you can't throw random things onto the network and expect it to stick; you need to pay someone for hosting and bandwidth (that someone could be yourself) to have it pinned, and in order to have it available worldwide at all times you still need to pay for a CDN of sort — the Linux box in your closet, or worse, your laptop that sometimes goes offline just won't cut it. Eventually it's just another protocol to copy stuff around, where stuff originates from various servers (your browser basically embeds a server, capable of serving stuff), with the possible benefit of popular stuff may be p2p'ed (but if you're a business you probably can't rely on that anyway). I fail to see how it's radically different.

(Also, I'm not even sure how you could p2p private user data, unless you expect everyone to carry around one or more yubikeys, or implant chips into fingers or something; plus all devices need into buy into that. But I haven't given that much thought.)

[1] https://datproject.org/

3 comments

Some things in p2p hypermedia (dat) that's not possible with http/s:

* You can generate domains freely using pubkeys and without coordinating with other devices, therefore enabling the browser to generate new sites at-will and to fork existing sites

* Integrity checks & signatures within the protocol which enables multiple untrusted peers to 'host'. This also means the protocol scales horizontally to meet demand.

* Versioned URLs

* Protocol methods to read site listings and the revision history

* Offline writes which sync to the network asynchronously

* Standard Web APIs for reading, writing, and watching the files on Websites from the browser. This means the dat network can be used as the primary data store for apps. It's a networked data store, so you can build multi-user applications with dat and client-size JS alone.

I'm probably forgetting some. You do still need devices which provide uptime, but they can be abstracted into the background and effectively act as dumb/thin CDNs. And, if you don't want to do that, it is still possible to use your home device as the primary host, which isn't very easy with HTTP.

This is a very interesting topic, thanks for working on it and answering questions here.

The first concern I had/have is about security. If everybody runs their own server, isn't this a security nightmare waiting to happen?

I understand from the presentation that these websites won't run php or other server side scripts which at least take some concern away.

Tara also showed how easy it was to copy a website, while pretty cool, that is also a nightmare scenario for most companies. If your competitors can clone your websites and pretend to be you, how do users know who's data they are looking at?

Not OP, but I believe when it comes to website copies, you can identify which one you are actually using by the url. So if someone makes a copy of dat://mylocalbank.com, their url would be just the hash (Eg. dat://c6740...)
Thanks for the list.

> You can generate domains freely using pubkeys and without coordinating with other devices, therefore enabling the browser to generate new sites at-will and to fork existing sites.

Not entirely sure what you mean,

- We can generates HTTP sites at will (all you need is an IP address);

- We have existing protocols for mirroring sites (not implemented universally, but nor is dat://);

- When you talk about pubkeys with coordination, there are obvious problems like the last paragraph of my original comment, right? Again, I'm probably misinterpreting what you're saying.

> Integrity checks & signatures within the protocol which enables multiple untrusted peers to 'host'.

Basically subresource integrity? Granted, with this protocol you can in theory retrieve objects from any peers (provided that they actually want to cache/pin your objects), not just the ones behind a revproxy/load balancer, so that's a potential win from decentralization.

> Versioned URLs

We can have that over HTTP, but usually it's not economical to host old stuff. In this case, someone still needs to pin the old stuff, no? I can see that client side snapshots could be more standardized, but we do have WARC with HTTP.

(EDIT: on second thought, it's much easier to implement on the "server"-side too.)

> Protocol methods to read site listings and the revision history

> Standard Web APIs for reading, writing, and watching the files on Websites from the browser.

You can build that on top of HTTP too.

My takeaway is it's simply a higher-level protocol than HTTP, so it's unfair to compare it to HTTP. Are there potential benefits from being decentralized? Yes. But most of what you listed comes from being designed as a higher-level protocol.

> We can generates HTTP sites at will (all you need is an IP address);

That's not really so easy from a consumer device with a dynamic IP.

> - When you talk about pubkeys with coordination, there are obvious problems like the last paragraph of my original comment, right? Again, I'm probably misinterpreting what you're saying.

You do need to manage keys and pair devices, yeah.

> My takeaway is it's simply a higher-level protocol than HTTP, so it's unfair to compare it to HTTP. Are there potential benefits from being decentralized? Yes. But most of what you listed comes from being designed as a higher-level protocol.

The broader concept of Beaker is to improve on the Web, and we do that by making it possible to author sites without having to setup or manage a server.

Decentralization is a second-order effect. Any apps that use dat for the user profile & data will be storing & publishing that data via the user's device. Those apps will also be able to move some/all of their business logic clientside, because theyre just using Web APIs to read & write. Add to that the forkability of sites, and you can see why this can be decentralizing: it moves more of the Web app stack into the client-side where hopefully it'll be easier for users to control.

> Decentralization is a second-order effect.

I see, I was looking at it backwards.

Its not a higher level of http, its more like lets use torrents instead of http because they are distributed and scale better. But the web is more than http, its dns and email and logins and all of that stuff, it all scales poorly, it can all be improved with distribution, lets not replace http with torrents lets replace it all with distributed stuff.

As an example you talk about needing a special device to manage keys which presents problems. It centralises your identity to your yubi key (instead of email), lose your yubi key and you lose your identity, what if it gets wet, crushed, corrupted, your fucked. Instead we encrypt the key and distribute it across the net, if a copy is deleted or corrupted there are other copies and its available to you anywhere anytime. Currently your identity is centralised to your email, if your email goes down you lose your identity, if its distributed and a copy goes down you just use it like normal.

Distribution solves pretty much all the problems centralisation creates, its just really complicated so we generally don't bother.

> Its not a higher level of http

Of course it's not higher level of HTTP, I never said that. I said higher level than HTTP. HTTP is just a stateless transport protocol, of course dat is higher level, and as I said much of the benefits described can be built on top of HTTP (and have been, just not standardized or not widespread).

> it all scales poorly, it can all be improved with distribution

Pretty sure it all does NOT scale poorly, as has been proven over the past thirty years. What's being solved here is not a problem of scale. "It can all be improved with distribution" is very hand wavy and doesn't really say anything. DNS and many other protocols are already distributed, btw.

> Instead we encrypt the key and distribute it across the net, ..., if its distributed and a copy goes down you just use it like normal.

There are two kinds of crypto, symmetric key and public key. Symmetric key is easily out of the window. For public key crypto, you always need a secret key and that has to be prior knowledge, not something negotiated on the fly, and of course prior knowledge has to be kept somewhere and presumably synced if you need it elsewhere, and it definitely can be lost. "Distributed secret keys solving everything" sounds like nonsense to me; there's always a secret key that is the starting point (call it the master key, if that makes more sense) and can't be distributed.

The fundamental difference is source independence. It doesn't matter where the data is, as long as someone has it pinned, you'll be able to access it.
That indeed is a fundamental difference. But on second thought I got confused. Source independence, content addressable are nice and all, but we don't build static websites that always have same hashes; "Ubuntu Server 18.04.1 ISO" could be ipfs://<static_hash>, but even "latest Ubuntu Server 18.04.x ISO" couldn't be that. You still need to query the origin server (or client, whatever you call it), the central authority, to get those addresses. So, frequently changing websites/webapps don't benefit from this; they may even be penalized by the overhead. Only aggressively cacheable objects could benefit, but the vast majority of those probably won't be popular enough to be cached/pinned by peers anyway, so you still end up getting whatever you need from the origin server (or paid CDNs).

Btw, I skimmed through Beaker docs, and it seems they resolve names through DNS (what else can they do) and even use HTTP for discovery.

I'd say that most websites are static enough to be pinned. With the others, the main problem is content determinism. If the same website renders differently on different platforms, they will have different hashes. The only reliable way to store them is in "unrendered" form. Which prevents the inclusion of external resources, something that most single-page interactive websites rely on.

Naming is a consensus problem. The key here is having the freedom of choice between trusted providers. The central source could be provided by a single cryptographic key, by many keys, m-of-n schemes or other arbitrary contracts, even in P2P form.

I'm really interested in what kind of user interfaces the Beaker people come up with when it comes to their "editable cloned websites" (forks).

Most websites; maybe.

Most popular websites; unlikely. Even HN isn't static enough to be pinned considering there is a new comment about once a minute or so.

What is pinned would be the content that doesn't change. It may mean the site architecture would need to be changed to accommodate one of these decentralized models.
Just want to point out that Beaker uses dat, not ipfs, so its sites are pubkey-addressed and therefore mutable.
You are right, a p2p web wont solve the barrier to entry, but web hosting costs $20 a year, so its not much of a barrier.

The real cost is scale, $20 year will cover a few thousand users but if you want googles scale it will cost you in bandwidth and complexity. p2p like torrents radically reduces the cost of bandwidth by distributing it, but more importantly it reduces complexity by standardising it.

Once the complexity is standardised budget web hosting can provide google scale for dirt cheap, and there are millions of budget hosting companies too many to shutdown them all giving you censorship resistance.