Hacker News new | ask | show | jobs
by tzpardi 3652 days ago
I am sorry, but centralized network bootstrapping is a very well known issue and unsolved problem of decentralized networks.

http://ryandoyle.net/assets/papers/Distributed_Bootstrapping...

http://grothoff.org/christian/dasp2p.pdf

http://www.net.in.tum.de/fileadmin/TUM/NET/NET-2014-08-1/NET...

You say: "i.e. changing hardcoded bootstrapping data every N minutes, so new users could bootstrap from different nodes"

The fundamental issue is, changing the data where? The node who wants to connect to the peer network, obviously cannot obtain the information from the peer network itself (as node is not connected), so obtain the information from where? Currently, all decentralized applications, including my system Streembit use the techniques of obtaining the information from a centralized source - which is the oxymoron of decentralization. If a web services or other centralized applications provide the new node with the list of existing listening/connected nodes then the solution is surely not decentralized. Government agencies or cyber-criminals only need to attack the hard coded, listening seed nodes and then the network is done and never can be back again until a new list of hardcoded seed nodes is published via the application source code or via other channels.

As I said above, Satoshi's original idea to obtain the seed info from IRC was the closest to decentralization, but since IRC is centralized itself I am sure you can see how far that is from the a decentralized bootstrapping.

We have the solutions on local decentralized networks such as mDNS and UDP multicasting which uses protocol level solutions, and we are investigating to solve the problem at Streembit with IPv6 anycasting.

<<< A bigger problem is how software is developed. The development itself is centralized, but even worse >>>

I disagree. Open source software can be forked and then you can adopt as much democratic development methods and governance as you want.

2 comments

> The fundamental issue is, changing the data where? The node who wants to connect to the peer network, obviously cannot obtain the information from the peer network itself (as node is not connected), so obtain the information from where?

Ok, you are assuming that the node already has the binary somehow. But that's not the case. In the real world we have to ship binaries to users. And this is where you can put your different bootstrapping data for different users.

You can go a lot farther: let users generate a binary distribution of the software to share with each other and hardcode bootstrapping data there obtained from a running network by that user.

> Open source software can be forked

The majority of users are not going to do that, they just don't have the skills. At best there will be a few popular distributions of the same "decentralized" software, with majority of installations controlled by a few entities. At worst - just one centralized entity that controls every installation.

<<< Ok, you are assuming that the node already has the binary somehow. >>>

Well, I am not assuming anything. I am talking about a fundamental problem of decentralized networks: the current bootstrapping of networks is centralized. Please refer to the quoted papers and there are many other research papers as well which describe this existing problem.

<<< But that's not the case. In the real world we have to ship binaries to users. And this is where you can put your different bootstrapping data for different users. >>>

You are misunderstanding the problem. The point is

a) we should never ever hard code the seed information into the application and consequently ship it with the binary. If we do rely on the current approach then you always connect to the seeds of ETH foundation, Bitcoin foundation and to my company's seeds in the case of Streembit which is the oxymoron of decentralization.

b) we should have a protocol level solution for entity discovery instead of application level solution such embedding the seed info in the source and then compile it into the app. When I say protocol level solution I refer to mDNS and UDP multicasting which works on local networks just fine and I am proposing IPv6 anycast for entity discovery on global networks.

The simple truth is that Bitcoin, Ethereum and all cryptocurrencies conveniently ignore this issue. The companies, lead developers, foundations or whoever run the show maintain the seed nodes, but such solution is surely not a decentralized solution.

> we should have a protocol level solution

No! This is a problem that solves itself once you solve a binary distribution problem. But none of the papers you refer to address the problem of the centralized binary distribution and jump right to the protocol level for some reason. It doesn't work like that.

I don't want to be condescending and I apologize if I sound like that, but I think you totally misunderstand what the problem is.

What difference it makes if you disseminate a fundamental design problem (i.e. the bootstrapping of the network is centralized) with a different type binary distribution? Whichever method you use for binary distribution, the distribution will deliver the very same existing problem. Again, please refer to the quoted papers to understand why the centralized network bootstrapping is an issue.

On the note of binary distribution, yes that is an issue as well and it would be nice to have a decentralized binary distribution, but again, that is an entirely different problem. BTW, I think our application Streembit can be used for decentralized binary distribution as well.

I think it's the other way around. You don't want to see that bootstrapping depends on the binary to be present on the node somehow and for some reason you think that solving bootstrapping makes sense even if it depends on another completely unsolved problem. It doesn't. You solve the first problem and only after that move on to the one that depends on this problem being solved.
<<< you think that solving bootstrapping makes sense even if it depends on another completely unsolved problem >>>

No. The problem of centralized bootstrapping that is an existing and real issue of decentralized networks (see the quoted papers and my above explanation) does not depend on the problem of "binary distribution". In fact it has nothing to do with "binary distribution". If a user builds the software from source - so the user avoids any "binary distribution" - then the user still have the problem of centralized network bootstrapping.

It seems you don't understand that the problem of centralized bootstrapping is a generic information technology problem of all users, regardless what was the method of "binary distribution" if any. That's fine, it was a good discussion, but I exit from this debate with you which is becoming meaningless now :-) Thanks for sharing your view" :-)

Could you please explain why it doesn't work like that?
It's very common in distributed system to push problems between different levels and ignore them, instead of addressing them.
Is a fallback to IP scanning a valid option to centralised bootstrapping? Lets say you have connected to 20peers during a session, u could use that list as a start for your scanning activities, because if there is one IP, there might be possibly more Ips close by that also provide the service?
What you suggest with regards to the 20 previous IP is called seed caching, and yes, that is what many, probably most P2P network do. While it works most of the time on large networks, but it is obviously not the solution, because it cannot be guaranteed that any of the previously active 20,50, 100 nodes is still listening in a later time. Also, such caching and scanning does not solve the fundamental problem of knowing the seed IPs at the very first connection, prior when you populated the cache the first time. The question is, how we solve this problem without having a hard coded list of IPs that usually P2P software ship in the source code.

Other techniques and several research papers suggest that scanning of a wide range of IP by guessing what could be a connected node is an option as well. In theory it certainly works, if you scan the whole internet then soon or later you will find a connected node and in theory it does solves the problem of centralized bootstrapping, but all these papers also agree that such scanning could be terribly inefficient.

What I mentioned, mDNS and UDP multicasting work fine on local networks, and we are experimenting with IPv6 anycast on global network to solve this problem.