Hacker News new | ask | show | jobs
by rhuber 1630 days ago
(coauthor of Nebula)

We briefly considered building something atop Wireguard in the early days of Nebula, but decided not to do so because of scaling. Wireguard's protocol necessitates that all nodes have existing keypairs for each other ahead of time. At Slack's scale, that means every time a fresh node is launched, you would have to tell 50,000 other nodes it exists.

Obviously you can smarten this up and tell only hosts it might talk to. But this adds complexity. Using PKI eliminates this key distribution problem and means that you don't have the same scaling limitations as something built on WG.

Wireguard is a very very good VPN, but I cannot imagine trying to run something on the scale of tens of thousands of nodes when you need such complex coordination systems to exchange keys/trust, especially in a dynamic environment where nodes are coming and going all the time.

I totally get that it solvable overall, but Slack has had 4 years of nearly perfect uptime on Nebula, whilst using it to pass >95% of all backend traffic. These considerations may seem simple to address, but there are fundamentals that mattered and led us to writing Nebula. We didn't want to create something new, but to do what Slack needed, we had to.