Hacker News new | ask | show | jobs
by solarisos 124 days ago
The shift from managed DERP to decentralized Peer Relays is a massive win for self-hosters with difficult NAT situations. I’m curious if this significantly reduces Tailscale's own egress costs or if the primary goal was just improving latency for users who can't establish a direct WireGuard tunnel. Either way, removing the 'hassle' of setting up a custom DERP server is a great UX improvement.
2 comments

Alex from Tailscale here... We’re users just like you, and we felt this pain point ourselves. The good news is that Peer Relays were able to build on a lot of the existing subnet router and exit node plumbing, so it wasn’t a huge engineering lift to bring to life.

We also have plenty of customers running in restrictive NAT environments (AWS being a common example), where direct WireGuard tunnels just aren’t always possible. In those cases, something like Peer Relays is essential for Tailscale to perform the way larger deployments expect.

So yes, it improves latency and UX for self-hosters, but it also helps us support more complex production environments without requiring folks to run and manage custom DERP infrastructure.

Thanks for the context, Alex. It’s interesting to hear that the engineering lift was lighter by leveraging the exit node/subnet router plumbing—that’s a clever use of existing primitives.

The point about AWS NAT restrictions is a big one. I think a lot of people underestimate how often 'enterprise-grade' networking actually becomes a bottleneck for direct P2P. Moving that burden away from custom DERP management makes the 'it just works' magic of Tailscale feel much more sustainable for small teams.

We’ve had issues with the centralized DERPs just blackholing traffic when we startup ephemeral nodes in CI. This is despite us ensuring that all important peers can establish direct connections to each other. But there is some bootstrapping that is happening before both peers negotiate.

Having said this, it’s been almost a year since the last incident of this. It’s been rock solid the last months. Ok sure using these new peer nodes will greatly reduce this from even a chance of happening anymore. :hacks away:

That ephemeral node bootstrap issue is a classic 'edge case' that becomes a nightmare in CI. It makes sense that centralized DERP might struggle with the sheer churn of nodes popping in and out of existence. Using a Peer Relay that lives permanently on your internal net as the 'anchor' for those CI nodes seems like it would solve that race condition entirely.