Playing Battleships over BGP (2018)

Y	Hacker News new \| ask \| show \| jobs

	Playing Battleships over BGP (2018) (blog.benjojo.co.uk)
	124 points by tcard 1760 days ago

4 comments

throw0101a 1760 days ago

> For a protocol that was produced on two napkins in 1989 [...]

I'm not sure I'd want to deal with a protocol that can't be explained on a napkin or two. UTF-8 was design on a diner placemat:

* https://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt

link

benjojo12 1760 days ago

Author of the post here, Ask me almost anything I guess(?)

link

kjrose 1760 days ago

If you could replace BGP globally instantly with no problems. What would you replace it with?

link

benjojo12 1759 days ago

(Keeping in mind that replacing BGP is similar hard-ness as replacing SMTP, and thus, might not be worth it)

Honestly, the issue that exists with BGP is not the protocol. The issue is attached to trust, and there is not a instantly fixable problem with a different protocol.

One issue with the internet as a whole is that seemingly simple questions are actually hard, The one is slowly being fixed with RPKI is "Who actually owns this IP address", knowing this we can build better filters against direct (origin AS != owner AS) hijacks.

However the next question that has no solution for is "Who is allowed to carry this route/transit this data?" -- This is going to be unbelievably hard to solve with certainty, There is question that maybe a PKI solution could be deployed (BGPSEC). However you also will hit the next issue.

The bgp table is massive. 1M+ routes that is stored on machines with reasonably long lifetimes. It does not help that in terms of computing power these machines are in general very slow. A multi TBit/s router may only have a 2014 era laptop CPU powering it. So computing anything 1M times quickly is a massive ask, and when links go down, it is reasonable have fast recompute/reconvergance times.

Fixing bgp is not a easy issue. Anyone who is telling you so is either fraudulent or does not understand the sheer scale/scope of the issues attached to the protocol.

link

convolvatron 1759 days ago

it is if you relax the constraint that the providers keep the legacy allocations and can advertise whatever the hell they want

Steve Deering had a really nice proposal on geographic addressing that would make pki sufficiently performant by using hierarchical assignments

link

makeworld 1759 days ago

Have you seen Yggdrasil? It provides an alternate routing idea, among other things.

https://yggdrasil-network.github.io/

link

pyvpx 1760 days ago

and keep IPv[4|6]?

link

moffkalast 1760 days ago

IPv9 is where it's at.

link

scratchadams 1760 days ago

no question, just a selfish request for more blog posts please

link

bmsleight_ 1760 days ago

Considering Events yesterday - how do you test non-live ?

link

tg180 1760 days ago

Maybe dn42.eu?

> Experiment with routing technology

> Participating in dn42 is primarily useful for learning routing technologies such as BGP, using a reasonably large network (> 1500 AS, > 1700 prefixes).

> Since dn42 is very similar to the Internet, it can be used as a hands-on testing ground for new ideas, or simply to learn real networking stuff that you probably can't do on the Internet (BGP multihoming, transit). The biggest advantage when compared to the Internet: if you break something in the network, you won't have any big network operator yelling angrily at you.

link

benjojo12 1760 days ago

Who said I tested non-live?

The actual beta builds/sanity checks were done just with two VMs peered with each other, but the live internet one was done in one take (and never again, at least by me)

link

benjojo12 1760 days ago

To add on, BGP has a very much "meme" status of being scary and dangerous, and any touching will break youtube etc. [Mostly perpetuated by infosec circles]

It's really not the 2000's anymore, BGP is mostly safe and filtered. There are still improvements to be made (I've even written on the blog about them), but one persons immense fuck ups are far less likely to cause issues now that IRR filters and prefix limits exist.

link

JadeNB 1760 days ago

> It's really not the 2000's anymore, BGP is mostly safe and filtered. There are still improvements to be made (I've even written on the blog about them), but one persons immense fuck ups are far less likely to cause issues now that IRR filters and prefix limits exist.

Any non-maliciously designed protocol probably can be used safely, but surely yesterday's events show that it is still eminently possible to use BGP dangerously?

link

benjojo12 1759 days ago

What part of yesterday was showing that it was possible to use BGP dangerously?

If you are certain in this argument, then you master electric switch is dangerous because you could switch off the power to your house.

link

HideousKojima 1759 days ago

What happened yesterday was (appears to be) Facebook screwing up their own routing and DNS, not anyone else's. They didn't take down routing for any IPs and domains they didn't own. I can't imagine any other protocol making a mistake like FB's impossible

link

INTPenis 1760 days ago

So how should these problems be mitigated? Have separate infrastructure for critical services or staging BGP or what?

link

midasuni 1759 days ago

It seems that the main problem Facebook group in restoring device was a lack of a completely separate out of band management network

If my network (way smaller than FB, but budget way lower) goes, I can get in via another ISP and WireGuard into the OOB network which is completly separate to the inband management.

Not every access switch is on OOB, but the core ones and a few critical devices are.

link

efitz 1760 days ago

This was very cool and also IMO very irresponsible.

link

dt3ft 1760 days ago

So this is why Facebook went down, eh? ;)

link