Hacker News new | ask | show | jobs
by beckler 2071 days ago
I was working at GE when AWS bought GE's 3.0.0.0/8 block, and it caused a massive headache. GE's network is gigantic, and there are a ridiculous number of assets. They didn't give much notice about the sale, so they shimmed all of the routing internally, but lots of services still pointed to the block.

There supposedly was an agreement that Amazon wouldn't start provisioning them for a certain amount of time, but whatever amount of time was specified, they either didn't honor it, or it wasn't enough time.

We were also moving assets over to AWS, and all of these things going on simultaneously caused what we called the three-pocalypse.

We would occasionally run across issues with external sites or newly provisioned lambdas who were on Amazon's new 3.0.0.0/8 block, but we couldn't reach them because internally that IP address didn't exist.

At the same time, they would open up a small block to allow access to those external sites, and then some internal service would no longer respond. Repeat ad nauseam. It was also compounded by the fact that there are countless teams in GE and not everyone would connect with who made what changes.

8 comments

I worked for NBCU and used to manage 3.3 3.54, 3.23, and about 5 other 3.x /16s. We had some design policies created by an elderly architect that was a bit lazy, so instead of VLSM for point to point links we would use /24s. Ahhh, the good old days.
This is what I think about when I'm told I should use /64s for point-to-point links in v6.
Some say that he is still doing that today.
Sounds like the perfect time to migrate to IPv6!
IP6 on AWS ECS is a pain, and impossible if you use their recommended and dedicated command line tool.
This has always seemed deeply suspicious to me. It's like AWS is trying to prop up the value of its speculation on IPv4 addresses.
or merely lazy. why spend time/$$$ supporting IPv6 when you have plenty of IPv4 addresses?
They're buying up taxi medallions and investing in ride sharing.
Well given that they keep on buying more IPv4, just supporting IPv6 better seems like the far cheaper option. That's what's so weird!
Truth be told, IPv6 still doesn't work. I've stayed in plenty of hotels and plenty of airport Wi-Fi that only had IPv4 routing. Until it's universal, it can't be considered a solution.
Fortunately public facing systems are capable of being dual stacked so that won't be a problem. There's no reason internal networks can't be v6 only. Microsoft's already done it.
I know they have plans:

https://teamarin.net/2019/04/03/microsoft-works-toward-ipv6-...

But AFAIK that is still aspirational. I would be pleasantly shocked and impressed if they had already achieved that.

Back up about 20 years and you could say the same thing about the (IPv4) Internet as a whole. I stayed in plenty of hotels which had zip for connectivity back then, but here we are now.

Things change...

On a similar note, AWS purchased some chunks of the 18.0.0.0/8 block from MIT, much to my dismay. It was so much fun having a class A network back in school, and getting static IPv4 addresses for projects was easy.

On a separate note I had hoped for some time that an AWS 18.x.x.x address would be useful to get access to journal papers but I tried and that sadly didn't work :(

I worked for HP back in the late '80s/early '90s, and it always seemed insane that they numbered everything internally with their 15.*/8 block, especially as you weren't even allowed out on the public Internet at the time. You had to get a manager to sign a paper to get access to a SOCKS proxy.
Same thing happen in my university. When I was graduated 10 years ago, they were using their ip block for their internal network, which can't even connect to the internet without a squid proxy. My wife did her master degree there last year and they were still doing it. I can't help but think how wasteful it is. Is there any benefit of doing that anyway, except due to legacy baggage?
This is how the Internet is supposed to work, with unique addresses.
Why? If you have a private network, like HP did, why would you want public IPs on it?
A big reason is mergers (something HP has a lot of experience with); merging two 10/8 networks is a mess but if they have unique IPs it's easier.

Also, I think the concept of a "private network" is inflexible and in some sense a premature optimization. If you use unique IPs you can decide on a subnet or even host basis what is exposed to the Internet and what isn't.

Ok, I'm sold! Where can I get 8 /24s to use in at my (small) work? :-)
> They either didn't honor it, or it wasn't enough time.

Knowing what they say about both companies, it was probably both reasons at the same time :-))

Sounds like a very bad IPAM - probably was a good move to sell it, since it showed all the places, where the network was not managed well and brought in a decent sum. At least so it seems from the few sentences you have written.

Also, in year 2018 GE could have been ready for IPv6 but that would require not only existing IPAM but also some proper leadership, which based on your words GE doesn't seem to have.

What's IPAM? Until about five years ago my F100 company barely used more than a network drive excel sheet for our public /16 + RFC1918, and they contacted the network security team to ask what addresses were used in the firewall NAT tables to determine what was free to use.
Something like https://www.globalservices.bt.com/en/solutions/products/diam...

Used it at a previous org I worked for. Pretty nice - it can crawl your routers, build up your existing network allocations then help you analyze/optimize them.

Stupid powerful - we used it mostly for decentralized DNS management (!!) but after a while I finally got the network and security folks to realize what the system could do and the spreadsheets started to finally go away.

Speaking of decentralized management it has robust hierarchical role-based security - we had thousands of site admins managing DNS and eventually IP addresses for their individual sites, but also smoothly maintaining overall command-control. A very cool system.

IP Address Management systems. Similar to solarwinds Orion.

We used that particular product extensively at my previous company. If you stole an ip without putting it into Orion there was a job that would enumerate/update info, if possible.

If the job failed, you and your boss would get an angry email.

This was in a company probably much smaller than F500.

Ouch. That does indeed sounds painful. I hope GE at least got a big pile of cash from the sale :-p
Can you describe any specific issue, that was not caused by your own misconfiguration or a hardcoded list of address ranges in a major third party service?
The situation was internal to the company. Neither cause you described was at fault, as per the post your responded to. Side effects of a massive bureaucracy are lack of information retention and communication, miscommunication and inaction (in decisioning as well as technical configuration).