Hacker News new | ask | show | jobs
IPv6 Is A Disaster (but we can fix it) (matduggan.com)
43 points by mduggles 1045 days ago
8 comments

My IPv6 philosophy:

If any "new" computer technology has been around even half as long as IPv6 ( https://en.wikipedia.org/wiki/IPv6_deployment#Major_mileston... ), with even a tenth of the "you gotta start using this!" push from the Big Boys - and yet still is very widely avoided/resisted, and the older-tech alternative commands a price premium due to widespread demand...gosh, that "new" technology must absolutely suck, eh?

I don't think it's fair to blame the technology. The problem is that the computing industry has changed.

The things that IPv6 would enable (direct end-to-end connectivity) is now seen as a negative by the industry that has since pivoted on rent-seeking, walled gardens and restricting user's potential. The industry is now even legally making money on many things that would've been considered outright malware just a decade ago.

People being able to host things themselves, or local-first apps that communicate directly without the need for any middlemen is a negative for the industry. The industry wants there to be a technical need for a middleman, so they can provide that and seek rent over it.

There is no user-level demand for IPv6 because the industry is no longer making any apps/devices/services that would take advantage of end-to-end connectivity (even if it was available now - let's say in a hypothetical world where IPv6 adoption is 100%) since it's more profitable not to, so as a result there is no pressure on ISPs to offer it.

I think that it's fair to blame the specification. I don't think the problem is that the industry has changed, I think it's that there's a huge amount of friction and headache for shifting to it. I think that's because it was too large of a change all at once, combined with the initial specification having some real problems (that did eventually get mitigated in later specs).

I suspect that if IPv6 limited itself to just increasing the size of IP addresses, IPv4 would largely be a distant memory by now.

"IPv4 with bigger addresses" would never have been backwards compatible and would always have been a compatibility break and would always have a slow rollout.

The proposals that seemed backwards compatible were just aggressive CGNAT consolidating even more power in the hands of IPv4 address owners. That doesn't seem like a sustainable fix in the long run.

> "IPv4 with bigger addresses" would never have been backwards compatible and would always have been a compatibility break

True, but it would limit that break to a single thing. That's much easier to deal with than the whole basket of things that IPv6 brings with it.

Compatibility breaks are always headaches. There's not "just broke a 'single' thing" when it comes to compatibility breaks. That is why strict semver suggests a major bump no matter how "small" a compatibility break appears to the developer. There is no such thing as a "simple" compatibility break to downstream users.

In general, despite the complex vocabulary about most of it, in many ways IPv6 is simpler than IPv4. Its header has fewer fields. Its QoL/QoS fields aren't accidental hacks on top of old debugging fields but intentionally designed fields for that very purpose. SLAAC is a simpler protocol than DHCPv4, though the algorithm sounds more complex at first. (DHCPv6 is basically as complex, but fewer devices and fewer subnets should need DHCPv6 in the first place.) Much of the "basket of things" that IPv6 brings with it are designed to remove complexity that has concreted around IPv4.

They ripped the bandaid completely off with the backwards compatibility break that they made with IPv6, and apparently a lot of people loved the cute stickers they had applied on top of the bandaid. But at this point it is probably better for the skin below to heal without the bandaid than to continue to sticker and bandaid over that and let all that unnecessary glue fester in place. (To push such a metaphor almost to its breaking place.)

> The problem is that the computing industry has changed.

Nah, the problem is ipv6 has been designed by a commitee for a lot of enterprise-ish features so the hobbyists have taken a look and postponed setting it up internally for when they have absolutely no choice.

I've asked for simple ipv6 tutorials in discussions on HN and elsewhere and whatever I got pointed at was always longer than the article we're discussing and incomplete.

Basic set up of ipv4 for a home network can be explained over just one pint. Looks like you need two barrells for ipv6.

Yeah right. The summary of the first page already throws around like 4-5 acronyms that each require reading a separate documentation.

And that's only for configuring your router, not your local network...

Okay then please provide me the level of documentation you are looking for, but for an IPv4 network. Sounds wonderful, I'd love to share it with new hires.
> I've asked for simple ipv6 tutorials

There really does seem to be a lack of good documentation about all of this. The docs that I've seen appear to be aimed at actual network engineers, or are so incomplete as to not be worthwhile.

I would be much less stressed by all of this if I could find something good that sits between those two extremes.

A part of me, though, suspects that the reason there is no "middle ground" documentation is that it's not possible -- that IPv6 is too complex for that. Not saying that's the actual reality, but it has the whiff of it.

All networking is complex.

I asked the other guy this, but I'll also ask you. Please provide me the level of documentation you are looking for, but for an IPv4 network. If you have some grand tutorial that explains it as easily as you make it out to be, then I truly would love to see it, I will include it in my onboarding documentation at work.

Because I understand both IPv4 and IPv6, and do not consider IPv6 the more complex protocol by any measure. I suspect your "whiff" is more a bias towards what you are comfortable with, rather than a true reflection of IPv6's complexity.

> Because I understand both IPv4 and IPv6, and do not consider IPv6 the more complex protocol by any measure.

You mentioned "new hires" while i mentioned hobbyists. You're talking about a business network where people are paid to do it, I'm talking about home networks and home labs.

You're basically confirming my statement that IPv6 was designed for enterprise needs?

> The industry is now even legally making money on many things that would've been considered outright malware just a decade ago.

Sounds a bit over the top. Can you name some examples?

10 years ago if you made software that uses all kinds of lies and dark patterns to get access to a user's contacts list, uploads it to your server and then you did data mining on it, people would be concerned and consider the software malicious.

Likewise with analytics - tracking every single action you do in an app (along with generic metadata such as IP addresses - which often leaks your general location and your relationship with anyone on the same network since you'd be sharing the IP address with them) would have been considered spyware.

When there were talks of tracking people for ad targeting in the early days of the internet people (rightfully) freaked out, even though that tracking was really primitive by today's standards.

All of those things are now considered legitimate and are routinely done.

I remember how often DoubleClick were the villains of tracking and privacy over-reach on early Slashdot, and then Google bought DoubleClick and became worse than DoubleClick ever were as top Slashdot villains and yet Google is still often called the heroes in the adtech space. (Though that sea is somewhat changing again as even more mainstream media catches up to tracking prevention.) It remains such a profound reversal to me.
We also used to have a name for malware that injected ads into your computer: adware. Now ads are just part of Windows.
Indeed. When I look at how far things have fallen in this regard, it gets very hard to feel positive or optimistic about the internet.
And the human race. The biggest revolution in communication since Gutenberg, and in less than half a century it's been used almost exclusively for evil purposes.
Instagram, for one.
But no one has pushed.

Consumer routers still suck regarding IPv6. Last time I tried setting up IPv6 on my wan I got a /128 which is utterly incompetent.

No service wants to cut off access to the ipv4 customers so they've made things just work. We have only recently hit address exhaustion (relative to IPv6 age).

No one wants to jump first and there is no government mandate for support.

I don't know who you think the big boys are, but it's not like Google or Meta have throttled ipv4 services or put banners on their site warning users they are on a legacy protocol.

A /128 on the WAN is normal. Addresses assigned by DHCPv6 (which is commonly used by ISPs for WAN address assignment) are assigned as /128.

The important part is the delegated prefix, which you normally get via DHCPv6-PD and should be at least a /56.

A big part of it is how much actually needs to change, and work properly, before you can actually rely on IPv6 the way you can on IPv4.

I recall reading about Facebook's internal IPv6 migration for their data centers and the problems they ran into. The two that stuck out the most to me and which I still remember details for are:

1. The PHP function developers used to convert an IPv4 address into an integer to store into a database or something. It didn't work in IPv6, meaning the code broke badly when it was considering an IPv6 host. They kept asking developers to stop using it, but new code kept getting added which used it. Eventually the decision was made that 'we've warned you enough, so from now on we're just going to go ahead and let your code break'.

2. Their networking hardware wasn't extensively tested for an IPv6 single-stack deployment. Turns out that when presented with a BGP advertisement that contained only IPv6 routes and no IPv4 routes, their switches would immediately crash. How did they discover this? By sending out a BGP advertisement that contained only IPv6 routes and no IPv4 routes, crashing every rack switch in the data center. There's no reason why this should happen; it's not a violation of the BGP spec or anything, it's just a bug for a case that no one tested on the vendor's end because none of their customers tried to do that.

So it's not that IPv6 sucks; it's actually pretty great in a lot of ways. The problem is that everything else sucks; people don't bother to support things, they don't consider it in their code, they don't test for it, they don't bother to roll out support because no one else has done so either so why bother, and so on.

In the end, IPv6 is 'avoided' because of the problems that Facebook ran into, or that OP ran into, and is 'resisted' because it's extra work that they could put into doing other things with their limited time and engineering work.

To be clear, dual-stack deployments work great; my ISP recently did a trial of a dual-stack deployment which I was lucky enough to participate in, and it was almost completely transparent. My Unifi gateway picked up the address and handed addresses out to clients internally, clients used IPv6 where appropriate and fell back to IPv4 when necessary, and everything worked completely transparently.

TL;DR IPv6 isn't the problem, the industry is the problem.

I am not buying that even at Facebook there's enough developers to keep adding ip address lookup calls in amounts that make a difference.
Well...okay, some good points, +1.

OTOH - for people making real-world decisions, the difference between "$Networking_Technology sucks" and "~all available implementations of $Networking_Technology suck" is pretty meaningless.

One way to think about the timescale is the rollout time versus the expected usage amount of time. IPv6 was also a bit of a "science fiction project". I remember a lot of the early hype for IPv6 was that it was "IP for the whole solar system" or even sometimes "galactic IP". The architects of IPv6 were clear that they were hoping for something like a 1000 years or more of addresses and usage, even expecting nearly every device on the planet (and maybe the whole solar system) needing an IPv6 address and knowing how fast things like the "Internet of Things" were coming down the horizon.

If it is built to last a 1000 years or more, ~25% of internet traffic by the end of the first 30 years isn't a terrible rollout curve. That's 0.3% of its expected lifetime. (Assuming the 1000 years clock started 30 years ago. It's even believable the 1000 years clock starts closer to 90% rollout of IPv6 than to IPv6 announcements 30 years ago. IPv4 address scarcity wasn't truly felt until decades into IPv4 usage, though it was an academic concern.)

Many Internet projects have always run on different timescales compared to the vastly faster timescales people associate with software and even hardware generations. (In part because these rollouts happen across software and hardware generations.) The IP protocol is so deeply fundamental to that, it is somewhat "civilization defining". It would probably be a lot scarier if an upgrade to something fundamental like IP was an "overnight success". Civilizations are built on the timescales of decades and lifetimes. It should not be a surprise that IPv6 rollout has happened on such timescales. We mostly can only hope that the IPv6 architects were as smart as they hoped they were in preparing for the deep, unknown future of the internet, because they knew they were working on a civilation-timescale tool. (Which is exactly why IPv6 is not just "IPv4 with bigger addresses".)

> If it is built to last a 1000 years or more, ~25% of internet traffic by the end of the first 30 years isn't a terrible rollout curve.

I don't think that measuring the rollout curve relative to the expected lifetime of the thing is reasonable (or at least, useful), though. In terms of something like this, measuring it relative to when IPv4 is simply no longer feasible is better. And that time is very, very near.

Since the main problem was address space, they should've just expanded it. Let everyone keep their old v4 addresses (with 0-padding), focus on the protocol upgrade, and give new users longer addresses for cheaper. You wouldn't even need DNS changes initially. Instead, v6 became a whole new thing with additional goals like removing NAT (which I'm not even convinced is a good idea), so of course there'd be way more friction.

Like, I said this elsewhere, Cloudflare public DNS is 1.1.1.1. If I switch to ipv6, I get to use 2606:4700:4700::1111. You telling me that's an upgrade?

I agree. Instead of just making v4 addresses bigger (and related services around the protocol as well - ICMP, DNS, ...) a committee spawned jack of all trades, master of none IPv6 incompatible with current IPv4 stack.
IPv6 is compatible with IPv4 there are millions of devices with only an IPv6 address that work just fine.
No, I only have a v4 address with my ISP and cannot use a v6-only device at home.
IPv6 is backwards compatible to IPv4 but not the other way around. If you have a solution how to address a 128bit IPv6 address with the 32bits available on v4 I'm sure many people are eager to talk to you. It's just not possible.
With a 6-to-4 gateway then yeah. But at that point you're using v4.
Yes.

Also, any change to the protocol was going to be a massive shift regarding network hardware. It wasn't ever possible to slap a few more bytes onto the address.

If you're going to make a monumentap shift, why not do it right?

> If you're going to make a monumentap shift, why not do it right?

yes, that was the argument from the very beginning, and it's not without merit. I disagree with it, because it's making a monumental shift into one that is even more monumental and increases resistance to making the change at all.

But who knows? Both sides of this argument are just speculating.

It would still be a much smaller shift if they focused on only expanding the address, and I guess removing packet fragmenting while they're already changing the fields. Why not do it right, cause at least it gets done that way.

There have been other huge migrations pulled off in networking. HTTP->HTTPS is the first that comes to mind. Extra layer of security, no other changes. Browsers slowly made users more and more wary of unsecured sites, and it became easier for site admins to obtain SSL certs. Once plain HTTP was finally made uncommon, versions of SSL/TLS still got upgraded slowly. They also avoided making it too flexible and turning into a fragmented mess like email or XMPP, i.e. browsers strongly avoid self-signed certs and started banning old versions.

Yep. It's as if they didn't understand that ~nobody would want to leave the great old IPv4 Club (packed with all your friends, and everyone who you'd want to meet) for the shoddy new IPv6 Club (with ~zero people there, probably nobody you know, and good luck trying to get a waiter's attention). And also pay for memberships in both clubs during the vague "eventually" transition period.
> Cloudflare public DNS is 1.1.1.1. If I switch to ipv6, I get to use 2606:4700:4700::1111. You telling me that's an upgrade?

The concept of vanity IPv4 addresses was invented in 2009, when Google acquired 8.8.8.0/24 from Level3. This is an emergent feature of a small, densely packed address space. IPv6 had existed for a decade (EDIT: not two decades) by that point, so you can't really blame the designers.

Sprint controls 2600::, probably by accident, but they're not doing anything interesting with it.

That's true, but even the less memorable v4 addresses are easier to deal with and nicer on the eyes. And on a LAN with a NAT, you typically get memorable addresses like 192.168.1.2.

Maybe the bigger issue was trying to get rid of NAT. People don't want every local network device to have a public IP and have to trust that the router's v6 firewall will do its job.

> People don't want every local network device to have a public IP

I absolutely don't want this. But as I understand it, I can avoid this by assigning my machines the IPv6 nonroutable addresses fe80::/64. They're the equivalent of 192.168.* and 10.*

Same as the firewall, it's fine if it's done right. But does every machine get link-local v6 addresses by default? My Mac is set to "automatic," which I assume asks the router. Even if I use link-local, does every router (even crappy ones) respect the no-forward rule? This is along with several other aspects of my network changing to use v6.

Meanwhile, if someone sends a v4 packet with TCP port 22 to my router, it can't tell where to forward it even if it wanted to. It takes effort to do that, namely a port forwarding config.

> But does every machine get link-local v6 addresses by default?

If you use DHCP, then I think you can configure that. What I have in mind is to assign static IPs to all of my fixed machines anyway, and use DHCP to assign IPs to transient machines. Not sure if that's reasonable, but it's my current thinking.

> does every router (even crappy ones) respect the no-forward rule?

There may be broken ones, but it doesn't matter so much because your ISP won't route such addresses regardless.

NAT is a bandage over a crippling of proper network behavior. You trust your port forwarding isn't illicitly opening itself, no? Then you can trust a default deny inbound policy on IPv6.
My port forwarding would have to actively try to allow traffic to my host. It doesn't even know where to forward to. And like it or not, NAT has momentum. Getting rid of NAT would be a big migration in of itself.
This is actually wrong, and dangerously so. Your router knows perfectly well where to forward any given packet to: it forwards it to the IP that's in the packet's "destination IP" header.

If a connection comes into your router with the destination IP set to one of your LAN machines, NAT will not stop the connection.

There's no reason to be using NAT to protect yourself from inbound connections, because that's not a thing NAT even does in the first place. It often makes things actively worse even, by making it easier to port scan for your servers and by giving you a false sense of security.

NAT66 exists, it just isn't a necessity in IPv6. There are also private IPv6 networks.

They are called Unique Local Addresses (ULA) and are in the range fd00::/8.

Which itself is so much better than RFC 1918 addresses. If you need private, non-Internet routable addresses, then you generate a random one. In the event two private networks need to communicate over VPN, for example, there is no clash.

You can number a LAN with fd00::/64 and use IPv6 NAT to reach the internet, so the addresses are even shorter than 192.168.1.x. It's just not commonly done that way.
For an IPv6 advocate, this guy sure set up a lot of NAT. And while he claimed he's going IPv6-only, he set up public access via IPv4. That won't convince anyone to switch or upgrade.

I understand that he's building a usable service and just trying to git 'er done, but it's a lot of hacks, so I'm glad they're documented in this here blogpost.

I hope that he can continually probe the edges to find out when real IPv6 support becomes available, and can gradually remove the hacks for a purer experience.

The nice thing about NAT64 is you only NAT when you're talking to a v4 only client, otherwise you still have pure v6. This leaves no hacks to remove for a pure IPv6 experience it just means you can have single stacked IPv6 devices instead of needing to dual stack or wait for the entire rest of the world to also configure IPv6 too.

I.e. it allows you to push IPv4 to your internet edge only in a way that doesn't downgrade anything about actual IPv6 capable connections. For a single server that probably does seem pretty silly but once you have multiple it can make more sense.

When the day comes that I have to shift to IPv6, I think I still want to NAT, though. I could be (and probably am) misunderstanding things, but I don't see how I can eliminate my need for it.

What I want it for is so that I can have services exposed through my domain name, but operated on different internal servers.

What you're asking for here is port forwards/DNAT, i.e. applying NAT to redirect an inbound connection. When people say "NAT", they're usually talking about SNAT/MASQUERADE, i.e. NATing outbound connections.

If you want to NAT inbound connections, you can do it without NATing outbound connections. Essentially: you don't need to NAT, you just need to port forward.

Honestly, I think you should just suck it up and use different hostnames for different services, because running all of your services on one IP is really bad for security since it makes it much easier to enumerate every service you're running -- it only takes scanning 65k ports on one IP to find them all, rather than 65k ports on 2^64 IPs. That's the difference between megabytes and yottabytes of port scan traffic.

If you NATed outbound connections to also come from this IP then things get even worse because every outbound connection any of your machines make would immediately inform the server of the IP needed to make an inbound connection to you. That's a completely unnecessary security sacrifice.

But if you're gonna do it, you can do just that, without trying to run the network on some local IP range too.

It's not uncommon to load-balance or NAT over fe80:: link-local addresses for internal service delivery for your use case. Some services are nicer, allowing DNS SRV records or the like, but many require the middle-man.

Some ask "why bother with IPv6 if I'm still going to do that then?" and generally the two key advantages are the fe80:: address co-exists with the unique public address of each box so you don't need outbound masquerade NAT pools and the fe80:: address space is enormous+interface specific so you don't have to worry about unique internal space or conflicts with other networks. Or, if you have a static IPv6 assignment in a more "proper" hosted deployment instead of a dynamic home deployment, you can of course just do stateless NAT to the public addresses without worrying about IP scarcity.

There are some services that have a different resource record in DNS like email (MX) or use SRV records like XMPP but else yes you are forced to either use a different hostname (like www. and ftp. and mail.) or do it port based with NAT like you used to.
At that point you just use a much simpler reverse proxy, I think? NATs have to be stateful to operate, but in IPv6 you can do a lot with simpler stateless reverse proxies.
NATs in this inbound scenario are most often stateless 1:1 static mappings of the port.
For several services, you're probably right. But not all of them. I think.
What would you suggest they have done?

I feel like excluding IPv4 folks is a large reason why IPv6 continues to fail. I feel like this is a pretty good compromise between pushing IPv6 and not being an IPv6 hermit in the IPv6 desert.

At this point, I feel like the onus should be on IPv4-only clients to adapt to an IPv6 world, by enabling proxies and translators that enable them to access IPv6 sites until their support comes up to speed.

This could be done on the ISP/enterprise level, but it is more counterproductive to tell IPv6 adopters and promoters that we need to bend over backwards and hack in NAT and purchase/rent/lease public IPv4 addresses, when this is not our problem anymore.

I feel like the more juicy services that are IPv6-accessible-only, the more it will drive consumer demand, and will light a fire under people who are responsible to update the support and ensure that IPv6 works, even when IPv4 doesn't.

This is sort of happening. Consumer "demand" is already showing IPv6-first usage in part because the (non-evil/braindead) consumer ISPs to avoid CGNAT scenarios have been moving to IPv6-first or IPv6-only with NAT64 gateways. This is especially the case in US mobile carriers who are generally some of the largest ISPs at this point by volume of US consumer traffic.

It's mostly the Enterprise level that has failed to get the message and is failing the IPv6 internet. Even just the examples in this article: It makes zero sense that GitHub still has no AAAA records (and is increasingly slow and lethargic on mobile carriers via NAT64 gateways; it is not just that their mobile app is only so-so, it's also their networking is slow). It makes zero sense that Docker put its AAAA records on weird secondary domains instead of their main domains.

Now that all of the major cloud providers are charging for IPv4 address space on a per-hour scale that might see reflection in bottom lines in IT budgets, maybe there will be a fire finally lit under Enterprises to consider using more and better IPv6.

I would suggest letting v4 users keep their existing v4 addresses when going to v6.
I mean my intention was to go pure, hence finding the Docker IPv6 registry and doing the IPv6 stuff with the bridge interface.

My hope is folks know of workarounds and I’ll do them and update the post.

> And while he claimed he's going IPv6-only, he set up public access via IPv4. That won't convince anyone to switch or upgrade.

This is the major flaw. Sites can't go ipv6-only. In reality we will have parallel ipv4/6 networks in place until the wheels fall off

It depends on a country. I was at places where IPv4 is the default choice. I also visited countries where IPv6 is a standard practice. As a consumer, you really do not see much difference, IPv6 works surprisingly well if not better than IPv4.
Most US mobile carriers are IPv6-first or IPv6-only with big NAT64 gateways. One of those countries these days is the consumer parts of the US.
I think if every hosting company that charges for ipv4 addresses offered complementary (or at least much cheaper) NAT64, then paying for an ipv4 would be only needed for ingress traffic, and ipv4 addresses could be dropped for most VMs.
I have Verizon FIOS and my area STILL does not have ipv6.
"The writing is on the wall" because cloud providers charge a few dollars a month for IPv4?
to be short: most stuff didn't work because they cling to their ipv4 address.
I wonder if he disabled IPv4 support in the kernel and stacks, or if he simply removed the IPv4 address from network interfaces and A records from DNS?
If the host has no v4 it switches to v6 if possible. Disabling the support is not needed
Very ambiguous reply. Which host? The remote or the local? What do you mean "has no v4"? No A record? No v4 address on an interface?

Which hardware and OS are you describing? Clearly the blog post illustrated some examples where switching to v6 was not happening, so it seems to contradict your comment right off the bat. There are many implementations of dual-stack IPv4/v6. In fact, they are more divergent than IPv4 implementations, because the latter often derives from the BSD-RENO codebase, while IPv6 was introduced after Linux became King, so Microsoft, Apple, and Linux (and lots of router/firewall vendors) have ostensibly developed IPv6 stacks separately, some being more open than others. They're not all going to work the same way with fallbacks/failovers.

If (a host trying reach a remote service) has (no route to the address indicated by the A record) then it will try a AAAA record if one exists.

> Which hardware and OS are you describing?

This is how it's supposed to work on all OSes; on any recent BSD (excepting perhaps Apple?) or Linux setup, it should work this way.

> Clearly the blog post illustrated some examples where switching to v6 was not happening

In those situations it was for connecting to services that do not advertise a AAAA record.

No, actually it does not work that way. You have it backwards, first of all: IPv6 AAAA record is queried first, then the connection is attempted, and then a fallback may happen to an A record and an IPv4 connection attempted.

https://www.rfc-editor.org/rfc/rfc5220.txt

I'm not sure why you specified "if a host has no route to the address", because that's a very specific and transient failure. Furthermore, the dual-stack handling necessarily happens in the application, so this is not an OS or kernel-level decision, it will be subject to each individual app's behavior: https://issues.apache.org/jira/browse/SERF-190

As you can see from RFC5220, IPv6 is preferred over IPv4, unless an option is set to swap those around. Of course, certain configurations can confound this preference order, such as ULA IPv6.

"No route to host", as should be obvious, is only one of many errors that could prevent an app from establishing an IPv6 connection. It would seem that they should handle most failures as an occasion to fall back to IPv4, unless configured not to.