Hacker News new | ask | show | jobs
by dijit 2141 days ago
You've had a few replies so I guess mine will be lost to the aether.

NAT vs Direct addressing is an interesting topic, because we've gotten so used to working around the issues inherent in NAT that we take them as a sort of given. I'll lay them out here:

1) The actual NAT state table in your router is much slower than a simple bit-map firewall lookup. This will show up as a bit of latency on every new connection.

2) The state table can get full. When that happens some connection needs to be evicted. For web technologies this wont look too bad.. Maybe a websocket connection gets closed and re-connects in the background. But if you're streaming something over raw TCP then that's annoying. Basically it makes your internet connection just that little less stable.

3) uPnP exists to try to mitigate the p2p issues with NAT; but does a poor job. -- Take for instance, a video game with VOIP, consoles are notorious for this; centralising and muxing everyones audio is expensive, so it's more useful to help people build peer meshes. So "NAT PUNCHING" is the normal way to go, but of course that doesn't always work, so you have weird tutorials on "how to port forward" when in reality this shouldn't be needed, a stateful firewall would be enough if not for NAT. Some guides even suggest putting your devices in the DMZ with direct port forwards on every port from the internet[!!]

https://www.denofgeek.com/games/how-to-change-nat-type-on-ps...

1 comments

> The state table can get full. When that happens some connection needs to be evicted.

This would be so much more convincing with some numbers to show it actually does happen in reality, especially at a rate that's comparable to other random connection drop-outs.

The most common symptom of this is someone mentioning that their home 'router' regularly needs reboots to keep working well. Excluding memory leaks, it's frequently the state table running out of space and connections going sideways as a result.

This is hard for individuals to see, but put a fair bit of load on a home consumer 'router' and, presuming you can get enough access to it to watch resources, you'll see it run out.

This is one of the things that better home network devices do: have sufficient RAM to handle a big state table, and manage it well.

IPv6 completely sidesteps this by not even needing a state table because no NAT.

> IPv6 completely sidesteps this by not even needing a state table because no NAT.

You may have forgotten that a stateful firewall that tracks inbound and outbound connections still needs memory to store a state table still applies in IPv6.

Now it also needs 8x more memory per entry, as the addresses have gone from 2x 32bit to 2x 128bit.

There's almost certainly more data in each entry than just the IP addresses, so it won't be 8x. NAT also requires a second set of entries to track the NAT session, which further equalizes it.
Absolutely. A state is protocol, ports, addresses, timers, counters and more. QoS/DSCP, firewall marks and other things add to the fun.
Makes sense if this happens, but does this actually happen to you? I've heard vague and rather dubious third-hand stories along these lines, but I've never actually encountered a router that needs rebooting to keep working well.

This actually seems bizarre to me now that I think more about it. The routers I've seen allow something like a few hundred thousand established connections over like a ~week. Say 300,000 over 3 days. To exhaust this you'd need to establish on average one new connection every single second (300000/3/24/60/60 ≈ 1), continuously for a week, while also timing out on every single one of them silently. Surely a normal person wouldn't exhaust such a table?

Exhausted NAT state tables is excessively common, evictions happen silently and the assertion that a reboot is required is for other reasons which I think are likely unrelated.

Professionally I run one (two, actually) of those annoying 'always online video games' and state drops in low quality routers is the most common cause of VOIP drop.

It seems like most router firmware has some kind of intelligent sensing software to see if there's a lot of traffic going over a state and then attempting to avoid evicting it. But for VOIP which can sometimes be silent.. or for a person not moving around in a game (and thus sending/recieving very few and very tiny updates) it can be seen.

Now; you want concrete evidence, and unfortunately the kinds of routers most people have (Say, a Virgin Hub 3.0 which is based on the Touchstone TG2492[0]) does not lend itself to being monitored well.

We're in some luck though, as I happen to run something immeasurably more powerful: a PfSense branded NetGate APU2[1]

PfSense absolutely /loves/ letting you know how it feels; and if we assume that I'm a "normal" user, (I have 1 laptop, 1 phone and an apple watch as the only devices on my network right now and I'm just browsing like normal) then we have some measure of how much memory a state table really consumes.

My state table currently contains a mere 170 states (according to iftop), but it's not really hurting my memory:

> 6% of 4030 MiB

Yet, I can see that some states have been forcefully closed, despite having lots of ram available to store too (these statistics were reset yesterday):

   state-mismatch                       748            0.0/s

In general the state table is very busy:

  State Table                          Total             Rate
    current entries                      152               
    searches                        90040931          338.1/s
    inserts                           437333            1.6/s
    removals                          437181            1.6/s

it's worth noting that this device is forcefully configuring itself to hit a max of 403000 states total:

  states        hard limit   403000
So it's not "memory" like you suggest, but since doing nat translation on every single packet is CPU intensive, states can be dropped if the table can't keep up.

[0]: 256MB of ram reserved for the state table it seems: https://deviwiki.com/wiki/Virgin_Media_Super_Hub_3

[1]: 4G of general purpose ram: https://www.firewallhardware.it/en/apu2-3nic/

Thanks for sharing. While I have a hard time grasping your usage (why in the world are 3 devices opening 1.6 connections every second?), it's not really relevant as your own data shows state tables don't get exhausted, right? Your table only has 152 entries, which is quite a far cry from exhausting its 403,000 slots.
This is fairly normal and common, especially if you browse without aggressive ad blocking.

I routinely see a single ad impression make over 20-50 connections outbound, and repeatedly close and reopen or randomly open new ones for various reasons, the most common being some form of "anti ad fraud" tracking that repeatedly polls to get an average or median latency, new connections and requests firing on every mouse move, etc.

Would also be entirely unsurprised if phones that had free mobile games and equivalent were polling and sending stuff like location data every minute.

My point is that even when I don't quite run out, something is dropping states.

the Hard limit is just one imposed by the OS, it doesn't seem to matter that I have absurd amounts of free memory, or that the kernel is quite content with loading up hundreds of thousands of states: they still get dropped.

And like I said, my hardware and software platform is many dozens of times more advanced than what most people are using at home.

As for the usage; easily explained by: every single website I open, all of the things that website asks my browser to pull in, every DNS request, every NTP update and every 'ping' to see if the device is online-- counts as a new state.