Hacker News new | ask | show | jobs
by wtallis 3669 days ago
Your numbers are ridiculous. There's a huge gulf between buffering millions of packets per switch port before a single drop, and a 1 in 1000 drop probability. You're also assuming that the drops are indiscriminate when a refusal to consider AQM and fair queuing is what led Arista to this absurdity in the first place, and you're presuming that latencies would still be astronomical in a world without massive queues.

A 10GbE network in a datacenter without bufferbloat would have RTTs orders of magnitude smaller than the 100ms queuing delay Arista considers acceptable; the effects of a congestion event would be ancient history by the time Arista's queues could drain. Even outside the datacenter, 100ms is a pretty long time for most connections in a managed-queue world. A congestion event on a device using fq_codel won't kill your DNS request or TCP handshake; it'll slow down an established flow and if you're using ECN you won't even lose a packet. It's only in a DDoS-like scenario of thousands of unresponsive connections (such as TCPs with a large initial window) beginning to transmit simultaneously that you'd see some flows getting unfairly penalized, but things would equalize within a few RTTs if the traffic was real TCP and not a true DDoS. You only see it take minutes for a download's throughput to recover if you're going over multiple satellite links or through a severely bloated queue.

1 comments

Okay make it one in a million. You will still be able to tell, and still see the phenomena I'm talking about.