| > disconnect all long-lived TCP sockets in an availability zone in unison I don't know what this means, but it sounds ridiculous. This would cause havoc with any sort of persistent tunnel or stateful connection, such as most database clients. Do you perhaps mean this just happens at ingress? That is much more believable and not as big of a deal. > office networks of large customers would do the same thing. Sounds like a personal problem. In all seriousness, your clients should handly any sort of network disconnect gracefully. It's foolish to assume TCP connections are durable, or to assume that you won't be hit by a thundering herd. Maybe I'm old fashioned but TCP hasn't changed much over the years, none of these problems are novel to me, it's well-trodden ground and there are many simple techniques to building durable clients. Also, all of the things you mention also affect plain old HTTP, especially HTTP2. There shouldn't be a significant difference in how you treat them, other than the fact you cannot assume they're all short lived connections. |
For us -- and I think this is common -- the persistent WebSocket connection allowed a set of assumptions around the shared state of the client and server that would have to be re-negotiated when reconnecting. The fact that this renegotiation was non-trivial was a major driver in selecting WebSockets in the first place. With HTTP, regardless of HTTP2 or QUIC, your application protocol very much is set up to re-negotiate things on a per-request basis. And so the issues I list don't tend to affect HTTP-based applications.