Hacker News new | ask | show | jobs
by ggreer 4053 days ago
A big problem with one-thread-per-connection is that you open yourself to slowloris-type DoS attacks.[1] Normal load (and even extreme load) is fine, but a few malicious clients can use up all of your threads and take down your server.

This is touched upon in the slides you linked to. On slide 62 (SMTP server) a point says, "Server spends a lot of time waiting for the next command (like many milliseconds)." A malicious client could send bytes very slowly, using up a thread for a much longer period of time. If the client has an async architecture, it can open multiple slow connections with little overhead. The asymmetry in resource usage can be quite staggering.

1. http://en.wikipedia.org/wiki/Slowloris_(software)

1 comments

You seem to be imagining a case where you only allocate a small fixed thread-pool and when it runs out you just stop and wait. I think the slide deck is advocating that you just keep allocating more threads.
I'm talking about hitting OS or resource limits. Let's say a server is configured to time-out requests after 2 minutes. A malicious client could do something like...

Every second:

1. Open 40 connections to the server.

2. For all open connections, send one byte.

Repeat indefinitely.

Steady state would be reached at 4,800 open connections. At 1 byte of actual data per second per connection, data plus TCP overhead would use around 200KB/s of bandwidth. The server would have to run 4,800 threads to handle this load. Depending on memory usage per thread, this could exhaust the server's RAM.

There are ways to mitigate this simple example attack, but the only way to defend against more sophisticated variants is to break the one-thread-per-connection relationship.

What i am truely missing is a good benchmark and comparisons between async vs sync. It seems true that everybody says that async is best but i don't see much evidence. For example, how should 4800 threads exhaust the servers RAM when the thread stack size can be as small as 48kB. That's a round 200MB of memory.

I'm not saying that the threaded approach is better, but that almost everyone comes around with some theoretical statement but nobody seems to care to find hard evidence.

You are right to distrust these claims. The reality is that threads can be significantly faster than async -- async code has to do a lot of bookkeeping and that bookkeeping has overhead. OTOH, threads have their own kind of overhead that can also be bad.

The slide deck that bysin linked above is pretty good:

http://www.mailinator.com/tymaPaulMultithreaded.pdf

This is by Paul Tyma, who at the time worked on Google's Java infrastructure team with Josh Bloch and other people who know what they're doing. Apparently he found threads to be faster in a number of benchmarks.

Ultimately which is actually faster will always depend on your use case. Unfortunately this means that general benchmarks aren't all that useful; you need to benchmark your system. And you aren't going to write your whole system both ways in order to find out which is faster. So probably you should just choose the style you're more comfortable with.

Async is kind of like libertarianism: It works pretty well in some cases, pretty poorly in others, but it has a contingent of fans who think they've discovered some magic solution to all problems and if you disagree then you must just not understand and you need to be educated.

(Note: The code I've been writing lately is heavily async, FWIW.)

Why is 4800 threads a problem, and 4800 heap-allocated callbacks not a problem? Are you assuming a thread consumes significantly more memory than the state you'd need to allocate in the async case? This isn't necessarily true.