Hacker News new | ask | show | jobs
by eklitzke 1475 days ago
I agree and this article seems pretty misinformed. Creating and managing threads on Linux is extremely cheap, especially when a lot of them are idle, and a lot of big companies (Google, Facebook, Amazon) have tons of huge C++ applications that have thousands of threads and it's fine. I also think a lot of people who don't work on these problems at these kinds of companies assume that it must be incredibly difficult to write code like this and debug it, but that's not really true. For one thing, generally the tricky parts to write are abstracted away so that regular engineers don't have to think much about threading concurrency issues. And when they come up, tsan and lock annotations[1] will catch 99.9% of these problems in testing and make it easy to understand why things are breaking.

In the real world here are the kinds of problems that people at Google etc. care about when it comes to performance or scalability issues with hugely concurrent programs:

  - Noisy neighbor problems from other threads messing with your TLB and L1 cache
  - High cost of context switches
  - Unpredictable scheduling/priority inversion in the scheduler
The first problem isn't actually made any better by using async coroutines or green threads/fibers, if you switch to another coroutine or fiber and it does something naughty (e.g. munmaps memory, which will cause a TLB shootdown) it's going to degrade performance for your unrelated coroutine/fiber.

The second and third problems can be solved in some cases by things like fibers and userspace scheduling, but this is a fairly advanced topic and "just use async" is definitely not the solution. If you're interested in learning more about how these problems are actually solved at Google for example I recommend [2] and [3].

[1] https://abseil.io/docs/cpp/guides/synchronization#thread-ann... [2] https://www.youtube.com/watch?v=KXuZi9aeGTw [3] https://storage.googleapis.com/pub-tools-public-publication-...

2 comments

> - Noisy neighbor problems from other threads messing with your TLB and L1 cache

Switching between threads within the same process doesn't require a TLB or L1 cache flush. Not sure if you were implying this, just wanted to point that out.

> - High cost of context switches

Userspace schedulers (like rust's tokio) do make context switching cheaper, however, most of the context switching in the case of a web server is due to blocking I/O and the most expensive part of the switch, entering the kernel, is already accounted for by the I/O request. Kernel context switching is unlikely to be your bottleneck.

> Unpredictable scheduling/priority inversion in the scheduler

This can definitely be an issue at scale, but a general purpose async scheduler like most use is unlikely to be any better.

As another data point, I have one Firefox window right now:

    $ ps -eLf | grep firefox | wc -l
    569
    $