Hacker News new | ask | show | jobs
by chc4 680 days ago
> There's obviously no lock

What? The threadpool has a shared mutex accessed by each worker thread, which is used for pushing work on heartbeat and for dequeuing work. https://github.com/judofyr/spice/blob/167deba6e4d319f96d9d67...

"Adding more threads to the system will not make your program any slower" is an insane claim to be making for any lightweight task scheduling system, and trivially not true looking at this: if you have 1000 threads as workers and degenerate tree program than each worker will take turns serializing themselves through the mutex. Things like these are massive red flags to anyone reading the README.

1 comments

The README covers this. Its argument is persuasive. If your point is that the constant is badly tuned for theoretical 1000 core machines that don't exist, I'm not sure I care. A 100ns stall at most every 100us becoming more likely when you approach multiple hundreds of cores is hardly a disaster. In the context of the comment I replied to, the difference between 8 and 16 workers is literally zero, as the wakeups are spaced so the locks will never conflict.

Actually, if you did have a 32k core machine somehow with magical sufficiently-uniform memory for microthreading to be sensible for it, I think it's not even hard to extend the algorithm to work with that. Just put the workers on a 3D torus and only share orthogonally. It means you don't have perfect work sharing, but I'm also pretty sure it doesn't matter.