Hacker News new | ask | show | jobs
by DanWaterworth 5528 days ago
I too doubt that they have any thing new with regard to parallelization of imperative programs. If the only way to utilize all cores is either by writing functionally or by manual parallelization then they don't really have a significant advantage over FPGA coprocessors. I do agree with them however when they say that this approach is better than creating lots of general purpose cores.
1 comments

What no one has mentioned is that 4000 stacks is a lot of memory.
Not if they're done as split/segmented stacks (http://gcc.gnu.org/wiki/SplitStacks)-- basically, you have a collection of 4KB stack pages for each thread instead of one large up-front allocation, and you grow it as needed. It costs a few instructions per function entry/exit, but overall cost is negligible and it allows you to run thousands of coroutines without issues.
If you allocate the stacks contiguously using mmap then memory is only used as it is accessed. That's not the problem. The problem is that 4000 concurrent non-trivial threads is a resource hog no matter how the stacks are allocated.