Wouldn't you have exactly 1 thread per CPU for this sort of brute force embarrassingly parallel, CPU intensive computation? You should be context switch approximately never which makes coroutine vs thread moot.
The point was more to see what the overhead is for CPU-bound tasks that do need to switch, not that this is the best approach for this particular task. I just happened to have the threaded version available, so I thought it made a good point of comparison.