For those curious, this implementation is based on a recent line of research called "heartbeat scheduling" which amortizes the overheads of creating parallelism, essentially accomplishing a kind of dynamic automatic granularity control.
Oh this is super interesting. I was only aware of the two first while writing Spice. I’ll definitely look into the two last as well. Thanks for sharing!