Wouldn't it make a lot of sense to just use Pyspark with RDDs? Latency would be relatively high, but it'd also bypass the GIL while also being more modern.
In my experience pyspark is much more flaky and annoying that doing parallel computing with more 'python native' tools. It only really makes sense when you outgrown small clusters and really need huge infrastructure.