I would say that properly configured Tortoise is better, but that comes with the massive caveat that Tortoise:
1 - Is a real pain to get 'working right' - it's not even remotely batteries included
and, more importantly:
2 - Is incredibly slow. I've been turning Heart Of Darkness into an audiobook as a unit test and it takes ~30m per paragraph, on average. Add to that the occasional hiccup where a block gets transcribed badly (Tortoise occasionally 'drops out' of it's selected voice) and Tortoise only really works if you have a ton of compute and you still don't mind waiting forever.
Although it wasn't clear to me how voicebox compares.