| HN Mirror

I don't think these systems' purpose is to fool humans. Tasks that test whether a system can fool people are simply a good way to evaluate the performance a system. If a speech synthesis fools people into thinking a real person is speaking, that means the speech synthesis is really good. You might say it's not important that a speech synthesis sounds perfectly human but our speech perception evolved to be optimal for human speech, so it's likely that any deviation from that makes the signal harder to process.