> Indian accents are narrative and not conversational in nature
What are some examples of how these things differ? I've been exploring Hindi recently, but find that I'm learning some pretty stuffy speech from Snell's books.
The way I think about realistic conversational speech is that if you get a phone call, you should not be able to tell whether it is an AI or a human just based on the voice. For English and even some asian languages like Chinese, this has already happened.
If you are a non-Hindi speaker and want to understand the difference, then I might find it difficult to explain :P But whatever you are learning, if you start practicing with a native speaker, I am sure you will easily surpass the SoTA hindi TTS models.
If you are a non-Hindi speaker and want to understand the difference, then I might find it difficult to explain :P But whatever you are learning, if you start practicing with a native speaker, I am sure you will easily surpass the SoTA hindi TTS models.
Non-conversational example: https://www.youtube.com/watch?v=ayYk3XkP0ts&t=22s&ab_channel...
You can list to this and understand easily that its AI generated speech. However, it works very well for dubbing etc.