Hacker News new | ask | show | jobs
by nshm 1228 days ago
In such an actively developed area like TTS/ASR there is high chance that custom solution would fit your needs much better. The feature set of TTS is actually pretty large and hard to combine in a single ML model. No free lunch you know.

For example if you look for singing voice, they might suggest you an adapted model that is good specifically for singing.

The testing process is also not very straight, you need to understand what to test and how to test properly. For example, some of their voices might be better for questions, some for news.

You'd better talk to them.