Hacker News new | ask | show | jobs
by bazlan 674 days ago
As someone who has worked in TTS for over 4 years now. I can tell you that evaluation is the most difficult aspect of generative audio ML.

How will this really check that the models are performing well vs just listening?

1 comments

We're focused on end-to-end evals focused on function-call accuracy, style, tone & latency of the conversations between our sims and your voice agent. Less focused on pure TTS evals at the moment!