Hacker News new | ask | show | jobs
Show HN: Speech-1, TTS to reduce AI call drop-rate (metavoice.io)
4 points by vatsalaggarwal 464 days ago
We’re releasing access to Speech-1, our conversational speech model designed specifically for customer phone calls (8khz telephony) to reduce call drop rates.

Try it: https://tts.metavoice.io Samples: https://www.metavoice.io

Voice AI calls today face >30% call drop rates with current synthetic voices. We want to make this tend to 0 by making AI speech far more human-like and engaging.

Current synthetic voices are overly perfect, and don’t sound like the typical human phone agent. They have audiobook-like narration, erratic emotions (e.g. over-enthusiastic “Got it!”), and lack natural-sounding imperfections (like breaths, hmm, umm). They also can’t reliably speak numbers, spellings, emails, etc. (& often produce chipmunk like sped-up speech).

Speech-1 fixes these problems!

We’re also releasing a research preview of Speech-1.1. It’s trained using ProsodyAlign, a novel post-training technique to more closely capture human-like speech patterns – pauses, stress, and how the voice goes up/down in tone on different words. However, it is currently less stable than Speech-1.

For more details, read our blogpost: https://www.metavoice.io/blog-posts/conversational-speech-mo...

1 comments

Really impressed by the quality of the voice!