Hacker News new | ask | show | jobs
by oidar 462 days ago
ElevenLabs is the only one offering speech to speech generation where the intonation, prosody, and timing is kept intact. This allows for one expressive voice actor to slip into many other voices.
1 comments

OpenAI’s Realtime speech to speech is far superior than ElevenLabs.
What ElevenLabs and OpenAI call “speech to speech” are completely different.

ElevenLabs’ takes as input audio of speech and maps it to a new speech audio that sounds like a different speaker said it, but with the exact same intonation.

OpenAI’s is an end-to-end multimodal conversational model that listens to a user speaking and responds in audio.

I see now. Thank you for clarifying. I thought this about ElevenLabs Conversational API.