| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by oidar 462 days ago
	ElevenLabs is the only one offering speech to speech generation where the intonation, prosody, and timing is kept intact. This allows for one expressive voice actor to slip into many other voices.

1 comments

goshx 462 days ago

OpenAI’s Realtime speech to speech is far superior than ElevenLabs.

link

noahlt 462 days ago

What ElevenLabs and OpenAI call “speech to speech” are completely different.

ElevenLabs’ takes as input audio of speech and maps it to a new speech audio that sounds like a different speaker said it, but with the exact same intonation.

OpenAI’s is an end-to-end multimodal conversational model that listens to a user speaking and responds in audio.

link

goshx 461 days ago

I see now. Thank you for clarifying. I thought this about ElevenLabs Conversational API.

link