| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by nojs 835 days ago

This matches my experience doing it with Elixir/OpenAI/ElevenLabs as well.

Depending on the application it’s also possible to fire the whole thing off pre-emptively, and then use the early response unless later context explicitly invalidates it.

Another cool trick to get around TTS latency is to maintain an audio cache keyed by semantic meaning, and get the LLM to choose from the cache. This saves high TTS API costs too.

1 comments

Dowwie 835 days ago

appointment scheduling seems like an ideal consumer of cached audio responses, but how can segments be concatenated into a naturally sounded response?

link