|
|
|
|
|
by nojs
789 days ago
|
|
This matches my experience doing it with Elixir/OpenAI/ElevenLabs as well. Depending on the application it’s also possible to fire the whole thing off pre-emptively, and then use the early response unless later context explicitly invalidates it. Another cool trick to get around TTS latency is to maintain an audio cache keyed by semantic meaning, and get the LLM to choose from the cache. This saves high TTS API costs too. |
|