Hacker News new | ask | show | jobs
by kuandriy 151 days ago
The end-to-end speech-to-speech claim is interesting, especially avoiding the ASR→LLM→TTS pipeline, which is where most latency and error compounding happens.