Hacker News new | ask | show | jobs
by PhilippGille 146 days ago
OP asked:

> Is anyone doing true end-to-end speech models locally (streaming audio out), or is the SOTA still “streaming ASR + LLM + streaming TTS” glued together?

Your setup is the latter, not the former.