Hacker News new | ask | show | jobs
by JohnTheNerd 884 days ago
it is almost entirely the LLM. I can see this in action by typing a response on my computer instead of using my phone/watch, which bypasses Whisper and Piper entirely.

your approach would work, but I really like the creativity of having the LLM generate the whole thing. it feels much less robotic. 8 seconds is bad, but not quite unusable.