|
|
|
|
|
by koljab
402 days ago
|
|
That's a great question! My first implementation was interruption on voice activity after echo cancellation. It still had way too many false positives. I changed it to incoming realtime transcription as a trigger. That adds a bit of latency but that gets compensated by way better accuracy. Edit: just realized the irony but it's really a good question lol |
|
Also, it took me longer than I care to admit to get your irony reference. Well done.
Edit: Just to expand on that in case it was not clear, this would be the ideal case I think:
LLM: You're going to want to start by installing XYZ, then you
Human: Ahh, right
LLM: Slight pause, makes sure that there is nothing more and checks if the reply is a follow up question/response or just active listening
LLM: ...Then you will want to...