|
|
|
|
|
by ericmcer
43 days ago
|
|
you actually don’t want it to immediately stop because people say things like “hm” “yeh” during machine output. Maybe you say “no” to someone next to you and don’t want to interrupt output. To confidently interrupt I would want to assert that the user has been speaking for > N time. You could do other things like parse a streaming transcription for keywords but generally it feels like bad UX to me to cut output the second input is detected. Letting the user talk for 1-2s gives a much stronger signal and it isn’t too weird for someone to keep talking for 1.5s after you start. |
|