Hacker News new | ask | show | jobs
by tdj 1598 days ago
I built a similar app using a Kaldi's nnet3 model running embedded; the thing was so responsive that our demo to an SVP went sideways: when he gave a query, the app responded nearly immediately after the sentence ended. The SVP did not realize it already responded, as the expectation for voice interaction systems was that it takes like 2-5 seconds to get an answer, which made the impression that the system did not work properly.

So, moral of the story, if you do a too good job of making a fast speech engine, especially for multi-turn dialogues, add some delays so it resembles human dialogue more.