Hacker News new | ask | show | jobs
by woodson 412 days ago
Human-to-human conversational patterns are highly specific to cultural and contextual aspects. Sounds like I’m stating the obvious, but developers regularly disregard that and then wonder why things feel unnatural for users. The “median delay” may not be the most useful thing to look at.

To properly learn more appropriate delays, it can be useful to find a proxy measure that can predict when a response can/should be given. For example, look at Kyutai’s use of change in perplexity in predictions from a text translation model for developing simultaneous speech-to-speech translation (https://github.com/kyutai-labs/hibiki).