Hacker News new | ask | show | jobs
by TaylorAlexander 920 days ago
Yeah it seems the notion of time is sort of not built in conceptually to current systems. You could pick a fixed time constant like 0.1 seconds or 1 second, but it's clear that it's sort of missing something more fundamental.
1 comments

I think if the same LLM were trained on audio and video input instead of text, and produced audio output, including silence tokens, then the notion of time would get "built in". Audio continuation without translation to text has been shown to work. Mixing it with text is also possible. But all this would require a massive network that maybe even be difficult for the world's biggest companies to train and serve at any kind of scale. So it's more of an engineering problem than a theoretical one imho.

Also imho, I think until the context/memory problem is fully solved we won't really see the AI as having any kind of agency. But continuous, low latency interaction would certainly feel like a step towards that.