Hacker News new | ask | show | jobs
by webnrrd2k 38 days ago
This is more me thinking out loud than a fully formed theory, but I suspect that a really large context might be useful when LLMs control more physical things. The huge context could be used to help encode the huge amount of implicit knowledge that ~4 billion years of evolution has crammed into our bodies. Plus all the junk we learned growing up, too. Stuff like vision processing, object permanence, all the unstated common-sense stuff humans are good at. Right now LLMs are used mostly for textual or data-processing tasks, but they will do more physical stuff, too.

It seems far more likely that it would all get baked-in to the LLM during training, but maybe it will turn out to be really useful to train up a "generic robot controller LLM" and pass in a huge number of tokens to better optimize it.

1 comments

but even if the context is big, the attention still has to sift through it