|
|
|
|
|
by HarHarVeryFunny
752 days ago
|
|
A true AGI would learn to manipulate it's environment to achieve it's goals, but obviously we are not there yet. An LLM has no goals - it's just a machine optimized to minimize training errors, although I suppose you could view this as an innate hard-coded goal of minimizing next word error (relative to training set), in same way we might say a machine-like insect has some "goals". Of course RLHF provides a longer time span (entire response vs next word) error to minimize, but I doubt training volume is enough for the model to internally model a goal of manipulating the listener as opposed to just favoring surface forms of response. |
|
Perhaps at some point LLMs will start to evolve from the prompt->response model into something more asynchronous and with some activity happening in the background too.