|
|
|
|
|
by HarHarVeryFunny
823 days ago
|
|
The fundamental difference is that humans do learn, permanently (eventually at least), from prediction feedback, however this works. I'm not convinced that STM is necessarily involved in this particular learning process (maybe just for episodic memories?), but it makes no difference - we do learn from the feedback. An LLM can perform one-shot in-context learning, which in conversational mode will include (up to context limit) feedback from it's actions (output), but this is never learned permanently. The problem with LLMs not permanently learning from the feedback to their own actions is that it means they will never learn new skills - they are doomed to only learn what they were pre-trained with, which isn't going to include the skills of any specific job unless that specific on-the-job experience of when to do something, or avoid doing it, were made a part of it. The training data for this does not exist - it's not the millions of lines of code on GitHub or the bug fixes/solutions suggested on Stack Overflow - what would be needed would be the inner thoughts (predictions) of developers as they tackled a variety of tasks and were presented with various outcomes (feedback) continuously throughout the software development cycle (or equivalent for any other job/skill one might want them to acquire). It's hard to see how OpenAI or anyone else could provide this on-the-job training to an LLM even if they let it loose in a programming playground where it could generate the training dataset. How fast would the context fill with compiler/link errors, debugger output, program output etc ... once context was full you'd have to pre-train on that (very slow - months, expensive) before it could build on that experience. Days of human experience would take years to acquire. Maybe they could train it to write crud apps or some other low-hanging fruit, but it's hard to see this ever becoming the general purpose "AI programmer" some people think is around the corner. The programming challenges of any specialized domain or task would require training for that domain - it just doesn't scale. You really need each individual deployed instance of an LLM/AI to be able to learn itself - continuously and incrementally - to get the on-the-job training for any given use. |
|
Are you sure? I think "Open"AI uses the chat transcripts to help the next training run?
> they are doomed to only learn what they were pre-trained with
Fine-tuning.
> The training data for this does not exist
What does "this" refer to? Have you read the Voyager paper? (https://arxiv.org/abs/2305.16291) Any lesson learnt in the library could be used for fine-tuning or the next training run for a base model.
> what would be needed would be the inner thoughts (predictions) of developers as they tackled a variety of tasks and were presented with various outcomes (feedback) continuously throughout the software development cycle
Co-pilot gets to watch people figure stuff out - there's no reason that couldn't be used for the next version. Not only does it not need to read minds, but people go out of their way to write comments or chat messages to tell it what they think is going on and how to improve its code.
> Days of human experience would take years to acquire
And once learnt, that skill will never age, never get bored, never take annual leave, never go to the kids' football games, never die. It can be replicated as many millions of time as necessary.
> they could train it to write crud apps
To be fair, a lot of computer code is crud apps. But instead of learning it in one language, now it can do it in every language that existed on stackoverflow the day before its training run.