| Every article on this subject is the same: 1. Deny there's hype 2. Start engaging in hype LLMs are a just a tool. Yes, "just". They're good at certain tasks. Mainly ones that have a lot of training data available and don't require precise output. Images, writing, boilerplate code, etc. For any other task their usefulness falls of sharply. > What it and its contemporaries have set loose in such a short period of time, and what is to follow on their heels tomorrow, is the real crux of the matter. What is the idea that LLMs will get significantly better based on? That the technology finally got good enough to be useful is great, but past performance is not a predictor of future performance. Predictions need to be based on the nature of the technology in question. 1. There's only so much usable training data available out there and I suspect most of it is already being used. Once a programming LLM is trained on the code on Github it has basically seen it all. 2. They are incapable of learning anything. Not even trivial concepts. Therefore there's no real way for them to build up their understanding step-by-step like a human would. |
1a) say it's incapable of thinking/reasoning/learning/understanding/etc
1b) argue about it's ability to think/reason/learn/understand/etc
In this case:
> They are incapable of learning anything
ChatGPT seems to have learned about plenty of things before 2021. If you ask it about concepts like copyright or gravity or communism, or ask it to "explain quantum computing in simple terms", as the landing page suggests, it displays a fairly good understanding of these concepts.
Human memory is divided into short-term and long-term (along with a bunch more concepts). It's true that ChatGPT lacks the ability to commit information from its short term memory (ie "chats") into it's long-term memory via inference, but GPT-4-Mawr, fine-tuned or re-trained on text including a new concept you invented would display some level of understanding (depending on how much text you had to fine tune it on).
On a per-chat basis, ChatGPT is able to, up to the limits of the context window, know things, and it's entirely fair to point out that a) the context window is fairly limited (8000 tokens, or 32k paid) and b) beyond that, it forgets what it was told. Some people are described as having the memory of a goldfish, and that description applies to ChatGPT. If you can fit your concept into the context window, ChatGPT-4 is able to do some reasoning about it. Thus, it's able to learn and understand new things, but forgets them just as quickly.
Humans sleep every day or so, and during that time the brain gets to do things that it isn't able to do while awake, and it's been suggested that sleep is important for integrating the day's activities and thus learning of advanced concepts. Assuming that's true, then we're just looking at different timescales.
If ChatGPT "sleeps" every couple of months, and learns everything it was told between last time it was trained, it's just operating on different timescales than humans do.