|
|
|
|
|
by Jack000
917 days ago
|
|
I think a lot of the confusion on whether LLMs can think stems from the fact that LLMs are purely models of language and solve intelligence as a kind of accidental side-effect. The real problem that an LLM is trying to solve is to create a model that can enumerate all meaningful sequences of words. This is just an insane way of approaching the problem of intelligence on the face of it. There's a huge difference between a model of language and an intelligent agent that uses language to communicate. What LLMs show is that the hardest problem - of how to get emergent capabilities at scale from huge quantities of data - is solved. To get more human-like thinking, all that is needed is to find the right pre-training task that more closely aligns with agentic behavior. This is still a huge problem but it's an engineering problem and not one of linguistic theory or philosophy. |
|
Think about what is contained (explicitly and implicitly) in all the text we can feed a model. It's not just language, but a projection of the world as humans see it.
GPT-3.5 Instruct Turbo can play valid chess at about 1800 ELO, no doubt because of the chess games described in PGN in the training set. Does Chess suddenly become a language ability because it was expressed in Text ? No