|
|
|
|
|
by iNic
366 days ago
|
|
The mere token prediction comment is wrong, but I don't think any of the other comments really explained why. Next token prediction is not what the AI does, but its goal. It's like saying soccer is a boring sport having only ever seen the final scores. The important thing about LLMs is that they can internally represent many different complex ideas efficiently and coherently! This makes them an incredible starting point for further training. Nowadays no LLM you interact with will be a pure next token predictor anymore, they will have all gone through various stages of RL, so that they actually do what we want them to do. I think I really feel the magic looking at the "circuit" work by Anthropic. It really shows that these models have some internal processing / thinking that is complex and clever. |
|
The Transformer circuits[0] suggest that this representation is not coherent at all.
[0] https://transformer-circuits.pub