|
|
|
|
|
by parineum
225 days ago
|
|
I'm not sure what claim your disputing or making with this. What more are LLMs than statistical inference machines? I don't know that I'd assert that's all they are with confidence but all the configurations options I can play with during generation (Top K, Top P, Temperature, etc.) are all ways to _not_ select the most likely next token which leads me to believe that they are, in fact, just statistical inference machines. |
|
It's not an argument - it's a dismissal. It's boneheaded refusal to think on the matter in any depth, or consider any of the implications.
The main reason to say "LLMs are just next token predictions" is to stop thinking about all the inconvenient things. Things like "how the fuck does training on piles of text make machines that can write new short stories" or "why is a big fat pile of matrix multiplications better at solving unseen math problems than I am".