|
|
|
|
|
by Chabsff
221 days ago
|
|
> They do it by iteratively predicting the next token. You don't know that. It's how the llm presents, not how it does things. That's what I mean by it being the interface. There's ever only one word that comes out of your mouth at a time, but we don't conclude that humans only think one word at a time. Who's to say the machine doesn't plan out the full sentence and outputs just the next token? I don't know either fwiw, and that's my main point. There's a lot to criticize about LLMs and, believe or not, I am a huge detractor of their use in most contexts. But this is a bad criticism of them. And it bugs me a lot because the really important problems with them are broadly ignored by this low-effort, ill-thought-out offhand dismissal. |
|
Yes. We know that LLMs can be trained by predicting the next token. This is a fact. You can look up the research papers, and open source training code.
I can't work it out, are you advocating a conspiracy theory that these models are trained with some elusive secret and that the researchers are lying to you?
Being trained by predicting one token at a time is also not a criticism??! It is just a factually correct description...