|
|
|
|
|
by svara
311 days ago
|
|
This is a perfectly fine line of argument imo but the GP didn't say that. LLM research is trying out a lot of different things that move away from just training on next token prediction, and I buy the argument that not doing anything else would be limiting. The model is still fundamentally a next token predictor. |
|