Y
Hacker News
new
|
ask
|
show
|
jobs
by
nyoncore
871 days ago
Isn't it obvious that since LLM are trained to predict the next word they do better than to predict the previous one?
1 comments
frotaur
871 days ago
In the paper it is mentioned that the LLMs predicting the previous token are indeed pre-trained in this way, so it is not true that the difference is obvious.
link