| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nyoncore 871 days ago
	Isn't it obvious that since LLM are trained to predict the next word they do better than to predict the previous one?

1 comments

frotaur 871 days ago

In the paper it is mentioned that the LLMs predicting the previous token are indeed pre-trained in this way, so it is not true that the difference is obvious.

link