| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by colechristensen 23 days ago
	No, training a state of the art model involves training on the order of 10 trillion tokens. We're talking about a step that updates weights based on say between 10k and 1M tokens.

1 comments

I learned something. Thank you!