| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gwern 1535 days ago
	Humans produce an astonishing amount of text if you consider all the source code, research data, social media websites, emails etc and project out a decade or two; there is also multimodal and RL to consider as a source of 'tokens' like visual tokens, which have ~infinite data. Text is great, but there is no reason to train only text. It's just a good starting point. But the real question you should be asking is, where would you get the compute to train a model that needs 216t tokens?