| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sigmoid10 714 days ago
	LLMs actually scale extremely well just by throwing compute at them. That's the whole reason they took off. Training a bigger model or training it longer or increasing the dataset all work more or less equally well. Now that we've saturated the dataset component (at least for human written text) pretty much, everyone throws their compute at bigger models or more epochs.