| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by arilotter 605 days ago
	This specific model is only trained on 100 billion tokens, so it's not SOTA by any means, but we've got designs on larger training runs later :)