| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gunalx 30 days ago
	The largest problem is available training data actually. They have already done experiments with dittrent sub 10b models with both fine-tuning and fully from scratch. And last I check the fully from scratch captured the language in a better way.