| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cbuskilla 2277 days ago
	Sure! It is the 90M params models and they trained models up to almost 10B params so I guess it gets better with the size (Didn't try way too expensive). And I agree about the alice derivates mitzuku is nice without doing anything fancy.