| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by daxfohl 848 days ago
	Seems like it could be useful for resizing the networks, no? Start with ChatGPT 4 then release an open version of it with much fewer parameters. Or maybe some metaparameter that mucks with the sizes during training produces better results. Start large to get a baseline, then reduce size to increase coherence and learning speed, then scale up again once that is maxed out.