| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by GaggiX 741 days ago
	GPT-3 was 173B parameters and it's very bad compare to much smaller models we have nowadays, the data and the compute play a giant role, also I doubt you would need to train a model further after you have trained it on absolute everything (but we are very far from that).