| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by saulpw 58 days ago
	Presumably $100m to train the 70B model? I think you're assuming that the author meant you can take an existing 70B model and run it in 16GB. But it stands to reason that "no loss in capability" means it had to be trained under those constraints.