| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sahil_chaudhary 1188 days ago
	All included it costs under 70$ for the 13B model. Training 65B now so will report what that will cost.

2 comments

For the 65B fine tune, did you add another A100 node? Or just drop batch size?

Any chance you’re up to sharing the training parameters?

Dropping the batch size

Please do! Also please include how you’re calculating the costs.