| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by CuriouslyC 1217 days ago
	7 billion can run on 16+ gb GPUs as fp16, 14 billion can be run on 16+ gb if quantized to int8. 14G @ fp16 and 30G at int8 will require one of the 48 gb cards (less, but hardware mostly goes 24 -> 48).

2 comments

Requirements could be reduced with something like DeepSpeed or ColossalAI (or even just simple hacks to move bits to RAM more aggressively)

thanks