Hacker News new | ask | show | jobs
by blablablub 1216 days ago
So what hardware do we need to run this model?
1 comments

7 billion can run on 16+ gb GPUs as fp16, 14 billion can be run on 16+ gb if quantized to int8. 14G @ fp16 and 30G at int8 will require one of the 48 gb cards (less, but hardware mostly goes 24 -> 48).
Requirements could be reduced with something like DeepSpeed or ColossalAI (or even just simple hacks to move bits to RAM more aggressively)
thanks