Hacker News new | ask | show | jobs
by fnbr 743 days ago
The rule of thumb is roughly 44gb, as most models are trained in bf16, and require 16 bits per parameter, so 2 bytes. You need a bit more for activations, so maybe 50GB?

you need enough RAM and HBM (GPU RAM) so it’s a constraint on both.

2 comments

Which GPU card can I buy to run this model? Can it run on commercial RTX3090 or does it need a custom GPU?
3090 or 4090 will be able to run quantized 22B models.

Though realistically for code completion smaller models will be better due to speed

Easy..
Most GPUs still use GDDR I'm pretty sure, not HBM. Do you mean VRAM?