|
|
|
|
|
by MacsHeadroom
1191 days ago
|
|
LLaMA it doesn't require any system RAM to run. It requires some very minimal system RAM to load the model into VRAM and to compile the 4bit quantized weights. But if you use pre-quantized weights (get them from HuggingFace or a friend) then all you really need is ~32GB of VRAM and maybe around 2GB of system RAM for 65B. (It's 30B which needs 20GB of VRAM.) |
|