Hacker News new | ask | show | jobs
by spion 1201 days ago
llama.cpp needs 40GB for the 65B model (due to int4 quantization)

RamNeeded(other_size) ~= 40GB * other_size/65B