Hacker News new | ask | show | jobs
by FLT8 1167 days ago
Vicuna-13B loads and idles at ~26GB RAM usage on a M1Max/64GB. When answering questions, that grows to around 75GB, and yes, you can feel it (and the machine) slow down significantly when it starts hitting swap. I think realistically you'd be wanting to stick to the 7B model on a 32G machine (even if you could get the weight deltas to apply correctly).