Hacker News new | ask | show | jobs
by qingcharles 591 days ago
The answer is that it depends which models you want to run. I'd get as much VRAM on your GPU as possible. Once that runs out, it'll start using your system RAM.

Some good info here if you dig around:

https://www.reddit.com/r/LocalLLaMA/