Hacker News new | ask | show | jobs
by Rastonbury 580 days ago
GPU VRAM is the bottleneck currently, check out r/localLlama for benchmarks and calculators for what models can fit into what cards approximately