Hacker News new | ask | show | jobs
by memorydial 497 days ago
If you're planning to do local GenAI, the biggest factor is VRAM, not just GPU power. Even smaller LLMs (e.g., 13B-30B) require a lot of VRAM to run efficiently, and 8GB is borderline for anything beyond toy use. If you’re serious about local GenAI, a second-hand RTX 3090 (24GB VRAM) or a 4070 Ti (16GB VRAM) would be a much better investment than an NPU, which currently isn't widely supported for local inference.