Hacker News new | ask | show | jobs
by jkbl 1300 days ago
Yeah, but that's only true when you use one model for yourself. More VRAM is needed for running such a service. It currently loads 6 models per single GPU. And I think I have some VRAM left to add even more.