Hacker News new | ask | show | jobs
by mhitza 63 days ago
For sure I was running on autopilot with that reply. Though in Q4 I would expect it to fit, as 24B-A4B Gemma model without CPU offloading got up to 18GB of VRAM usage