Hacker News new | ask | show | jobs
by dannyw 806 days ago
You can QLoRA decent models on 24GB VRAM. There’s also optimised kernels like Unsloth that are really VRAM efficient and good for hobbyists.
1 comments

Yes, but I still don't think you'll be able to run Mixtral 8x22b with 16GB VRAM, or QLoRA it, even with Unsloth. It's much bigger than the original Mixtral.