Hacker News new | ask | show | jobs
by kwerk 819 days ago
I have two 3090s and it runs fine with `ollama run mixtral`. Although OP definitely meant mistral with the 7B note
1 comments

ollama run mixtral will default to the quantized version (4bit IIRC). I'd guess this is why it can fit with two 3090s.