Y
Hacker News
new
|
ask
|
show
|
jobs
by
_ea1k
819 days ago
ollama run mixtral will default to the quantized version (4bit IIRC). I'd guess this is why it can fit with two 3090s.