Hacker News new | ask | show | jobs
by MacsHeadroom 928 days ago
The answer is yes if you have a 24GB GPU. Just wait for 4bit quantization.

Or watch Tim Dettmers, who is releasing code to run Mixtral 8x7b in just 4GB of RAM.