Y
Hacker News
new
|
ask
|
show
|
jobs
by
MacsHeadroom
928 days ago
The answer is yes if you have a 24GB GPU. Just wait for 4bit quantization.
Or watch Tim Dettmers, who is releasing code to run Mixtral 8x7b in just 4GB of RAM.