Y
Hacker News
new
|
ask
|
show
|
jobs
by
deaux
130 days ago
And that's at unusable speeds - it takes about triple that amount to run it decently fast at int4.
Now as the other replies say, you should very likely run a quantized version anyway.