Y
Hacker News
new
|
ask
|
show
|
jobs
by
mark_l_watson
941 days ago
Another data point: I can (barely) run a 30B 4 bit quantized model on a Mac Mini with 32G on chip memory but it runs slowly (a little less than 10 tokens/second).
13B and 7B models run easily and much faster.