|
|
|
|
|
by zellyn
422 days ago
|
|
Exactly. You want to come close to maxing out your RAM for model+context. I've run Gemma on a 64GB M1 and it was pretty okay, although that was before the Quantization-Aware Training version released last week, so it might be even better now. |
|