|
|
|
|
|
by SushiHippie
941 days ago
|
|
As the model is very small you should be able to run any quantization level on a M-Series macbook with at least 16GB of ram.
The best one speed/quality wise will probably be Q6_K. As it has not much difference in quality with Q8, but will be definitely faster than Q8. Haven't tried this one specifically but I always run the 7B parameter models on a M2 Pro with Q6_K or Q4_K_M (depending on how fast I want it). See also this table in the readme, which states that Q8 only needs ~10GB of RAM:
https://huggingface.co/TheBloke/MonadGPT-GGUF?text=Hey+my+na... |
|