|
|
|
|
|
by spmurrayzzz
421 days ago
|
|
You can just use llama.cpp instead (which is what ollama is using under the hood via bindings). Just need to make sure youre using commit `d3bd719` or newer. I normally use this with nvidia/cuda, but tested on my mbp and havent had any speed issues thus far. Alternatively, LMStudio has MLX support you can use as well. |
|