| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by spmurrayzzz 421 days ago
	You can just use llama.cpp instead (which is what ollama is using under the hood via bindings). Just need to make sure youre using commit `d3bd719` or newer. I normally use this with nvidia/cuda, but tested on my mbp and havent had any speed issues thus far. Alternatively, LMStudio has MLX support you can use as well.