Hacker News new | ask | show | jobs
by thomspoon 3 days ago
Either ollama or omlx, both are pretty dang performant. Omlx lets you run Claude code locally though as long as you bootstrap it with the right model
2 comments

Omlx is really nice, thanks for the recommendation!
Why would you need Omlx? For speed up?
Has extra KV cache on SSD, and lots more options to tweak. There's experimental TurboQuant and multi token prediction support.