Hacker News new | ask | show | jobs
by anon373839 50 days ago
I would recommend trying oMLX, which is much more performant and efficient than LM Studio. It has block-level KV context caching that makes long chats and agentic/tool calling scenarios MUCH faster.
1 comments

and it horribly kernel panics when it is running for too long due to Apple does not give a sh over mlx, see list of issues: https://github.com/Harperbot/metal-guard#landed-here-searchi...