Hacker News new | ask | show | jobs
by srigi 406 days ago
Can you add a recent build of llama.cpp (arm64) to the results pool? I'm really interested in comparing mlx to llama.cpp, but setting up the mlx seems too difficult for me to do by myself.
1 comments

I ran them again several times to make sure the results were fair. My previous runs also had a different 30B model loaded in the background that I forgot about.

LM Studio is an easy way to use both mlx and llama.cpp

anemll [0]: ~9.3 tok/sec

mlx [1]: ~50 tok/sec

gguf (llama.cpp b5219) [2]: ~41 tok/sec

[0] https://huggingface.co/anemll/anemll-DeepSeekR1-8B-ctx1024_0...

[1] https://huggingface.co/mlx-community/DeepSeek-R1-Distill-Lla...

[2] (8bit) https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B-...

Thank you very much.