|
|
|
|
|
by sanchitmonga22
94 days ago
|
|
Respectfully, the benchmarks show it is different. MetalRT and mlx-lm use the exact same model files, identical 4-bit MLX weights. That makes it a pure engine-to-engine comparison: LLM decode: MetalRT is 1.10-1.19x faster across all models tested STT: 70s audio in 101ms vs 463ms (4.6x faster) TTS: 178ms vs 493ms (2.8x faster) mlx-lm is a general-purpose array computation framework that also supports inference. MetalRT is purpose-built for inference only. That focus is where the performance gap comes from. You can reproduce these numbers yourself: rcli bench runs the same benchmarks we published. Full methodology: https://www.runanywhere.ai/blog/metalrt-fastest-llm-decode-e... Yes, MetalRT is closed-source. We're transparent about that. The performance difference is the reason it exists. |
|