Hacker News new | ask | show | jobs
by wmertens 17 days ago
What do you mean? In the whitepaper they say that the original can't run on an iPhone 17 at all, and on an M4 the Bonsai version runs 5.6x faster than the original.

This quantization has a small order of magnitude improvement on memory and compute requirements, how can it be slower?

And all that while retaining quality.