Hacker News new | ask | show | jobs
by pickettd 705 days ago
The Android apk for MLC is updated frequently with recent models built-in. And a Samsung S24+ can comfortably run 7-8B models at reasonable speeds (10ish tokens/sec).

https://llm.mlc.ai/docs/deploy/android.html