Hacker News new | ask | show | jobs
by bloody-crow 896 days ago
I don't play games or do anything too resource-demanding on my phone normally. Pro models typically have more memory than non-pro models and running LLMs on device might be the only scenario where it can realistically make a difference for me.
1 comments

Smaller 3B LLMs (like phi-2) work fine on newer non pro models, at full context lengths. Running 7B models on even 8GB iPhone 15 Pro and Pro Max phones involves reducing the context lengths to 1k or fewer tokens, because the full context length KV cache won't fit on these devices.