Hacker News new | ask | show | jobs
by woadwarrior01 894 days ago
Smaller 3B LLMs (like phi-2) work fine on newer non pro models, at full context lengths. Running 7B models on even 8GB iPhone 15 Pro and Pro Max phones involves reducing the context lengths to 1k or fewer tokens, because the full context length KV cache won't fit on these devices.