Hacker News new | ask | show | jobs
by saltwounds 248 days ago
OpenAI and Anthropic's real moat is hardware. For local LLMs, context length and hardware performance are the limiting factors. Qwen3 4B with a 32,768 context window is great. Until it begins filling up and performance drops quickly.

I use local models when possible. MCPs work well, but their large context injection makes switching to an online provider the no-brainer.