My super uninformed theory is that local LLM will trail foundation models by about 2 years for practical use.
For example right now a lot of work is being done on improving tool calling and agentic workflows, which tool calling was first popping up around end of 2023 for local LLMs.
This is putting aside the standard benchmarks which get "benchmaxxed" by local LLMs and show impressive numbers, but when used with OpenCode rarely meet expectations. In theory Qwen3.5-397B-A17B should be nearly a Sonnet 4.6 model but it is not.
You can run Qwen3.5-35B-A3B on 32GB of RAM sure, although to get 'Claude Code' performance, which I assume he means Sonnet or Opus level models in 2026, this will likely be a few years away before its runnable locally (with reasonable hardware).
For example right now a lot of work is being done on improving tool calling and agentic workflows, which tool calling was first popping up around end of 2023 for local LLMs.
This is putting aside the standard benchmarks which get "benchmaxxed" by local LLMs and show impressive numbers, but when used with OpenCode rarely meet expectations. In theory Qwen3.5-397B-A17B should be nearly a Sonnet 4.6 model but it is not.