|
|
|
|
|
by dajonker
151 days ago
|
|
Great, I've been experimenting with OpenCode and running local 30B-A3B models on llama.cpp (4 bit) on a 32 GB GPU so there's plenty of VRAM left for 128k context. So far Qwen3-coder gives the me best results. Nemotron 3 Nano is supposed to benchmark better but it doesn't really show for the kind of work I throw at it, mostly "write tests for this and that method which are not covered yet". Will give this a try once someone has quantized it in ~4 bit GGUF. Codex is notably higher quality but also has me waiting forever. Hopefully these small models get better and better, not just at benchmarks. |
|
- Tool calling doesn't work properly with OpenCode
- It repeats itself very quickly. This is addressed in the Unsloth guide and can be "fixed" by setting --dry-multiplier to 1.1 or higher
- It makes a lot of spelling errors such as replacing class/file name characters with "1". Or when I asked it to check AGENTS.md it tried to open AGANTS.md
I tried both the Q4_K_XL and Q5_K_XL quantizations and they both suffer from these issues.