| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by roscas 311 days ago

From my experience, qwen3-coder is way better. I only have gpt-oss:20b installed to make a few more tests but I give it a program to make a summary of what it does and qwen3 just works in a few seconds, while gpt-oss was cancelled after 5 minuts... doing nothing.

So I just use qwen3. Fast and great ouput. If for some reason I don't get what I need, I might use search engines or Perplexity.

I have a 10GB 3080 and Ryzen 3600x with 32gb of RAM.

Qwen3-coder is amazing. Best I used so far.

5 comments

lvl155 311 days ago

Qwen3 coder 480B is quite good and on par with Sonnet 4. It’s the first time I realized the Chinese models are probably going to eclipse US-based models pretty soon, at least for coding.

link

indigodaddy 311 days ago

Where do you use qwen3 480b from, I'm not even seeing it on Openrouter. EDIT nm, openrouter is just calling it qwen3-coder-- when I click for more info it shows it's Qwen3-Coder-480B-A35B-Instruct. And it's one of their free models. Nice

link

tough 311 days ago

cerebras code (both sub and api) have it

link

faangguyindia 311 days ago

what edit format u use with Qwen? https://aider.chat/docs/more/edit-formats.html

diff is failing me or do you guys use whole?

link

cpursley 311 days ago

That might be a stretch, maybe Sonnet 3.5. But it is pretty impressive as is Kimi on opencode.

link

mhitza 311 days ago

I've been using lightly gpt-oss-20b but what I've found is that for smaller (single sentence) prompts it was easy enough to have it loop infinitely. Since I'm running it with llama.cpp I've set a small repetition penalty and haven't encountered those issues since (I'm using it a couple of times a day to analyze diffs, so I might have just gotten lucky since)

link

nicolaslem 311 days ago

I had the same issue with other models where they would loop repeating the same character, sentence or paragraph indefinitely. Turns out the context size some tools set by default is 2k and this is way too small.

link

ModelForge 311 days ago

I’ve been using the ollama version (uses about 13 Gb RAM on macOS) and haven’t had that issue yet. I wonder if that’s maybe an issue of the llama.cpp port?

link

mhitza 311 days ago

Never used ollama, only ready to go models via llamafile and llama.cpp.

Maybe ollama has some defaults it applies to models? I start testing models at 0 temp and tweak from there depending how they behave.

link

smokel 311 days ago

The 20B version doesn't fit in 10GB. That might explain some issues?

link

SV_BubbleTime 311 days ago

Are you using this in an agentic way or in a copy and paste and “code this” single input single output way?

I’d like to know how far the frontier models are from the local for agentic coding.

link

panki27 311 days ago

What Qwen3-Coder model are you using? Quantized or not?

Asking because I'm looking for a good model that fits in 12GB VRAM.

link