VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO

Y	Hacker News new \| ask \| show \| jobs

	VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO (arxiv.org)
	33 points by timhigins 1 hour ago

2 comments

noperator 25 minutes ago

Having some success while testing this model out as a replacement for GPT-5 nano in source code security review. Running on RTX 3090 (24 GB VRAM) via vLLM. It's not great on structured output (as noted in the model card) but I'm working around that in my harness.

link

dummydummy1234 3 minutes ago

Can't you just force it to do structured output via constrained generation?

link

aero2146 47 minutes ago

I tried generating the classic pelican svg, but it failed horribly just showing me a rectangle and a black circle...

link

fwipsy 21 minutes ago

I think this is predicted? Part of the story is how they were able to preserve core reasoning ability while cutting knowledge like "pelicans have wings."

> these findings motivate the Parametric Compression-Coverage Hypothesis, which views verifiable reasoning as compressible into compact reasoning cores, while open-domain knowledge and general-purpose competence require broad parameter coverage over facts, concepts, and long-tail scenarios.

link

pylotlight 7 minutes ago

The only real essential item here is tool calling capability is it not? So I assume they tested a strong read/write/edit tool consistency?

link

realitysballs 38 minutes ago

That’s all I needed to hear

link

pylotlight 8 minutes ago

As in, you learnt that a useless test that no one should be using was tested here, that's what you meant right?

link

physPop 37 minutes ago

Its for reasoning not generating art?

link

websap 32 minutes ago

Can you explain this a bit more

link

tyre 13 minutes ago

Imagine you want to make a smaller model that is really good at one thing, say, driving a car. You could remove the parameters that lead it to correctly answer, "What is the powerhouse of the cell?" or, "Who was the first president of the United States?"

It would look really dumb if someone asked it that, but that's fine. You're trying to make a model that is optimized for efficiency for a specific task. As much as possible, you should prune uncorrelated things.

link

pylotlight 7 minutes ago

SVG generation is a useless test, what's there more to know?

link