| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by otabdeveloper4 132 days ago
	> are materially ahead of every open source model out there at this time They aren't. Any difference is in sampling parameters and post-training flavor choices. These aren't things that are "materially ahead", that's basically just LLM themes.

1 comments

achompas 132 days ago

I’m sorry but you’re demonstrably incorrect.

Listen, I want more open weight models in the world. They create entrepreneurial opportunities and support use cases which the foundation labs don’t want to support.

But open weight models are consistently three to six months behind on performance compared to closed models, as confirmed by both benchmarks and personal use. They’re closer on coding and much further away on non-coding tasks.

There are theories as to why these models lag, which I won’t get into. But anyone claiming open-weight models are close to closed-weight models is ignoring significant evidence to the contrary.

link

otabdeveloper4 131 days ago

> three to six months behind on performance

Yeah, like I said - it's just a post-training difference. That's not a material difference, that's a difference of chrome and polish.

link

ninjagoo 132 days ago

> I’m sorry but you’re demonstrably incorrect.

Please so demonstrate?

link

achompas 132 days ago

The onus isn’t on me. It’s on anyone contradicting findings by most benchmarks, because most of them show a clear advantage for Opus and GPT over OSS models.

link

ninjagoo 132 days ago

So Big Claim No Demonstration? :-)

link

orf 132 days ago

I mean just use them and compare, the gap is obvious.

link

otabdeveloper4 131 days ago

I did, and I fixed Qwen's issues with trivial sampling and loop detection hacks.

If I can do this, then a company that wants to sell local models seriously could do it too.

link

ninjagoo 131 days ago

> I did, and I fixed Qwen's issues with trivial sampling and loop detection hacks.

Wow, that's amazing! Care to share the changes? Would love to try them out.

link

otabdeveloper4 131 days ago

It's not amazing at all.

What's amazing is that LLM technologies are so immature that even basic engineering diligence isn't being done. (Like detecting token loops, for example.)

link