| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hnhn34 302 days ago
	>but after Deepseek stunned the world with the R1 model, subsequent models got heavily censored and languished in relative obscurity. I'm pretty sure this isn't what happened. DeepSeek just hasn't released a big model upate. But in the meantime, Qwen, Bytedance, Ziphu and Moonshot AI have released extremely impressive models, some of which are SOTA or close to it. The open source/open weight world is still in love with Chinese labs as they keep releasing cool stuff and filling the void left by Meta and Mistral.

2 comments

Sammi 301 days ago

None of these are SOTA in any benchmark I've seen. Some of them get in the top 10 or top 5 some times.

What these Chinese models have that is interesting is that they are much cheaper to run and they are open source. This has pushed the other closed source SOTA models to make all the big updates we have seen the last few months. I guess we can thank these Chinese models for creating some competitive pressure, which has pushed the forefront, but they aren't doing so by leading the forefront.

link

freilanzer 298 days ago

Qwen scores very high in the MTEB: https://huggingface.co/spaces/mteb/leaderboard

link

freilanzer 301 days ago

Qwen3-Embedding is really good, imo. It allows me to release a global feature that would have been extremely difficult otherwise.

link