| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by WhitneyLand 76 days ago
	StepFun is an interesting model. If you haven’t heard of it yet there’s some good discussion here: https://news.ycombinator.com/item?id=47069179

2 comments

tarruda 76 days ago

Since that discussion, they released the base model and a midtrain checkpoint:

- https://huggingface.co/stepfun-ai/Step-3.5-Flash-Base

- https://huggingface.co/stepfun-ai/Step-3.5-Flash-Base-Midtra...

I'm not aware of other AI labs that released base checkpoint for models in this size class. Qwen released some base models for 3.5, but the biggest one is the 35B checkpoint.

They also released the entire training pipeline:

- https://huggingface.co/datasets/stepfun-ai/Step-3.5-Flash-SF...

- https://github.com/stepfun-ai/SteptronOss

link

lostmsu 76 days ago

Tuned Qwen 3.5 27B beats Step 3.5 on almost all benchmarks, so the point about the size class is moot.

link

tempaccount420 76 days ago

Benchmarks are not interesting in deciding the "size class". Bigger size means more knowledge. Also, the Qwen 3.5 27B is a dense 27B active parameter model. StepFun 3.5 Flash has 11B active parameters.

link

lostmsu 75 days ago

> Bigger size means more knowledge.

Qwen 3.5 27B beats StepFun 3.5 Flash on GPQA Diamond too, so probably no.

link

tarruda 75 days ago

Benchmarks don't tell the whole story. For one-shot coding tasks, I found Step 3.5 Flash to be stronger even than Qwen 3.5 397B.

link

anentropic 75 days ago

Benchmarks don't tell the whole story... for that you need anecdotes from random HN posters :)

link

skysniper 76 days ago

thanks for the info. before running the bench i only tried it in arena.ai type of tasks and it was not impressive. i didn't expect it to be that good at agentic tasks

link