Hacker News new | ask | show | jobs
by niklassheth 217 days ago
This is more evidence that Cognition's SWE-1.5 is a GLM-4.6 finetune
2 comments

Can you provide more context for this? (eg Was SWE-1.5 released recently? Is it considered good? Is it considered fast? Was there speculation about what the underlying model was? How does this prove that it's a GLM finetune?)
People saw chinese characters in generations made by swe-1.5 (windsurfs model) and also in the one made by cursor. This led to suspicions that the models are finetunes of chinese models (which makes sense, as there aren't many us/eu strong coding models out there). GLM4.5/4.6 are the "strongest" coding models atm (with dsv3.2 and qwen somewhat behind) so that's where the speculation came from. Cerebras serving them at roughly the same speeds kinda adds to that story (e.g. if it'd be something heavier like dsv3 or kimik2 it would be slower).
Really appreciate this context. Thank you!
I suspect they are referencing the 950tok/s claim on Cognition's page.
Ah. Thx. Blogpost for others: https://cognition.ai/blog/swe-1-5

Takeaway is that this is sonnet-ish model at 10x the speed.

Not at all. Any model with somewhat-similar architecture and roughly similar size should run at the same speed on Cerabras.

It's like saying Llama 3.2 3B and Gemma 4B are fine tunes of each other because they run at similar speeds on NVidia hardware.