| I have no affiliation with them but here's what I think happened: 1. They claim the official model is based on Qwen 397B. It's likely they didn't disclose Nex Pro at all because Nex itself is based on the same base model (not saying they shouldn't). 2. The improvement would come from merging the weights PLUS on-policy distillation. The confusion is that the uploaded model didn't have the distillation at all. 3. It's important to notice they didn't advertise the model besides posting it on Reddit 2 days ago. It became viral organically, over the weekend, and during Brazil's World Cup debut (Brazilians will understand). Of course the mayor of Rio took the opportunity to capitalize over the free coverage, but that wasn't done in conjunction with the researchers. 4. I don't see why they would disclose Qwen 397B as base and mention the SwiReasoning paper but not mention Nex if all they did was to merge both models. 5. In any case, what they are claiming is easily verifiable once (if) they upload the right model. |
https://news.ycombinator.com/item?id=48529544