| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by a2128 409 days ago
	To be clear, this is not a model trained on zero data, this is a pretrained model (Qwen 2.5 trained on 18 trillion tokens) finetuned using self-generated data grounded by a Python interpreter

2 comments

scotty79 409 days ago

I think at this point the initial process of exposing the empty model to all the available domain data in bulk is no longer interesting to many people. It's an obvious first step so it's barely mentioned anymore. What's currently worked on is what you do afterwards to get a useful tool in the end.

link

ethan_smith 408 days ago

The breakthrough here is eliminating the need for human-labeled reasoning data while still achieving SOTA results, which has been a major bottleneck in developing reasoning capabilities.

link