| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by futureshock 531 days ago
	It seems an interesting fine-tuning idea. Drawing from reasoning models, I wonder if it’s effective to 10x or 100x the fine-tune dataset by having a larger reasoning model create documentation and reasoning COTs about the code base’s current state and speculation about future state updates. Maybe have it output some verbose execution flow analysis.

1 comments

samatdav 531 days ago

Thank you for the idea! We are also considering upsampling and distillation. But on high level, correctly setting up the data for simple fine-tuning can already produce great results.

link