| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ImJasonH 302 days ago
	Is anybody working on making building specialized things easier and cheaper?

2 comments

-_- 302 days ago

Yes! At https://RunRL.com we offer hosted RL fine-tuning, so all you need to provide is a dataset and reward function or environment.

link

selim-now 301 days ago

yes! check out https://distillabs.ai/ – follows a similar approach except the evaluation set is held out before the synthetic data generation, which I would argue makes it more robust (I'm affiliated)

link