Hacker News new | ask | show | jobs
by ImJasonH 254 days ago
Is anybody working on making building specialized things easier and cheaper?
2 comments

Yes! At https://RunRL.com we offer hosted RL fine-tuning, so all you need to provide is a dataset and reward function or environment.
yes! check out https://distillabs.ai/ – follows a similar approach except the evaluation set is held out before the synthetic data generation, which I would argue makes it more robust (I'm affiliated)