Hacker News new | ask | show | jobs
by patelajay285 859 days ago
Collecting data is hard, but the library is also a synthetic data generation library, so for example you can create the data for DPO fully synthetically, check out the self-rewarding LLMs example: https://datadreamer.dev/docs/latest/pages/get_started/quick_...