Hacker News new | ask | show | jobs
Distilabel: Synthetic Data Generation and Rlaif at Scale (argilla.io)
4 points by dvilasuero 798 days ago
1 comments

Hey!

At Argilla, we've been using our previous version of distilabel to build open preference datasets used by 100s of models and top performing models like zephyr-141b.

Today we're releasing distilabel 1.0.0. We've totally revamped it to make creating complex synthetic data pipelines easier, more robust and community-friendly.

We'd love to hear your thoughts!