Hacker News new | ask | show | jobs
by KaiserPro 330 days ago
It was always the case. We only managed to make a decent model once we created a decent dataset.

This meant making a rich synthetic dataset first, to pre-train the model, before fine tuning on real, expensive data to get the best results.

but this was always the case.

1 comments

RLHF wasn't needed for Deepseek, only gobbling up the whole internet — both good and bad stuff. See their paper