Hacker News new | ask | show | jobs
by nolist_policy 77 days ago
Is distillation or synthetic data used during pre-training? If yes how much?