Hacker News new | ask | show | jobs
by upbeat_general 55 days ago
This isn’t quite accurate. Data weighting is quite important in pretraining.