Hacker News new | ask | show | jobs
by dabinat 1775 days ago
Well that was kind of my point: you need to manually figure out what’s clean and what isn’t.
1 comments

But it's easy to do that for a small subset for finetuning compared to cleaning up the entire dataset.