Hacker News new | ask | show | jobs
by vl 1562 days ago
In my experience, ironically, most of the model gains come from understanding and fixing data pipelines and datasets, tokenizers, vocabs. It’s surprising how a team can spend time on a complex model, but nobody bothered to runs stats and see that 20% of samples are garbage or that top tokens are nonsense. So in this sense a lot of “ML” work is data analytics or code debugging. I usually say that we should work on products, and do whatever work is required to advance product at the moment.