|
|
|
|
|
by vl
1562 days ago
|
|
In my experience, ironically, most of the model gains come from understanding and fixing data pipelines and datasets, tokenizers, vocabs. It’s surprising how a team can spend time on a complex model, but nobody bothered to runs stats and see that 20% of samples are garbage or that top tokens are nonsense. So in this sense a lot of “ML” work is data analytics or code debugging. I usually say that we should work on products, and do whatever work is required to advance product at the moment. |
|