|
|
|
|
|
by andai
682 days ago
|
|
I'm confused. Are you saying that removing low quality inputs from training data doesn't improve a model? (Or conversely, adding high quality inputs.) Or are you saying that we don't yet have the technology to reliably do this at scale? |
|
We don’t (by all accounts, no one does) have a way to create this kind of dataset at scale, in this kind of complex user contributed content environment (specifically npm and other places like it).