|
|
|
|
|
by sillysaurusx
856 days ago
|
|
I’m extremely skeptical of this approach. Until proven otherwise, with a model that users actually find useful, I don’t think this can work. It would be nice. But I’ve seen too many nice ideas completely fall apart in practice to accept this without some justification. Even if there are papers on the topic, and those papers show that the models rank highly according to some eval metrics, the only metric that truly matters is "the user likes the model and it solves their problems." By the way, on a separate topic, the 90/10 dataset split that you do in all of your examples turns out to be fraught with peril in practice. The issue is that the validation dataset quality turns out to be crucial, and randomly yeeting 10% of your data into the validation dataset without manual review is a recipe for problems. |
|