| HN Mirror

Fair enough, and that's why I suppose for instance CATPCHAs and such make us do silly free labor.

But generally, hilarious labeling errors are widespread already in benchmarks:

Companies also seem rather carefree with their labeling, including even in contexts in which accuracy is paramount:

And things get interesting when you start labeling alleged political bias, for instance:

Thus, in general, I'd appreciate research on the "accuracy" (i.e., labelers' inter-group agreement, etc.) of labels used in the wild.