Hacker News new | ask | show | jobs
by jonathanlb 1652 days ago
Speaking from my experience working at data labeling companies, the sabotage does occurs, but is not intentionally malicious.

What ends up happening is that some labelers learn what the pre-determined questions and answers are and share these via Facebook and Discord to other labelers. That way, the other labelers can stay on the task longer while providing garbage responses to the non-predetermined question/answer pairs.

It's an arms race with labelers on one end, trying to make a quick buck, and data labeling platforms on the other, trying to get quality labeled data.