Are there any examples of this actually working? I keep seeing this fantasy repeated but have not seen a plausible explanation for how they wouldn't be contibuting to the pile of negative examples which are just as valuable if not more.
what about just... becoming mediocre? engineers are already infamously lazy at reviewing PRs. how is Meta incentivizing these Data Labelers to give a shit and actually scrutinize the AI-generated code they're supposed to be reviewing? what's the reward structure? what prevents engineers from flagging minor nitpicks all day while they look at LinkedIn?
Probably forcing them to review each other's work to panopticon "quality," and keeping track of the average throughput per engineer so if people fall behind the taskmasters can pay them a visit.
Poison pilling skills is a thing, though finding evidence for it is difficult given the crux is an absence of information. The baseline instruction and training is given to the model by the expert, but edge cases are willfully neglected. The degree of neglect generally determines how detectable it is, but if all the SMEs are in on it a lot of them will probably persist. Effectiveness and impact are obviously relative to the system and the edge case. Not particularly different from the fallout previously seen during the offshoring era.