Fixed it for you.
Is there some noise in these labels? Sure! But the relative performance with respect to these is still a valid evaluation
Is there some noise in these labels? Sure! But the relative performance with respect to these is still a valid evaluation