Hacker News new | ask | show | jobs
by DalekBaldwin 2517 days ago
And due to scale-end effects, you should expect to compute a weaker correlation from a naive calibration analysis. If you are at the zeroth percentile, you can only overestimate your performance; if you are at the hundredth, you can only underestimate it. http://home.cerge-ei.cz/ortmann/TrentoCourse/Juslin_etal_Nai...
1 comments

But interestingly, D-K isn't a symmetric-about-median effect; which is what you'd expect from scale-end effects alone. The D-K finding was that people tend to estimate themselves closer to roughly the 70th percentile than they actually are.
70% is the bottom of C grade range. (https://en.m.wikipedia.org/wiki/Academic_grading_in_the_Unit...) Honestly I think this can explain a large part of D-K. People just aren't calibrated to imagine getting a 25% let alone a 0% even in something they are truly terrible at.
> 70% is the bottom of C grade range

70th percentile doesn't correspond to a 70% score, and when it does, it's almost certainly not on a (common, but very far from universal) grading scale where it is true that 70% is the bottom of the C range, just like 50th percentile is very rarely failing, though 50% is in (and not even the top of) the F range on the same scale.

> Honestly I think this can explain a large part of D-K.

It almost certainly doesn't explain any of the high-end part of the effect, and an explanation where the low-end and high-end effects are unrelated coincidences is, while possible, the kind of explanation that a preference for parsimony would prefer avoiding in the absence of evidence demanding separate effects.

I just looked at the original paper again and I see that in some of the tests the researchers asked participants to estimate the raw number of questions they got correct, in addition to the percentile relative to their peers. Even in raw numbers the bottom quartile overestimated their ability significantly. So I'm less confident in my theory than I was before.
Where are you getting this from? The actual test scores in the graph in the article don't correspond with the percentiles; I don't think the scales on the x and y axis are the same, which is what you seem to be contending.
In an eating contest, it's rare to finish eating less than one bite. It's not insane to imagine that last place is significantly better than 0.
Even for percentiles, it makes sense to calibrate to above average: people are rarely ranked on tasks they're not good at.
But that’s sort of what you’d expect, e.g. most people thinking they’re an above-average driver—but don’t actually think they’re incredibly wicked good. I’d never seen it presented quite this way before but it makes a lot of sense.
Nobody said it had to be scale-end effects alone. But yes, when I look at that Dunning-Kruger graph, I don't see what the authors of TFA say or what the common misinterpretation is (I agree with the authors about how it's commonly misconstrued), I see bad calibration, but the rank order of everything looks right.