Hacker News new | ask | show | jobs
by whatshisface 2386 days ago
The author is arguing that the real signal is zero and the systematic error is large, so you will always end up converging on a repeatable but useless value. Technically, taking only one sample could have gotten you closer because there is a 50% chance that the random error would have gone in the opposite direction to the systemic error, although the author is wrong to phrase that like it's some kind of advantage, because the other fifty percent of the time the random error will make the total error even worse.
3 comments

They say later that

> When a feedback instrument surveys eight colleagues about your business acumen, your score of 3.79 is far greater a distortion than if it simply surveyed one person about you—the 3.79 number is all noise, no signal.

Which implies to me that they believe there is signal there, but that it goes away when aggregated?

I think by "surveyed" they don't mean "asked one person for a score" but rather got some overall information from one person including their qualitative feelings and perceptions. There is signal in those as they discuss elsewhere in the article, but the quantitative rating allegedly has no value even when averaging. That's the charitable reading, anyway.
Yeah, I'd like to see what statistical theory they are using here. I don't think it's sound. It's unfortunate since I think the article is otherwise quite good.
If methodology is unsound, there can be negative value and outcomes. Ie. active trading is a consistent loser for most people. Methodology can lead people astray.
? I still don't get it.
Averaging values with random divergence from the truth is useful. Averaging values with random divergence from nonsense is not.
Specifically, I don't get how is one random value is supposed to more accurate than many random values averaged.
I think that's more of a relative impact. When you have just one measurement, you know it's not particularly reliable. When you have a bunch, we are conditioned to think it's more reliable.

So in the latter case, the distance between its reliability and its perceived reliability is greater than in the former case.

I agree with you, that it has to do with perception of reliability. However, the article seems to state that there is an actual greater error with more inputs. That's what I don't understand.

"We cannot remove the error by adding more data inputs and averaging them out, and doing that actually makes the error bigger."

I don't see how it "makes the error bigger". Maybe I'm being too literal and the writer is truly referring to the perception of the results carrying more weight, and therefore having a "bigger error".