| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by borramakot 2386 days ago
	> Adding up all the inaccurate redness ratings—“gray,” “pretty gray,” “whitish gray,” “muddy brown,” and so on—and averaging them leads us further away both from learning anything reliable about the individuals’ personal experiences of the rose and from the actual truth of how red our rose really is. I don't understand this comment. How does averaging noisy signal, even systematically noisy signal, result in something that is noisier than any individual signal? I would have assumed the average would converge on (real signal + systematic error).

2 comments

whatshisface 2386 days ago

The author is arguing that the real signal is zero and the systematic error is large, so you will always end up converging on a repeatable but useless value. Technically, taking only one sample could have gotten you closer because there is a 50% chance that the random error would have gone in the opposite direction to the systemic error, although the author is wrong to phrase that like it's some kind of advantage, because the other fifty percent of the time the random error will make the total error even worse.

link

borramakot 2386 days ago

They say later that

> When a feedback instrument surveys eight colleagues about your business acumen, your score of 3.79 is far greater a distortion than if it simply surveyed one person about you—the 3.79 number is all noise, no signal.

Which implies to me that they believe there is signal there, but that it goes away when aggregated?

link

itsdrewmiller 2386 days ago

I think by "surveyed" they don't mean "asked one person for a score" but rather got some overall information from one person including their qualitative feelings and perceptions. There is signal in those as they discuss elsewhere in the article, but the quantitative rating allegedly has no value even when averaging. That's the charitable reading, anyway.

link

zwieback 2386 days ago

Yeah, I'd like to see what statistical theory they are using here. I don't think it's sound. It's unfortunate since I think the article is otherwise quite good.

link

_y5hn 2386 days ago

If methodology is unsound, there can be negative value and outcomes. Ie. active trading is a consistent loser for most people. Methodology can lead people astray.

link

elwell 2386 days ago

? I still don't get it.

link

tunesmith 2386 days ago

Averaging values with random divergence from the truth is useful. Averaging values with random divergence from nonsense is not.

link

elwell 2385 days ago

Specifically, I don't get how is one random value is supposed to more accurate than many random values averaged.

link

tunesmith 2385 days ago

I think that's more of a relative impact. When you have just one measurement, you know it's not particularly reliable. When you have a bunch, we are conditioned to think it's more reliable.

So in the latter case, the distance between its reliability and its perceived reliability is greater than in the former case.

link

elwell 2385 days ago

I agree with you, that it has to do with perception of reliability. However, the article seems to state that there is an actual greater error with more inputs. That's what I don't understand.

"We cannot remove the error by adding more data inputs and averaging them out, and doing that actually makes the error bigger."

I don't see how it "makes the error bigger". Maybe I'm being too literal and the writer is truly referring to the perception of the results carrying more weight, and therefore having a "bigger error".

link

ummonk 2386 days ago

An individual rating can tell you something about the individual’s experience.

Averaging the ratings of multiple people tells you nothing since it washes out the individual experiences. I.e. individual samples hold meaning about the samples themselves but aggregation of samples is just noise.

link