Hacker News new | ask | show | jobs
by pbhjpbhj 5765 days ago
>The plural of 'anecdote' is not 'data'.

I've seen this a few times. I assume you mean anecdotes told by liars? Surely a plurality of true statements constitutes data? The strength of that data for interpretation or prediction is limited by the number of data points, for sure, but I don't really get why this aphorism is supposed to be true.

2 comments

It is, of course, literally true that data is composed of a collection of data points -- you could call them anecdotes if you wanted to. But the point of the aphorism is that you can't go the other direction.

Anecdotes are isolated instances, things people remember and tell because they are interesting. They lack any sort of statistical control or rigorous documentation. If you have a bunch of anecdotes about socially awkward homeschoolers, you don't have any actual data on the incidence of social awkwardness among homeschoolers. You just have have a bunch of anecdotes.

Because of the selection process. If you are trying to make a judgment about an entire population based on a small sample, the chance of getting a non-representative sample is already high. If the sample is selected on the basis of what sticks out in memory, the chance of it being representative is almost nil. Though an interesting anecdote or two can be considered data, they are notoriously unreliable as a basis for statistical judgments.
I think any reasonable approach realises that anecdotal evidence needs weighing appropriately. In the sibling comment for example if you merely wanted to establish if there are any incidences of children with "social awkwardness" that are home-educated (or if for example this is a product of a particular method of schooling) then you've got your tentative result off the bat. The fact that there are some such results allows you to hone your null-hypothesis and tune your approach to getting statistically significant data.

Anyway, I digress, anon.