Hacker News new | ask | show | jobs
by imgabe 3688 days ago
> Even today it's no problem to analyze the Facebook picture of some random person and calculate a chance of that person being an alcoholic in X years based on the number of party pictures they share.

The problem is not that this calculation exists, but that people so easily misinterpret it. It doesn't mean "Person with these photos has a 65% chance of being an alcoholic in 5 years", what it REALLY means is "65% of the people with these photos became an alcoholic in 5 years".

That is a HUGE difference. For the remaining 35% there is perhaps some additional factor not included in the existence of the photos that guarantees there is a 0% chance of them ever becoming an alcoholic. When you try to apply general statistics about a large population to a single individual, it's not as simple as just saying "65% of the time this person will become an alcoholic". That's ridiculous anyway, because the person will only live one life, one time.

2 comments

The calculation is useful from a marketing perspective. If I decide to advertise alcoholism treatments to these people in five years' time, I'll have a lot more success then if I'd just picked people at random to advertise to.
..and in the meantime, you can advertise alcohol to these people. It's a win-win! </evil>
This distinction seems relevant only if you target one person, which seems pretty rare. In reality you're going to use this metric on a pool of people, in which case 65% of them would become an alcoholic in 5 years.
It becomes a problem when it's used for profiling, such as "Don't hire that person, there's a 65% chance they'll be an alcoholic" or "send a drone to kill that person, there's an 80% chance they're a terrorist"
I agree that it's a problem, but your distinction still seems generally pointless. If you have a company policy to not hire alcoholics and you reject 1,000 people based upon this metric then 65% would have been an alcoholic.
It is a problem for the 350 non-alcoholics who are unable to get a job because the hiring manager doesn't understand statistics.