Hacker News new | ask | show | jobs
by tmarthal 4164 days ago
Whenever you say "that made no sense", I think that you are using too much bias and not giving enough credit to what the data is telling you.

If you look at the most "controversial" data science paper from 2013 where a study correlated intelligence to Liking the Facebook pages "Curly Fries" and "Thunderstorms" (here is a summary: http://www.wired.com/2013/03/facebook-like-research/), there were a lot of proponents saying that there was no causation, and the correlation was not founded, etc.

Of course, you would say the study "makes no sense". Intelligence can't be predicted by Facebook Likes. There is no correlation there, etc. But why not? If you read the paper (http://www.pnas.org/content/110/15/5802.full.pdf) their logic is sound. Is the marketing campaigns that the company bought based on the TV Stand<>DVD Player connection any different than other marketing campaigns? Facebook does all of their ad display based on similar data analysis as above, and it seems to be working for them.

Note: There is the not-so-hidden machine learning feedback loop now (explained better here: http://www.john-foreman.com/blog/the-perilous-world-of-machi...), where people Like the 'Curly Fries' and 'Thunderstorms' pages because of the research.

1 comments

Whenever you say "that made no sense", I think that you are using too much bias and not giving enough credit to what the data is telling you.

What? If a data scientist sees something seems illogical, there is no reason not to investigate it and see if he/she can find a more logical explanation. Sure, if the effect seems real but unexplained, you can accept and use it but advocating a kind of big data mysticism, "don't investigate, accept" seems to be buying into the senseless hype. And if you read the post, you'll notice the parent actually discovered the association was just an artifact of an easily explained association.

And, no, there's no much reason for companies to advertise just a TV stand and DVD player. Common sense tells one what the data actually data, that those two items, by themselves aren't and weren't what many people were just dreaming about.