Hacker News new | ask | show | jobs
by jondoh 5951 days ago
It's nice to see somebody making an effective argument against those high-and-mighty OkCupid data lords. I've read a number of their posts, and they are actually quite careless in making assertions as though they are experts.

Anyone who works with data knows that subtle changes in how you define a metric can lead to drastically different findings. Ever heard the saying: "There are lies, damn lies, and statistics." In real science, you need to do everything possible to try to prove yourself wrong, and fail. They hardly seem to do anything so rigorous. They choose one metric, see what the outcome is - gasp, something sensational! Write a post about it.

1 comments

They lost me when I noticed they were mucking around with chart axes to make effects look more significant than they are (http://news.ycombinator.com/item?id=1065203), surely the oldest trick in the book. Besides that, the posts can only count as pseudo-scientific in the absence of the data being available for review. But such criticisms do kind of miss the point, which is that this has been very effective marketing.
What's wrong with 'mucking around' with chart axes? Scientists do that all the time, because effects are often hardly visible with 'normal' axes and you have something you wish to make clearly stand out. Skipping part of an axis and changing the scales is a normal, even required, thing to do.
You should explicitly note this.

I've actually really liked the posts from OK Cupid, they are very interesting, but this article really pokes a lot of holes in their theories, in that they presume they are dealing with a cross section of society due to their size, but from reading this, it seems entirely likely they are mistaken.

Really? Because it's also a notorious way to distort data. It sounds like you have more experience with this than I do, but I'm puzzled by the contradiction here. When is it a normal, even required scientific technique and when is it the oldest "how to lie with statistics" trick in the book?
OK, that's an interesting question. I guess it depends on the audience. When your audience is scientifically minded, they will know how to interpret your axes and understand why you made a certain choice. They might criticize you for it, but you know such a choice will be scrutinized, so you won't try to deceive them. When your audience consists of less numerically literate folks, you have to be careful with your axes, as you make something appear more interesting that it actually is. If I take a graph from a scientific presentation and use it in an article for lay people, I'm not trying to trick them, but it may appear as such. I guess it's a fine line between making the interesting bits stand out and making bits stand out to make them seem interesting.