Hacker News new | ask | show | jobs
by mytype 5807 days ago
Our sampling rigor cannot be compared to political polls, it's true. However we do weight the sample not based on guesses but on official data about the gender, age, and personality distribution in the US. We did not use geographic weights because our respondents were more or less properly distributed.

Any poll can be attacked. Even political polls are subject to bias in who picks up the phone, who responds to the surveyor, etc. This bias is partially driven by personality type, which we are able to nearly eliminate through weighting by personality traits. How many polls can do that?

Most polling is much less random than political polls, and yet the results are still treated as worthwhile. Some firms put up surveys on sites (site audience bias), or have carefully selected paid volunteers (bias to incorporate people who are willing to do that), etc.

The fact that our results are corroborated by Forrestor and Sybase, to the extent that we can be compared, is very satisfying to me personally. Sure, we may be a few percentage points off on some things, but are the overall results invalid? No. I challenge anyone to point to any result in the study and give a reasonable explanation why that result is likely way off. The only things that are suspect to me are the major outliers, subsets like native americans for which we did not have much data. I'll give you that. Anything else? Let me know.

2 comments

There is an element of non-randomness in drawing from Facebook users although I'm not sure who it favors. (I'm one of the minority not being on there. I don't text people either)

Looks like you did a pretty good job with this. It's a thought provoking read for sure. It occurred to me that there may be an issue with the "lust" category. It lumps those who lust for an Escalade in with those who lust for a human - very different things. I suspect that geeks feeling plenty of lust for humans may be less likely to regard it as a sin and might be less likely to respond to that choice.

It's interesting that the iPad demographic seems to lean away from slender and no children (which probably includes most gays?). If I had to guess, I'd say that group leans more than average towards the Mac. It would have been interesting to see sexual orientation and OSes used/avoided or primary OS in this. I suppose it might have generated a number of emotional responses though LOL

"I challenge anyone to point to any result in the study and give a reasonable explanation why that result is likely way off."

plan to buy: forrester 3.8% plan to buy: you 2.0%

so your results differ markedly from the forrester report itself?

Your results are different (and likely differ from reality) because you did not use a random sample, you used a self selected sample of people who belong to facebook and wanted to respond to a poll. That is simply a fact.

Having taken a non-random sample you then projected your own opinions into all your conclusions. Its an amusing post, but not really an interesting one.

There's two explanations for that difference:

1) Forrestor's 3.8% was collected in June 2010, whereas our data was collected from March through May. Clearly the iPad is gaining momentum, as time goes on more people are planning to buy one (at least for now).

2) More importantly, Forrestor said that "no time frame was specified in the survey question". So they asked people who don't own an iPad if they intend to buy one. That's fairly different from our question, which allowed the respondent to choose from:

1) plan to buy one 2) want to play with one first 3) will wait for later versions 4) waiting for the consensus opinion

Clearly, with the variety of options, some people who might have simply answered "yes" to the "intend to buy" Forrestor question would pick 2, 3 or 4 in our question.

Does that settle that? And by the way, these are people who wanted to take a personality quiz, not answer a poll. Their motivation had nothing to do with the iPad, that question was randomly inserted into the personality quiz.

While you seem to be "amused" by a blogger who in your mind completely makes stuff up, I'm amused by a commenter who after 5 minutes of review feels confident enough to slam a study that took multiple people dozens of hours to complete.

We're not publishing this in a science journal. That's not what we're going for. It's not complete BS either though. We do normalization, we have lots of data, and our psychological measures are based on the best contemporary research. The results are worthwhile. I'll say again: I challenge anyone to point out a major flaw in our data (not my interpretation of it).

So, in summary, you asked different questions during a different time period and came up with different responses, but still feel happy claiming that the forrestor report somehow validates your own results? why?

"these are people who wanted to take a personality quiz"

that is exactly my point. what subset of iPad users want to take a personality quiz? there is a clear self-selection bias there that you are, for some reason I dont understand, entirely ignoring.

"I'm amused by a commenter who after 5 minutes of review feels confident enough to slam a study that took multiple people dozens of hours to complete"

mytype, Im not slamming it, its amusing, Im just pointing out the obvious. its not science you were engaged in, its opinionated blogging. Your blog post would have lost nothing if you had just skipped the dozens of hours work to implement the poll.

Its not a useful poll in any sense of the word.

Its a non-random poll of self-selected facebook users and gives your post about the same degree of additional validity as the personal anecdote below regarding the people I know who own iPads gives my posting.

"It's not complete BS either though"

in what sense of the word?

from a scientific POV it is a poll from a totally non-random sample with entirely unreliable results, into which you manage to project conclusions that happen to suit you.

That is not intended as a slam, it is simply a fact.

If you are unhappy with that fact, you should try harder next time to use a decent user sample.

or phrase things differently. Although the poll doesn't talk about iPad users as a group; not even close; it does give a clear result of what a subset of iPad owners, who belong to facebook and chose to take a personality quiz, might be intending.

Yes, it's a biased sample. I've already admitted that. My point is that it's not egregiously more biased than any other sample from academia or professional surveyors, and in many cases it's less biased.

Do you reject most academic psychology research because it is based on students at the college of the researcher? Certainly that's a more biased sample.

Do you reject political polls because they're based on who answers calls from random phone numbers and then does not hang up once they realize it's a poller?

Do you reject most commercial research based on paid volunteers or visitors to sites with much smaller and more biased audiences than Facebook?

I state upfront in the article that this is based on MyType users who are on Facebook. What more do you want? If you want no bias, just do math and don't believe any data based on people, written by people, spoken by people, anything having to do with people.

The bottom line point is, this is much more rigorous than much of the crap blogs and media report on. I'll take a random example that I googled for the iPad: http://techcrunch.com/2010/04/06/ipad-sentiment-analysis/. "87% of tweets indicate intent to purchase the iPad". Give me a break. Talk about bias. The sampling errors there are horrific.

I'm just trying to maintain a reasonable perspective on MyType's data, not hide any facts about the shortcomings of it. There are shortcomings, they're just not so bad to make the results "entirely unreliable".

"87% of tweets indicate intent to purchase the iPad"

oddly enough, I have less of a problem with that article than I do with your post.

They clearly acknowledge all the limitations of the data right up front.

I can read that article and understand within the first 2-3 sentences that they are playing a game of mental masturbation, and then grin at the conclusions.

Its clearly a pointless piece of puff, and perfectly enjoyable as such.

My problem with your blog post arises because it is inviting me to take it more seriously than that.

You state upfront that it is based on MyType users who are on Facebook and who participated in a personality quiz.

The question you never speak to, and need to answer, is why on earth do you believe that a narrow sample like that can reasonably be used to draw conclusions about the broader set of iPad users?

do you fully intend that the blog post be a pointless piece of puffery similar to the techcrunch article? in that case, make that explicit.

do you actually believe that you can, using the statistics you have available, speak usefully about the broader set of iPad owners? explain why, giving your confidence level and other assumptions you have made.

If you want me to take it seriously, you need to take it seriously.

Bias in sample data is unavoidable, but the bias should be clearly called out before, during and after the conclusions to ensure that the context is not missed.

and yes, I do reject any poll that does not take the idea of sample context and data bias seriously, regardless of its source.

If you do not clearly acknowledge the limitations of the data you have, you might just as well spend your time making numbers up.

"I challenge anyone to point out a major flaw in our data"

its based on a non-random set of self selected participants who happen to use facebook.

That is a major flaw in your data.

Its useless to draw conclusions from it, unless you are specifically interested in drawing conclusions about iPad users who have a facebook account and like to fill in personality quizzes.

Having said that, if that small subgroup of iPad users is, in fact, the group that you wish to discuss then go for it, but please be clear about that at the top of your post.