Hacker News new | ask | show | jobs
by o09rdk 2382 days ago
There are whole fields of research devoted to the questions you're raising. As such, it's hard to reply with anything that would do justice to them. This isn't to say your questions aren't important, just that your lack of answers reflects your ignorance more so than that of the researchers. I say this not antagonistically but to suggest that it's important to understand that what you see is not always all there is to say.

It is true that these are self-report questionnaires, but as such they are small samples of behaviors of the people in question. Samples of how they perceive themselves, how they think about life, how they think about others, and what they value.

The Big Five, and the measures used in studies such as this, has been validated (in the sense that the ratings have been associated concurrently and predictively) over decades in many ways, with regard to daily reports of behavior, emotion, and life events, diagnoses, work ratings, performance on tests, ratings by peers and colleagues, ratings by strangers, just about everything you can imagine. These self-reports aren't perfect, but they do provide a fuzzy snapshot of someone at a given moment in time. Yes, it would be better to obtain all sorts of other measures of behavior, but they would be too expensive to obtain on large enough samples to be representative.

A major paradox in understanding human behavioral differences is that the more specific and "real world" you get, the less and less they generalize. That is, you can get a very concrete measure of a real-world behavior, but it ceases to be representative of that person across a large number of contexts and situations. Say you want to measure theft, for example. Do you set up a honeypot? Is that representative of that person? Do you use police reports or records? Is that representative? It turns out self-report on online questionnares is a very good measure of things like this because people are less self-conscious, and report things that don't go on the official record.

Faking is also controversial in this area. You're right to bring it up as an issue, but to understand the research on it it's important to think about why someone would fake. That is, what's the motivation for large proportions of people to systematically fake in one direction? And if they do do go to the trouble of doing that, what's "real" and what's "fake"? That is, let's say people make themselves look more dominant than they really are -- what does it mean if one person does that and another does not? It turns out that the person who wants to make themselves look more dominant often is more dominant, all other things equal, because it means they value that.

Also, strangely enough, it turns out that people who are callous and aggressive don't really care about that, especially on online questionnaires, because they are callous and aggressive.

This has all been very thoroughly researched and it turns out to be much more complicated than it seems at first glance. It doesn't mean things can't be better, but it does mean that over very large samples of persons answering questions on a low-stakes questionnaire (in the sense there aren't real consequences to them answering one way or another), a lot of these things average out. It's not the end of the story, but it's not something to be dismissed either.

In the end, questions of sex differences in behavior are about sex differences in behavior. And that's what this research addresses.

2 comments

If there are all these ways to verify that the answers to questionnaires are accurate you'd think those ways would have been used instead of questionnaires as proof in this highly controversial and inflammatory subject. Extraordinary claims require extraordinary evidence, don't they?
> Extraordinary claims require extraordinary evidence, don't they?

But these claims aren't extraordinary in this scientific field.

They may of course seem extraordinary to those not familiar with the science.

Thank you for your patient and civil answer.

It's getting past my bed time and your comment deserves a more thorough answer that I'll try to write tomorrow, but for the time being this is what strikes me the most about your reply:

>> Also, strangely enough, it turns out that people who are callous and aggressive don't really care about that, especially on online questionnaires, because they are callous and aggressive.

How do you know that someone who looks callous and aggressive on online questionnaires is actually callous and aggressive? The obvious answer seems to be that you know because you've given them another questionnaire separately. Is that the case?

I'm not trying to catch you out, so I'll spell it out: if that is the case then I don't see how you can ever know that someone is callous and aggressive in any objective sense of the way. Like I say in another comment, that would be "questionnaires all the way down". This is a really strong signal that I get from discussions like this and it makes me very suspicious of assurances that it's all been studied and it's all based on solid evidence.

I mean, I'm sorry, I don't want to sound like a square but "how [people] perceive themselves" is exactly the opposite of what I'd think of as an objective measure of how they really are. For example- I perceive myself as pretty (I like myself, that is) but I am not always perceived as pretty by others. What value is there in asking me how pretty I am?

Edit: I get that some of your comment addresses this. But it still seems to me like the solution is to try to double-guess the participant. That also doesn't sound like it should make for objective observations.

> How do you know that someone who looks callous and aggressive on online questionnaires is actually callous and aggressive?

You don't, but it's also not necessary. It's impossible to objectively assess someone's subjective experience, the best we can do is look at groups of people and attempt to find reliable indicators.

The point is that some people will over-emphasize any given trait, and others will under-emphasize it, so on average it evens out.

Think of color perception for a similar conundrum. How can you be sure that the red you see is the same as everybody else is seeing?

>> How can you be sure that the red you see is the same as everybody else is seeing?

I can't, but my understanding is that if we all agree to call a certain frequency of visible light "red", the frequency won't change because some people perceive it in a different way than others. Neither will measuring the frequency depend on how people perceive it.

That seems to me to be a more consistent definition of "red" than the definitions of personality traits that are discussed here.

A bit more about your comment as promised.

>> There are whole fields of research devoted to the questions you're raising. As such, it's hard to reply with anything that would do justice to them. This isn't to say your questions aren't important, just that your lack of answers reflects your ignorance more so than that of the researchers. I say this not antagonistically but to suggest that it's important to understand that what you see is not always all there is to say.

Another comment brought up the term "construct validity" and it seems to match my concerns exactly. I am glad there is debate on that.

I study for a PhD in AI and I have similar concerns about research in my field. For instance, in AI, research often claims to have modelled human abilities such as "reasoning", "emotion" or "intuition". I'm personally uncomfortable even with well-established terms like "learning" (as in "machine learning") and "vision" (as in "machine vision")- because we don't really know what it means to "see" or to "learn" in human terms so we shouldn't be hasty to apply that terminology to machines.

This tendency has been criticised from the early days of the field but we seem to have regressed in recent years, with the success of machine learning for object classification in images and speech processing taking the field by storm and leaving no room for careful study anymore, it seems. But that's a conversation for another thread.

In AI, I'm worried that calling what algorithms do "attention" or "learning to learn" etc, gives a false impression to people outside the field about the progress of the field, and, in the end, about what we know and what we don't know. This is certainly not advancing the science.

I think the same about psychology and studies like the ones we're discussing here. If psychologists are happy measuring the correlations of the answers in their questionnaires, and they call the quantities measured in this way with names like "agreeableness" and "sensitivity"- doesn't that just give the entirely wrong impression to people outside the field who have a very different concept of what "agreeableness" etc means?

I say that this is "not advancing the science". You could argue that the science is doing fine, thank you, even if lay people don't get it. But, if the way the science is carried out creates confusion and influences real behaviour and decisions, as studies like the ones discussed above have the potential to do- is that really a beneficial outcome of research?

To put it plainly: as a researcher I don't aspire to create confusion, but to bring clarity in subjects that are hard to understand. Isn't that the whole point?

>> In the end, questions of sex differences in behavior are about sex differences in behavior. And that's what this research addresses.

I understand this. But, my concern here is that asking people "what do you think about sex differences in behaviour" is likely to return results tained by ungodly amounts of cultural bias that would be impossible to disentangle from any other results. How is this addressed in such studies? How do you account for people answering questions about sex differences in behaviour based on what they are used to think about sex differences in behaviour, rather than what they actually observe?

P.S. Hey, your answer does do justice to my questions. Thanks for your patience, again.