|
|
|
|
|
by YeGoblynQueenne
2422 days ago
|
|
Regarding the gaydar paper, yes, I have read the full paper (if memory serves,
I read two versions, a pre-print and the published paper). At the time, I
wanted to publish a rebuttal, perhaps a letter in a journal or something, but
in the end I didn't think I'd be adding much to the debate and the paper had
been widely discredited already anyway. My objection with the methodology in the paper was that the authors had
assembled a dataset where the distribution of gay men and women was 50% of the
population, i.e. there were as many gay women as straight and as many gay men
as straight in the data. This was for one of their datasets, the one were
everyone had a picture. There were two more where the distribution was less
even but still nothing like what it's usually estimated to be. This despite
the fact that the paper itself cited a result that gay men and women are
around 7% of the population. The reason for this discrepancy was clearly to improve the results by reducing
the number of false negatives which are expected when there are many more
negative than positive examples in binary classification. This from the point of view of machine learning. There were other flaws that
others pointed out, e.g. the choice of metric (I don't remember what it was
now, I can look it up if you like), the premising of the paper on prenatal
hormone theory that is another piece of bunkum without any evidence to back it
etc. And of course there were the ethical considerations. Sorry but I don't have the courage to reply to the rest of your comment. You
write way too much. |
|
If there is signal in the rebalanced dataset, there should be signal in the imbalanced dataset. If they'd switched to logloss or AUC and an imbalanced dataset, do you think now their results would be as good as random? Because that is what you are implying and you are basically implying the research is fraudulent. This is a very strong claim to make, in the absence of legit discrediting studies that failed to replicate any predictability, and requires more than guessing the authors rebalancing act was "clearly" to improve the accuracy (with 7% negative class, you could get 93% accuracy by always predicting positive class, so if they wanted to inflate the accuracy, they shouldn't have rebalanced).
The ethical considerations are moot/personal opinion, as they passed the ethics board of Stanford. Those are people who evaluate ethics of academic research for a living, or are you saying they were also shoddy and wrong to give this a pass?
Magical thinking is not wanting something to be true, because it would be an uncomfortable truth, and so deeming that something which is objectively true, must be false, so you can continue to think happy thoughts in line with your world view.
You keep talking about the paper being widely discredited, but can't provide a single academic source for this. Instead, you question my sources (business insider?) while posting articles from The Next Web written by a History degree journalist who does not want the concept of binary sexuality to be true, or even allow it in constructing a dataset of gay and straight people by self-classification.
It takes more energy and letters to attack a point than to make a point. You made quite a lot of weak points.