Hacker News new | ask | show | jobs
by imh 3751 days ago
In addition to the caveats they list, there may be systematic biases as to who includes a profile pic and who doesn't. Even an enormous sample and perfect inferences from the face API can't surmount a bad dataset (bad depending on what you are trying to learn from it)
1 comments

To be fair, they did address that:

    Lang 	FacesDetected
    ruby 	71
    r 		38
    javascript 	60
    java 	47
    html 	59
    go 		53
    cpp 	34
    c 		24
    python 	49
    php 	66
    perl 	45
    swift 	49
    csharp 	61
Not quite. That relates to sample sizes and, as far as interpretation goes, it's dealt with as if the rest were missing at random. In reality, they are probably missing not at random, but in relation to other characteristics, changing the interpretation.