Hacker News new | ask | show | jobs
by chrisloy 3201 days ago
If you read the paper, the photos and labels were sourced from a dating website. In my opinion, there is a good chance that the model may be overfitting to how people wish to present themselves in that context. - e.g. framing of the photo, facial expression etc. Things with a heavy amount of cultural conditioning.

Some of the press around this seems a bit alarmist - I doubt you would see anywhere near this accuracy out in the real world.

4 comments

They address overfitting, presentation and context in the paper. Their DNN was using facial features that had been extracted by VGG-Face, which is a widely used thing that reduces a face to a vector of scores that are meant to be independent of transient features such as facial expression, background, orientation, lighting, contrast, and similar.

By having their DNN train on faces that have been processed by VGG-Face, they greatly reduce the risk of overfitting or relying on things that would be present in dating site pictures but not in pictures of the same people in other contexts.

They use multiple pictures from the same profile. Does the test set include any people that were in the training set?
The problem being, even if they are different photos, if the same people are in the test set, it may just be recognizing people.

Instead of learning, that person looks like a gay person.

It learns, that person looks like Tim, who is gay.

Ah, I had missed that. I guess this will mitigate the risk a lot, although I would still like to have seen results against a test set of images from a different context (social media for example).
Good point. I was thinking similar. The context matters, in this case a lot. Certainly, there are signals people want to send on dating sites. Evidently the algorithm picked up those signals and then from that created a pattern (because it has more and finer capacity than mere humans).

Still quite an accomplishment (maybe). But they pretty much already led the horse to water, yes?

This is a very annoying part of non-explanatory models. I think the result defies common sense a bit, and the model can't explain why this is so.

So in the circumstance, why should we believe it's generalizable?

This also goes to the heart of the problem with deep learning on neural nets. We have this algorithm that apparently identifies homo and heterosexual people, presumably based on a variety of subtle features, but we have pretty much no clue as to which features and why.

The human judges may have been less accurate, but they could likely explain each decision they made and the visual features they based their decision on.

Humans are known to be unreliable in explaining how they come to conclusions as well. Humans just like to pretend they can verbalise all knowledge ;)
Even if they verbalized their knowledge incorrectly they give you something, which if you chose, you could further test / replicate. In other words even if they're BSing they're still falsifiable, not so "magic models" when their publisher may not want them falsified
Some are better at it than others, and no doubt many are pretty bad at it, but I have yet to see a neural net explain to me, accurately or not, why it came to the decision it did.
Don't forget that we can listen to their attempt at verbalizing that knowledge and then, in turn, draw/verbalize our own sketchy conclusions....and so on.
Something to do with clouds and tanks : https://www.jefftk.com/p/detecting-tanks