| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by erichahn 1897 days ago
	How is entropy related to "male gaze". This approach seems to be unsupervised, I don't see the problem.

4 comments

SiempreViernes 1897 days ago

I don't think the claim is that the behaviour is caused by "male gaze", but rather that the outcome of always focusing the cropping around any visible cleavage is functionally identical.

link

skavi 1897 days ago

Whether or not it's unsupervised, whether or not it's sexist, it seems that a thumbnail focusing on a person's face rather than their breasts is typically going to be more desirable. Depending on context, of course.

link

teachingassist 1897 days ago

> This approach seems to be unsupervised, I don't see the problem.

Someone wrote and tested this algorithm, and either:

a) didn't test it on pictures of women, or,

b) didn't notice that it cropped breasts rather than faces, or,

c) didn't think that was a problem.

If they had noticed and cared, this wouldn't be the approach in use.

link

refenestrator 1896 days ago

It sounds like it was blasted out in an afternoon.

Maybe they weren't thinking about boobies a whole lot, tried like 5 random test images and shipped it?

Presumably there are a ton of failure modes for that algorithm, why get so moralistic and high-horse about just one?

link

p1necone 1896 days ago

Nobody is trying to assign blame to the person who wrote the algorithm, it's just being pointed out that the the output it produces is sub optimal in this specific way.

link

eyelidlessness 1896 days ago

Blasting out an ML algorithm in an afternoon that causes millions of people to see wildly different representations of people depending on their race or gender seems, maybe, like a bad thing you wouldn’t want to defend?

link

refenestrator 1896 days ago

Sure, more testing would be better, but nobody cares how many landscape shots it messed up, right?

Years ago reddit was, IIRC, not very staffed at all compared to their traffic. It's a pretty privileged take to say they should have done expensive QA entirely around your particular things that you care about.

link

eyelidlessness 1896 days ago

But that’s the problem. The entire premise of these algorithms is based around what you (or the developers producing it) care about. It’s the common thread between image crops preferring white faces and women’s breasts, and automated cars preferring dead black pedestrians over vehicle collisions.

If you don’t have the capacity to use new technologies without increasing harm, maybe you don’t have the capacity to use them.

link

refenestrator 1896 days ago

No, I'm positing the premise that it was shipped with an absolute minimum of QA that didn't approach the level of trying to build 'inclusive' reference sets, on the cheap. It wasn't about caring in terms of priority of what was tested, it was about NOT caring that much in any direction and shipping it.

And it was a naive image cropping algorithm, years ago, and not making use of any sophisticated 'new technologies'. The beauty of the algorithm is that it was a simple function that could have been written in 1975 and required no training, deep learning or any of that. If you want to talk self-driving cars, you've got a much more relevant measure of harm and I'm right there with you.

As it is, I'd say there's a disconnect between where you and those years-ago shoestring developers stand on Maslow's hierarchy of needs. They were being scrappy with limited resources, and you're mad at them for not having an amount of QA that would have seemed unbelievable to them under their resource constraints.

link

jtbayly 1896 days ago

Machine learning? It’s not in my book.

link

teachingassist 1896 days ago

> why get so moralistic and high-horse about just one?

Not doing so; just observing the facts.

If a proposed 'unsupervised' algorithm of this simplicity highlighted women's faces perfectly, but zoomed in on men's receding hairlines, it wouldn't have made it past the drawing board. Indeed, it's reasonable to believe that nobody would have noticed that it consistently worked for women. We certainly wouldn't know that this algorithm existed or be talking about it here.

We observe a bias in what is considered important to check before shipping.

link

refenestrator 1896 days ago

That's an AWFUL lot of assumptions, there. You've constructed an entire complex narrative around something where all we know is that it's very simplistic.

link

teachingassist 1896 days ago

I don't think this is a complex narrative. It's the reality of development:

The simplest case is just to pick the most central square, and then you could probably improve that by picking a standard square according the rule of thirds. Those are the naive algorithms - this choice of alternative algorithm is deliberate and isn't as naive or simplistic as you're claiming.

The algorithm is only considered useful because it appears to do better than that, on whatever examples that the developers tried (i.e. there was a business case for using it), and against other possible code.

Including, likely, pictures of their own selves. That's certainly what I would test it on, until it vaguely worked.

What 'awful lot of assumptions' do you think I am making? I don't imagine we are in disagreement about this.

link

opsy2 1897 days ago

How is entropy defined in this context?

Clearly there is human-derived input in the system (otherwise... What's the point just crop randomly)

link

jedberg 1897 days ago

Here is the code:

https://github.com/reddit-archive/reddit/blob/753b17407e9a9d...

But in short, it's a histogram of the values of the pixels.

link

andbberger 1897 days ago

Which is not really image entropy at all, as it totally neglects spatial structure. You could sort an image and get the same entropy using the histogram approach.

link

bavell 1897 days ago

"Entropy" in this context also left me wondering. Perhaps "variance" or "deviation from the mean"?

Thanks for the insights!

link

MauranKilom 1897 days ago

Entropy usually boils down to "sum p * log(p) for all p (and possibly normalize)", assuming you have discrete probabilities p. It is not related to variance or mean.

link

TheGallopedHigh 1897 days ago

Randomness in pixel values

link