Hacker News new | ask | show | jobs
by drusepth 2683 days ago
I always figure they're looking for a population consensus. They're doing image recognition at scale and these are clearly ambiguous, hard images to classify. They could easily have a few people at Google say, "I determine this is a storefront" and make that the "correct" answer, but I think they're more interested in a consensus of what most "normal" people would classify as a storefront, especially in potentially-volatile classifications where real humans might argue over the answer. They can skip the argument and just know which side will win it.
1 comments

What they're actually getting though is the population consensus of what normal people believes Google's image classifier believes. The system incentivizes users to reinforce misconceptions their classifier has.

Does this look like a mountain to you? https://0x0.st/zzvr.jpg

Google's image classifier would think that's a mountain. If you disagree, google will classify you as a robot. After failing these sort of challenges a few times the user decides to play along and tell google what they think google wants to hear, rather than the truth.

What makes you think Google's image classifer would think that's a mountain?

Especially if this is all used for learning, enough people saying "that is clearly not a mountain" would reinforce that it's, in fact, probably not a mountain. Even if I got classified as a robot, I'm not sure I would think "oh, a system designed to classify images would think this not-a-mountain is a mountain", so I definitely wouldn't double down and keep marking it as a mountain. I'd, well, not. And assume the system is at least as good as classifying the images it chooses to use as I am.

> "What makes you think Google's image classifer would think that's a mountain?"

Because every single time it asks me to classify mountains it rejects my answers if I don't click on trees on the horizon (and often trees on the horizon are the only "mountains" presented) and every single time it accepts the answer that such trees are mountains. I've gotten the mountains challenge dozens of times, the results are very consistent. If there is a group of trees on the horizon, that is asserted to be a mountain.

> "enough people saying "that is clearly not a mountain" would reinforce that it's, in fact, probably not a mountain."

Totally irrelevant because if I am trying to get through a google captcha, it's because that captcha is standing in the way of me doing something. My interest is in passing the captcha, not correcting Google's shitty image classifier. So I have absolutely no incentive to make my life harder by insisting on correct answers, and every incentive to tell Google what they want to hear.

>So I have absolutely no incentive to make my life harder by insisting on correct answers, and every incentive to tell Google what they want to hear.

I guess this is where the misunderstanding is. You don't think Google wants to hear the correct answer?

Trying to guess at what the daily/monthly flavor of "correct" is seems like it'd do more harm than good, resulting in some kind of nondeterministic guessing game of "well, trees on the horizon are probably assumed to be a mountain" that never settles on actually-correct answers (and, I'd wager, is often more inconvenient to the user than just answering correctly would be, because now there's a layer of indirection on what they think a system thinks of an image, rather than just what they think of that image).

If everyone just answered "no, that's trees" instead of a hand-wavy "I think you think it's a mountain", I feel like this captcha would be significantly easier for us humans (because we could actually give real answers), as well as less inconvenient for people who just want to pass on through and get on with whatever they were doing before a site wanted to verify they weren't a bot (because they can just, well, identify images instead of playing a game of "what does the machine think?").

> "You don't think Google wants to hear the correct answer?"

They may want it but they don't reward it. I don't care what sort of answer they want, I only care what sort of answer they accept. I'm not going to donate my time to these bastards by doing anything more than what's necessary to pass their captcha.

> "If everyone just answered "no, that's trees" instead of a hand-wavy "I think you think it's trees","

That's just not going to happen: https://en.wikipedia.org/wiki/Prisoner%27s_dilemma