| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by saltedonion 1747 days ago
	The space for potential edge cases are so large it will be wack mole. Inference tasks isn’t like software testing where the states are well defined.

3 comments

stathibus 1747 days ago

Another way to put this is: nobody working in this field has control over what they are building, and can't make any promises about how it will behave.

It's like launching a rocket ship that you can't test in a physical simulation first. Something is going to go wrong that can't be predicted because we lack the understanding, but this is seen as an acceptable risk.

link

toxik 1747 days ago

Which is inherent to the problem. The domain is the space of _all possible natural images_, it’s so big, it’s ridiculous. Fundamentally, these techniques do not analyze images like humans do, but are rather trained to pick out any salient signal it can latch on to. That seems to be “primates are mostly like humans but darker”, which is superficially true but a pretty weak definition as it includes dark skinned humans.

link

pvaldes 1747 days ago

Primates are not darker than humans. Primates is the Order, they came in all colors, including brilliant blue, white with black stripes, orange, yellow or red.

We live in a world when everybody is offended by nothing. The problem is that nobody should be offended by being called a Gorilla. In the same way as nobody is offended by being called an eagle or a wolf. Is a wonderful animal, smart, strong, protective and gentle. What if some idiots used the term pejoratively five generations ago? We know better. Societies can change.

If white people is not being classified as primates, the algorithm should be corrected so they are. Not fixed excluding black people from humanity.

People should be educated also to understand that an AI algorithm is returning probability, not truth

link

wizzwizz4 1747 days ago

It's probably because humans are primates – but the AI systems often have to treat “human” as a completely separate category as “primate”, so they have to draw weird, complex boundaries around “primate” (actually “all non-human primates”). When the “primate” classification is stronger than the “human” classification, the system says “primate” rather than “human”, and if it's predominantly been trained on “pictures of white Americans are not pictures of primates”, its “primate” definition might not be skewed to miss everyone else.

I expect you'd get better results if you allowed the system to call humans “primates”, then accept “human primate” as “human” when parsing the output. (That is, leave the “is_primate” output line floating while training on pictures of humans.) I don't know whether that would work, though.

link

toxik 1747 days ago

These models typically don’t have hierarchical labels like that, and they apply a softmax to their output - which means /one/ label will be considered correct. (A softmax means taking the exp of your predicted scores, then divide by the sum.)

link

wizzwizz4 1747 days ago

I know – but there's no technical reason they shouldn't have more complex relationships between labels (assuming you can train that, which I don't know). If we can't get better training data, at least trying to fix the problem at the algorithms (instead of slapping crude filters on the end of them) would be nice.

link

toxik 1747 days ago

I agree in spirit but disagree in practice, I think. Like we said previously, the domain is humongous so even establishing meaningful relationships between labels and sublabels is extremely difficult. Many cases are likely ambiguous too, our human understanding isn’t actually hierarchical - it’s much more elusive. It’s a square peg round hole type problem really, humans don’t really think in terms of labels in the first place, we mostly use them for the purposes of language.

link

diskzero 1746 days ago

nobody working in this field has control over what they are building, and can't make any promises about how it will behave. If you are working on machine learning as an employee of Facebook, Google, etc. shouldn't you assume the worst possible outcome from your efforts? I often wonder how people manage this cognitive dissonance.

link

dontreact 1747 days ago

Yeah, but it shows some real lack of care to fall into the same edge case that Google was widely criticized for 3 years ago.

link

bryan0 1747 days ago

I don’t think this is an edge case. We are asking AI to classify images and videos of people where the results can be disastrous.

link

daenz 1747 days ago

A large and complex task is well suited for the talented people behind Facebook. But even a list of basic searches that exercised the engine would have caught this. No excuses imo, particularly since it happened to Google[0] not that long ago. Was nobody paying attention?

0. https://www.theverge.com/2018/1/12/16882408/google-racist-go...

link

ta8902 1747 days ago

Maybe they thought it would be racist if they explicitly devised rules preventing black people being identified as primates.

link