| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by wizzwizz4 1747 days ago
	It's probably because humans are primates – but the AI systems often have to treat “human” as a completely separate category as “primate”, so they have to draw weird, complex boundaries around “primate” (actually “all non-human primates”). When the “primate” classification is stronger than the “human” classification, the system says “primate” rather than “human”, and if it's predominantly been trained on “pictures of white Americans are not pictures of primates”, its “primate” definition might not be skewed to miss everyone else. I expect you'd get better results if you allowed the system to call humans “primates”, then accept “human primate” as “human” when parsing the output. (That is, leave the “is_primate” output line floating while training on pictures of humans.) I don't know whether that would work, though.

1 comments

toxik 1747 days ago

These models typically don’t have hierarchical labels like that, and they apply a softmax to their output - which means /one/ label will be considered correct. (A softmax means taking the exp of your predicted scores, then divide by the sum.)

link

wizzwizz4 1747 days ago

I know – but there's no technical reason they shouldn't have more complex relationships between labels (assuming you can train that, which I don't know). If we can't get better training data, at least trying to fix the problem at the algorithms (instead of slapping crude filters on the end of them) would be nice.

link

toxik 1747 days ago

I agree in spirit but disagree in practice, I think. Like we said previously, the domain is humongous so even establishing meaningful relationships between labels and sublabels is extremely difficult. Many cases are likely ambiguous too, our human understanding isn’t actually hierarchical - it’s much more elusive. It’s a square peg round hole type problem really, humans don’t really think in terms of labels in the first place, we mostly use them for the purposes of language.

link