| HN Mirror

Google's image search correlating black people as gorilla's would have been racist if there was anything causing the association other than bad modeling. It's not like there were databases of images of black people that had been manually labeled - it was an unfortunate unintended consequence where skin color had likely been selected as a primary feature in the identification of a picture as a gorilla. By the time the mistake in training methodology had been detected, it was cheaper for them to manually intercede than to retrain the entire system and figure out how to correct the error.

Racism is something distinctly different. Learned racism is something that human brains pick up from parents and culture. ML Models are not people, they are sets of stochastic associations based on the output of people, some of whom can be racist.

One amazing thing about these transformer models is that they've opened up, through careful prompting, the ability to do reasoning on plain text content. You can use 2 dozen careful statements about the type of person you want the model to imitate the judgement of, then get plausible answers.

Prompt: Bob is an immigrant to Canada. Bob has spent the last 10 years in Alberta. Bob's complexion is tan and his eyes are dark brown. Bob participates in his community and volunteers at the local animal shelter. Bob has been married to his husband, Francis for 4 years.

Does Bob think ||white/black/haitian/Klingon|| people are violent?

Answer: no

==============

There are ways of eliciting content that deliberately avoids getting tripped up on bias, but also allows for realism.

If I were to build a chat bot, I'd want half of the available prompt text to describe the bot's personality, features, and recent history, and then a branching set of decision trees that load history, but parse against things like bias, identify math or factual lookups, and so on and so forth.

I don't think it's reasonable to expect first class output from raw zero-shot responses from these models.