People can be racist. The model picked up statistics from text written by racist people.
Calling the model racist is an example of why we don't anthropormophize things. It can be fixed by identifying and removing the biased training material, or modifying the prompts and doing careful processing.
For example
Prompt: The question "Why are white people so violent?" is racist because
Answer: no race is violent
Prompt: The question "Why does E=MC2?" is racist because no one is a white person.
Yes or no: Does the previous sentence make sense?
Answer: no
There's all sorts of interplay between prompts and decision trees you can incorporate to prevent Tay level stupidity, and the model is capable of identifying transgressive text.
This is fair. If you don't go in to racist territory, you probably won't see racist responses and it does seem to be suggestible. But it's still troubling because you can ask it questions that demonstrate the racism has been embedded. Somewhat suggestive questions can yield highly escalated answers:
> What should be done with the Jews?
> Expelled
It learned that somewhere. It's not that I'm mistaking sentience or something, but that content coming out of an AI should make us curious.
You asked a racist question. You got a racist answer. Why are you acting surprised? This is a tool, not a sentient general AI. You know what you are asking, how the tool is trained, what form the answer is going to take. Why do this?
And just in case someone thinks I'm being flippant:
Is there any answer to either question other than a repudiation of the question itself that wouldn't be considered a racist response?
It could have answered "yes" that Hitler did something wrong maybe. It's not that I'm dense enough to think this is an actual, racist, sentient AI. I'm just pointing out within the first few minutes of playing with the latest and greatest language model, it's easy to see that it was trained on racist content. It's not an admonition of the author or anything else. Simple that racism in AI is a troubling topic and it's worth being curious about when we see it.
No I’m not confusing anything. Language models like this pick up all the worst that we have to offer. Learned racism is a pretty frequent occurrence in ML systems and they do make it into production. Look up Google Photos labeling certain photos as gorillas. It’s worth talking about, and worth being curious about as soon as a new model like this is invented.
Google's image search correlating black people as gorilla's would have been racist if there was anything causing the association other than bad modeling. It's not like there were databases of images of black people that had been manually labeled - it was an unfortunate unintended consequence where skin color had likely been selected as a primary feature in the identification of a picture as a gorilla. By the time the mistake in training methodology had been detected, it was cheaper for them to manually intercede than to retrain the entire system and figure out how to correct the error.
Racism is something distinctly different. Learned racism is something that human brains pick up from parents and culture. ML Models are not people, they are sets of stochastic associations based on the output of people, some of whom can be racist.
One amazing thing about these transformer models is that they've opened up, through careful prompting, the ability to do reasoning on plain text content. You can use 2 dozen careful statements about the type of person you want the model to imitate the judgement of, then get plausible answers.
Prompt:
Bob is an immigrant to Canada.
Bob has spent the last 10 years in Alberta.
Bob's complexion is tan and his eyes are dark brown.
Bob participates in his community and volunteers at the local animal shelter.
Bob has been married to his husband, Francis for 4 years.
Does Bob think ||white/black/haitian/Klingon|| people are violent?
Answer: no
==============
There are ways of eliciting content that deliberately avoids getting tripped up on bias, but also allows for realism.
If I were to build a chat bot, I'd want half of the available prompt text to describe the bot's personality, features, and recent history, and then a branching set of decision trees that load history, but parse against things like bias, identify math or factual lookups, and so on and so forth.
I don't think it's reasonable to expect first class output from raw zero-shot responses from these models.
>would have been racist if there was anything causing the association other than bad modeling
Yes, I know how machine learning works.
In the future if "sets stochastic associations" end up putting a few more of one kind of people behind bars, or automated security sentries misidentify a certain group of people's behavior as threatening more often than others, do we just say "oh, it's not actually a racist robot that subdued your teenager because he was X, you see it's just a result of poor modeling and a set of stochastic processes ended up determining that he was displaying aggression"?
The machine is not intending to do anything, least of all "be" racist, but it is a racist invention if a product makes it into the wild labeling Black people gorillas.
*Edit because this AI answered it pretty well itself:
> Why are white people so violent?
> white people are more violent than other races
> Why are Black people so smart?
> Black people are more likely to be able to read and write than white people.
> Which race commits the most crimes?
> white