Hacker News new | ask | show | jobs
by wzdd 970 days ago
> of course I'd paint a stereotype to make sure it looks Indian

Would you? That's pretty boring: Given the vagueness of the prompt, you're actually free to paint anyone, from Kumari Mayawati to Satya Nadella.

Not doing "the obvious", whether that's a harmful stereotype or just a tired trope, is part of what makes art art. But from the images, of which there are hundreds, all of them extremely similar, I wouldn't think "an Indian person". I'd think something far more specific: an old bearded Indian man wearing a turban. Which is sort of the article's point.

Interestingly, trying the same prompt in my local installation of Stable Diffusion, I got quite a lot more variety in terms of age and sex (though I couldn't really escape turbans and bindis). So this actually seems fixable even for very vague prompts, despite the implication of your comment that the problem is with the user.

2 comments

> Would you?

If I were to be honest, yes. There would probably be a lot more diversity in my paintings than demonstrated in the article, but ultimately my experience would be limited to what I see in the immigrant community, popular culture, and the news. For the most part, those are very narrow slices of Indian society. More important, it will reflect what I see most often in those categories and is unlikely to reflect facets I rarely see.

If anything, AI art could probably do better than I when properly prompted. One could choose someone who would is likely to exist (a farmer in India or a university student in India) and the model would likely have some "idea" of what they look like. Perhaps a language model can massage vague prompts to create more specific and representative ones automatically, to further reduce individual bias. (I say reduce because it's ultimately limited to the data that has been fed to it, but it should have a broader scope than an individual person has.)

Why should we lower the bar on AI models to your superficial understanding of Indian culture?
>Would you? That's pretty boring: Given the vagueness of the prompt, you're actually free to paint anyone, from Kumari Mayawati to Satya Nadella.

this would be a valid point if the person doing the painting was of sufficient artistic ability that they could paint a picture of a specific Indian person and have it be recognizable, if they knew what specific Indian person would be recognized by the person requesting they drawn an Indian person.

This response demonstrates the same issue as the OP, which is to think like an engineer and attempt to reverse-engineer the design goals of the software rather than to consider the prompt in and of itself, without context.

If you commission someone to paint "an Indian person", would you withhold payment if they painted a specific Indian person, or an Indian person not in traditional dress? (And, to be clear, Midjourney is certainly capable of doing this recognisably). Hopefully you would instead be happy with the result, because it would be what you asked for -- if you specifically wanted a "stereotypical Indian person" you would have asked for that instead. "Be recognised by the largest amount of people" is not typically the goal of an artistic work. Is it the goal of Midjourney? Well, to the extent that it is, that's the problem that the article is pointing out: if you attempt to cater to everyone, you will necessarily produce a picture which is at best conventional and at worst extremely stereotypical.

A few seconds of playing around with Stable Diffusion shows that this need not be the case, so the article actually points out a specific deficiency of Midjourney.

You said: >Given the vagueness of the prompt, you're actually free to paint anyone, from Kumari Mayawati to Satya Nadella.

let me re-emphasize

>you're

the abbreviation you are is evidently in reference to a person, a person free to paint anyone. Specifically you asked the previous person if THEY would paint a stereotype if asked to paint an Indian.

If I am free to paint anyone when asked to paint an Indian I will never paint a specific Indian and always attempt to paint a stereotype because my personal painting skills are not good enough to paint any specific Indian and have them be recognizable by anyone.

I assume the abilities of Stable Diffusion and Midjourney are actually good enough to paint a specific Indian, their abilities are definitely greater than mine when it comes to 'painting'.

For some reason you decided my response had something to do with Stable Diffusion and Midjourney from an engineer's perspective, rather than the specific subject of what a human would do if given the same prompt.

I don't know why you would make this mistake, maybe the response demonstrates the tendency of engineers to misunderstand the meaning of simple texts if they do not match up to their preconceptions?