Hacker News new | ask | show | jobs
by wzdd 969 days ago
This response demonstrates the same issue as the OP, which is to think like an engineer and attempt to reverse-engineer the design goals of the software rather than to consider the prompt in and of itself, without context.

If you commission someone to paint "an Indian person", would you withhold payment if they painted a specific Indian person, or an Indian person not in traditional dress? (And, to be clear, Midjourney is certainly capable of doing this recognisably). Hopefully you would instead be happy with the result, because it would be what you asked for -- if you specifically wanted a "stereotypical Indian person" you would have asked for that instead. "Be recognised by the largest amount of people" is not typically the goal of an artistic work. Is it the goal of Midjourney? Well, to the extent that it is, that's the problem that the article is pointing out: if you attempt to cater to everyone, you will necessarily produce a picture which is at best conventional and at worst extremely stereotypical.

A few seconds of playing around with Stable Diffusion shows that this need not be the case, so the article actually points out a specific deficiency of Midjourney.

1 comments

You said: >Given the vagueness of the prompt, you're actually free to paint anyone, from Kumari Mayawati to Satya Nadella.

let me re-emphasize

>you're

the abbreviation you are is evidently in reference to a person, a person free to paint anyone. Specifically you asked the previous person if THEY would paint a stereotype if asked to paint an Indian.

If I am free to paint anyone when asked to paint an Indian I will never paint a specific Indian and always attempt to paint a stereotype because my personal painting skills are not good enough to paint any specific Indian and have them be recognizable by anyone.

I assume the abilities of Stable Diffusion and Midjourney are actually good enough to paint a specific Indian, their abilities are definitely greater than mine when it comes to 'painting'.

For some reason you decided my response had something to do with Stable Diffusion and Midjourney from an engineer's perspective, rather than the specific subject of what a human would do if given the same prompt.

I don't know why you would make this mistake, maybe the response demonstrates the tendency of engineers to misunderstand the meaning of simple texts if they do not match up to their preconceptions?