|
|
|
|
|
by tianqi
132 days ago
|
|
I've seen many comments describing the "horse riding man" example as extremely bizarre (which it actually is), so I'd like to provide some background context here. The "horse riding man" is a Chinese internet meme originating from an entertainment awards ceremony, when the renowned host Tsai Kang-yong wore an elaborate outfit featuring a horse riding on his back[1]. At the time, he was embroiled in a rumor about his unpublicized homosexual partner, whose name sounded "Ma Qi Ren" which coincidentally translates to "horse riding man" in Mandarin. This incident spread widely across Chinese internet and turned into a meme. So they used "horse riding man" as an example isn't entirely nonsensical, though the image per se is undeniably bizarre and carries an unsettling vibe. [1] The photo of the outfit: https://share.google/mHJbchlsTNJ771yBa |
|
This problem is infamous because it persisted (unlike other early problems, like creating the wrong number of fingers) for much more capable models, and the Qwen Image people are certainly very aware of this difficult test. Even Imagen 4 Ultra, which might be the most advanced pure diffusion model without editing loop, fails at it.
And obviously an astronaut is similar to a man, which connects this benchmark to the Chinese meme.