Hacker News new | ask | show | jobs
by ericpauley 56 days ago
Going to have to disagree on the backup test. Opus flamingo is actually on the pedals and seat with functional spokes and beak. In terms of adherence to physical reality Qwen is completely off. To me it's a little puzzling that someone would prefer the Qwen output.

I'd say the example actually does (vaguely) suggest that Qwen might be overfitting to the Pelican.

4 comments

Qwen's flamingo is artistically far more interesting. It's a one-eyed flamingo with sunglasses and a bow tie who smokes pot. Meanwhile Opus just made a boring, somewhat dorky flamingo. Even the ground and sky are more interesting in Qwen's version

But in terms of making something physically plausible, Opus certainly got a lot closer

Given adherence is a more significant practical barrier, it's probably the better signal. That is, if we decide too look for signal here.
The fundamental challenge of AI is preventing unprompted creativity. I can spin up a random initialization and call all of it's output avante garde if we want to get creative.
I recently fell down the rabbithole of AI-generated videos, and realised that many of the "flaws" that make them distinctive, such as objects morphing and doing unusual things, would've been nearly impossible or require very advanced CGI to create.
"artistically interesting" is IMHO both a subjective and 'solved' problem. These models are trained with an "artistically interesting" reward model that tries to guide the model towards higher quality photos.

I think getting the models to generate realistic and proportional objects is a much harder and important challenge (remember when the models would generate 6 fingers?).

The Opus bike isn't very physically plausible though.
Even the first one - Qwen added extra details in the background sure. But he Pelican itself is a stork with a bent beak and it's feet is cut off it's legs. While impressive for a local model, I don't think it's a winner.
Did you see opus bike though for that same test? I know it is about the flamingo but that is bad.
Qwen, at least, can draw a complete bicycle frame. The opus frame will snap in half and can’t steer.
Qwen's frame is so strong that it broke both feet off the pelican.
Clearly he's riding a fixie and trying to stop. Pelican didn't drink his Ovaltine.
It's a 3B model. It should not be this close. Debating their artistic qualities is missing the point.
35B, but your point stands I think.