Hacker News new | ask | show | jobs
by GaggiX 1064 days ago
>current industry frontier

>Dalle 2

I'm sorry OpenAI, but your model is not the frontier; also it's funny that it's the only text-to-image models mentioned, they probably know how better the other models are.

1 comments

Yes there are so many better ways to produce ethically dubious, derivative trash.
I don't think they meant better in the artistic sense but in the sense that they outperform on metrics?
Yes and no, with models like Stable Diffusion the team behind it has released metrics so you can see even on paper that the model has better performance than Dalle 2 (SD has a lower MS-COCO FID so it's better). For models like Midjourney there are no metrics, but the difference in quality is so big that there is no real need if you just want to know which model is better. On a high level, Dalle 2 is worse because it generates a lot of artifacts, no details whatsoever, it has a fixed resolution and aspect ratio, and the fact that the model is proprietary means that you can only do what is available in the frontend and API, and there is not much.
I'm snarking on this guy because I don't care which model we're talking about. Stability might perform better in whatever quantitative metric we want to use, I just think using models like this also makes you a hack that doesn't understand or care about making things.
Like tracing artwork with a pencil, these models are unfortunately too transformative compared to humans to compete with them in creating derivative content. Perhaps AGI will come to help us.
I think continuing to make fun of these people in public is a better solution than AGI