Hacker News new | ask | show | jobs
by TeMPOraL 973 days ago
OpenAI justified that in a paper the other day, saying that DALL-E 3 performs better on longer, more detailed prompts describing all aspects of the image in rich language - so they put GPT-4 in front to expand the typical user's short and vague prompt, so such users get nicer results by default.

My own observation: this kind of hack is possible only since/with GPT-4 - it takes an LLM this powerful to reliably extend and enrich arbitrary user input into much longer prompt, that's coherent, consistent, and a plausible (to human) interpretation of the original input.

Now this may fan the flames on the "is it or is it not" AI discussions, but: you could almost say that GPT-4 is engaging in creative process here.