|
|
|
|
|
by bonoboTP
442 days ago
|
|
> I haven't see any details on how OpenAI's model works Exactly. People just confidently make things up. There are many possible ways, and without details, "native generation" is just a marketing buzzword without clear definition. It's a proprietary system, there is no code release, there is no publication. We simply don't know how exactly it's done. |
|
It's probably an implementation of VAR (https://arxiv.org/abs/2404.02905) - autoregressive image generation with a small twist. Rather than predict every token at the target resolution directly, start with predicting it at a small resolution, cranking it higher and higher until the desired resolution.