|
|
|
|
|
by phire
216 days ago
|
|
Most image models are diffusion models, not LLMs, and have a bunch of other idiosyncrasies. So I suspect it's more that lessons from diffusion image models don't carry over to text LLMs. And the Image models which are based on multi-mode LLMs (like Nano Banana) seem to do a lot better at novel concepts. |
|