Hacker News new | ask | show | jobs
by jerf 1478 days ago
Expected after the fact, somewhat. Before hand it would not be unreasonable to expect that the output text and the input text aren't necessarily that kind of connected, though, especially as as I understand it, DALL-E was not given input labelling explaining the text in various images. To it, text is just a frequently-recurring set of shapes that relate to each other a lot. This may yet be a false positive, based on other discussion.

That the model would have a consistent form of some kind of gibberish would be a given. Even humans have it: https://en.wikipedia.org/wiki/Bouba/kiki_effect And I'm sure if you asked native English speakers, "Hey, we know this isn't a word, but if it was a word, what would it be? 'Apoploe vesrreaitars'" you would get something very far from a uniformly random distribution of all nameable concepts.