Hacker News new | ask | show | jobs
by nutanc 1476 days ago
I don't think it's a secret language per se. It's just that the tokens generated for these sentences are for some reason coming close to a bird latent space. Maybe if we can dig deep and do a google search for kinds of birds we can find the connection. Tokens from OpenAI below.

https://t.co/Of8CBGdGAE.

Found this answer:

https://twitter.com/BarneyFlames/status/1531736708903051265?...

2 comments

> I don't think it's a secret language per se. It's just that the tokens generated for these sentences are for some reason coming close to a bird latent space.

Serious question: what else do you think language is? How else would your brain associate the word "bird" with the concept?

I would associate a bird with a bird. I wouldn't associate birmongle with bird just because both start with the token "bir". What Dalle is doing here is what it has been trained to do. Just find the closest token match and try to draw something. It does not understand something. And that's absolutely fine. I am just saying it does not have a secret language or something. It's just regular language and it's closest matches.
But there's nothing particularly avian about the phonemes "bird". Your brain just matches that pattern of sounds to the nearest concept in thought-space, which happens to describe birds. That's what language is, a consistent mapping of words to concepts.

Which seems to also be what this article is describing.

I think the weird thing is that it creates these "words" and uses them in its output.
Indeed, this is an expected outcome from the way that DALL-E is trained. It’s still an interesting finding and sometimes affirming the expected can lay the groundwork for future interesting discoveries.

This is an example of an application where uncertainty modelling would help greatly. Any and every input will lead to an output. That doesn’t mean that all regions of latent/embedding space are equally valid.

I’m in the camp that large/modern ML models are nearing human intelligence, in some aspects. What’s currently missing is the universal ability to estimate uncertainty and identify inputs that are out of distribution. Many groups are working on this and perhaps we already have the solution but are not combining the right uncertainty estimation approach with the right foundational model.