Hacker News new | ask | show | jobs
by lotaezenwa 1477 days ago
I concur that the tweeter is being pedantic.

This is largely some embedding of semantics that we currently do not fully have a mapping for, precisely because it was generated stochastically.

Saying it was "not true" seems like clickbait.

2 comments

If DALL-E had a choice to output "Command not understood", maybe we wouldn't be discussing this.

Like those AIs that guess what you draw, and recognize random doodling as "clouds", DALL-E is probably using the least unlikely route. That a gibberish word is drawn as a bird is maybe because it was "bird (2%), goat (1%), radish (1%)".

1. https://quickdraw.withgoogle.com

That's extremely optimisic. When faced with gibberish, the "confidences" are routinely 90%+ as with "meaningful" input.

It's almost as-if its an illusion designed to fool, we, the users.. by only providing inputs meaningful to us, we come to the foolish idea that it understands these inputs.

This is a good point. The fact that DALL-E will try to render something, no matter how meaningless the input, is a trait it has in common with many neural networks. If you want to use them for actual work, they should be able to fail rather than freestyle.
Especially since his results confirm most of what the original thread claimed. A couple of the inputs did not reliably replicate, but "for the most part, they're not true" seems straightforwardly false. He even seems to deliberately ignore this sometimes, such as when he says "I don't see any bugs" when there is very obviously a bug in the beak of all but two or three of the birds.
When I zoomed in, I felt only four in ten birds clearly had anything in their beaks, and in each case it looked like vegetable matter. In the original set, only one clearly has an insect in its beak.

Are there higher-resolution images to be had?

Lower in the same thread he accepts that his main tweet was clickbaity, and that actually there's consistency in some of the results.
Not really, he afterwards says that he was more trying to inject some humility. He really doesn't think this is measuring anything of interest. For the birds result in particular, see https://twitter.com/BarneyFlames/status/1531736708903051265.
If I read what that tweet says properly, the system ended up outputting things that were almost scientific nomenclature for the general class of items it was being asked to draw. There are probably many examples of "bird is an instance of class X" in the text but they are not consistent, and the resulting token vector is a point near the center of "birdspace".
Yes. Indeed, it seems to interpret a lot of nonsense tokens it doesn't recognize as though it's probably the Latin / scientific term for some sort of species it doesn't remember very well (keeping in mind that all these systems are attempting to compress a large corpus into a relatively small space). I think https://twitter.com/realmeatyhuman/status/153173904648934195... is best illustrative of this phenomenon.

So, it's certainly an "interesting" result in the sense that it shows how these kinds of systems work, but it's definitely not a language.

Why is it important if it's "a language" or not? What we're talking about are concept representations (nouns), not languages. But I think most people who read "DALL-E has a secret language" probably picked up on that because we're accustomed to the hype in machine learning naming things to sound like they are more profound and powerful than they really are.
When questioned about the change of tone, he answers "Well... a little bit of twitter hype makes a thread go a long way".

https://twitter.com/emnode/status/1531852124501553153