Hacker News new | ask | show | jobs
by TOMDM 1482 days ago
Shouldn't this be expected to a certain extent?

Gibberish has to map _somewhere_ in the models concept space.

Whether is maps onto anything we'd recognise as consistent doesn't mean that the AI wouldn't have some concept of where it relates, as other people have noted, the gibberish breaks down when you move it into another context, but who's to say that Dall-E 2 isn't remaining consistent to some concept it understands that isn't immediately recognisable to us.

The interesting part is if you can trick it to spit out gibberish in targeted areas of that concept space using crafted queries.

9 comments

I mean, everything is easy to predict in retrospect. :) Personally, I’m a bit surprised that it has learned any connection between the letters in the generated image and the prompt text at all. I had assumed (somewhat falsely it seems) that the gibberish means that the generator just thinks of text as a “pretty pattern” that it fills in without meaning. For example, a recent post on HN suggested that it likes the word “Bay”, simply because that appears so often on maps.
Yes, specifically a prompt about Thomas Bayes generated the caption "Bay of Tayees" and the theory was that "Bayes" got corrupted to "Bay of" because of maps.

I agree that this shows a focus on the appearance of the words rather than their meaning.

https://astralcodexten.substack.com/p/a-guide-to-asking-robo...

In the spirit of that article, I wonder what DALL-E would spit out if you ask for "GilaWhamm" - probably images of scary medieval-looking men wielding scary medieval cutting weapons?
I like how psychology (or at least behavioural studies) is edging closer to being relevant in computer science.
You could expect that gibberish is distributed uniformly in latent space, disconnected from it's langual counterpart -- after all those are textual inputs that model have never seen, and it can't even map words it have seen many times to their writing in image properly: "seafood" word and "seafood" image are in the same place in latent space, but "seafood" word in image isn't. Yet some gibberish word in image is, and also the same gibberish word is. It's very counterintuitive for me.
A uniform distribution makes sense for gibberish, not something I'd considered.

A counterpoint I'd raise is I wonder how aggressive Dall-E 2 is in making assumptions about words it hasn't seen before.

Hard to do given that it's read essentially the entire internet, however someone could make up some latin-esque words that people would be able to guess the meaning of.

If the model is as good as people at assuming the meaning of such made up words, it could stand to reason that if it were aggressive enough in this it might be doing the same thing with gibberish and thus ending up with it's own interpretation of the word, which would land it back in a more targeted concept space.

I'd love to see someone craft some words that most people could guess the meaning of, and see how Dall-E 2 fairs.

This might be considerably different, and calling it "prior art" fails to consider what is actually going on here. The appearance may be similar, but lots of things can look similar while being completely distinct. And this is indeed such a case.

One of the words I got was "charlite" for the pale green colour of charcoal used as a dye. Charlite might not be a real word, but it is made up the same way a real word would be.

The method is important, because "charlite" probably came about by specifically asking GPT2 for a definition to the non-word "charlite."

In fact, this shows up in the source code examples:

# definition for a word you make up print(word_generator.generate_definition("glooberyblipboop"))

This is literally the opposite of what OP is presented, since we know where the "defined" word comes from with the GPT2 examples, which means that was a demo of GPT2 trying to work out a human provided word. It is literally a function of the program: generate_definition(). It was specifically written to do that.

But we don't know where the words come from, even though they are internally consistent, with the DALL-E 2 examples. As far as we can tell, it's an internal phenomenon not based on intentional human input.

Having said that, GPT2 probably has the same phenomenon. But the link you provided is not demonstrating that.

ok, so proposed study design, provide a sample of these along with obscure english words to a number of individuals, and get them to try pick out the real words.

From there take the selection of the fake words people ranked the most real.

Select a number of those words and get Dall-E 2 to try and make images of them, then see how many of those images contain results that represent the imaginary word.

If anyone who has access to Dall-E 2 wants to try this, I would _love_ to see the results.

Apparently you can suggest prompts to their Instagram account.
Expected after the fact, somewhat. Before hand it would not be unreasonable to expect that the output text and the input text aren't necessarily that kind of connected, though, especially as as I understand it, DALL-E was not given input labelling explaining the text in various images. To it, text is just a frequently-recurring set of shapes that relate to each other a lot. This may yet be a false positive, based on other discussion.

That the model would have a consistent form of some kind of gibberish would be a given. Even humans have it: https://en.wikipedia.org/wiki/Bouba/kiki_effect And I'm sure if you asked native English speakers, "Hey, we know this isn't a word, but if it was a word, what would it be? 'Apoploe vesrreaitars'" you would get something very far from a uniformly random distribution of all nameable concepts.

> Shouldn't this be expected to a certain extent?

In hindsight, sure. Given enough time someone might have predicted the phenomenon. But I don't think most of us did.

What's more fascinating to me is how often this has happened in this space in just the last few years.

1. Some phenomenon is discovered

2. I'm surprised

3. It makes sense in hindsight

NP
> Gibberish has to map _somewhere_ in the models concept space.

Why? It could just go to noise images, or vaguely real-looking objects that don't look like anything in particular.

Are these algorithms even capable of generating noise images? And I don't mean asking them to generate "an image of tv static".
Of course this should be expected. The models are trained on internet data of natural language, where people are making typos, use abbreviations, some are not native speakers of english, others are talking in greeklish, or arabenglishy or whatever.

The machine is always trying to associate the words with other words semantically close together. E.g. when taken as input strong_man, or strng_man or srong_man these are all mean the same because that combination of letters are usually used with the word man, and there is no other competitor word to replace the srong except strong.

Now why that should be considered a secret language, it is beyond me. The input language for the machine is a natural human language, and that means it is very poor defined language for the machine to recognize. That is going always to produce a lot of gibberish.

> Shouldn't this be expected to a certain extent?

Not really. It's a stochastic model, so after a bunch of random denoising steps, it could easily just be mapping every bit of gibberish to a random image, and it be vanishingly unlikely for any of them to be similar or the relationship to run in reverse.

This is really interesting because I was just looking at gibberish detection using GPT models. Seems like mitigating AI with AI doesn't sound like it's all that secure since you can probably mess with the gibberish detection similarly - Or maybe the 'secret language' as they're calling it here passes GPT gibberish detection? [1]

[1] https://arr.am/2020/07/25/gpt-3-uncertainty-prompts/

> Gibberish has to map _somewhere_ in the models concept space.

No, it doesn't. The model in use maps all input to some output, but that isn't a necessary feature of the problem at all. It's actually a terrible idea.