Hacker News new | ask | show | jobs
by habitue 1532 days ago
It is important context, but just to push back against people over-correcting on this, my guess is that the ones he rejected also looked approximately this good.

I think the primary reason people are wowed by this thread isn't attributable mainly to the subtle effect of the cherry-picking he did, but in fact to the overall quality of any image generated by DALL-E 2.

4 comments

Yeah that’s right. There were very few strictly-bad ones across the entire thread of generations

The rejections were most commonly

1. Kind of just slightly boring or literally drawing the thing rather than being cool and artistic

2. Cool but similar to the artistic style of bios near it in the thread, whereas I wanted to keep it diverse (surreal followed by literal, oil followed by sharp lines etc) so it's more fun to scroll through

Whereas a few years ago generative models (GANs etc) would often render like static noise sometimes or completely wrong things. I've only seen that problem once with DALL-E across hundreds or thousands of images now (it generated a fully white image)

> 2. Cool but similar to the artistic style of bios near it in the thread, whereas I wanted to keep it diverse (surreal followed by literal, oil followed by sharp lines etc) so it's more fun to scroll through

Has anyone compiled a list of the styles and artists Dall-E "knows"? How niche does it get? Decorative Initial Caps? Florid Victorian Ornaments? Googie Architecture? SFF artists like Michael Whelan, Vincent Di Fate, Jeffrey Catherine Jones, or Jim Burns? Banksy? Sculptors like Bathsheba Grossman or Markus Pierson? Early animation artists like Ub Iwerks or E. C. Segar?

I was experimenting with one of the VQGAN+Clip notebooks a while ago, and it did pretty well with some styles, but not so much with "Heroic Realism" or "Soviet Propaganda Poster" or "Sheppard Fairey", and even worse when I was trying to get it to draw in that style an object that could be construed as implying a style itself like "retro robot" or "50s raygun" (eg. "A retro robot drawn in a heroic realism style" or "A cubist painting of a steampunk pistol"). Is that kind of dissonance a problem for Dall-E?

Can you ask DALL-E to draw itself?
Yes, and it sees itself as a really cute little demon: https://mobile.twitter.com/gdb/status/1512521912064229377
Somehow this makes me feel a bit more at ease about this whole thing.
Maybe that's how they want you to feel.
Is that like asking GitHub’s code autocomplete to write a code autocompleter?
Asking Copilot to write Copilot. Hmmmmmm…
That's very cool but once you have stable image output how do you define good image output when it comes to art?

The stuff on deviantart is pretty good too and neatly tagged and classified by art style.

I’d often send like six images to the person who’s bio I was making and ask them to choose two :)
For better results send three, two being static noise :)
https://twitter.com/nickcammarata/status/1512123067803344899...

You're absolutely right, here he displays the full set for a given prompt. They all look fantastic!

I've been sitting here with my mouth wide open for 5 minutes unable to move past what you just showed me. I can't fathom that this exists.
DALL-E 2 isn't the first superhuman AI, but it is the first capable of teaching the whole world of just what that means for all of us.
I've been casually following this space for a while (as a full stack web/mobile engineer, nothing to do with ai) and this feels substantially different than what I've seen before.

Would you have names or links for some other projects you're aware of? Would love to check them out.

GPT-3 is surely as jaw dropping as this?
No, GPT-3 still produces gibberish at times. The majority of the good examples still ramble like a schizophrenic person. Much of the output is uncanny, interesting, and impressive in its own right but I wouldn't describe it as human level.

DALL-E 2 is different from what I've seen. The things it produces seem to actually make sense the majority of the time. The outputs are strikingly similar to what a competent human might output as opposed to one with a severe mental illness.

I'm sure part of this is an inherent advantage that DALL-E enjoys regarding context. Art is supposed to be artistic whereas text is expected to maintain long distance logical consistency of abstract concepts across a stream of output and also to communicate something concrete. So in a sense the bar for art is probably lower in many ways.

You cannot absorb words as fast as pictures. GTP-3 is more impressive as it seems to have auch broader depth of understanding context. The disadvantage of GTP-3 is that it is sometimes very wrong like with simple math problems
Did GPT-3 write this comment ?
Having worked with Nick extensively, take what he says with a grain of salt. He’s well known even by close friends to be a reality distorter, to put it softly.
Sir, this is a public discussion over a well-enough documented breakthrough with good-faith non-corporate actors on both sides of the original friend-oriented equation. There’s no practical nor epistemic need to hijack it as if we were all hanging out in the laundromat of your worldview.
This is a public forum discussing a public tweet made by an employee of a for-profit private company who sells you this technology. And said employee is a traveling salesman and consummate hype machine, acknowledged by his own best friends - and even self many times.

Practical and epistemically relevant knowledge to anyone deciding how interesting these results, presented originally without mentioning they were cherry-picked, are. I’m doing a favor to provide it, as doing so isn’t exactly something that makes me look great, but is very much worth knowing for anyone following him.

Side note - there’s this SF club of effete intellectualistas who fashion themselves as modern day florentines during a de novo renaissance. They do a lot of back-patting. They have the exactly mentality of your reply - be kind, love is all you need, etc.

It’s sort of the exact opposite of the east coast mentality that willingly sacrifices looking good and “getting along” in favor of finding the truth despite some discomfort. Discomfort to this group is very taboo.

Of course, this don’t-rock-the-boat mentality is very much intentional as it gives said club the ability to instantly shun anyone who deigns to critique it, allowing them to continue building their following.

Your first critique was ad-hominem.

Your second critique: assumes the original presentation had to be accompanied by methodology and proof to be of value; derives an implicit attempt-to-distort from your perspective of the scene at hand; devolves into paternalism to end in unsubstantiated moralism.

I might even agree with the spirit underlying your words—given, say, the meaning-loss of the company’s name—this just isn’t the way to convey it.

I’m adding to truth finding, in relevant context. My names in my profile. It’s a discussion forum - the opinions are the point.
If only James Randi was around. What a fantastic example of cold reading.

Gather round, gather round, give me a text, any text at all and I will produce you an image of some kind. And you will call it "good" if it looks like anything at all.

Because all art is subjective and your mind will work overtime to connect it back to the text you provided.

Even if the text just serves as random entropy, it's alright for people to feel a subjective connection between the artwork and the text.
Does OpenAI have a GUI that you're using or is that a CLI?