| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by simiones 1280 days ago

> Diffusion models can be compared to a superhumanly talented artist that can be cloned in unlimited fashion by anyone having the software and hardware means.

How can you claim with a straight face that this is a better explanation of what an NN is?

An NN is simply an approximation of a multi-valued function, whose parameters are adjusted by minimizing the difference between the output of the NN and the output of the real function for a certain input. It is much much closer to "a giant archive of compressed images being used to interpolate between them" (though it's not that) than it is to a "superhumanly talented artist".

1 comments

tshadley 1279 days ago

> An NN is simply an approximation of a multi-valued function, whose parameters are adjusted by minimizing the difference between the output of the NN and the output of the real function for a certain input.

Right, but that equally fits a biological NN if you zoom in that close. You'll need more than wikipedia to appreciate what deep-neural-networks are doing here, it's dimensional space that's key. What DNNs do that is similar to the human brain is that they order "concepts" in high-dimensional space. Colors, textures, shape and hierarchies of same are organized and cross-referenced with text in an incredibly complex connectome. It would be useless to memorize images with their textual descriptions as that would be horrendously inefficient/ineffective during inference. Rather, the model must do what we do and understand what makes an image a "landscape" or a "portrait" or a "cartoon". It needs to understand what is an artist's style and how to perform it on a work never before created.

"Understanding" can only mean ordering meaningless letters and pixels in multidimensional space so that they line up with human understanding (and human 'understanding', in turn, can only mean ordering meaningless sensory perceptions in the brain's multidimensional connectome such that reality turns out to be approximately predicted and controlled). The only systems that work this way efficiently are neural networks, biological and artificial.

link

simiones 1279 days ago

> Right, but that equally fits a biological NN if you zoom in that close.

First of all, we have no idea how biological NNs learn, how they represent information, how they reason etc. Given what we do know, there is no reason to assume any similarity with ANNs on any of those fronts. Just to give one example, we know very well that a single biological neuron encodes significant information and is capable of reasoning on its own. In fact, even non-neuronal biological cells are capable of such - especially looking at single-celled organisms, which display extraordinarily complex behaviors with no NN in sight.

Second of all, we don't exactly understand how the huge models we have actually encode the higher-level representations of the training set that they store. Of course, we can say for sure that they are not literally storing a copy of the data on simple space requirements. But we can also say for sure that their "understanding" of the data, as well as their capacity for inference, is significantly different from our own - since they make certain mistakes that are nearly impossible for a human to make, while showing super human abilities in other aspects. So, if anything, we must conclude that whatever it is they are doing, it is most certainly not a way of understanding the information the way we understand it.

link

tshadley 1279 days ago

My initial definition was "like a super-humanly talented artist": this is very different from a human being who also happens to be an artist. Stable Diffusion does only art with text-prompting well, nothing else, and will take a very different "mental" route to creation as a human. But nevertheless it still creates "super-humanly talented" art because it is widely recognized as incredibly good, few artists can do this as well, probably none can do it with comparable range, and certainly no one can match its speed. Therefore its effect on society is as if a super-humanly talented artist could be effortlessly cloned. Where is the laughable conclusion that requires me to force a straight face?

What is laughable is that these abilities could come from interpolation or collage (not your claim but the plaintiff's). The only way these abilities could occur is if Stable Diffusion can represent image and text very similar to the way human brains comprehend them. The argument here is simple: what are the odds that StableDiffusion/DNNs have hit on a representational method that is totally different from human brains yet yields the same recognition, praise and admiration for the artist from everyone who sees it? Seems to me close to 0.

link

tsimionescu 1277 days ago

> it is widely recognized as incredibly good, few artists can do this as well, probably none can do it with comparable range

I very much disagree. Pretty much any halfway decent artist (say, anyone able to at least caricature recognizable people) is able to produce this kind of imitative art, when/if they are aiming for this type of copying. I've seen nothing coming out of SD that I couldn't expect to find on DeviantArt, at least if I commissioned it specifically. Most human artists of course don't do this, since people usually don't like reproductions (apart from posters) or copying others' style. Note as well the huge problem SD has with consistent fine details (especially text, but also often hands and even faces).

Of course, I fully agree on the speed factor (and would add scale/cost) in which SD without a question is far beyond humanity, obviously. That is a meaningful difference that is very likely to affect markets like decorative prints and other low-value art.

Perhaps this basic disagreement (which is of course ultimately subjective, unless someone is going to do a blind taste test) explains the difference of opinions for the rest of the points.

link

tshadley 1276 days ago

I'm assuming that you disagree specifically that few artists can essentially create derivative art (i.e avoiding plagiarization but being clearly influenced by artist X or paying homage to artist Y, etc.) as well as top SD models. Well, what about a blind test: https://www.vice.com/en/article/bvmvqm/an-ai-generated-artwo...

Now, sure, winning just 1 contest isn't going to settle the matter but I think it provides reasonable evidence that SD is heading to elite quality at a rapid pace. Above, the submitter Allen was responsible for directing and cleaning the results so deserves credit. However, Allen is also using a year-old MidJourney model that has likely improved dramatically already.

It seems uncontroversial that even if SD models aren't in the top-tier of imitative/derivative art (or "art from text" as that genre evolves) right now, they will be soon.

link