Hacker News new | ask | show | jobs
by akavi 1375 days ago
What could contain information is a book of all 512x512 images that a human being would perceive as being "an image". Ie, the vast majority of possible 512x512 images look like random noise to humans. Excluding those massively shrinks the size of the book.

So that does mean image model AI like dall-e/SD are effectively compression over this space of images that they can generate (which is at least attempting to emulate the space of 'meaningful images' to a human), since given a seed, they'll deterministically produce the same image, and that seed is much smaller than the information needed to describe every pixel in the image.

1 comments

Ah yes, conceptually, SD is a filter that removes noise (~ things that look like noise to us) from the space of all possible pixel combinations. It's interesting that it's also how it works in practice.