Hacker News new | ask | show | jobs
by rasz 2994 days ago
Picture is not compressed, its hallucinated from vague memory of the real thing, a mere dream. Cars vanish, building change wall structure, even the license plate receives fake text absent from source materia.

Its a giant guesswork of what was there originally. Reminds me of Xerox scanners lying about scanned in numbers http://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_...

3 comments

All lossy compression algorithms hallucinate. That's the whole point: reducing image size by dropping some information and then hallucinating a plausible replacement to decompress.

The only difference is that this compression is better at hallucinating, so you don't get ringing artifacts or blocks, but some internally consistent alternate reality.

If you don't want to lose data you should not use lossy compression at all. JPEG can erase the distinction between digits as well.

Okay, but still, some kinds of changes are better than others. They should probably start testing for this in the visual perception tests: is lost information greyed out in a visible way? Are words and digits always fuzzed, or replaced?

Because it turns out that fuzziness and compression artifacts have a higher-level meaning: when you see them, you know something has been lost. That's an important (if inadvertent) signal. We need to make sure the artifacts don't go away.

I can kind of see regulations coming that make neurally compressed image or video data are require to have a little Ⓝ on-screen graphic in one of the corners, in addition to (not necessarily perceivable) watermarks that can make even small crops of the image identifiable as neurally generated. And that is probably the best case.

In the worst case (and more likely?), we are going to ban computational substrates large enough to perfectly forge important data altogether because it will be too easy to misuse. We‘d essentially go back to ~1960s electronics to have at least halfway functioning mechanisms of creating social trust, namely high-bandwidth personal interactions where every thought and every action has a high chance of leaving a trace in the real world and thus contributing to someone’s reputation. No blockchain and no other technology can create nearly as much trust as that without being highly prone to misuse.

It could be a problem if it hallucinates the wrong license plate number at a crime scene. If all you want is a gigantic 8K resolution stock photo of a woman holding her baby in front of a laptop without devouring 10 MB of the user's data cap, it may be fine if the woman has a slightly different (but still highly detailed) hair style.
It would always decompress the same way, so for artistic purposes, if you preview it and it looks good, it is good.

But if you're using photography to look at things in the world, that's a whole different story.

If fuzziness and compression artifacts are a signal, it makes sense that eliminating them would reduce filesize.
There is a difference between dropping data, and replacing it with fake one. Unreadable text is better than scanned cheque with different amount.
Sounds more or less like our human memory and image recall. Sometimes it's more accurate, but sometimes we make up details that were not there originally.
It's not as bad as you say. Sure, the positions of the individual leaves and of the grain of the concrete and of little puffs of the clouds are hallucinated, but most salient semantic features like the presence of a car or of a person are left untouched.

In other words, mostly only unimportant details are hallucinated, which is what we want.

Car behind the bus is gone, bus license plate got some fake text. Its pretty bad as far as reproducing original.