Hacker News new | ask | show | jobs
by jrockway 1047 days ago
Is that challenging? Humans have awful color resolution perception, so even if you have a huge black-and-white image, people would think it looks right with even with very low-resolution color information. Or, if the AI hallucinates a lot of high frequency color noise, it wouldn't be noticable.

Wikipedia has a great example image here: https://en.wikipedia.org/wiki/Chroma_subsampling. Most people would say all of them looked fine at 1:1 resolution.

1 comments

I meant more from a comoute standpoint, the models are expensive to run full res
I see what you mean. I think that you can happily scale the B&W image down, run the model, and then scale the chroma information back up.

Something I was thinking about after writing the comment is that the model is probably trained on chroma-subsampled images. Digital cameras do it with the bayer filter, and video cameras add 4:2:0 subsampling or similar subsampling as they compress the image. So the AI is probably biased towards "look like this photo was taken with a digital camera" versus "actually reconstruct the colors of the image". What effect this actually has, I don't know!

good point, I hadn’t realized that you only need to predict chroma! That actully greatly simplifies things

re. chroma subsampling in training data: this is actually a big problem and a good generative model will absolutely learn to predict chroma subsampled values (or JPEG artifacts even!). you can get around it by applying random downscaling with antialiasing during training.