Hacker News new | ask | show | jobs
by fireattack 2110 days ago
The result seems pretty poor to me?

(I just use the exmaple image that is already in the notebook)

Original: https://i.imgur.com/Q66mHTD.png Result: https://i.imgur.com/4R6qn8e.png

There are lots of random spots on the image, and the brightness level changes totally.

Sure, 5232 kB to 124 kB is impressive, but people would probably prefer a badly compressed JPEG over this, since at least JPEG artifact is predictable (and if image isn't displayed in 100%, the artifact would be less obvious, unlike brightness change and spots in this result).

Edit: I just saw the result in https://hific.github.io/ for the same picture, but that one has none of these flaws (no brightness change, no weird spots here and there) with even smaller filesize. Why?

2 comments

Hey, thanks for bringing the brightness issue to my attention - turns out I wasn't normalizing the output correctly - I just pushed a fix and the output images don't have the brightness change now.

As for the random spots, that's an artifact of the entropy coding algorithm. In principle this is lossless but there is some distortion because I'm using a custom vectorized version of an rANS encoder and it's hard to encode overflow values in a vectorized fashion, I'm working on this though. If you can live with really slow decoding times (2-3mins) then you can disable vectorization to eliminate these small imperfections entirely.

As for the comparison to the official model, that's mainly because of compute constraints v. Google (this is just my weekend project). My model uses a smaller architecture and was trained for only 4e5 steps versus the 2e6 steps they reported in the paper - even then it took 4+ days on AWS! The model is also trained on the Openimages dataset, which is presumably much smaller and more noisy than the massive internal dataset Google used.

Just curious, is the change on the model side? Since I didn't see much relevant in the notebook's rev history [1].

[1] https://colab.research.google.com/github/Justin-Tan/high-fid...

Thank you!
It's a research demo and you seem to frame it in a negative light almost exclusively. Not that your remarks aren't valid (they are). Seems a bit flippant.