Hacker News new | ask | show | jobs
by jharsman 3447 days ago
That's not how lossless compression of JPEGs work.

Besides removing information from the file that doesn't affect the rendered image (like EXIF data), lossless recompressors typically replace the huffman coding of DCT coefficients with a more efficient arithmetic coder. So you don't start over from raw pixels, but you replace the type of compression used with a more modern and efficient algorithm. That means ordinary software can't read the JPEG (since you've essentially created a new format) but you can just decompress into standard JPEG whenever someone wants to look at the image.

1 comments

> Besides removing information from the file that doesn't affect the rendered image

You can do this if the goal is pixel perfect accuracy, but Flickr can’t do this since they have “a long-standing commitment to keeping uploaded images byte-for-byte intact”…

I bet a lot of those ICC color profiles are the same across many images though... One you could strip the metadata and keep it in a separate deduplicated database, and then reassemble when the user accesses the file.