Hacker News new | ask | show | jobs
by kloch 813 days ago
> 10+ bits. Jpegli can be encoded with 10+ bits per component.

If you are making a new image/video codec in 2024 please don't just give us 2 measly extra bits of DR. Support up to 16 bit unsigned integer and floating point options. Sheesh.

4 comments

We insert/extract about 2.5 bits more info from the 8 bit jpegs, leading to about 10.5 bits of precision. There is quite some handwaving necessary here. Basically it comes down to coefficient distributions where the distributions have very high probabilities around zeros. Luckily, this is the case for all smooth noiseless gradients where banding could be otherwise observed.
Does the decoder have to be aware of it to properly display such an image?
To display it at all, no. To display it smoothly, yes.
From a purely theoretical viewpoint 10+ bits encoding will lead into slightly better results even if rendered using a traditional 8 bit decoder. One source of error has been removed from the pipeline.
Ideally, the decoder should be dithering, I suppose. (I know of zero JPEG decoders that do this in practice.)
Jpegli, of course, does this when you ask for 8 bit output.
Has there been any outreach to get a new HDR decoder for the extra bits into any software?

I might be wrong, but it seems like Apple is the primary game in town for supporting HDR. How do you intend to persuade Apple to upgrade their JPG decoder to support Jpegli?

p.s. keep up the great work!

I tried to reach to their devrel person Jen Simmons here: https://twitter.com/jyzg/status/1763141558042243470

I didn't follow up and I don't know if she read it or understood the proposal.

How does the data get encoded into 10.5 bits but displayable correctly by an 8 bit decoder while also potentially displaying even more accurately by a 10 bit decoder?
Through non-standard API extensions you can provide a 16 bit data buffer to jpegli.

The data is carefully encoded in the dct-coefficients. They are 12 bits so in some situations you can get even 12 bit precision. Quantization errors however sum up and worst case is about 7 bits. Luckily it occurs only in the most noisy environments and in smooth slopes we can get 10.5 bits or so.

8-bit JPEG actually uses 12-bit DCT coefficients, and traditional JPEG coders have lots of errors due to rounding to 8 bits quite often, while Jpegli always uses floating point internally.
It's not a new codec, it's a new encoder/decoder for JPEG.
I consider codec to mean a pair of encoder and decoder programs.

I don't consider it to necessarily mean a new data format.

One data format can be implemented by multiple codecs.

Semantics and nomenclature within our field is likely underdeveloped and the use of these terms varies.

This should have been in a H1 tag at the top of the page. Had to dig into a paragraph to find out Google wasn’t about to launch another image format supported in only a scattering of apps yet served as Image Search results.
It is. (well, h3 actually)

> Introducing Jpegli: A New JPEG Coding Library

From their Github:

> Support for 16-bit unsigned and 32-bit floating point input buffers.

"10+" means 10 bits or more.

Would not ">10" be a better way to denote that?
That means something different, but "≥10" would be better IMHO. Really there's an upper limit of 12, and 10.5 is more likely in practice: https://news.ycombinator.com/item?id=39922511
I decided to call it 10.5 bits based on rather fuzzy theoretical analysis and a small amount of practical experiments with using jpegli in HDR use where more bits is good to have. My thinking is that in the slowest smoothest gradients (where banding would otherwise be visible) it is only three quantization decisions that generate error: (0,0), (0,1) and (1, 0) coefficient. Others are close to zero. I consider these as adding stochastic variables that have uniform error. On the average they start to behave a bit like a Gaussian distribution, but each block samples those distributions 64 times so there are going to be some more and some less lucky pixels. If we consider that every block would have one maximally unlucky corner pixel which would get all three wrong.

log(4096/3)/log(2) = 10.41

So, very handwavy analysis.

Experimentally it seems to roughly hold.

Yeah, >=, my bad.
For pure viewing of non-HDR content 10 bits is good enough. Very few humans can tell the difference between adjacent shades among 1024 shades. Gradients look smooth.

16 bits is useful for image capture and manipulation. But then you should just use RAW/DNG.