Hacker News new | ask | show | jobs
by snvzz 202 days ago
>medical images

Isn't JPEG-XL a lossy codec?

3 comments

JPEG-XL is both a lossy and lossless codec. It is already being used in Camera DNG format, making the RAW image smaller.

While lossy codec is hard to compare and up for debate. JPEG-XL is actually better as a lossless codec in terms of compression ratio and compression complexity. There is only one other codec that beats it but it is not open source.

What is the non-open source codec?
HALIC is by far the best lossless codec in terms of speed/compression ratio. If lossy mode were similarly available, we might not be discussing all these issues. I think he stopped developing HALIC for a long time due to lack of interest.

Its developer is also developing HALAC (High Availability Lossless Audio Compression). He recently released the source code for the first version of HALAC. And I don't think anyone cared.

Thank you for the information! Appreciate it. Will look into this more.
HALIC (High Availability Lossless Image Compression)

https://news.ycombinator.com/item?id=38990568

Thank yoU! Totally missed this.
It has both lossy and lossless modes.
Good to hear.

I sure hope they came up with a good, clear system to distinguish them.

As in, a clear way to detect whether a given file is lossy or lossless?

I was thinking that too, but on the other hand, even a lossless file can't guarantee that its contents aren't the result of going through a lossy intermediate format, such as a screenshot created from a JPEG.

I meant like a filename convention, and tags in the file itself.
There is some sort of tag, jxlinfo can tell you if a file is "lossy" or "(possibly) lossless".
Presumably you can look at the file and tell which mode is used, though why would you care to know from the filename?
I find it incredibly helpful to know that .jpg is lossy and .png is lossless.

There are so many reasons why it's almost hard to know where to begin. But it's basically the same reason why it's helpful for some documents to end in .docx and others to end in .xlsx. It tells you what kind of data is inside.

And at least for me, for standard 24-bit RGB images, the distinction between lossy and lossless is much more important than between TIFF and PNG, or between JPG and HEIC. Knowing whether an image is degraded or not is the #1 important fact about an image for me, before anything else. It says so much about what the file is for and not for -- how I should or shouldn't edit it, what kind of format and compression level is suitable for saving after editing, etc.

After that comes whether it's animated or not, which is why .apng is so helpful to distinguish it from .png.

There's a good reason Microsoft Office documents aren't all just something like .msox, with an internal tag indicating whether they're a text document or a spreadsheet or a presentation. File extensions carry semantic meaning around the type of data they contain, and it's good practice to choose extensions that communicate the most important conceptual distinctions.

Surely something close to perceptually lossless is sufficient for most use cases?
Think of all the use cases where the output is going to be ingested by another machine. You don't know that "perceptually lossless" as designed for normal human eyeballs on normal screens in normal lighting environments is going to contain all the information an ML system will use. You want to preserve data as long as possible, until you make an active choice to throw it away. Even the system designer may not know whether it's appropriate to throw that information away, for example if they're designing digital archival systems and having to consider future users who aren't available to provide requirements.