Hacker News new | ask | show | jobs
by adastra22 898 days ago
FLAC’s compression algorithm was pretty much garbage when it came out, and is much worse now compared to the state of the art. Even mp3 + gzip residuals would probably compress better.

FLAC doesn’t support more modern sampling formats (e.g. floating point for mastering), or complex multi channel compression for surround sound formats.

There just isn’t something better (and free) to replace it yet.

3 comments

> There just isn’t something better (and free) to replace it yet.

Apple's ALAC (Apple Lossless Audio Codec) format is an open-source and patent-free alternative. I believe both ALAC and FLAC support up to 8 channels of audio, which allows them to support 5.1 and 7.1 surround. https://en.wikipedia.org/wiki/Apple_Lossless_Audio_Codec#His...

These are distribution formats, so I'd be surprised if there were demand for floating-point audio support. And in contexts where floating point audio is used, audio size is not really a problem.

When FLAC compresses stereo audio, it does a diff of the left and right channels and compresses that. This often results in a 2x additional compression ratio because the left and right channels are tightly correlated.

Unless things have changed substantially and I missed it, FLAC does not do similar tricks for other multichannel audio modes. Meaning that for surround sound, each channel is independently compressed and it is unable to exploit signal correlation between channels.

Proprietary formats like Dolby on the other hand do support rather intelligent handling of multichannel modes.

FLAC is not solely a distribution format. Indeed as a distribution format it sucks in a number of ways. It is chiefly used as an archival format, and would in fact be ideal as a mastering format if these deficiencies Could be addressed.

In what ways does flac suck for distribution? All the music I download from Bandcamp is in that format, it works great for me.
It could be much smaller, maybe 2-3x better compression. Better support for surround sound / multichannel audio. If an AAC stream were used for the lossy predictive stage, then existing hardware acceleration could be used for energy efficient playback.
How would 2-3x better compression be achievable?

I don't use or desire multichannel audio but that and the hardware acceleration are interesting points.

FLAC uses 1970’s era compression technology for both compression stages (lossy and residual) in order to conservatively avoid patents in the implementation. Just replace the lossy component with AAC, which is now out of patent protection, and replace Rice coding for the residual with the much better (but was still patented in the 90’s) arithmetic coding. Those two changes should get 2-4x performance improvement, as well as hardware accelerated encoding and playback as a free bonus.

Multichannel audio support is nice because it is often used in distribution of media files sourced from DVD/BluRay. It would be good to have a high quality, free codec for that use.

> FLAC’s compression algorithm was pretty much garbage when it came out, and is much worse now compared to the state of the art. Even mp3 + gzip residuals would probably compress better.

MP3 is a lossy format so I would practically guarantee that you’d end up with a smaller file but that’s not the purpose of FLAC. Lossless encoding makes a file smaller than WAV while still being the same data.

> e.g. floating point for mastering

I’m 0% sold on floating point for mastering. 32bit yes, but anyone who’s played a video game can tell you about those flickering textures and those are caused not by bad floating point calculations, but by good floating point calculations (the bad part is putting textures “on top” of each other at the same coordinates) . Floating point math is “fast” but not accurate. Why would anyone want that for audio (not trying to bash here, I’m genuinely puzzled and would love some knowledgeable insight)

> MP3 is a lossy format so I would practically guarantee that you’d end up with a smaller file but that’s not the purpose of FLAC. Lossless encoding makes a file smaller than WAV while still being the same data.

You misunderstood what you are replying to. FLAC works by running a lossy compression pass, and then LZ encoding the residual. The better the lossy pass, the less entropy in the residual and the smaller it compresses. FLAC’s lossy compressor pass was shit when it came out, and hasn’t gotten any better.

Flickering textures is caused by truncation and wouldn’t be any better with integer math. The same issues apply (and are solved the same way, with explicit biases; flickering shouldn’t be a thing in any quality game engine).

Floating point math is largely desired for mastering because compression (technical term overloaded meaning! Compression here means something totally different than above) results in samples having vastly different dynamic ranges. If rescaled onto the same basis, one would necessarily lose a lot of precision to truncation in intermediate calculations. Using floating point with sufficient precision makes this a non-concern.

> FLAC works by running a lossy compression pass, and then LZ encoding the residual.

Since when does FLAC run a lossy pass? You can recover the original soundwave from a FLAC file, you can't do the same with an MP3.

I'm pretty sure FLAC does not run a lossy compression pass.

Flickering textures in game engines are likely due to z-fighting, unless you're referring to some other type of flickering.

If you're looking to preserving as much detail as possible from your masters then floating points make sense. But its really overkill.

> The FLAC encoding algorithm consists of multiple stages. In the first stage, the input audio is split into blocks. If the audio contains multiple channels, each channel is encoded separately as a subblock. The encoder then tries to find a good mathematical approximation of the block, either by fitting a simple polynomial, or through general linear predictive coding. A description of the approximation, which is only a few bytes in length, is then written. Finally, the difference between the approximation and the input, called residual, is encoded using Rice coding.

Linear predictor is a form of lossy encoding.

LPC is lossy, but FLAC maintains enough information to be able to reproduce the original data. Therefore its lossless even though LPC is a part of the compression.
Yes exactly. What you’re saying lines up with what I’ve learned through experience.

> If you're looking to preserving as much detail as possible from your masters then floating points make sense.

I’ve been searching for hours and gotten nothing more than the classic floats vs ints handwaving. Can you explain what you know about why using floats preserves detail?

Do you actually have experience writing a FLAC encoder/decoder? I do. Go read the format specification. There is a lossy compression pass, then it uses a general compressor on the residual after you subtract out the lossy signal. The two combined allow you to reconstruct the original signal losslessly.
what do you suggest instead?
I suggest that people who care enough about these things (not me, I’m just informed about it), come together and make a new lossless encoder format that has feature parity with the proprietary/“professional” codecs.
what codec are you suggesting is better, and how much better is it? unless encoders have wildly improved, alac's from apple is not better than flac. ape and wavpack seems to do a bit better, but not much