Hacker News new | ask | show | jobs
by zigzag312 1646 days ago
Is there any open source audio compression format like that? Lossless and very fast. I haven't found any yet.

EDIT: I'm thinking about a format that would be suitable as a replacement for uncompressed WAV files in DAWs. Rendered tracks often have large sections of silence and uncompressed WAVs have always seemed wasteful to me.

6 comments

FLAC is always lossless, but has a variable compression ratio so you can trade compression for speed.

Using the command line "flac" tool, "flac -0" is the fastest, "flac -8" is the slowest, but produces the smallest files.

In my experience, 0-2 all produce roughly equivalent sized files, as do 4-8.

I tried passing stereo wavs in 2 x 16bits (4bytes) as rgba for qoi but I haven't been very successful.
That's not surprising. QOI is heavily optimized for images which tend to be relatively continuous, while audio tends to oscillate a ton.
It might work to fourier transform first (although likely will kill performance)
fixed size FFT (eg length 64) can be made scary fast.
I'd also like to know what's the best (or any) lossless audio compression process/tools.

My application is to send audio (podcast recordings) to a remote audio engineer friend who will do the post processing, then round trip it to me to complete the editing.

Wav is so big it makes a 1 hr podcast a difficult proposition.

MP3 is unsuitable because compression introduces too many artefacts the quality suffers unacceptably.

What do other people do in this circumstances?

1 hour of CD quality mono FLAC encoded is about 100-150 MB. Is that small enough?
Well, I'm using a Rodecaster in multi-channel mode with 3 mics so an hour is more like 450MB.
FLAC and ALAC can be losslessly converted to back to WAV and cuts the file size in half.
ALAC? FLAC? What is the problem with these?
FLAC is limited to 24 bit depth. I was thinking of intermediate format suitable for use in DAWs and samplers that also supports floating point to avoid clipping.
24-bit integer and 32-bit float have the same dynamic range available, so you are not losing any fidelity.

However, frankly, if you're working professionally with audio like that, the best solution is simply to have sufficient disk space available to work with raw audio.

Use FLAC to compress the final product, when you are done.

They have the same precision but float has vastly larger dynamic range due to the 8-bit exponent. When normalized and quantized for output this does result in roughly the same effective dynamic range (depending on how much of the integer range was originally used).

The issue is audio is typically mixed close to maximum so any processing steps can easily lead to clipping. One solution is to use float or larger integers internally during each processing step and normalize/convert back to 24-bit integer to write to disk. Another (better imo) option would be to do all intermediate steps and disk saves in a floating point format and only normalize/quantize for output once.

I haven't worked with professional audio in over 25 years (before everything went fully digital) but I would be surprised if floating point formats were not an option for encoding and intermediate workflows. Many quantization steps seems like a bad idea.

Most DAWs and plugins and audio interfaces nowadays use floating point internally.
> I would be surprised if floating point formats were not an option for encoding and intermediate workflows.

For bouncing tracks to disk, uncompressed 32-bit floating point formats are avaliable, but I am not aware of any fast losslessly compressed 32-bit floating point format.

All professional audio production software these days internally works with 32/64 bit floats. That's the native format, because it allows you to go above 0 dBFS (maximum level), as long as you go back below it at the end of the chain.
With 24-bit integer you are at risk of clipping.

EDIT: Floating point is useful while you are working to avoid any accidental clipping. As an intermediate format, like a ProRes for video. FLAC is great as a final format.

check WavPack (32pcm, floats etc) but it's slower(not much) than flac, offering slighty beter compresion.
WavPack seems a bit too slow already. 3x slower decode compared to FLAC in this test https://stsaz.github.io/fmedia/audio-formats/
Wavpack on a modern CPU, from your own link, decodes at approx. 250x realtime. How fast is 'fast enough' if that isn't?
Projects with 100+ tracks are not uncommon. Sampler/rompler of a single virtual instrument can play 10+ sounds simultaneously. Playback of an orchestral score with virtual instruments can easily go over 250 simultaneous sounds, so just a real-time playback (without any additional processing) would already be a challenge.
Nice, I was not aware.
WavPack might fit the bill. It has decent software support. Not sure if DAWs can use it natively, they might unpack it to a temp folder.

https://www.wavpack.com/

Reaper does. Unfortunately, WavPack has a bit too much performance overhead.
It's a 20 year old format.

ZStandard is a very good compressor, with an especially fast decompressor. Maybe someone should try using this instead of zlib in an audio format (FLAC, WavPack, ...)

I mean, is there really a need for utilizing ZStd for audio compression?

FLAC is extremely good at compression audio, has very fast encode and uber fast decode. It also doesn't use zlib...

gzip -1 is lossless and fast. It will somewhat compress pcm data :)
You would loose fast seeking ability with gzip. Or am I mistaken?
You can only seek within a gzip file if you write it with some number of Z_FULL_FLUSH points which are resumable. The command line gzip program does not support this, but it's easy using zlib. For example you might do a Z_FULL_FLUSH roughly every 50 MB of compressed data. Then you can seek to any byte in the file, search forward or backward for the flush marker, and decompress forward from there as much as you want. If your data is sorted or is a time series, it's easy to implement binary search this way. And standard gunzip etc will still be able to read the whole file as normal.
MOD files ;)

(but seriously, MODs can encode hours of audio into kilobytes, the downside is of course that they require a special authoring process which seems to be a bit of a lost art today)