|
I haven't, but now I have. I took https://opus-codec.org/static/examples/samples/music_orig.wa... from https://opus-codec.org/examples/. Then I wrote the following snippet of Python code: from scipy.io import wavfile
import numpy as np
import zstd
sampling_rate, samples = wavfile.read(r'data/bootleg-compress/music_orig.wav')
orig = samples.tobytes()
naive_compressed = zstd.ZSTD_compress(orig)
deltas = np.diff(samples, prepend=samples.dtype.type(0), axis=0) # Per-channel deltas.
compressed_deltas = zstd.ZSTD_compress(deltas.ravel()) # Interleave channels and compress.
decompressed_deltas = np.frombuffer(zstd.ZSTD_uncompress(compressed_deltas), dtype=samples.dtype)
decompressed = np.cumsum(decompressed_deltas.reshape(deltas.shape), axis=0, dtype=samples.dtype)
assert np.array_equal(samples, decompressed)
print(len(orig))
print(len(naive_compressed))
print(len(compressed_deltas))
giving: 17432876
15518973
12817602
Looks like my initial estimation of 2-4 was way off (when FLAC achieves ~2 this should've been a red flag), but you do get a ~1.36x reduction in space at basically memory read speed.Using an encoding for second order differences with storing -127 <= d <= 127 using 1 byte and the others 2 bytes (for an input of 16-bit audio) I got a ratio of ~1.50 for something that can still operate entirely at RAM speed: orig = samples.tobytes()
deltas = np.diff(samples, prepend=samples.dtype.type(0), axis=0) # Per-channel deltas.
delta_deltas = np.diff(deltas, prepend=samples.dtype.type(0), axis=0) # Per-channel second-order differences.
# Many small differences, encode almost all 1-byte differences using 1 byte,
# using 3 bytes for larger differences. Interleave channels and encode.
small = np.sum(np.abs(delta_deltas.ravel()) <= 127)
bootleg = np.zeros(small + (len(delta_deltas.ravel()) - small) * 3, dtype=np.uint8)
i = 0
for dda in delta_deltas.flatten():
if -127 <= dda <= 127:
bootleg[i] = dda + 127
i += 1
else:
bootleg[i] = 255
bootleg[i + 1] = (dda + 2**15) % 256
bootleg[i + 2] = (dda + 2**15) // 256
i += 3
compressed_bootleg = zstd.ZSTD_compress(bootleg)
print(len(compressed_bootleg))
decompressed_bootleg = zstd.ZSTD_uncompress(compressed_bootleg)
result = []
i = 0
while i < len(bootleg):
if bootleg[i] < 255:
result.append(decompressed_bootleg[i] - 127)
i += 1
else:
lo = decompressed_bootleg[i + 1]
hi = decompressed_bootleg[i + 2]
result.append(256*hi + lo - 2**15)
i += 3
decompressed_delta_deltas = np.array(result, dtype=samples.dtype).reshape(delta_deltas.shape)
decompressed_deltas = np.cumsum(decompressed_delta_deltas, axis=0, dtype=samples.dtype)
decompressed = np.cumsum(decompressed_deltas, axis=0, dtype=samples.dtype)
assert np.array_equal(samples, decompressed)
Prints 11593846. |