Hacker News new | ask | show | jobs
by theptip 1708 days ago
What does a "numerical text representation" of a waveform look like? (Not familiar with audio processing but interested to understand your suggestion.)
1 comments

Here's a fragment of the representation of a stereo file:

       4.9600227   0.094451904297 -0.014831542969 
       4.9600454   0.089172363281 -0.0092468261719 
        4.960068   0.087493896484 -0.0065612792969 
       4.9600907   0.090179443359 -0.0028686523438 
       4.9601134   0.093963623047 0.0060729980469 
       4.9601361   0.095367431641  0.020538330078 
       4.9601587   0.094299316406  0.035186767578 
       4.9601814    0.09228515625  0.045013427734 
       4.9602041   0.089691162109  0.051422119141 
       4.9602268   0.086059570312  0.058929443359 
Columsn are: [time in seconds] [left channel sample] [right channel sample]

This was generated using

      sox somefile.wav somefile.dat
You can reverse that by reversing the argument order above.
This has some advantages-- it's numerically precise and can be more flexible, but it has some downsides over the suggested approach.

- The quantization of the graphs is a feature to add some tolerance to the tests. I admit this is a mixed blessing.

- This is a lot more opaque to someone looking at a text file of the test output than what is described in the post.

The opacity of the .dat file is real and deep. But I'd expect the opacity of the go/python/lua/whatever code that generates the .dat to be extremely low, and that's what you'd read.